ETC UPD30181AY

User’s Manual
VR4100 Series™
64-/32-bit Microprocessor
Architecture
Target Device
µPD30121 (VR4121TM)
µPD30122 (VR4122TM)
µPD30131 (VR4131TM)
µPD30181 (VR4181TM)
µPD30181A, 30181AY (VR4181ATM)
Document No. U15509EJ2V0UM00 (2nd edition)
Date Published June 2002 NS CP(K)
 NEC Corporation 2002
 MIPS Technologies, Inc. 1997, 2001
Printed in Japan
[MEMO]
2
User’s Manual U15509EJ2V0UM
NOTES FOR CMOS DEVICES
1
PRECAUTION AGAINST ESD FOR SEMICONDUCTORS
Note:
Strong electric field, when exposed to a MOS device, can cause destruction of the gate oxide and
ultimately degrade the device operation. Steps must be taken to stop generation of static electricity
as much as possible, and quickly dissipate it once, when it has occurred. Environmental control
must be adequate. When it is dry, humidifier should be used. It is recommended to avoid using
insulators that easily build static electricity. Semiconductor devices must be stored and transported
in an anti-static container, static shielding bag or conductive material. All test and measurement
tools including work bench and floor should be grounded. The operator should be grounded using
wrist strap. Semiconductor devices must not be touched with bare hands. Similar precautions need
to be taken for PW boards with semiconductor devices on it.
2
HANDLING OF UNUSED INPUT PINS FOR CMOS
Note:
No connection for CMOS device inputs can be cause of malfunction. If no connection is provided
to the input pins, it is possible that an internal input level may be generated due to noise, etc., hence
causing malfunction. CMOS devices behave differently than Bipolar or NMOS devices. Input levels
of CMOS devices must be fixed high or low by using a pull-up or pull-down circuitry. Each unused
pin should be connected to V DD or GND with a resistor, if it is considered to have a possibility of
being an output pin. All handling related to the unused pins must be judged device by device and
related specifications governing the devices.
3
STATUS BEFORE INITIALIZATION OF MOS DEVICES
Note:
Power-on does not necessarily define initial status of MOS device. Production process of MOS
does not define the initial operation status of the device. Immediately after the power source is
turned ON, the devices with reset function have not yet been initialized. Hence, power-on does
not guarantee out-pin levels, I/O settings or contents of registers. Device is not initialized until the
reset signal is received. Reset operation must be executed immediately after power-on for devices
having reset function.
VR10000, VR12000, VR4000, VR4000 Series, VR4100, VR4100 Series, VR4110, VR4120, VR4121, VR4122,
VR4130, VR4131, VR4181, VR4181A, VR4300, VR4305, VR4310, VR4400, VR5000, VR5000A, VR5432, VR5500,
and VR Series are trademarks of NEC Corporation.
MIPS is a registered trademark of MIPS Technologies, Inc. in the United States.
MC68000 is a trademark of Motorola Inc.
IBM370 is a trademark of IBM Corp.
Pentium is a trademark of Intel Corp.
DEC VAX is a trademark of Digital Equipment Corp.
UNIX is a registered trademark in the United States and other countries, licensed exclusively through
X/Open Company, Ltd.
User’s Manual U15509EJ2V0UM
3
2
2
Purchase of NEC I C components conveys a license under the Philips I C Patent Rights to use these
2
2
components in an I C system, provided that the system conforms to the I C Standard Specification as defined
by Philips.
Exporting this product or equipment that includes this product may require a governmental license from the U.S.A. for some
countries because this product utilizes technologies limited by the export control regulations of the U.S.A.
• The information in this document is current as of April, 2002. The information is subject to change
without notice. For actual design-in, refer to the latest publications of NEC's data sheets or data
books, etc., for the most up-to-date specifications of NEC semiconductor products. Not all products
and/or types are available in every country. Please check with an NEC sales representative for
availability and additional information.
• No part of this document may be copied or reproduced in any form or by any means without prior
written consent of NEC. NEC assumes no responsibility for any errors that may appear in this document.
• NEC does not assume any liability for infringement of patents, copyrights or other intellectual property rights of
third parties by or arising from the use of NEC semiconductor products listed in this document or any other
liability arising from the use of such products. No license, express, implied or otherwise, is granted under any
patents, copyrights or other intellectual property rights of NEC or others.
• Descriptions of circuits, software and other related information in this document are provided for illustrative
purposes in semiconductor product operation and application examples. The incorporation of these
circuits, software and information in the design of customer's equipment shall be done under the full
responsibility of customer. NEC assumes no responsibility for any losses incurred by customers or third
parties arising from the use of these circuits, software and information.
• While NEC endeavours to enhance the quality, reliability and safety of NEC semiconductor products, customers
agree and acknowledge that the possibility of defects thereof cannot be eliminated entirely. To minimize
risks of damage to property or injury (including death) to persons arising from defects in NEC
semiconductor products, customers must incorporate sufficient safety measures in their design, such as
redundancy, fire-containment, and anti-failure features.
• NEC semiconductor products are classified into the following three quality grades:
"Standard", "Special" and "Specific". The "Specific" quality grade applies only to semiconductor products
developed based on a customer-designated "quality assurance program" for a specific application. The
recommended applications of a semiconductor product depend on its quality grade, as indicated below.
Customers must check the quality grade of each semiconductor product before using it in a particular
application.
"Standard": Computers, office equipment, communications equipment, test and measurement equipment, audio
and visual equipment, home electronic appliances, machine tools, personal electronic equipment
and industrial robots
"Special": Transportation equipment (automobiles, trains, ships, etc.), traffic control systems, anti-disaster
systems, anti-crime systems, safety equipment and medical equipment (not specifically designed
for life support)
"Specific": Aircraft, aerospace equipment, submersible repeaters, nuclear reactor control systems, life
support systems and medical equipment for life support, etc.
The quality grade of NEC semiconductor products is "Standard" unless otherwise expressly specified in NEC's
data sheets or data books, etc. If customers wish to use NEC semiconductor products in applications not
intended by NEC, they must contact an NEC sales representative in advance to determine NEC's willingness
to support a given application.
(Note)
(1) "NEC" as used in this statement means NEC Corporation and also includes its majority-owned subsidiaries.
(2) "NEC semiconductor products" means any semiconductor product developed or manufactured by or for
NEC (as defined above).
M8E 00. 4
4
User’s Manual U15509EJ2V0UM
Regional Information
Some information contained in this document may vary from country to country. Before using any NEC
product in your application, pIease contact the NEC office in your country to obtain a list of authorized
representatives and distributors. They will verify:
•
Device availability
•
Ordering information
•
Product release schedule
•
Availability of related technical literature
•
Development environment specifications (for example, specifications for third-party tools and
components, host computers, power plugs, AC supply voltages, and so forth)
•
Network requirements
In addition, trademarks, registered trademarks, export restrictions, and other legal issues may also vary
from country to country.
NEC Electronics Inc. (U.S.)
Santa Clara, California
Tel: 408-588-6000
800-366-9782
Fax: 408-588-6130
800-729-9288
NEC do Brasil S.A.
Electron Devices Division
Guarulhos-SP, Brasil
Tel: 11-6462-6810
Fax: 11-6462-6829
• Filiale Italiana
Milano, Italy
Tel: 02-66 75 41
Fax: 02-66 75 42 99
NEC Electronics Hong Kong Ltd.
• Branch The Netherlands
Eindhoven, The Netherlands
Tel: 040-244 58 45
Fax: 040-244 45 80
NEC Electronics Hong Kong Ltd.
• Branch Sweden
Taeby, Sweden
Tel: 08-63 80 820
NEC Electronics (Europe) GmbH Fax: 08-63 80 388
Duesseldorf, Germany
• United Kingdom Branch
Tel: 0211-65 03 01
Milton Keynes, UK
Fax: 0211-65 03 327
Tel: 01908-691-133
Fax: 01908-670-290
• Sucursal en España
Madrid, Spain
Tel: 091-504 27 87
Fax: 091-504 28 60
Hong Kong
Tel: 2886-9318
Fax: 2886-9022/9044
Seoul Branch
Seoul, Korea
Tel: 02-528-0303
Fax: 02-528-4411
NEC Electronics Shanghai, Ltd.
Shanghai, P.R. China
Tel: 021-6841-1138
Fax: 021-6841-1137
NEC Electronics Taiwan Ltd.
Taipei, Taiwan
Tel: 02-2719-2377
Fax: 02-2719-5951
NEC Electronics Singapore Pte. Ltd.
Novena Square, Singapore
Tel: 253-8311
Fax: 250-3583
• Succursale Française
Vélizy-Villacoublay, France
Tel: 01-30-67 58 00
Fax: 01-30-67 58 99
J02.4
User’s Manual U15509EJ2V0UM
5
PREFACE
Readers
This manual targets users who intend to understand the functions of the VR4100
Series, the RISC microprocessors, and to design application systems using them.
Purpose
This manual introduces the architecture of the VR4100 Series to users, following the
organization described below.
Organization
Two manuals are available for the VR4100 Series: Architecture User’s Manual (this
manual) and Hardware User’s Manual of each product.
Architecture
User's Manual
• Pipeline operation
• Cache organization and memory
management system
• Exception processing
• Interrupts
• Instruction set
How to read this manual
Hardware
User's Manual
•
•
•
•
•
Pin functions
Physical address space
Function of Coprocessor 0
Initialization interface
Peripheral units
It is assumed that the reader of this manual has general knowledge in the fields of
electric engineering, logic circuits, and microcomputers.
In this manual, the following products are referred to as the VR4100 Series.
Descriptions that differ between these products are explained individually, and
common parts are explained as for the VR4100 Series.
VR4121 (µPD30121)
VR4122 (µPD30122)
VR4131 (µPD30131)
VR4181 (µPD30181)
VR4181A (µPD30181A, 30181AY)
To learn in detail about the function of a specific instruction,
→ Read CHAPTER 2 CPU INSTRUCTION SET SUMMARY, CHAPTER 3
MIPS16 INSTRUCTION SET, CHAPTER 9
CPU INSTRUCTION SET
DETAILS, and CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT.
To learn about the overall functions of the VR4100 Series,
→ Read this manual in sequential order.
To learn about hardware functions,
→ Refer to Hardware User's Manual which is separately available.
To learn about electrical specifications,
→ Refer to Data Sheet which is separately available.
6
User’s Manual U15509EJ2V0UM
Conventions
Data significance:
Active low:
Note:
Caution:
Remark:
Numeric representation:
Higher on left and lower on right
XXX# (trailing # after pin and signal names)
Description of item marked with Note in the text
Information requiring particular attention
Supplementary information
binary/decimal ... XXXX
hexadecimal ... 0xXXXX
Prefixes representing an exponent of 2 (for address space or memory capacity):
10
K (kilo) ... 2 = 1024
20
2
M (mega) ... 2 = 1024
30
3
G (giga) ... 2 = 1024
40
4
T (tera) ... 2 = 1024
50
5
P (peta) ... 2 = 1024
60
6
E (exa) ... 2 = 1024
Related Documents
The related documents indicated here may include preliminary version. However,
preliminary versions are not marked as such.
Document name
Document number
VR4100 Series Architecture User’s Manual
This manual
VR4121 User’s Manual
U13569E
µPD30121 (VR4121) Data Sheet
U14691E
VR4122 User’s Manual
U14327E
µPD30122 (VR4122) Data Sheet
U16219E
VR4131 Hardware User’s Manual
U15350E
µPD30131 (VR4131) Data Sheet
To be prepared
VR4181 Hardware User’s Manual
U14272E
µPD30181 (VR4181) Data Sheet
U14273E
VR4181A Hardware User’s Manual
To be prepared
µPD30181A, 30181AY (VR4181A) Data Sheet
To be prepared
VR Series
TM
Programming Guide Application Note
User’s Manual U15509EJ2V0UM
U10710E
7
CONTENTS
CHAPTER 1 INTRODUCTION .............................................................................................................. 17
1.1 Features .................................................................................................................................... 17
1.2 CPU Core .................................................................................................................................. 19
1.2.1 CPU registers ..............................................................................................................................
20
1.2.2 Coprocessors ..............................................................................................................................
21
1.2.3 System control coprocessor (CP0) ..............................................................................................
21
1.2.4 Floating-point unit (FPU) .............................................................................................................
23
1.2.5 Cache memory ............................................................................................................................
23
1.3 CPU Instruction Set Overview ................................................................................................ 23
1.4 Data Formats and Addressing ................................................................................................ 26
1.5 Memory Management System ................................................................................................ 30
1.5.1 Translation lookaside buffer (TLB) ..............................................................................................
30
1.5.2 Processor modes .........................................................................................................................
30
1.6 Instruction Pipeline .................................................................................................................. 31
1.6.1 Branch prediction .........................................................................................................................
31
1.7 Code Compatibility .................................................................................................................. 32
CHAPTER 2 CPU INSTRUCTION SET SUMMARY ......................................................................... 33
2.1 Instruction Set Architecture .................................................................................................... 33
2.2 CPU Instruction Formats ......................................................................................................... 34
2.3 Instructions Added in the VR4100 Series ............................................................................... 35
2.3.1 Product-sum operation instructions .............................................................................................
35
2.3.2 Power mode instructions .............................................................................................................
35
2.4 Instruction Overview ............................................................................................................... 36
2.4.1 Load and store instructions .........................................................................................................
36
2.4.2 Computational instructions ..........................................................................................................
40
2.4.3 Jump and branch instructions ......................................................................................................
47
2.4.4 Special instructions ......................................................................................................................
51
2.4.5 System control coprocessor (CP0) instructions ...........................................................................
52
CHAPTER 3 MIPS16 INSTRUCTION SET ......................................................................................... 54
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
8
Outline .......................................................................................................................................
Features ....................................................................................................................................
Register Set ..............................................................................................................................
ISA Mode ...................................................................................................................................
54
54
55
56
3.4.1 Changing ISA mode bit by software ............................................................................................
56
3.4.2 Changing ISA mode bit by exception ...........................................................................................
56
3.4.3 Enabling change ISA mode bit ....................................................................................................
57
Types of Instructions ...............................................................................................................
Instruction Format ...................................................................................................................
MIPS16 Operation Code Bit Encoding ...................................................................................
Outline of Instructions .............................................................................................................
57
59
64
67
User’s Manual U15509EJ2V0UM
3.8.1 PC-relative instructions ...............................................................................................................
67
3.8.2 Extend instruction ........................................................................................................................
68
3.8.3 Delay slots ...................................................................................................................................
70
3.8.4 Instruction details ........................................................................................................................
71
CHAPTER 4 PIPELINE ......................................................................................................................... 84
4.1 Pipeline Stages ........................................................................................................................ 84
4.1.1 VR4121, VR4122, VR4181A .........................................................................................................
84
4.1.2 VR4131 ........................................................................................................................................
87
4.1.3 VR4181 ........................................................................................................................................
89
4.2 Branch Delay ............................................................................................................................ 90
4.2.1 VR4121, VR4122, VR4181A .........................................................................................................
90
4.2.2 VR4131 ........................................................................................................................................
91
4.2.3 VR4181 ........................................................................................................................................
93
4.3 Branch Prediction .................................................................................................................... 94
4.4
4.5
4.6
4.7
4.3.1 VR4122, VR4181A .......................................................................................................................
95
4.3.2 VR4131 ........................................................................................................................................
97
Load Delay ................................................................................................................................
Instruction Streaming ..............................................................................................................
Pipeline Activities ....................................................................................................................
Interlock and Exception ..........................................................................................................
101
101
102
116
4.7.1 Exception conditions ................................................................................................................... 119
4.7.2 Stall conditions ............................................................................................................................ 120
4.7.3 Slip conditions ............................................................................................................................. 121
4.7.4 Bypassing .................................................................................................................................... 123
CHAPTER 5 MEMORY MANAGEMENT SYSTEM ............................................................................ 124
5.1 Processor Modes ..................................................................................................................... 124
5.1.1 Operating mode ........................................................................................................................... 124
5.1.2 Addressing mode ........................................................................................................................ 124
5.2 Translation Lookaside Buffer (TLB) ...................................................................................... 125
5.2.1 Format of a TLB entry ................................................................................................................. 125
5.2.2 Manipulation of TLB .................................................................................................................... 126
5.2.3 TLB instructions ........................................................................................................................... 127
5.2.4 TLB exceptions ............................................................................................................................ 127
5.3 Virtual-to-Physical Address Translation ............................................................................... 128
5.3.1 32-bit mode address translation .................................................................................................. 131
5.3.2 64-bit mode address translation .................................................................................................. 132
5.4 Address Space ........................................................................................................................ 133
5.4.1 User mode virtual address space ................................................................................................ 133
5.4.2 Supervisor mode virtual address space ...................................................................................... 135
5.4.3 Kernel mode virtual address space ............................................................................................. 138
5.5 Memory Management Registers ............................................................................................ 146
5.5.1 Index register (0) ......................................................................................................................... 147
5.5.2 Random register (1) .................................................................................................................... 147
5.5.3 EntryLo0 (2) and EntryLo1 (3) registers ...................................................................................... 148
User’s Manual U15509EJ2V0UM
9
5.5.4 PageMask register (5) ................................................................................................................. 149
5.5.5 Wired register (6) ......................................................................................................................... 150
5.5.6 EntryHi register (10) .................................................................................................................... 151
5.5.7 Processor Revision Identifier (PRId) register (15) ....................................................................... 152
5.5.8 Config register (16) ...................................................................................................................... 153
5.5.9 Load Linked Address (LLAddr) register (17) ............................................................................... 155
5.5.10 TagLo (28) and TagHi (29) registers ......................................................................................... 156
CHAPTER 6 EXCEPTION PROCESSING ........................................................................................... 157
6.1 Exception Processing Overview ............................................................................................ 157
6.1.1 Precision of exceptions ................................................................................................................ 157
6.2 Exception Processing Registers ............................................................................................ 158
6.2.1 Context register (4) ...................................................................................................................... 159
6.2.2 BadVAddr register (8) .................................................................................................................. 160
6.2.3 Count register (9) ......................................................................................................................... 160
6.2.4 Compare register (11) ................................................................................................................. 161
6.2.5 Status register (12) ...................................................................................................................... 161
6.2.6 Cause register (13) ...................................................................................................................... 165
6.2.7 Exception Program Counter (EPC) register (14) ......................................................................... 167
6.2.8 WatchLo (18) and WatchHi (19) registers ................................................................................... 168
6.2.9 XContext register (20) .................................................................................................................. 169
6.2.10 Parity Error register (26) ............................................................................................................ 170
6.2.11 Cache Error register (27) ........................................................................................................... 170
6.2.12 ErrorEPC register (30) ............................................................................................................... 171
6.3 Overview of Exceptions .......................................................................................................... 173
6.3.1 Exception types ........................................................................................................................... 173
6.3.2 Exception vector locations ........................................................................................................... 173
6.3.3 Priority of exceptions ................................................................................................................... 175
6.4 Details of Exceptions ............................................................................................................... 176
6.4.1 Cold Reset exception .................................................................................................................. 176
6.4.2 Soft Reset exception ................................................................................................................... 177
6.4.3 NMI exception .............................................................................................................................. 178
6.4.4 Address Error exception .............................................................................................................. 179
6.4.5 TLB exceptions ............................................................................................................................ 180
6.4.6 Bus Error exception ..................................................................................................................... 183
6.4.7 System Call exception ................................................................................................................. 184
6.4.8 Breakpoint exception ................................................................................................................... 185
6.4.9 Coprocessor Unusable exception ................................................................................................ 186
6.4.10 Reserved Instruction exception ................................................................................................. 187
6.4.11 Trap exception ........................................................................................................................... 188
6.4.12 Integer Overflow exception ........................................................................................................ 188
6.4.13 Watch exception ........................................................................................................................ 189
6.4.14 Interrupt exception ..................................................................................................................... 190
6.5 Exception Processing and Servicing Flowcharts ................................................................. 191
10
User’s Manual U15509EJ2V0UM
CHAPTER 7 CACHE MEMORY .......................................................................................................... 198
7.1 Memory Organization .............................................................................................................. 198
7.1.1 On-chip caches ........................................................................................................................... 199
7.2 Cache Organization ................................................................................................................. 200
7.2.1 Instruction cache line .................................................................................................................. 200
7.2.2 Data cache line ............................................................................................................................ 201
7.2.3 Placement of cache data ............................................................................................................. 202
7.3 Cache Operations .................................................................................................................... 202
7.3.1 Cache data coherency ................................................................................................................ 203
7.3.2 Replacement of cache line .......................................................................................................... 203
7.3.3 Accessing the caches ................................................................................................................. 204
7.4 Cache States ............................................................................................................................ 205
7.4.1 Cache state transition diagrams .................................................................................................. 206
7.5 Cache Access Flow ................................................................................................................. 207
7.6 Manipulation of the Caches by an External Agent ............................................................... 220
7.7 Initialization of the Caches ..................................................................................................... 220
CHAPTER 8 CPU CORE INTERRUPTS ............................................................................................ 221
8.1 Types of Interrupt Request ..................................................................................................... 221
8.1.1 Non-maskable interrupt (NMI) ..................................................................................................... 221
8.1.2 Ordinary interrupts ....................................................................................................................... 221
8.1.3 Software interrupts generated in CPU core ................................................................................. 222
8.1.4 Timer interrupt ............................................................................................................................. 222
8.2 Acknowledging Interrupts ...................................................................................................... 222
8.2.1 Detecting hardware interrupts ..................................................................................................... 222
8.2.2 Masking interrupt signals ............................................................................................................. 223
CHAPTER 9 CPU INSTRUCTION SET DETAILS ............................................................................ 224
9.1 Instruction Notation Conventions .......................................................................................... 224
9.2 Notes on Using CPU Instructions .......................................................................................... 226
9.2.1 Load and Store instructions ......................................................................................................... 226
9.2.2 Jump and Branch instructions ..................................................................................................... 227
9.2.3 System control coprocessor (CP0) instructions .......................................................................... 228
9.3 CPU Instructions ...................................................................................................................... 228
9.4 CPU Instruction Opcode Bit Encoding .................................................................................. 383
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT ..................................................................... 386
CHAPTER 11 COPROCESSOR 0 HAZARDS ................................................................................... 421
APPENDIX INDEX ................................................................................................................................. 427
User’s Manual U15509EJ2V0UM
11
LIST OF FIGURES (1/3)
Fig. No.
Title
Page
1-1.
CPU Core Internal Block Diagram ........................................................................................................
19
1-2.
CPU Registers ......................................................................................................................................
21
1-3.
CPU Instruction Formats (32-bit Length Instruction) .............................................................................
23
1-4.
CPU Instruction Formats (16-bit Length Instruction) .............................................................................
25
1-5.
Byte Address in Big-Endian Byte Order ................................................................................................
27
1-6.
Byte Address in Little-Endian Byte Order ..............................................................................................
28
1-7.
Misaligned Word Accessing (Little-Endian) ..........................................................................................
29
2-1.
CPU Instruction Formats .......................................................................................................................
34
2-2.
Byte Specification Related to Load and Store Instructions ...................................................................
37
4-1.
Pipeline Stages (VR4121, VR4122, VR4181A) .......................................................................................
85
4-2.
Instruction Execution in the Pipeline (VR4121, VR4122, VR4181A) .......................................................
86
4-3.
Pipeline Stages (VR4131) ......................................................................................................................
87
4-4.
Instruction Execution in the Pipeline (VR4131) ......................................................................................
88
4-5.
Pipeline Stages (VR4181) ......................................................................................................................
89
4-6.
Instruction Execution in the Pipeline (VR4181) ......................................................................................
89
4-7.
Branch Delay (VR4121, VR4122, VR4181A) ...........................................................................................
90
4-8.
Branch Delay (VR4131, MIPS III Instruction Mode) ...............................................................................
91
4-9.
Branch Delay (VR4131, MIPS16 Instruction Mode) ...............................................................................
92
4-10.
Branch Delay (VR4181) .........................................................................................................................
93
4-11.
Pipeline on Branch Prediction (VR4122, VR4181A) ...............................................................................
95
4-12.
Pipeline on Branch Prediction (VR4131, When the Branch Is in the Lower Address) ...........................
97
4-13.
Pipeline on Branch Prediction (VR4131, When the Branch Is in the Higher Address) ..........................
99
4-14.
Pipeline Activities ..................................................................................................................................
102
4-15.
ADD Instruction Pipeline Activities (VR4121, VR4122, VR4181A) ..........................................................
104
4-16.
ADD Instruction Pipeline Activities (VR4131) ........................................................................................
105
4-17.
ADD Instruction Pipeline Activities (VR4181) ........................................................................................
105
4-18.
JALR Instruction Pipeline Activities (VR4121, VR4122, VR4181A) .........................................................
106
4-19.
JALR Instruction Pipeline Activities (VR4131) .......................................................................................
107
4-20.
JALR Instruction Pipeline Activities (VR4181) .......................................................................................
107
4-21.
BEQ Instruction Pipeline Activities (VR4121, VR4122, VR4181A) ..........................................................
108
4-22.
BEQ Instruction Pipeline Activities (VR4131) ........................................................................................
109
4-23.
BEQ Instruction Pipeline Activities (VR4181) ........................................................................................
109
4-24.
TLT Instruction Pipeline Activities (VR4121, VR4122, VR4181A) ...........................................................
110
4-25.
TLT Instruction Pipeline Activities (VR4131) ..........................................................................................
111
4-26.
TLT Instruction Pipeline Activities (VR4181) ..........................................................................................
111
4-27.
LW Instruction Pipeline Activities (VR4121, VR4122, VR4181A) ............................................................
112
4-28.
LW Instruction Pipeline Activities (VR4131) ..........................................................................................
113
4-29.
LW Instruction Pipeline Activities (VR4181) ..........................................................................................
113
4-30.
SW Instruction Pipeline Activities (VR4121, VR4122, VR4181A) ...........................................................
114
4-31.
SW Instruction Pipeline Activities (VR4131) ..........................................................................................
115
4-32.
SW Instruction Pipeline Activities (VR4181) ..........................................................................................
115
4-33.
Interlocks, Exceptions, and Faults ........................................................................................................
116
12
User’s Manual U15509EJ2V0UM
LIST OF FIGURES (2/3)
Fig. No.
4-34.
Title
Exception Detection ...............................................................................................................................
Page
119
4-35.
Data Cache Miss Stall ............................................................................................................................
120
4-36.
CACHE Instruction Stall .........................................................................................................................
120
4-37.
Load Data Interlock ................................................................................................................................
121
4-38.
MD Busy Interlock ..................................................................................................................................
122
5-1.
Format of a TLB Entry ............................................................................................................................
126
5-2.
TLB Manipulation Overview ...................................................................................................................
127
5-3.
Virtual-to-Physical Address Translation .................................................................................................
129
5-4.
Address Translation in TLB ....................................................................................................................
130
5-5.
32-bit Mode Virtual Address Translation ................................................................................................
131
5-6.
64-bit Mode Virtual Address Translation ................................................................................................
132
5-7.
User Mode Address Space ....................................................................................................................
134
5-8.
Supervisor Mode Address Space ..........................................................................................................
136
5-9.
Kernel Mode Address Space .................................................................................................................
139
5-10.
xkphys Area Address Space ..................................................................................................................
140
5-11.
Index Register ........................................................................................................................................
147
5-12.
Random Register ...................................................................................................................................
147
5-13.
EntryLo0 and EntryLo1 Registers ..........................................................................................................
148
5-14.
PageMask Register ................................................................................................................................
149
5-15.
Positions Indicated by the Wired Register .............................................................................................
150
5-16.
Wired Register .......................................................................................................................................
150
5-17.
EntryHi Register .....................................................................................................................................
151
5-18.
PRId Register .........................................................................................................................................
152
5-19.
Config Register ......................................................................................................................................
153
5-20.
LLAddr Register .....................................................................................................................................
155
5-21.
TagLo Register ......................................................................................................................................
156
5-22.
TagHi Register .......................................................................................................................................
156
6-1.
Context Register .....................................................................................................................................
159
6-2.
BadVAddr Register .................................................................................................................................
160
6-3.
Count Register ........................................................................................................................................
160
6-4.
Compare Register...................................................................................................................................
161
6-5.
Status Register .......................................................................................................................................
161
6-6.
Status Register Diagnostic Status Field ................................................................................................
163
6-7.
Cause Register .......................................................................................................................................
165
6-8.
EPC Register (When MIPS16 ISA Is Disabled) .....................................................................................
167
6-9.
EPC Register (When MIPS16 ISA Is Enabled) ......................................................................................
168
6-10.
WatchLo Register ...................................................................................................................................
168
6-11.
WatchHi Register....................................................................................................................................
168
6-12.
XContext Register...................................................................................................................................
169
6-13.
Parity Error Register ...............................................................................................................................
170
6-14.
Cache Error Register ..............................................................................................................................
170
6-15.
ErrorEPC Register (When MIPS16 ISA Is Disabled) .............................................................................
172
User’s Manual U15509EJ2V0UM
13
LIST OF FIGURES (3/3)
Fig. No.
Title
Page
6-16.
ErrorEPC Register (When MIPS16 ISA Is Enabled) .............................................................................
172
6-17.
Common Exception Handling ................................................................................................................
192
6-18.
TLB/XTLB Refill Exception Handling .....................................................................................................
194
6-19.
Cold Reset Exception Handling ............................................................................................................
196
6-20.
Soft Reset and NMI Exception Handling ...............................................................................................
197
7-1.
Logical Hierarchy of Memory ................................................................................................................
198
7-2.
On-chip Caches and Main Memory .......................................................................................................
199
7-3.
Instruction Cache Line Format ..............................................................................................................
200
7-4.
Data Cache Line Format .......................................................................................................................
201
7-5.
Cache Index and Data Output ...............................................................................................................
204
7-6.
Instruction Cache State Diagram ..........................................................................................................
206
7-7.
Data Cache State Diagram ...................................................................................................................
206
7-8.
Flow on Instruction Fetch ......................................................................................................................
207
7-9.
Flow on Load Operations ......................................................................................................................
208
7-10.
Flow on Store Operations .....................................................................................................................
209
7-11.
Flow on Index_Invalidate Operations ....................................................................................................
210
7-12.
Flow on Index_Writeback_Invalidate Operations ..................................................................................
211
7-13.
Flow on Index_Load_Tag Operations ...................................................................................................
211
7-14.
Flow on Index_Store_Tag Operations ...................................................................................................
212
7-15.
Flow on Create_Dirty Operations ..........................................................................................................
212
7-16.
Flow on Hit_Invalidate Operations ........................................................................................................
213
7-17.
Flow on Hit_Writeback_Invalidate Operations ......................................................................................
214
7-18.
Flow on Fill Operations .........................................................................................................................
215
7-19.
Flow on Hit_Writeback Operations .......................................................................................................
216
7-20.
Flow on Fetch_and_Lock Operations (VR4131 only) ............................................................................
217
7-21.
Writeback Flow .....................................................................................................................................
218
7-22.
Refill Flow ..............................................................................................................................................
218
7-23.
Writeback & Refill Flow .........................................................................................................................
219
8-1.
Non-maskable Interrupt Signal ..............................................................................................................
221
8-2.
Hardware Interrupt Signals ...................................................................................................................
222
8-3.
Masking of the Interrupt Request Signals .............................................................................................
223
9-1.
CPU Instruction Opcode Bit Encoding ..................................................................................................
383
14
User’s Manual U15509EJ2V0UM
LIST OF TABLES (1/2)
Table No.
Title
Page
1-1.
Comparison of Functions of VR4100 Series ...........................................................................................
18
1-2.
CP0 Registers ........................................................................................................................................
22
1-3.
List of Instructions Supported by VR Series Processors ........................................................................
32
35
2-1.
MACC Instructions (for VR4121, VR4122, VR4131, and VR4181A) .........................................................
2-2.
Product-Sum Operation Instructions (for VR4181) .................................................................................
35
2-3.
Power Mode Instructions .......................................................................................................................
35
2-4.
Number of Delay Slot Cycles Necessary for Load and Store Instructions .............................................
36
2-5.
Load/Store Instruction ............................................................................................................................
38
2-6.
Load/Store Instruction (Extended ISA) ..................................................................................................
39
2-7.
ALU Immediate Instruction ....................................................................................................................
40
2-8.
ALU Immediate Instruction (Extended ISA) ...........................................................................................
41
2-9.
Three-Operand Type Instruction ............................................................................................................
41
2-10.
Three-Operand Type Instruction (Extended ISA) ...................................................................................
42
2-11.
Shift Instruction ......................................................................................................................................
42
2-12.
Shift Instruction (Extended ISA) .............................................................................................................
43
2-13.
Multiply/Divide Instructions ....................................................................................................................
44
2-14.
Multiply/Divide Instructions (Extended ISA) ...........................................................................................
44
2-15.
Product-Sum Operation Instructions (for VR4121, VR4122, VR4131, and VR4181A) .............................
45
2-16.
Product-Sum Operation Instructions (for VR4181) .................................................................................
45
2-17.
Number of Stall Cycles in Multiply and Divide Instructions ....................................................................
46
2-18.
Jump Instructions ...................................................................................................................................
47
2-19.
Branch Instructions ................................................................................................................................
48
2-20.
Branch Instructions (Extended ISA) .......................................................................................................
49
2-21.
Special Instructions ................................................................................................................................
51
2-22.
Special Instructions (Extended ISA) ......................................................................................................
51
2-23.
System Control Coprocessor (CP0) Instructions ...................................................................................
52
3-1.
General-purpose Registers ....................................................................................................................
55
3-2.
Special Registers ...................................................................................................................................
56
3-3.
MIPS16 Instruction Set Outline ..............................................................................................................
57
3-4.
Field Definition .......................................................................................................................................
59
3-5.
Bit Encoding of Major Operation Code (op) ...........................................................................................
64
3-6.
RR Minor Operation Code (RR-Type Instruction) ..................................................................................
64
3-7.
RRR Minor Operation Code (RRR-Type Instruction) .............................................................................
65
3-8.
RRI-A Minor Operation Code (RRI-Type ADD Instruction) ....................................................................
65
3-9.
SHIFT Minor Operation Code (SHIFT-Type Instruction) ........................................................................
65
3-10.
I8 Minor Operation Code (I8-Type Instruction) .......................................................................................
65
3-11.
I64 Minor Operation Code (64-bit Only, I64-Type Instruction) ...............................................................
66
3-12.
Base PC Address Setting ......................................................................................................................
67
3-13.
Extendable MIPS16 Instructions ............................................................................................................
69
3-14.
Load and Store Instructions ...................................................................................................................
71
3-15.
ALU Immediate Instructions ...................................................................................................................
74
3-16.
Two-/Three-Operand Register Type ......................................................................................................
76
3-17.
Shift Instructions ....................................................................................................................................
78
User’s Manual U15509EJ2V0UM
15
LIST OF TABLES (2/2)
Table No.
Title
Page
3-18.
Multiply/Divide Instructions ....................................................................................................................
80
3-19.
Jump and Branch Instructions ..............................................................................................................
82
3-20.
Special Instructions ...............................................................................................................................
83
4-1.
Description of Pipeline Activities during Each Stage .............................................................................
103
4-2.
Correspondence of Pipeline Stage to Interlock and Exception Conditions ...........................................
117
4-3.
Pipeline Interlock ...................................................................................................................................
118
4-4.
Description of Pipeline Exception ..........................................................................................................
118
5-1.
User Mode Segments ...........................................................................................................................
134
5-2.
32-bit and 64-bit Supervisor Mode Segments .......................................................................................
137
5-3.
32-bit Kernel Mode Segments ...............................................................................................................
141
5-4.
64-bit Kernel Mode Segments ...............................................................................................................
143
5-5.
Cacheability and the xkphys Address Space ........................................................................................
144
5-6.
CP0 Registers .......................................................................................................................................
146
5-7.
Cache Algorithm ....................................................................................................................................
149
5-8.
Mask Values and Page Sizes ...............................................................................................................
149
5-9.
System Interface Clock Ratio (to PClock) .............................................................................................
154
5-10.
Instruction Cache Sizes ........................................................................................................................
155
5-11.
Data Cache Sizes .................................................................................................................................
155
6-1.
CP0 Registers .......................................................................................................................................
158
6-2.
Cause Register Exception Code Field ..................................................................................................
166
6-3.
32-Bit Mode Exception Vector Base Addresses ....................................................................................
174
6-4.
64-Bit Mode Exception Vector Base Addresses ....................................................................................
174
6-5.
Exception Priority Order ........................................................................................................................
175
7-1.
Cache Size, Line Size, and Index .........................................................................................................
204
9-1.
CPU Instruction Operation Notations ....................................................................................................
225
9-2.
Load and Store Common Functions .....................................................................................................
226
9-3.
Access Type Specifications for Loads/Stores .......................................................................................
227
11-1.
Coprocessor 0 Hazards ........................................................................................................................
422
11-2.
Calculation Example of CP0 Hazard and Number of Instructions Inserted ...........................................
426
16
User’s Manual U15509EJ2V0UM
CHAPTER 1 INTRODUCTION
This chapter gives an outline of the VR4121 (µPD30121), the VR4122 (µPD30122), the VR4131 (µPD30131), the
VR4181 (µPD30181), and the VR4181A (µPD30181A, 30181AY), which are 64-/32-bit RISC microprocessors. In this
manual, these products are referred to as the VR4100 Series.
1.1 Features
The VR4100 Series, which is a part of the RISC microprocessor VR Series, is a group of products developed for PDAs.
The VR Series is high-performance 64-/32-bit microprocessors employing the RISC (reduced instruction set computer)
TM
architecture developed by MIPS
manufactured by NEC.
The VR4100 Series accommodates the ultra low power consumption CPU core provided with cache memory, a
high-speed product-sum operation unit, and an address management unit. The VR4100 Series also has interface
units for the peripheral circuits required for battery-driven portable information equipment (refer to Hardware User's
Manual of each product for details about on-chip peripheral functions).
The features of the VR4100 Series are described below.
{ Employs 64-bit RISC core as a CPU
Possible to operate in 32-bit mode
{ Optimized instruction pipeline
{ On-chip cache memory
{ Employs write-back cache
Reduces store operations using system bus
{ Physical address space: 32 bits
Virtual address space: 40 bits
{ Translation lookaside buffer (TLB) with 32-double entries
{ Instruction set: MIPS III (however, the FPU, LL, LLD, SC, and SCD instructions are removed), MIPS16
{ Supports high-speed product-sum operation instructions
{ Effective power management features, which include the four modes of Fullspeed, Standby, Suspend, and
Hibernate
{ On-chip PLL and clock generator
{ Variable on-chip peripheral functions ideal for potable information equipment
The functions of the VR4100 Series are listed as follows.
User’s Manual U15509EJ2V0UM
17
CHAPTER 1 INTRODUCTION
Table 1-1. Comparison of Functions of VR4100 Series
Item
VR4121
Part number
µPD30121
CPU core
VR4120TM core
Instruction set
VR4122
µPD30122
VR4131
VR4181
VR4181A
µPD30131
µPD30181
µPD30181A,
30181AY
VR4130TM core
VR4110TM core
VR4120 core
MIPS I, II, III
MIPS I, II, III
MIPS I, II, III
+ high-speed product-sum (32-bit)
+ high-speed product-
+ high-speed product-
sum (16-bit)
+ MIPS16
sum (32-bit)
+ MIPS16
+ MIPS16
5-stage pipeline
5-/6-stage pipeline
Pipeline
5-/6-stage pipeline
On-chip cache
memory
• Instruction: 16KB
• Instruction: 32KB
• Instruction: 16KB
• Instruction: 4KB
• Instruction: 8KB
• Data: 8KB
• Data: 16KB
• Data: 16KB
• Data: 4KB
• Data: 8KB
• Direct map
• Direct map
• 2-way set-
• Direct map
• Direct map
2-way superscalar
6-/7-stage pipeline
associative
• With line lock
function
On-chip peripheral
functions
• Memory controller
• Memory controller
• Memory controller
• Memory controller
• Extension bus
• Extension bus interface (ISA, PCI)
• Extension bus
• Extension bus
interface (ISA)
interface (ISA)
• LCD interface
• LCD interface
• Touch panel
• Touch panel
interface (ISA)
• LCD interface
• Touch panel
interface
• Communication interface (UART, CSI, IrDA
(SIR, MIR, FIR))
• LED controller
• Timer, counter
• Keyboard interface
• Communication
interface (UART,
CSI, IrDA (SIR,
interface
interface
• General-purpose port
• Keyboard interface
• Keyboard interface
• Clock generator
• Communication
• Communication
• Power management unit
MIR, FIR))
interface (UART,
interface (UART,
CSI, IrDA (SIR))
CSI, I2C, IrDA (SIR))
• CompactFlash
• Modem interface
interface
• CompactFlash
interface
• AC97/I2S audio
• Audio interface
• Audio interface
• LED controller
• LED controller
• DMA controller
• DMA controller
• DMA controller
• Timer, counter
• Timer, counter
• USB host/function
• Watchdog timer
• Watchdog timer
• General-purpose
• General-purpose
port
port
interface
controller
• PWM generator
• Timer, counter
• Clock generator
• Clock generator
• Watchdog timer
• Power management
• Power management
• General-purpose
unit
unit
port
• A/D converter
• A/D converter
• Clock generator
• D/A converter
• D/A converter
• Power management
unit
• A/D converter
• D/A converter
Other functions
−
• On-chip branch prediction function
• On-chip hardware debug function
−
• On-chip branch
prediction function
• On-chip hardware
debug function
18
User’s Manual U15509EJ2V0UM
CHAPTER 1 INTRODUCTION
1.2 CPU Core
Figure 1-1 shows the internal block diagram of the CPU core.
In addition to the conventional high-performance integer operation units, this CPU core has a full-associative
format translation lookaside buffer (TLB), which has 32 entries that provide mapping to 2-page pairs for one entry.
Moreover, it also has instruction and data caches, and a bus interface.
Figure 1-1. CPU Core Internal Block Diagram
Virtual address bus
Internal data bus
Control (o)
Bus
interface
Data cache
Instruction
cache
CP0
CPU
Control (i)
Address/data (o)
TLB
Address/data (i)
Clock
generator
Internal clock
(1) CPU
CPU is a block that performs integer calculations. This block includes a 64-bit integer data path, and productsum operator.
(2) Coprocessor 0 (CP0)
CP0 incorporates a memory management unit (MMU) and exception handling function.
The MMU checks
whether there is an access between different memory segments (user, supervisor, and kernel) by executing
address conversion. The translation lookaside buffer (TLB) converts virtual addresses to physical addresses.
(3) Instruction cache
The instruction cache employs virtual index and physical tag formats. It is managed with direct mapping format
in the VR4121, VR4122, VR4181, and VR4181A, or with 2-way set-associative format in the VR4131.
(4) Data cache
The data cache employs virtual index, physical tag, and writeback formats. It is managed with direct mapping
format in the VR4121, VR4122, VR4181, and VR4181A, or with 2-way set-associative format in the VR4131.
User’s Manual U15509EJ2V0UM
19
CHAPTER 1 INTRODUCTION
(5) CPU bus interface
The bus interface controls data transmission/reception between the CPU core and peripheral units. The bus
interface consists of two 32-bit multiplexed address/data buses (one for input, and the other for output), clock
signals, interrupt request signals, and various other control signals.
(6) Clock generator
The clock generator processes clock inputs and supplies them to internal units.
1.2.1 CPU registers
The CPU core has thirty-two 64-bit general-purpose registers (GPR).
In addition, it provides the following special registers:
• PC: Program counter (64 bits)
• HI register: Contains the integer multiply and divide higher doubleword result (64 bits)
• LO register: Contains the integer multiply and divide lower doubleword result (64 bits)
Two of the general-purpose registers are assigned the following functions:
• r0 is fixed to 0, and can be used as the target register for any instruction whose result is to be discarded. r0
can also be used as a source register when a zero value is needed.
• r31 is the link register used by link instructions such as JAL (jump and link) instructions. This register can be
used for other instructions. However, be careful that use of the register by a link instruction will not coincide
with use of the register for other operations.
The register group is provided within the CP0 (system control coprocessor), to process exceptions and to manage
addresses.
CPU registers can operate as either 32-bit or 64-bit registers, depending on the processor operation mode.
The operation of the CPU register differs depending on what instructions are executed: 32-bit instructions or
MIPS16 instructions. For details, refer to CHAPTER 3 MIPS16 INSTRUCTION SET.
The VR4100 Series processors have no program status word (PSW) register as such; this is covered by the
Status and Cause registers incorporated within the system control coprocessor (CP0). For details of CP0 registers,
refer to Table 1-2 CP0 Registers.
Figure 1-2 shows the CPU registers.
20
User’s Manual U15509EJ2V0UM
CHAPTER 1 INTRODUCTION
Figure 1-2. CPU Registers
General-purpose registers
63
0
Multiply and divide registers
63
r0 = 0
0
HI
r1
r2
0
63
LO
r29
Program counter
63
r30
r31 = Link address
0
PC
1.2.2 Coprocessors
MIPS ISA defines 4 types of coprocessors (CP0 to CP3).
• CP0 translates virtual addresses to physical addresses, switches the operating mode (Kernel, Supervisor, or
User mode), and manages exceptions. It also controls the cache subsystem to analyze a cause and to return
from the error state.
• CP1 is reserved for floating-point instructions.
• CP2 is reserved for future definition by MIPS.
• CP3 is no longer defined. CP3 instructions are reserved for future extensions.
The VR4100 Series implements the CP0 only.
1.2.3 System control coprocessor (CP0)
CP0 translates virtual addresses to physical addresses, switches the operating mode, controls the cache
memory, and manages exceptions. For detailed descriptions of these functions, refer to CHAPTER 5 MEMORY
MANAGEMENT SYSTEM and CHAPTER 6 EXCEPTION PROCESSING.
CP0 has thirty-two registers that have corresponding register number.
The register number is used as an
operand of instructions to specify a CP0 register to be accessed. Table 1-2 shows simple descriptions of each
register.
User’s Manual U15509EJ2V0UM
21
CHAPTER 1 INTRODUCTION
Table 1-2. CP0 Registers
Register
Register Name
Usage
Description
Number
0
Index
Memory management
Programmable pointer to TLB array
1
Random
Memory management
Pseudo-random pointer to TLB array (read only)
2
EntryLo0
Memory management
Lower half of TLB entry for even VPN
3
EntryLo1
Memory management
Lower half of TLB entry for odd VPN
4
Context
Exception processing
Pointer to virtual PTE table in 32-bit mode
5
PageMask
Memory management
Page size specification
6
Wired
Memory management
Number of wired TLB entries
7
−
−
8
BadVAddr
Exception processing
Virtual address where the most recent error occurred
9
Count
Exception processing
Timer count
10
EntryHi
Memory management
Upper half of TLB entry (including ASID)
11
Compare
Exception processing
Timer compare value
12
Status
Exception processing
Operation status
13
Cause
Exception processing
Cause of last exception
14
EPC
Exception processing
Exception program counter
15
PRId
Memory management
Processor revision identifier
16
Config
Memory management
Memory mode system specification
Note1
Reserved for future use
17
LLAddr
Memory management
Physical address for diagnostic purpose
18
WatchLo
Exception processing
Memory reference trap address lower bits
19
WatchHi
Exception processing
Memory reference trap address higher bits
20
Xcontext
Exception processing
Pointer to virtual PTE table in 64-bit mode
−
21 to 25
−
Note2
Reserved for future use
26
Parity Error
Exception processing
Cache parity bits
27
Cache ErrorNote2
Exception processing
Index and status of cache error
28
TagLo
Memory management
Cache tag register (low)
29
TagHi
Memory management
Cache tag register (high)
30
ErrorEPC
Exception processing
Error exception program counter
31
−
−
Reserved for future use
Notes 1. This register is defined to maintain compatibility with the VR4000
TM
TM
and VR4400 . The contents of this
register are meaningless in the normal operation.
TM
2. This register is defined to maintain compatibility with the VR4100 . This register is not used in the normal
operation.
Caution When accessing the CP0 registers, some instructions require consideration of the interval time
until the next instruction is executed, because there is a delay from when the contents of the CP0
register change to when this change is reflected in the CPU operation. This time lag is called a CP0
hazard. For details, refer to CHAPTER 11 COPROCESSOR 0 HAZARDS.
22
User’s Manual U15509EJ2V0UM
CHAPTER 1 INTRODUCTION
1.2.4 Floating-point unit (FPU)
The VR4100 Series does not support the floating-point unit (FPU). A coprocessor unusable exception will occur if
any FPU instructions are executed. If necessary, FPU instructions should be emulated by software in an exception
handler.
1.2.5 Cache memory
The VR4100 Series incorporates instruction and data caches, which are independent of each other. This
configuration enables high-performance pipeline operations. Both caches have a 64-bit data bus, enabling a oneclock access. These buses can be accessed in parallel.
The caches are managed with direct mapping format in the VR4121, VR4122, VR4181, and VR4181A, or with 2way set-associative format in the VR4131. The data cache of the VR4131 has also the line lock function.
A detailed description of caches is given in CHAPETER 7 CACHE MEMORY.
1.3 CPU Instruction Set Overview
There are two types of CPU instructions: 32-bit length instructions (MIPS III) and 16-bit length instructions
(MIPS16). Use of the MIPS16 instructions is enabled or disabled by setting MIPS16EN pin during a reset.
(1) MIPS III instructions
All the CPU instructions are 32-bit length when executing MIPS III instructions, and they are classified into three
instruction formats as shown in Figure 1-3: immediate (I type), jump (J type), and register (R type). The fields of
each instruction format are described in CHAPTER 2 CPU INSTRUCTION SET SUMMARY.
Figure 1-3. CPU Instruction Formats (32-bit Length Instruction)
31
I - type (Immediate)
26 25
op
31
J - type (Jump)
16 15
0
rt
immediate
26 25
0
op
31
R - type (Register)
21 20
rs
target
26 25
op
21 20
rs
16 15
rt
User’s Manual U15509EJ2V0UM
11 10
rd
6 5
sa
0
funct
23
CHAPTER 1 INTRODUCTION
The instruction set can be further divided into the following five groupings:
(a) Load and store instructions move data between the memory and the general-purpose registers. They are all
immediate (I-type) instructions, since the only addressing mode supported is base register plus 16-bit,
signed immediate offset.
(b) Computational instructions perform arithmetic, logical, shift, and multiply and divide operations on values in
registers. They include R-type (in which both the operands and the result are stored in registers) and I-type
(in which one operand is a 16-bit signed immediate value) formats.
(c) Jump and branch instructions change the control flow of a program. Jumps are made either to an absolute
address formed by combining a 26-bit target address with the higher bits of the program counter (J-type
format) or register-specified address (R-type format).
The format of the branch instructions is I type.
Branches have 16-bit offsets relative to the program counter. JAL instructions save their return address in
register 31.
(d) System control coprocessor (CP0) instructions perform operations on CP0 registers to control the memorymanagement and exception-handling facilities of the processor.
(e) Special instructions perform system calls and breakpoint exceptions, or cause a branch to the general
exception-handling vector based upon the result of a comparison. These instructions occur in both R-type
and I-type formats.
For the operation of each instruction, refer to CHAPTER 2 CPU INSTRUCTION SET SUMMARY and CHAPTER
9 CPU INSTRUCTION SET DETAILS.
(2) Additional instructions
All the sum-of-products instructions and power mode instructions are 32-bit length.
(3) MIPS16 instructions
All the CPU instructions except for JAL and JALX are 16-bit length when executing MIPS16 instructions, and they
are classified into thirteen instruction formats as shown in Figure 1-4.
The fields of each instruction format are described in CHAPTER 3 MIPS 16 INSTRUCTION SET.
24
User’s Manual U15509EJ2V0UM
CHAPTER 1 INTRODUCTION
Figure 1-4. CPU Instruction Formats (16-bit Length Instruction)
15
I-type
11 10
op
15
RI-type
immediate
11 10
op
15
RR-type
15
Shift-type
15
I8-type
15
I8_MOVR32-type
I64-type
3 2
5 4
r32(4:3)
r32(2:0)
0
rz
8 7
0
funct
11 10
I64
0
r32(4:0)
8 7
11 10
15
5 4
ry
funct
I64
F
immediate
8 7
11 10
15
Shamt
0
0
funct
I8
2 1
8 7
11 10
15
I8_MOV32R-type
0
immediate
5 4
funct
I8
F
F
ry
0
3
4
ry
rx
I8
rz
5
8 7
11 10
2 1
ry
rx
SHIFT
immediate
5 4
8 7
11 10
0
ry
8 7
11 10
15
5 4
rx
RRI-A
funct
8 7
11 10
0
ry
rx
RRR
RRI-A-type
5 4
rx
RRI
RRR-type
immediate
8 7
11 10
15
0
rx
op
RRI-type
8 7
11 10
15
RI64-type
0
immediate
8 7
funct
5 4
ry
0
immediate
JAL/JALX-type
31
11 10 9
16 15
Immediate(15:0)
JAL
User’s Manual U15509EJ2V0UM
X
5 4
0
Immediate(20:16) Immediate(25:21)
25
CHAPTER 1 INTRODUCTION
The instruction set can be further divided into the following four groupings:
(a) Load and store instructions move data between memory and general-purpose registers. They include RRI,
RI, I8, and RI64 types.
(b) Computational instructions perform arithmetic, logical, shift, and multiply and divide operations on values in
registers. They include RI-, RRIA, I8, RI64, I64, RR, RRR, I8_MOVR32, and I8_MOV32R types.
(c) Jump and branch instructions change the control flow of a program. They include JAL/JALX, RR, RI, I8, and I
types.
(d) Special instructions are SYSCALL, BREAK, and Extend instructions. The SYSCALL and BREAK instructions
transfer control to an exception handler. The Extend instruction extends the immediate field of the next
instruction. They are RR and I types. When extending the immediate field of the next instruction by using the
Extend instruction, one cycle is needed for executing the Extend instruction, and another cycle is needed for
executing the next instruction.
For more details of each instruction’s operation, refer to CHAPTER 3
MIPS16 INSTRUCTION SET and
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT.
1.4 Data Formats and Addressing
The VR4100 Series uses the following four data formats:
• Doubleword (64 bits)
• Word (32 bits)
• Halfword (16 bits)
• Byte (8 bits)
In the CPU core, if the data format is any one of halfword, word, or doubleword, the byte ordering can be set as
either big endian or little endian. In the VR4131, the setting of BIGENDIAN pin during a reset decides which byte
order is used. The VR4121, VR4122, VR4181, and VR4181A only support the little-endian order.
Endianness refers to the location of byte 0 within the multi-byte data structure. Figures 1-5 and 1-6 show the
configuration.
When configured as a big-endian system, byte 0 is always the most-significant (leftmost) byte, which is
compatible with MC68000TM and IBM370TM conventions.
When configured as a little-endian system, byte 0 is always the least-significant (rightmost) byte, which is
compatible with PentiumTM and DEC VAXTM conventions.
In this manual, bit designations are always little endian.
26
User’s Manual U15509EJ2V0UM
CHAPTER 1 INTRODUCTION
Figure 1-5. Byte Address in Big-Endian Byte Order
(a) Word data
31
High-order
address
Low-order
address
24 23
16 15
8 7
0
Word
address
12
13
14
15
12
8
9
10
11
8
4
5
6
7
4
0
1
2
3
0
(b) Doubleword data
Word
Halfword
63
High-order
address
Low-order
address
32 31
Byte
16 15
87
0
Doubleword
address
16
17
18
19
20
21
22
23
16
8
9
10
11
12
13
14
15
8
0
1
2
3
4
5
6
7
0
Remarks 1. The highest byte is the lowest address.
2. The address of word data is specified by the highest byte’s address.
User’s Manual U15509EJ2V0UM
27
CHAPTER 1 INTRODUCTION
Figure 1-6. Byte Address in Little-Endian Byte Order
(a) Word data
31
High-order
address
Low-order
address
24 23
16 15
8 7
0
Word
address
15
14
13
12
12
11
10
9
8
8
7
6
5
4
4
3
2
1
0
0
(b) Doubleword data
Word
Halfword
63
High-order
address
Low-order
address
32 31
Byte
16 15
87
0
Doubleword
address
23
22
21
20
19
18
17
16
16
15
14
13
12
11
10
9
8
8
7
6
5
4
3
2
1
0
0
Remarks 1. The lowest byte is the lowest address.
2. The address of word data is specified by the lowest byte’s address.
28
User’s Manual U15509EJ2V0UM
CHAPTER 1 INTRODUCTION
The CPU core uses the following byte boundaries for halfword, word, and doubleword accesses:
• Halfword:
An even byte boundary (0, 2, 4...)
• Word:
A byte boundary divisible by four (0, 4, 8...)
• Doubleword: A byte boundary divisible by eight (0, 8, 16...)
The following special instructions are used to load and store data that are not aligned on 4-byte (word) or 8-byte
(doubleword) boundaries:
• Word access:
LWL, LWR, SWL, SWR
• Doubleword access:
LDL, LDR, SDL, SDR
These instructions are used in pairs of L and R.
Accessing misaligned data requires one additional instruction cycle (1 PCycle) over that required for accessing
aligned data.
Figure 1-7 shows the access of a misaligned word that has byte address 3.
Figure 1-7. Misaligned Word Accessing (Little-Endian)
31
24 23
High-order address
Low-order address
16 15
6
8 7
5
0
4
3
Caution In the VR4131, data transfer to the internal I/O (register) space or to the PCI bus is performed with
data converted to little endian even during operation in big-endian mode. Therefore, the following
restrictions apply for access to these address spaces.
• Do not perform 3-byte access. When 3-byte access is executed, data is undefined.
• When 8-byte access is executed, the order of higher word and lower word is reversed.
• Do not use the LWR, LWL, LDR, and LDL instructions. Access by the LWR, LWL, LDR, or LDL
instruction causes erroneous data to be loaded.
User’s Manual U15509EJ2V0UM
29
CHAPTER 1 INTRODUCTION
1.5 Memory Management System
The VR4100 Series has a 32-bit physical addressing range of 4 GB. However, since it is rare for systems to
implement a physical memory space as large as that memory space, the CPU provides a logical expansion of
memory space by translating addresses composed in the large virtual address space into available physical memory
addresses.
A detailed description of these address spaces is given in CHAPTER 5 MEMORY MANAGEMENT SYSTEM.
1.5.1 Translation lookaside buffer (TLB)
Virtual memory mapping is performed using the translation lookaside buffer (TLB). The TLB converts virtual
addresses to physical addresses. It runs by a full-associative method and has 32 entries, each mapping a pair of
two consecutive pages. The page size is variable between 1 KB and 256 KB, in powers of 4.
(1) Joint TLB (JTLB)
The JTLB holds both instruction and data addresses.
For fast virtual-to-physical address decoding, the VR4100 Series uses a large, fully associative TLB (joint TLB)
that translates 64 virtual pages to their corresponding physical addresses. The TLB is organized as 32 pairs of
even-odd entries, and maps a virtual address and address space identifier (ASID) into the 4 GB physical
address space.
The page size can be configured, on a per-entry basis, to map a page size of 1 KB to 256 KB. A CP0 register
stores the size of the page to be mapped, and that size is entered into the TLB when a new entry is written.
Thus, operating systems can provide special purpose maps; for example, a typical frame buffer can be memorymapped using only one TLB entry.
Translating a virtual address to a physical address begins by comparing the virtual address from the processor
with the physical addresses in the TLB; there is a match when the virtual page number (VPN) of the address is
the same as the VPN field of the entry, and either the global (G) bit of the TLB entry is set, or the ASID field of
the virtual address is the same as the ASID field of the TLB entry.
This match is referred to as a TLB hit. If there is no match, a TLB miss exception is taken by the processor and
software is allowed to refill the TLB from a page table of virtual/physical addresses in memory.
1.5.2 Processor modes
(1) Operating modes
The VR4100 Series has three operating modes, User, Supervisor, and Kernel. The manner in which memory
addresses are mapped depends on these operating modes. Refer to CHAPTER 5 MEMORY MANAGEMENT
SYSTEM for details.
(2) Addressing modes
The VR4100 Series has two addressing modes, 64-bit and 32-bit. The manner in which memory addresses are
translated or mapped depends on these operating modes. Refer to CHAPTER 5 MEMORY MANAGEMENT
SYSTEM for details.
30
User’s Manual U15509EJ2V0UM
CHAPTER 1 INTRODUCTION
1.6 Instruction Pipeline
The VR4100 Series has a 5- to 7-stage instruction pipeline.
In the VR4121, VR4122, VR4181, and VR4181A, one instruction is issued each cycle under normal circumstances.
The VR4131 employs a 2-way superscalar mechanism so that two instructions can be executed simultaneously.
A detailed description of the pipeline is provided in CHAPTER 4 PIPELINE.
1.6.1 Branch prediction
The VR4122, VR4131, and VR4181A have a branch prediction mechanism to speed up branch operations. These
processors have a branch prediction table that holds branch instructions whose conditions were satisfied in the past,
and the target addresses of the instructions. If an instruction that is the same as the fetched instruction is in this
table (hit), execution branches without delay. If the corresponding branch instruction is not in the branch prediction
table (miss), the address of that instruction is loaded to the branch prediction table and then execution branches.
For the operations when a hit or a miss occurs, refer to CHAPTER 4 PIPELINE.
If the BP bit of the Config register of CP0 is cleared, branch prediction is performed. It is not performed if the BP
bit is set (1) or in the MIPS16 instruction mode.
User’s Manual U15509EJ2V0UM
31
CHAPTER 1 INTRODUCTION
1.7 Code Compatibility
The CPU cores of the VR4100 Series are designed in consideration of the program compatibility to other VRSeries processors. However since they have some differences from other processors on their architecture, they
cannot necessarily execute all programs that can be executed in other VR-Series processors, and also other VRSeries processors cannot necessarily execute all programs that can be executed in the VR4100 Series.
Matters that should be paid attention to when porting programs between the VR4100 Series and other VR-Series
processors are listed below.
• A 16-bit length MIPS16 instruction set is added in the VR4100 Series.
• Multiply-add instructions are added in the VR4100 Series.
• Instructions for power modes (HIBERNATE, STANDBY, SUSPEND) are added in the VR4100 Series to
support power modes.
• Operations to lock a cache are added to the CACHE instruction in the VR4131.
• The VR4100 Series does not support floating-point instructions since it has no Floating-Point Unit (FPU).
• The VR4100 Series does not have the LL bit to perform synchronization of multiprocessing. Therefore, it does
not support instructions that manipulate the LL bit (LL, LLD, SC, SCD).
• The CP0 hazards of the VR4100 Series are equally or less stringent than those of the VR4000 (see Chapter 11
for details).
For more information about each instruction, refer to Chapters 9 and 3, and user's manuals of each product other
than the VR4100 Series.
Instructions supported by each of the VR Series processors are listed below.
Table 1-3. List of Instructions Supported by VR Series Processors
Products
Supported instructions
VR4300 TM
VR5000 TM
VR5432 TM
VR10000TM
VR4122
VR4305 TM
VR5000A TM
VR5500 TM
VR12000TM
VR4181
VR4310 TM
VR4121
VR4131
VR4181A
MIPS I
A
A
A
A
A
A
MIPS II
A
A
A
A
A
A
MIPS III
A
A
A
A
A
A
N/A
N/A
A
A
A
A
MIPS IV
N/A
N/A
N/A
A
A
A
MIPS16
A
A
N/A
N/A
N/A
N/A
Multiply-add
A
A
N/A
N/A
A
N/A
Floating-point operation
N/A
N/A
A
A
A
A
Power mode transition
A
A
N/A
A
A
N/A
LL bit
manipulation
(VR5500)
32
User’s Manual U15509EJ2V0UM
CHAPTER 2 CPU INSTRUCTION SET SUMMARY
This chapter is an overview of the CPU instruction set; refer to CHAPTER 9 CPU INSTRUCTION SET DETAILS
for detailed descriptions of individual CPU instructions.
2.1 Instruction Set Architecture
In the MIPS Instruction Set Architecture (ISA), five levels of instruction sets, from MIPS I through MIPS V, are
currently defined. An instruction set of larger level number includes that of smaller level number. In other words, a
processor implementing the MIPS IV instruction set is able to run MIPS I, MIPS II, or MIPS III binary programs without
change.
There are another instruction sets called ASE, Application-Specific Extension, that extend functions for specific
applications and MIPS16 is the one currently defined (refer to CHAPTER 3 MIPS16 INSTRUCTION SET for details).
The VR4100 Series implements MIPS III and MIPS16 instruction sets except for the following instructions:
(1) Synchronization support instructions
The VR4100 Series does not support a multiprocessor operating environment. Thus the instructions to support
synchronization of memory update defined in the MIPS II and MIPS III ISA - the load linked and store conditional
instructions - cause reserved instruction exception. The load link (LL) bit is eliminated.
Remark The SYNC instruction is handled as a NOP instruction since all load/store instructions in this processor
are executed in program order.
(2) Floating-point operation instructions
The VR4100 Series does not incorporate a floating-point unit (FPU).
Thus the FPU instructions cause a
coprocessor unusable exception. FPU instructions should be emulated by software in an exception handler if
necessary.
User’s Manual U15509EJ2V0UM
33
CHAPTER 2 CPU INSTRUCTION SET SUMMARY
2.2 CPU Instruction Formats
Each MIPS III ISA CPU instruction consists of a single 32-bit word, aligned on a word boundary. There are three
instruction formats - immediate (I-type), jump (J-type), and register (R-type) - as shown in Figure 2-1. The use of a
small number of instruction formats simplifies instruction decoding, allowing the compiler to synthesize more
complicated and less frequently used instruction and addressing modes from these three formats as needed.
Figure 2-1. CPU Instruction Formats
31
I-type (immediate)
26 25
rs
op
31
J-type (jump)
21 20
16 15
rt
immediate
26 25
0
op
31
R-type (register)
0
target
26 25
op
21 20
rs
16 15
rt
11 10
rd
65
sa
0
funct
op:
6-bit operation code
rs:
5-bit source register specifier
rt:
5-bit target (source/destination) register specifier or branch condition
immediate: 16-bit immediate value, branch displacement, or address displacement
34
target:
26-bit unconditional branch target address
rd:
5-bit destination register specifier
sa:
5-bit shift amount
func:
6-bit function field
User’s Manual U15509EJ2V0UM
CHAPTER 2 CPU INSTRUCTION SET SUMMARY
2.3 Instructions Added in the VR4100 Series
In the VR4100 Series, instructions such as power mode instructions or product-sum operation instructions, which
are suitable for potable information equipment and multimedia field, are added. These instructions are not included
in the standard MIPS III instruction set.
2.3.1 Product-sum operation instructions
These instructions add a value in an accumulator to the result of multiplication and store it into a destination
register, using the HI register and LO register as an accumulator. A 64-bit accumulator consists of the low-order 32
bits of the HI register as high-order bits and the low-order 32 bits of the LO register as low-order bits. No overflow or
no underflow occurs by executing these instructions, and therefore, no exception occurs.
Of product-sum operation instructions, those that perform saturation processing or store data into a generalpurpose register by specifying options are called MACC instructions.
Table 2-1. MACC Instructions (for VR4121, VR4122, VR4131, and VR4181A)
Instruction
Definition
MACC
Multiply and Add Accumulate
DMACC
Doubleword Multiply and Add Accumulate
Table 2-2. Product-Sum Operation Instructions (for VR4181)
Instruction
Definition
MADD16
Multiply and Add 16-bit Integer
DMADD16
Doubleowrd Multiply and Add 16-bit Integer
2.3.2 Power mode instructions
These instructions stop the internal clock of the processor and set the processor in a low power consumption
mode. Three low power consumption modes are available, each of which can be set by a dedicated instruction.
Table 2-3. Power Mode Instructions
Instruction
Definition
STANDBY
Standby
SUSPEND
Suspend
HIBERNATE
Hibernate
User’s Manual U15509EJ2V0UM
35
CHAPTER 2 CPU INSTRUCTION SET SUMMARY
2.4 Instruction Overview
The CPU instructions are classified into five classes. The product-sum operation instructions and power mode
instructions added in the VR4100 Series are also included in one of the five classes.
2.4.1 Load and store instructions
Loads and stores are immediate (I-type) instructions that move data between memory and the general-purpose
registers. The only addressing mode that load and store instructions directly support is base register plus 16-bit
signed immediate offset.
Tables 2-5 and 2-6 list the ISA-defined load/store instructions and extended-ISA instructions, respectively.
(1) Scheduling a load delay slot
A load instruction that does not allow its result to be used by the instruction immediately following is called a
delayed load instruction. The instruction slot immediately following this delayed load instruction is referred to as
the load delay slot.
In the VR4100 Series, a load instruction can be followed directly by an instruction that accesses a register that is
loaded by the load instruction. In this case, however, an interlock occurs for a necessary number of cycles. Any
instruction can follow a load instruction, but the load delay slot should be scheduled appropriately for both
performance and compatibility with the VR Series microprocessors. For detail, see CHAPTER 4 PIPELINE.
(2) Store delay slot
When a store instruction is writing data to a cache, the data cache is kept busy at the DC and WB stages. If an
instruction (such as load) that follows directly the store instruction accesses the data cache in the DC stage, a
hardware-driven interlock occurs. To overcome this problem, the store delay slot should be scheduled.
Table 2-4. Number of Delay Slot Cycles Necessary for Load and Store Instructions
Instruction
Necessary number of PCycles
Load
1
Store
1
(3) Defining access types
Access type indicates the size of a processor data item to be loaded or stored, set by the load or store instruction
opcode. Access types and accessed byte are shown in Figure 2-2.
Regardless of access type or byte ordering (endianness), the address given specifies the least significant byte in
the addressed field. For a big-endian configuration, the high-order byte is the least-significant byte, and for a
little-endian configuration the low-order byte.
The access type, together with the three low-order bits of the address, defines the bytes accessed within the
addressed doubleword (shown in Figure 2-2). Only the combinations shown in Figure 2-2 are permissible; other
combinations cause address error exceptions.
36
User’s Manual U15509EJ2V0UM
CHAPTER 2 CPU INSTRUCTION SET SUMMARY
Figure 2-2. Byte Specification Related to Load and Store Instructions
Access type
(value)
Low-order
address
bits
Accessed byte
Accessed byte
(big-endian)
(little-endian)
2
1
0
63
Doubleword (7)
0
0
0
0
1
2
3
4
5
6
7-byte (6)
0
0
0
0
1
2
3
4
5
6
0
0
1
1
2
3
4
5
6
0
0
0
1
2
3
4
5
0
1
0
2
3
4
5
0
0
0
2
3
4
0
1
1
3
4
0
0
0
1
0
0
0
0
0
0
0
1
1
0
0
1
0
1
0
0
0
0
1
0
1
0
0
1
1
0
0
0
0
0
0
1
0
1
0
0
1
1
1
0
0
1
0
1
1
1
0
1
1
1
6-byte (5)
5-byte (4)
Word (3)
Triple byte (2)
Halfword (1)
Byte (0)
0
0
0
1
1
2
1
2
1
2
7
7
7
7
7
7
7
7
0
6
5
4
3
2
1
0
6
5
4
3
2
1
0
6
5
4
3
2
1
5
4
3
2
1
0
5
4
3
2
4
3
2
1
0
4
3
2
1
0
2
1
0
2
1
6
6
5
3
5
6
7
7
6
5
4
3
3
4
0
6
63
3
4
0
5
6
0
5
6
5
6
7
7
6
5
6
5
4
1
1
2
3
3
4
5
5
6
7
7
0
2
4
6
0
0
1
1
2
2
3
3
4
4
5
5
6
6
7
7
Remark The big-endian order is supported by the VR4131 only.
User’s Manual U15509EJ2V0UM
37
CHAPTER 2 CPU INSTRUCTION SET SUMMARY
Table 2-5. Load/Store Instruction
Instruction
Format and Description
op
base
rt
offset
Load Byte
LB rt, offset (base)
The offset is sign extended and then added to the contents of the register base to form the virtual address.
The bytes of the memory location specified by the address are sign extended and loaded into register rt.
Load Byte Unsigned
LBU rt, offset (base)
The offset is sign extended and then added to the contents of the register base to form the virtual address.
The bytes of the memory location specified by the address are zero extended and loaded into register rt.
Load Halfword
LH rt, offset (base)
The offset is sign extended and then added to the contents of the register base to form the virtual address.
The halfword of the memory location specified by the address is sign extended and loaded to register rt.
Load Halfword
Unsigned
LHU rt, offset (base)
The offset is sign extended and then added to the contents of the register base to form the virtual address.
The halfword of the memory location specified by the address is zero extended and loaded to register rt.
Load Word
LW rt, offset (base)
The offset is sign extended and then added to the contents of the register base to form the virtual address.
The word of the memory location specified by the address is sign extended and loaded to register rt. In the
64-bit mode, it is further sign extended to 64 bits.
Load Word Left
LWL rt, offset (base)
The offset is sign extended and then added to the contents of the register base to form the virtual address.
Shifts to the left the word whose address is specified so that the address-specified byte is at the leftmost position of the word. The result of the shift operation is merged with the contents of register rt
and loaded to register rt. In the 64-bit mode, it is further sign extended to 64 bits.
Load Word Right
LWR rt, offset (base)
The offset is sign extended and then added to the contents of the register base to form the virtual address.
Shifts to the right the word whose address is specified so that the address-specified byte is at the rightmost position of the word. The result of the shift operation is merged with the contents of register rt and
loaded to register rt. In the 64-bit mode, it is further sign extended to 64 bits.
Store Byte
SB rt, offset (base)
The offset is sign extended and then added to the contents of the register base to form the virtual address.
The least significant byte of register rt is stored to the memory location specified by the address.
Store Halfword
SH rt, offset (base)
The offset is sign extended and then added to the contents of the register base to form the virtual address.
The least significant halfword of register rt is stored to the memory location specified by the address.
Store Word
SW rt, offset (base)
The offset is sign extended and then added to the contents of the register base to form the virtual address.
The lower word of register rt is stored to the memory location specified by the address.
Store Word Left
SWL rt, offset (base)
The offset is sign extended and then added to the contents of the register base to form the virtual address.
Shifts to the right the contents of register rt so that the left-most byte of the word is in the position of the
address-specified byte. The result is stored to the lower word in memory.
Store Word Right
SWR rt, offset (base)
The offset is sign extended and then added to the contents of the register base to form the virtual address.
Shifts to the left the contents of register rt so that the right-most byte of the word is in the position of the
address-specified byte. The result is stored to the upper word in memory.
38
User’s Manual U15509EJ2V0UM
CHAPTER 2 CPU INSTRUCTION SET SUMMARY
Table 2-6. Load/Store Instruction (Extended ISA)
Instruction
Format and Description
op
base
rt
offset
Load Doubleword
LD rt, offset (base)
The offset is sign extended and then added to the contents of the register base to form the virtual address.
The doubleword of the memory location specified by the address are loaded into register rt.
Load Doubleword Left
LDL rt, offset (base)
The offset is sign extended and then added to the contents of the register base to form the virtual address.
Shifts to the left the double word whose address is specified so that the address-specified byte is at the
left-most position of the double word. The result of the shift operation is merged with the contents of
register rt and loaded to register rt.
Load Doubleword
Right
LDR rt, offset (base)
The offset is sign extended and then added to the contents of the register base to form the virtual address.
Shifts to the right the double word whose address is specified so that the address-specified byte is at
the right-most position of the double word. The result of the shift operation is merged with the contents
of register rt and loaded to register rt.
Load Word Unsigned
LWU rt, offset (base)
The offset is sign extended and then added to the contents of the register base to form the virtual address.
The word of the memory location specified by the address are zero extended and loaded into register rt
Store Doubleword
SD rt, offset (base)
The offset is sign extended and then added to the contents of the register base to form the virtual address.
The contents of register rt are stored to the memory location specified by the address.
Store Doubleword Left
SDL rt, offset (base)
The offset is sign extended and then added to the contents of the register base to form the virtual address.
Shifts to the right the contents of register rt so that the left-most byte of the double word is in the
position of the address-specified byte. The result is stored to the lower doubleword in memory.
Store Doubleword
Right
SDR rt, offset (base)
The offset is sign extended and then added to the contents of the register base to form the virtual address.
Shifts to the left the contents of register rt so that the right-most byte of the double word is in the
position of the address-specified byte. The result is stored to the upper doubleword in memory.
User’s Manual U15509EJ2V0UM
39
CHAPTER 2 CPU INSTRUCTION SET SUMMARY
2.4.2 Computational instructions
Computational instructions perform arithmetic, logical, and shift operations on values in registers. Computational
instructions can be either in register (R-type) format, in which both operands are registers, or in immediate (I-type)
format, in which one operand is a 16-bit immediate.
Computational instructions are classified as:
(1) ALU immediate instructions
(2) Three-operand type instructions
(3) Shift instructions
(4) Multiply/divide instructions
In addition, product-sum operation instructions are added in the VR4100 Series.
To maintain data compatibility between the 64- and 32-bit modes, it is necessary to sign-extend 32-bit operands
correctly. If the sign extension is not correct, the 32-bit operation result is meaningless.
Table 2-7. ALU Immediate Instruction
Instruction
Format and Description
op
rs
rt
immediate
Add Immediate
ADDI rt, rs, immediate
The 16-bit immediate is sign extended and then added to the contents of register rs to form a 32-bit
result. The result is stored into register rt. In the 64-bit mode, the operand must be sign extended. An
exception occurs on the generation of 2’s complement overflow.
Add Immediate
Unsigned
ADDIU rt, rs, immediate
The 16-bit immediate is sign extended and then added to the contents of register rs to form a 32-bit
result. The result is stored into register rt. In the 64-bit mode, the operand must be sign extended. No
exception occurs on the generation of integer overflow.
Set On Less Than
Immediate
SLTI rt, rs, immediate
The 16-bit immediate is sign extended and then compared to the contents of register rt treating both
operands as signed integers. If rs is less than the immediate, the result is set to 1; otherwise, the result
is set to 0. The result is stored to register rt.
Set On Less Than
Immediate Unsigned
SLTIU rt, rs, immediate
The 16-bit immediate is sign extended and then compared to the contents of register rt treating both
operands as unsigned integers. If rs is less than the immediate, the result is set to 1; otherwise, the
result is set to 0. The result is stored to register rt.
AND Immediate
ANDI rt, rs, immediate
The 16-bit immediate is zero extended and then ANDed with the contents of the register. The result is
stored into register rt.
OR Immediate
ORI rt, rs, immediate
The 16-bit immediate is zero extended and then ORed with the contents of the register. The result is
stored into register rt.
Exclusive OR
Immediate
XORI rt, rs, immediate
The 16-bit immediate is zero extended and then Ex-ORed with the contents of the register. The result
is stored into register rt.
Load Upper
Immediate
LUI rt, immediate
The 16-bit immediate is shifted left by 16 bits to set the lower 16 bits of word to 0. The result is stored
into register rt. In the 64-bit mode, the operand must be sign extended.
40
User’s Manual U15509EJ2V0UM
CHAPTER 2 CPU INSTRUCTION SET SUMMARY
Table 2-8. ALU Immediate Instruction (Extended ISA)
Instruction
Format and Description
op
rs
rt
immediate
Doubleword Add
Immediate
DADDI rt, rs, immediate
The 16-bit immediate is sign extended to 64 bits and then added to the contents of register rs to form a
64-bit result. The result is stored into register rt.
An exception occurs on the generation of integer overflow.
Doubleword Add
Immediate Unsigned
DADDIU rt, rs, immediate
The 16-bit immediate is sign extended to 64 bits and then added to the contents of register rs to form a
64-bit result. The result is stored into register rt.
No exception occurs on the generation of overflow.
Table 2-9. Three-Operand Type Instruction
Instruction
Format and Description
op
rs
rt
rd
sa
funct
Add
ADD rd, rs, rt
The contents of registers rs and rt are added together to form a 32-bit result. The result is stored into
register rd. In the 64-bit mode, the operand must be sign extended. An exception occurs on the
generation of integer overflow.
Add Unsigned
ADDU rd, rs, rt
The contents of registers rs and rt are added together to form a 32-bit result. The result is stored into
register rd. In the 64-bit mode, the operand must be sign extended. No exception occurs on the
generation of integer overflow.
Subtract
SUB rd, rs, rt
The contents of register rt are subtracted from the contents of register rs. The 32-bit result is stored
into register rd. In the 64-bit mode, the operand must be sign extended. An exception occurs on the
generation of integer overflow.
Subtract Unsigned
SUBU rd, rs, rt
The contents of register rt are subtracted from the contents of register rs. The 32-bit result is stored
into register rd. In the 64-bit mode, the operand must be sign extended. No exception occurs on the
generation of integer overflow.
Set On Less Than
SLT rd, rs, rt
The contents of registers rs and rt are compared, treating both operands as signed integers. If the
contents of register rs is less than that of register rt, the result is set to 1; otherwise, the result is set to
0. The result is stored to register rd.
Set On Less Than
Unsigned
SLTU rd, rs, rt
The contents of registers rs and rt are compared treating both operands as unsigned integers. If the
contents of register rs is less than that of register rt, the result is set to 1; otherwise, the result is set to
0. The result is stored to register rd.
AND
AND rd, rt, rs
The contents of register rs are logical ANDed with that of general register rt bit-wise. The result is
stored to register rd.
OR
OR rd, rt, rs
The contents of register rs are logical ORed with that of general register rt bit-wise. The result is stored
to register rd.
Exclusive OR
XOR rd, rt, rs
The contents of register rs are logical Ex-ORed with that of general register rt bit-wise. The result is
stored to register rd.
NOR
NOR rd, rt, rs
The contents of register rs are logical NORed with that of general register rt bit-wise. The result is
stored to register rd.
User’s Manual U15509EJ2V0UM
41
CHAPTER 2 CPU INSTRUCTION SET SUMMARY
Table 2-10. Three-Operand Type Instruction (Extended ISA)
Instruction
Format and Description
op
rs
rt
rd
sa
funct
Doubleword Add
DADD rd, rt, rs
The contents of register rs are added to that of register rt. The 64-bit result is stored into register rd.
An exception occurs on the generation of integer overflow.
Doubleword Add
Unsigned
DADDU rd, rt, rs
The contents of register rs are added to that of register rt. The 64-bit result is stored into register rd. No
exception occurs on the generation of integer overflow.
Doubleword Subtract
DSUB rd, rt, rs
The contents of register rt are subtracted from that of register rs. The 64-bit result is stored into register
rd. An exception occurs on the generation of integer overflow.
Doubleword Subtract
Unsigned
DSUBU rd, rt, rs
The contents of register rt are subtracted from that of register rs. The 64-bit result is stored into register
rd. No exception occurs on the generation of integer overflow.
Table 2-11. Shift Instruction
Instruction
Format and Description
op
rs
rt
rd
sa
funct
Shift Left Logical
SLL rd, rs, sa
The contents of register rt are shifted left by sa bits and zeros are inserted into the emptied lower bits.
The 32-bit result is stored into register rd. In the 64-bit mode, the operand must be sign extended.
Shift Right Logical
SRL rd, rs, sa
The contents of register rt are shifted right by sa bits and zeros are inserted into the emptied higher
bits. The 32-bit result is stored into register rd. In the 64-bit mode, the operand must be sign extended.
Shift Right Arithmetic
SRA rd, rt, sa
The contents of register rt are shifted right by sa bits and the emptied higher bits are sign extended.
The 32-bit result is stored into register rd. In the 64-bit mode, the operand must be sign extended.
Shift Left Logical
Variable
SLLV rd, rt, rs
The contents of register rt are shifted left and zeros are inserted into the emptied lower bits. The lower
five bits of register rs specify the shift count. The 32-bit result is stored into register rd. In the 64-bit
mode, the operand must be sign extended.
Shift Right Logical
Variable
SRLV rd, rt, rs
The contents of register rt are shifted right and zeros are inserted into the emptied higher bits. The
lower five bits of register rs specify the shift count. The 32-bit result is stored into register rd. In the 64bit mode, the operand must be sign extended.
Shift Right Arithmetic
Variable
SRAV rd, rt, rs
The contents of register rt are shifted right and the emptied higher bits are sign extended. The lower
five bits of register rs specify the shift count. The 32-bit result is stored into register rd. In the 64-bit
mode, the operand must be sign extended.
42
User’s Manual U15509EJ2V0UM
CHAPTER 2 CPU INSTRUCTION SET SUMMARY
Table 2-12. Shift Instruction (Extended ISA)
Instruction
Format and Description
op
rs
rt
rd
sa
funct
Doubleword Shift Left
Logical
DSLL rd, rs, sa
The contents of register rt are shifted left by sa bits and zeros are inserted into the emptied lower bits.
The 64-bit result is stored into register rd.
Doubleword Shift
Right Logical
DSRL rd, rs, sa
The contents of register rt are shifted right by sa bits and zeros are inserted into the emptied higher
bits. The 64-bit result is stored into register rd.
Doubleword Shift
Right Arithmetic
DSRA rd, rt, sa
The contents of register rt are shifted right by sa bits and the emptied higher bits are sign extended.
The 64-bit result is stored into register rd.
Doubleword Shift Left
Logical Variable
DSLLV rd, rt, rs
The contents of register rt are shifted left and zeros are inserted into the emptied lower bits. The lower
six bits of register rs specify the shift count. The 64-bit result is stored into register rd.
Doubleword Shift
Right Logical Variable
DSRLV rd, rt, rs
The contents of register rt are shifted right and zeros are inserted into the emptied higher bits. The
lower six bits of register rs specify the shift count. The 64-bit result is stored into register rd.
Doubleword Shift
Right Arithmetic
Variable
DSRAV rd, rt, rs
The contents of register rt are shifted right and the emptied higher bits are sign extended. The lower six
bits of register rs specify the shift count. The 64-bit result is stored into register rd.
Doubleword Shift Left
Logical + 32
DSLL32 rd, rt, sa
The contents of register rt are shifted left by 32 + sa bits and zeros are inserted into the emptied lower
bits. The 64-bit result is stored into register rd.
Doubleword Shift
Right Logical + 32
DSRL32 rd, rt, sa
The contents of register rt are shifted right by 32 + sa bits and zeros are inserted into the emptied
higher bits. The 64-bit result is stored into register rd.
Doubleword Shift
Right Arithmetic + 32
DSRA32 rd, rt, sa
The contents of register rt are shifted right by 32 + sa bits and the emptied higher bits are sign
extended. The 64-bit result is stored into register rd.
User’s Manual U15509EJ2V0UM
43
CHAPTER 2 CPU INSTRUCTION SET SUMMARY
Table 2-13. Multiply/Divide Instructions
Instruction
Format and Description
op
rs
rt
rd
sa
funct
Multiply
MULT rs, rt
The contents of registers rt and rs are multiplied, treating both operands as 32-bit signed integers. The
64-bit result is stored into special registers HI and LO. In the 64-bit mode, the operand must be sign
extended.
Multiply Unsigned
MULTU rs, rt
The contents of registers rt and rs are multiplied, treating both operands as 32-bit unsigned integers.
The 64-bit result is stored into special registers HI and LO. In the 64-bit mode, the operand must be
sign extended.
Divide
DIV rs, rt
The contents of register rs are divided by that of register rt, treating both operands as 32-bit signed
integers. The 32-bit quotient is stored into special register LO, and the 32-bit remainder is stored into
special register HI. In the 64-bit mode, the operand must be sign extended.
Divide Unsigned
DIVU rs, rt
The contents of register rs are divided by that of register rt, treating both operands as 32-bit unsigned
integers. The 32-bit quotient is stored into special register LO, and the 32-bit remainder is stored into
special register HI. In the 64-bit mode, the operand must be sign extended.
Move from HI
MFHI rd
The contents of special register HI are loaded into register rd.
Move from LO
MFLO rd
The contents of special register LO are loaded into register rd.
Move to HI
MTHI rs
The contents of register rs are loaded into special register HI.
Move to LO
MTLO rs
The contents of register rs are loaded into special register LO.
Table 2-14. Multiply/Divide Instructions (Extended ISA)
Instruction
Format and Description
op
rs
rt
rd
sa
funct
Doubleword Multiply
DMULT rs, rt
The contents of registers rt and rs are multiplied, treating both operands as signed integers. The 128bit result is stored into special registers HI and LO.
Doubleword Multiply
Unsigned
DMULTU rs, rt
The contents of registers rt and rs are multiplied, treating both operands as unsigned integers. The
128-bit result is stored into special registers HI and LO.
Doubleword Divide
DDIV rs, rt
The contents of register rs are divided by that of register rt, treating both operands as signed integers.
The 64-bit quotient is stored into special register LO, and the 64-bit remainder is stored into special
register HI.
Doubleword Divide
Unsigned
DDIVU rs, rt
The contents of register rs are divided by that of register rt, treating both operands as unsigned
integers. The 64-bit quotient is stored into special register LO, and the 64-bit remainder is stored into
special register HI.
44
User’s Manual U15509EJ2V0UM
CHAPTER 2 CPU INSTRUCTION SET SUMMARY
Table 2-15. Product-Sum Operation Instructions (for VR4121, VR4122, VR4131, and VR4181A)
Instruction
Format and Description
op
rs
rt
rd
funct
Multiply and Add
Accumulate
MACC{h}{u}{s} rd, rs, rt
The contents of registers rt and rs are multiplied, treating both operands as 32-bit signed integers. The
result is added to the combined value of special registers HI and LO. The 64-bit result is stored into
special registers HI and LO.
If h=0, the same data as that stored in register LO is also stored in register rd; if h=1, the same data as
that stored in register HI is also stored in register rd.
If u is specified, the operand is treated as unsigned data.
If s is specified, registers rs and rd are treated as a 16-bit value (32 bits sign- or zero-extended), and
the value obtained by combining registers HI and LO is treated as a 32-bit value (64 bits sign- or zeroextended). Moreover, saturation processing is performed for the operation result in the format specified
with u.
Doubleword Multiply
and Add Accumulate
DMACC{h}{u}{s} rd, rs, rt
The contents of registers rt and rs are multiplied, treating both operands as 32-bit signed integers. The
result is added to value of special register LO. The 64-bit result is stored into special register LO.
If h=0, the same data as that stored in register LO is also stored in register rd; if h=1, undefined data is
stored in register rd.
If u is specified, the operand is treated as unsigned data.
If s is specified, registers rs and rd are treated as a 16-bit value (32 bits sign- or zero-extended), and
register LO is treated as a 32-bit value (64 bits sign- or zero-extended). Moreover, saturation
processing is performed for the operation result in the format specified with u.
Table 2-16. Product-Sum Operation Instructions (for VR4181)
Instruction
Format and Description
op
rs
rt
rd
sa
funct
Multiply and Add 16bit Integer
MADD16 rs, rt
The contents of registers rt and rs are multiplied, treating both operands as 16-bit signed integers (by
sign extending to 64 bits). The result is added to the combined value of special registers HI and LO.
The 64-bit result is stored into special registers HI and LO.
Doubleword Multiply
and Add 16-bit Integer
DMADD16 rs, rt
The contents of registers rt and rs are multiplied, treating both operands as 16-bit signed integers (by
sign extending to 64 bits). The result is added to value of special register LO. The 64-bit result is stored
into special register LO.
User’s Manual U15509EJ2V0UM
45
CHAPTER 2 CPU INSTRUCTION SET SUMMARY
MFHI and MFLO instructions after a multiply or divide instruction generate interlocks to delay execution of the next
instruction, inhibiting the result from being read until the multiply or divide instruction completes.
Table 2-17 gives the number of processor cycles (PCycles) required to resolve interlock or stall between various
multiply or divide instructions and a subsequent MFHI or MFLO instruction.
Table 2-17. Number of Stall Cycles in Multiply and Divide Instructions
Instruction
46
Number of instruction cycles
MULT
1
MULTU
1
DIV
35
DIVU
35
DMULT
4
DMULTU
4
DDIV
67
DDIVU
67
MACC
0
DMACC
0
MADD16
1
DMADD16
1
User’s Manual U15509EJ2V0UM
CHAPTER 2 CPU INSTRUCTION SET SUMMARY
2.4.3 Jump and branch instructions
Jump and branch instructions change the control flow of a program. All jump and branch instructions occur with a
delay of one instruction: that is, the instruction immediately following the jump or branch instruction (this is known as
the instruction in the delay slot) always executes while the target instruction is being fetched from memory.
For instructions involving a link (such as JAL and BLTZAL), the return address is saved in register r31.
(1)Overview of jump instructions
Subroutine calls in high-level languages are usually implemented with J or JAL instructions, both of which are Jtype instructions. In J-type format, the 26-bit target address shifts left 2 bits and combines with the high-order 4
bits of the current program counter to form a 32-bit or 64-bit absolute address.
Returns, dispatches, and cross-page jumps are usually implemented with the JR or JALR instructions. Both are
R-type instructions that take the 32-bit or 64-bit byte address contained in one of the general-purpose registers.
Table 2-18. Jump Instructions
Instruction
Format and Description
target
op
Jump
J target
The contents of 26-bit target address is shifted left by two bits and combined with the high-order four
bits of the PC. The program jumps to this calculated address with a delay of one instruction.
Jump and Link
JAL target
The contents of 26-bit target address is shifted left by two bits and combined with the high-order four
bits of the PC. The program jumps to this calculated address with a delay of one instruction. The
address of the instruction following the delay slot is stored into r31 (link register).
Instruction
Jump and Link
Exchange
Instruction
Format and Description
target
op
JALX target
The contents of 26-bit target address is shifted left by two bits and combined with the high-order four
bits of the PC. The program jumps to this calculated address with a delay of one instruction, and then
the ISA mode bit is reversed. The address of the instruction following the delay slot is stored into r31
(link register).
Format and Description
op
rs
rt
rd
sa
Jump Register
JR rs
The program jumps to the address specified in register rs with a delay of one instruction.
Jump snd Link
Register
JALR rs, rd
The program jumps to the address specified in register rs with a delay of one instruction.
The address of the instruction following the delay slot is stored into rd.
User’s Manual U15509EJ2V0UM
funct
47
CHAPTER 2 CPU INSTRUCTION SET SUMMARY
(2) Overview of branch instructions
A branch instruction has a PC-related signed 16-bit offset.
All branch instruction target addresses are computed by adding the address of the instruction in the delay slot to
the 16-bit offset (shifted left by 2 bits and sign-extended to 64 bits). All branches occur with a delay of one
instruction.
Calculation of the target address is performed at the RF stage and the EX stage of the instruction. The target
instruction of the branch is fetched at the EX stage of the branch instruction.
If the branch condition does not meet in executing a Likely instruction, the instruction in its delay slot is nullified.
For all other branch instructions, the instruction in its delay slot is unconditionally executed.
Table 2-19. Branch Instructions (1/2)
Instruction
Format and Description
op
rs
rt
offset
Branch on Equal
BEQ rs, rt, offset
If the contents of register rs are equal to that of register rt, the program branches to the target address.
Branch on Not Equal
BNE rs, rt, offset
If the contents of register rs are not equal to that of register rt, the program branches to the target
address.
Branch on Less Than
or Equal to Zero
BLEZ rs, offset
If the contents of register rs are less than or equal to zero, the program branches to the target address.
Branch on Greater
Than Zero
BGTZ rs, offset
If the contents of register rs are greater than zero, the program branches to the target address.
Instruction
Format and Description
REGIMM
rs
sub
offset
Branch on Less Than
Zero
BLTZ rs, offset
If the contents of register rs are less than zero, the program branches to the target address.
Branch on Greater
Than or Equal to Zero
BGEZ rs, offset
If the contents of register rs are greater than or equal to zero, the program branches to the target
address.
Branch on Less Than
Zero and Link
BLTZAL rs, offset
The address of the instruction that follows delay slot is stored to register r31 (link register). If the
contents of register rs are less than zero, the program branches to the target address.
Branch on Greater
Than or Equal to Zero
and Link
BGEZAL rs, offset
The address of the instruction that follows delay slot is stored to register r31 (link register). If the
contents of register rs are greater than or equal to zero, the program branches to the target address.
Remark sub: Sub-operation code
48
User’s Manual U15509EJ2V0UM
CHAPTER 2 CPU INSTRUCTION SET SUMMARY
Table 2-19. Branch Instructions (2/2)
Instruction
Format and Description
COP0
BC
br
offset
Branch on
Coprocessor 0 True
BC0T offset
Adds the 16-bit offset (shifted left by two bits and sign extended to 32 bits) to the address of the
instruction in the delay slot to calculate out the branch target address. If the conditional signal of the
coprocessor 0 is true, the program branches to the target address with one-instruction delay.
Branch on
Coprocessor 0 False
BC0F offset
Adds the 16-bit offset (shifted left by two bits and sign extended to 32 bits) to the address of the
instruction in the delay slot to calculate out the branch target address. If the conditional signal of the
coprocessor 0 is false, the program branches to the target address with one-instruction delay.
Remark BC: BC sub-operation code
br: branch condition identifier
Table 2-20. Branch Instructions (Extended ISA) (1/2)
Instruction
Format and Description
op
rs
rt
offset
Branch on Equal
Likely
BEQL rs, rt, offset
If the contents of register rs are equal to that of register rt, the program branches to the target address.
If the branch condition is not met, the instruction in the delay slot is discarded.
Branch on Not Equal
Likely
BNEL rs, rt, offset
If the contents of register rs are not equal to that of register rt, the program branches to the target
address. If the branch condition is not met, the instruction in the delay slot is discarded.
Branch on Less Than
or Equal to Zero Likely
BLEZL rs, offset
If the contents of register rs are less than or equal to zero, the program branches to the target address.
If the branch condition is not met, the instruction in the delay slot is discarded.
Branch on Greater
Than Zero
BGTZL rs, offset
If the contents of register rs are greater than zero, the program branches to the target address. If the
branch condition is not met, the instruction in the delay slot is discarded.
User’s Manual U15509EJ2V0UM
49
CHAPTER 2 CPU INSTRUCTION SET SUMMARY
Table 2-20. Branch Instructions (Extended ISA) (2/2)
Instruction
Format and Description
REGIMM
rs
sub
offset
Branch on Less Than
Zero Likely
BLTZL rs, offset
If the contents of register rs are less than zero, the program branches to the target address. If the
branch condition is not met, the instruction in the delay slot is discarded.
Branch on Greater
Than or Equal to Zero
Likely
BGEZL rs, offset
If the contents of register rs are greater than or equal to zero, the program branches to the target
address. If the branch condition is not met, the instruction in the delay slot is discarded.
Branch on Less Than
Zero and Link Likely
BLTZALL rs, offset
The address of the instruction that follows delay slot is stored to register r31 (link register). If the
contents of register rs are less than zero, the program branches to the target address. If the branch
condition is not met, the instruction in the delay slot is discarded.
Branch on Greater
Than or Equal to Zero
and Link Likely
BGEZALL rs, offset
The address of the instruction that follows delay slot is stored to register r31 (link register). If the
contents of register rs are greater than or equal to zero, the program branches to the target address. If
the branch condition is not met, the instruction in the delay slot is discarded.
Remark sub: Sub-operation code
Instruction
Format and Description
COP0
BC
br
offset
Branch on
Coprocessor 0 True
Likely
BC0TL offset
Adds the 16-bit offset (shifted left by two bits and sign extended to 32 bits) to the address of the
instruction in the delay slot to calculate out the branch target address. If the conditional signal of the
coprocessor 0 is true, the program branches to the target address with one-instruction delay. If the
branch condition is not met, the instruction in the delay slot is discarded.
Branch on
Coprocessor 0 False
Likely
BC0FL offset
Adds the 16-bit offset (shifted left by two bits and sign extended to 32 bits) to the address of the
instruction in the delay slot to calculate out the branch target address. If the conditional signal of the
coprocessor 0 is false, the program branches to the target address with one-instruction delay. If the
branch condition is not met, the instruction in the delay slot is discarded.
Remark BC: BC sub-operation code
br: branch condition identifier
50
User’s Manual U15509EJ2V0UM
CHAPTER 2 CPU INSTRUCTION SET SUMMARY
2.4.4 Special instructions
Special instructions generate software exceptions. Their formats are R-type (Syscall, Break). The Trap instruction
is available only for the products that support the MIPS III instruction set or later. All the other instructions are
available for all VR Series.
Table 2-21. Special Instructions
Instruction
Format and Description
SPECIAL
rs
rt
rd
sa
Synchronize
SYNC
Completes the load/store instruction executing in the current pipeline before the next load/store
instruction starts execution.
System Call
SYSCALL
Generates a system call exception, and then transits control to the exception handling program.
Breakpoint
BREAK
Generates a break point exception, and then transits control to the exception handling program.
funct
Remark SYNC instruction is handled as a NOP instruction in the VR4100 Series.
Table 2-22. Special Instructions (Extended ISA) (1/2)
Instruction
Format and Description
SPECIAL
rs
rt
rd
sa
funct
Trap If Greater Than
or Equal
TGE rs, rt
The contents of register rs are compared with that of register rt, treating both operands as signed
integers. If the contents of register rs are greater than or equal to that of register rt, an exception
occurs.
Trap If Greater Than
or Equal Unsigned
TGEU rs, rt
The contents of register rs are compared with that of register rt, treating both operands as unsigned
integers. If the contents of register rs are greater than or equal to that of register rt, an exception
occurs.
Trap If Less Than
TLT rs, rt
The contents of register rs are compared with that of register rt, treating both operands as signed
integers. If the contents of register rs are less than that of register rt, an exception occurs.
Trap If Less Than
Unsigned
TLTU rs, rt
The contents of register rs are compared with that of register rt, treating both operands as unsigned
integers. If the contents of register rs are less than that of register rt, an exception occurs.
Trap If Equal
TEQ rs, rt
If the contents of registers rs and rt are equal, an exception occurs.
Trap If Not Equal
TNE rs, rt
If the contents of registers rs and rt are not equal, an exception occurs.
User’s Manual U15509EJ2V0UM
51
CHAPTER 2 CPU INSTRUCTION SET SUMMARY
Table 2-22. Special Instructions (Extended ISA) (2/2)
Instruction
Format and Description
REGIMM
rs
sub
immediate
Trap If Greater Than
or Equal Immediate
TGEI rs, immediate
The contents of register rs are compared with 16-bit sign-extended immediate data, treating both
operands as signed integers. If the contents of register rs are greater than or equal to 16-bit signextended immediate data, an exception occurs.
Trap If Greater Than
or Equal Immediate
Unsigned
TGEIU rs, immediate
The contents of register rs are compared with 16-bit zero-extended immediate data, treating both
operands as unsigned integers. If the contents of register rs are greater than or equal to 16-bit signextended immediate data, an exception occurs.
Trap If Less Than
Immediate
TLTI rs, immediate
The contents of register rs are compared with 16-bit sign-extended immediate data, treating both
operands as signed integers. If the contents of register rs are less than 16-bit sign-extended immediate
data, an exception occurs.
Trap If Less Than
Immediate Unsigned
TLTIU rs, immediate
The contents of register rs are compared with 16-bit zero-extended immediate data, treating both
operands as unsigned integers. If the contents of register rs are less than 16-bit sign-extended
immediate data, an exception occurs.
Trap If Equal
Immediate
TEQI rs, immediate
If the contents of register rs and immediate data are equal, an exception occurs.
Trap If Not Equal
Immediate
TNEI rs, immediate
If the contents of register rs and immediate data are not equal, an exception occurs.
Remark sub: Sub-operation code
2.4.5 System control coprocessor (CP0) instructions
System control coprocessor (CP0) instructions perform operations specifically on the CP0 registers to manipulate
the memory management and exception handling facilities of the processor.
The power mode instructions added in the VR4100 Series are included in this instruction group.
Table 2-23. System Control Coprocessor (CP0) Instructions (1/2)
Instruction
Format and Description
COP0
sub
rt
rd
0
Move to System
Control Coprocessor
MTC0 rt, rd
The word data of general-purpose register rt in the CPU are loaded into general-purpose register rd in
the CP0.
Move from System
Control Coprocessor
MFC0 rt, rd
The word data of general-purpose register rd in the CP0 are loaded into general-purpose register rt in
the CPU.
Doubleword Move to
System Control
Coprocessor 0
DMTC0 rt, rd
The doubleword data of general-purpose register rt in the CPU are loaded into general-purpose register
rd in the CP0.
Doubleword Move
from System Control
Coprocessor 0
DMFC0 rt, rd
The doubleword data of general-purpose register rd in the CP0 are loaded into general-purpose
register rt in the CPU.
Remark sub: Sub-operation code
52
User’s Manual U15509EJ2V0UM
CHAPTER 2 CPU INSTRUCTION SET SUMMARY
Table 2-23. System Control Coprocessor (CP0) Instructions (2/2)
Instruction
Format and Description
COP0
funct
CO
Read Indexed TLB
Entry
TLBR
The TLB entry indexed by the Index register is loaded into the EntryHi, EntryLo0, EntryLo1, or
PageMask register.
Write Indexed TLB
Entry
TLBWI
The contents of the EntryHi, EntryLo0, EntryLo1, or PageMask register are loaded into the TLB entry
indexed by the Index register.
Write Random TLB
Entry
TLBWR
The contents of the EntryHi, EntryLo0, EntryLo1, or PageMask register are loaded into the TLB entry
indexed by the Random register.
Probe TLB For
Matching Entry
TLBP
The address of the TLB entry that matches with the contents of EntryHi register is loaded into the Index
register.
Return From
Exception
ERET
The program returns from exception, interrupt, or error trap.
Remark CO: Sub-operation identifier
Instruction
Format and Description
COP0
funct
CO
STANDBY
STANDBY
The processor’s operating mode is transited from Fullspeed mode to Standby mode.
SUSPEND
SUSPEND
The processor’s operating mode is transited from Fullspeed mode to Suspend mode.
HIBERNATE
HIBERNATE
The processor’s operating mode is transited from Fullspeed mode to Hibernate mode.
Remark CO: Sub-operation identifier
Instruction
Cache Operation
Format and Description
CACHE
base
op
offset
Cache op, offset (base)
The 16-bit offset is sign extended to 32 bits and added to the contents of the register base, to form
virtual address. This virtual address is translated to physical address with TLB. For this physical
address, cache operation that is indicated by 5-bit sub-opcode is performed.
User’s Manual U15509EJ2V0UM
53
CHAPTER 3 MIPS16 INSTRUCTION SET
3.1 Outline
If the MIPS16 ASE (Application-Specific Extension), which is an expanded function for MIPS ISA (Instruction Set
Architecture), is used, system costs can be considerably reduced by lowering the memory capacity requirement of
embedded hardware. MIPS16 is an instruction set that uses the 16-bit instruction length, and is compatible with
Note
MIPS I, II, III, IV, and V
instruction sets in any combination. Moreover, existing 32-bit instruction length binary
data can be executed with MIPS16 without change.
Note The VR4100 Series currently supports the MIPS I, II, and III instruction sets.
MIPS16 instruction set is enabled or disabled in the VR4100 Series according to the state of MIPS16EN pin during
a reset.
3.2 Features
• 16-bit length instruction format
• Reduces memory capacity requirements to lower overall system cost
• MIPS16 instructions can be used with MIPS instruction binary
• Compatibility with MIPS I, II, III, IV, and V instruction sets
• Used with switching between MIPS16 instruction length mode and 32-bit MIPS instruction length mode.
• Supports 8-bit, 16-bit, 32-bit, and 64-bit data formats
• Provides 8 general-purpose registers and special registers
• Improved code generation efficiency using special 16-bit dedicated instructions
54
User’s Manual U15509EJ2V0UM
CHAPTER 3
MIPS16 INSTRUCTION SET
3.3 Register Set
Tables 3-1 and 3-2 show the MIPS16 register sets. These register sets form part of the register sets that can be
accessed in 32-bit instruction length mode. MIPS16 instructions can directly access 8 of the 32 registers that can be
used in the 32-bit instruction length mode.
In addition to these 8 general-purpose registers, the special instructions of MIPS16 reference the stack pointer
register (sp), return address register (ra), condition code register (t8), and program counter (pc). sp and ra are
mapped by fixing to the general-purpose registers in the 32-bit instruction length mode.
MIPS16 has 2 move instructions that are used in addressing 32 general-purpose registers.
Table 3-1. General-purpose Registers
MIPS16 register
encoding
32-bit MIPS
register encoding
Symbol
Comment
0
16
s0
General-purpose register
1
17
s1
General-purpose register
2
2
v0
General-purpose register
3
3
v1
General-purpose register
4
4
a0
General-purpose register
5
5
a1
General-purpose register
6
6
a2
General-purpose register
7
7
a3
General-purpose register
N/A
24
t8
MIPS16 condition code register. BTEQZ, BTNEZ,
CMP, CMPI, SLT, SLTU, SLTI, and SLTIU instructions
are implicitly referenced.
N/A
29
sp
Stack pointer register
N/A
31
ra
Return address register
Remarks 1. The symbols are the general assembler symbols.
2. The MIPS register encoding numbers 0 to 7 correspond to the MIPS16 binary encoding of the
registers, and are used to show the relationship between this encoding and the MIPS registers. The
numbers 0 to 7 are not used to reference registers, except within binary MIPS16 instructions.
Registers are referenced from the assembler using the MIPS name ($16, $17, $2, etc.) or the
symbol name (s0, s1, v0, etc.). For example, when register number 17 is accessed with the register
file, the programmer references either $17 or s1 even if the MIPS16 encoding of this register is 001.
3. The general-purpose registers not shown in this table cannot be accessed with a MIPS16
instruction set other than the Move instruction. The Move instruction of MIPS16 can access all 32
general-purpose registers.
4. To reference the MIPS16 condition code registers with this manual, either T, t8, or $24 has to be
used, depending on the case. These three names reference the same physical register.
User’s Manual U15509EJ2V0UM
55
CHAPTER 3
MIPS16 INSTRUCTION SET
Table 3-2. Special Registers
Symbol
Description
PC
Program counter. The PC-relative Add instruction and Load
instruction can access this register.
HI
The upper word of the multiply or divide result is inserted
LO
The lower word of the multiply or divide result is inserted
3.4 ISA Mode
MIPS16 instruction set supports procedure calling, and returns from the MIPS16 instruction mode or the 32-bit
instruction length mode to the MIPS16 instruction mode or the 32-bit instruction length mode.
• The JAL instruction supports calling to the same ISA.
• The JALX instruction supports calling that inverses ISA.
• The JALR instruction supports calling to either ISA.
• The JR instruction supports also returning to either ISA.
MIPS16 instruction set also supports a return operation from exception processing.
• The ERET instruction, which is defined only in 32-bit instruction length mode, supports returning to ISA when an
exception has not occurred.
The ISA mode bit defines the instruction length mode to be executed. If the ISA mode bit is 0, the processor
executes only 32-bit instructions. If the ISA mode bit is 1, the processor executes only MIPS16 instructions.
3.4.1 Changing ISA mode bit by software
Only the JALX, JR, and JALR instructions change the ISA mode bit between the MIPS16 instruction mode and the
32-bit instruction length mode. The ISA mode bit cannot be directly overwritten by software. The JALX changes the
ISA mode bit to select another ISA mode. The JR instruction and JALR instruction load the ISA mode bit from bit 0 of
the general-purpose register that holds the target address. Bit 0 is not a part of the target address. Bit 0 of the target
address is always 0, and no address exception is generated.
Moreover, the JAL, JALR, and JALX instructions save the ISA mode bit to bit 0 of the general-purpose register that
acquires the return address. The contents of this general-purpose register are later used by the JR and JALR
instruction for return and restoration of the ISA mode.
3.4.2 Changing ISA mode bit by exception
Even if an exception occurs, the ISA mode does not change. When an exception occurs, the ISA mode bit is
cleared to 0 so that the exception is serviced with 32-bit code. Then the ISA mode status before the exception
occurred is saved to the least significant bit of the EPC register or the ErrorEPC register. During return from an
exception, the ISA mode before the exception occurred is returned to by executing the JR or ERET instruction with
the contents of this register. Moreover, the ISA mode bit is cleared to 0 after cold reset and soft reset of the CPU
core, and the 32-bit instruction length mode returns to its initial state.
56
User’s Manual U15509EJ2V0UM
CHAPTER 3
MIPS16 INSTRUCTION SET
3.4.3 Enabling change ISA mode bit
Changing the ISA mode bit is valid only when MIPS16EN pin is set to active during the RTC reset, and the
MIPS16 instruction mode is enabled. The operation of the JALX, JALR, JR, and ERET instructions in the 32-bit
instruction mode, differs depending on whether the MIPS16 instruction mode is enabled or prohibited. If the MIPS16
instruction mode is prohibited, the JALX instruction generates a reserved instruction exception. The JR and JALR
instructions generate an address exception when bit 0 of the source register is 1. The ERET instruction generates an
address exception when bit 0 of the EPC or ErrorEPC register is 1. If the MIPS16 instruction mode is enabled, the
JALX instruction executes JAL, and the ISA mode bit is inverted. The JR and JALR instructions load the ISA mode
from bit 0 of the source register. The ERET instruction loads the ISA mode from bit 0 of the EPC or ErrorEPC
register. Bit 0 of the target address is always 0, and no address exception is generated even when bit 0 of the source
register is 1.
3.5 Types of Instructions
This section describes the different types of instructions, and indicates the MIPS16 instructions included in each
group.
Instructions are divided into the following types.
Load and Store instructions
: Move data between memory and the general-purpose registers.
Computational instructions
: Perform arithmetic operations, logical operations, and shift operations on values
in registers.
Jump and Branch instructions: Change the control flow of a program.
Special instructions
: SYSCALL, BREAK, and Extend instructions.
SYSCALL and BREAK transfer
control to an exception handler. Extend enlarges the immediate field of the next
instruction. Instructions that can be extended with Extend are indicated as Note 1
in Table 3-3 MIPS16 Instruction Set Outline.
Table 3-3. MIPS16 Instruction Set Outline (1/2)
Op
Description
Load and Store instructions
LB
Note 1
Note 1
LBU
Note 1
LH
Note 1
LHU
LW
Note 1
Notes 1, 2
Op
Description
Multiply/Divide instructions
Load Byte
MULT
Multiply
Load Byte Unsigned
MULTU
Multiply Unsigned
Load Halfword
DIV
Divide
Load Halfword Unsigned
DIVU
Divide Unsigned
Load Word
MFHI
Move From HI
LWU
Load Word Unsigned
MFLO
Move From LO
LDNotes 1, 2
Load Doubleword
DMULTNote 2
Doubleword Multiply
SBNote 1
Store Byte
DMULTUNote 2
Doubleword Multiply Unsigned
Note 1
SH
SW
Store Halfword
Note 1
Store Word
Notes 1, 2
SD
DDIV
Note 2
Note 2
DDIVU
Doubleword Divide
Doubleword Divide Unsigned
Store Doubleword
Notes 1. Extendable instruction. For details, see 3.8.2 Extend instruction.
2. Can be used in 64-bit mode and 32-bit Kernel mode.
User’s Manual U15509EJ2V0UM
57
CHAPTER 3
MIPS16 INSTRUCTION SET
Table 3-3. MIPS16 Instruction Set Outline (2/2)
Op
Description
Op
Description
Arithmetic instructions: ALU immediate instructions
Jump/Branch instructions
LINote 1
Load Immediate
JAL
Jump and Link
Add Immediate Unsigned
JALX
Jump and Link Exchange
Doubleword Add Immediate Unsigned
JR
Jump Register
Set on Less Than Immediate
JALR
Note 1
ADDIU
Notes 1, 2
DADDIU
SLTI
Note 1
Note 1
SLTIU
CMPI
Note 1
Set on Less Than Immediate Unsigned
Jump and Link Register
Branch on Equal to Zero
Note 1
Branch on Not Equal to Zero
BEQZ
BNEZ
Compare Immediate
Note 1
Note 1
Branch on T Equal to Zero
Note 1
Branch on T Not Equal to Zero
BTEQZ
Arithmetic instructions: 2/3 operand register instructions
BTNEZ
ADDU
Add Unsigned
BNote 1
SUBU
Subtract Unsigned
DADDUNote 2
Note 2
DSUBU
SLT
Branch Unconditional
Doubleword Add Unsigned
Shift instructions
Doubleword Subtract Unsigned
SLLNote 1
Set on Less Than
Shift Left Logical
Note 1
Shift Right Logical
Note 1
Shift Right Arithmetic
SRL
SLTU
Set on Less Than Unsigned
SRA
CMP
Compare
SLLV
Shift Left Logical Variable
NEG
Negate
SRLV
Shift Right Logical Variable
AND
AND
SRAV
Shift Right Arithmetic Variable
OR
OR
XOR
Exclusive OR
NOT
Not
MOVE
Move
Notes 1, 2
Doubleword Shift Left Logical
Notes 1, 2
Doubleword Shift Right Logical
Notes 1, 2
Doubleword Shift Right Arithmetic
DSLL
DSRL
DSRA
DSLLV
Note 2
Doubleword Shift Left Logical Variable
Note 2
Doubleword Shift Right Logical Variable
DSRLV
Special instructions
EXTEND
Extend
BREAK
Breakpoint
SYCALL
System Call
DSRAVNote 2
Doubleword Shift Right Arithmetic Variable
Notes 1. Extendable instruction. For details, see 3.8.2 Extend instruction.
2. Can be used in 64-bit mode and 32-bit Kernel mode.
58
User’s Manual U15509EJ2V0UM
CHAPTER 3
MIPS16 INSTRUCTION SET
3.6 Instruction Format
The MIPS16 instruction set has a length of 16 bits and is located at the half-word boundary. One part of Jump
instructions and instructions for which the Extend instruction extends immediate become 32 bits in length, but
crossing the word boundary does not represent a problem.
The instruction format is shown below. Variable subfields are indicated with lower case letters (rx, ry, rz,
immediate, etc.).
In the case of special functions, constants are input to the two instruction subfields op and funct. These values
are indicated by upper case mnemonics. For example, in the case of the Load Byte instruction, op is LB, and in the
case of the Add instruction, op is SPECIAL, and function is ADD.
The constants of the fields used in the instruction formats are shown below.
Table 3-4. Field Definition
Field
Definition
op
5-bit major operation code
rx
3-bit source/destination register specification
ry
3-bit source/destination register specification
immediate or imm
4-bit, 5-bit, 8-bit, or 11-bit immediate value,
branch displacement, or address displacement
rz
3-bit source/destination register specification
Funct or F
Function field
I-type (immediate) instruction format
15
14
13
12
11
10
9
8
7
op
6
5
4
3
2
1
0
3
2
1
0
2
1
0
immediate
RI-type instruction format
15
14
13
12
11
10
op
9
8
7
6
5
rx
4
immediate
RR-type instruction format
15
14
13
op
12
11
10
9
rx
8
7
6
5
ry
User’s Manual U15509EJ2V0UM
4
3
Funct
59
CHAPTER 3
MIPS16 INSTRUCTION SET
RRI-type instruction format
15
14
13
12
11
10
op
9
8
7
rx
6
5
4
3
ry
2
1
0
immediate
RRR-type instruction format
15
14
13
12
11
10
RRR
9
8
7
rx
6
5
4
ry
3
2
1
0
rz
F
RRI-A type instruction format
15
14
13
12
11
10
RRI-A
9
8
7
rx
6
5
ry
4
3
2
F
1
0
immediate
SHIFT instruction format
15
14
13
12
11
10
SHIFT
9
8
7
rx
6
5
4
3
2
1
shamtNote
ry
0
F
Note The 3-bit shamt field can encode shift count numbers from 0 to 7. 0-bit shift (NOP) cannot be executed. 0
is regarded as shift count 8.
I8-type instruction format
15
14
13
12
11
10
I8
9
8
7
6
5
Funct
4
3
2
1
0
immediate
I8_MOVR32 instruction format (used only with MOVR32 instruction)
15
14
13
I8
60
12
11
10
9
Funct
8
7
6
5
ry
User’s Manual U15509EJ2V0UM
4
3
2
r32[4:0]
1
0
CHAPTER 3
MIPS16 INSTRUCTION SET
I8_MOV32R instruction format (used only with MOV32R instruction)
15
14
13
12
11
10
I8
9
8
7
6
5
4
3
2
1
r32[2:0, 4:3]Note
Funct
0
rz
Note The r32 field uses special bit encoding. For example, encoding of $7 (00111) is 11100 in the r32 field.
I64-type instruction format
15
14
13
12
11
10
I64
9
8
7
6
5
Funct
4
3
2
1
0
2
1
0
immediate
RI64-type instruction format
15
14
13
12
I64
11
10
9
Funct
8
7
6
5
4
3
ry
immediate
JAL and JALX instruction format
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9
immediate(15:0)
JAL
X
8
7
6
5
4
3
2
1
0
immediate(20:16) immediate(25:21)
JAL in case of X = 0 instruction
JALX in case of X = 1 instruction
User’s Manual U15509EJ2V0UM
61
CHAPTER 3
MIPS16 INSTRUCTION SET
EXT-I instruction format
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9
MAJOR
0
0
0
0
0
0
immediate(4:0)
EXTEND
8
7
6
5
immediate(10:5)
4
3
2
1
0
immediate(15:11)
EXT-RI instruction format
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9
MAJOR
rx
0
0
0
immediate(4:0)
EXTEND
8
7
6
5
immediate(10:5)
4
3
2
1
0
immediate(15:11)
EXT-RRI instruction format
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9
MAJOR
rx
immediate(4:0)
ry
EXTEND
8
7
6
5
immediate(10:5)
4
3
2
1
0
immediate(15:11)
EXT-RRI-A instruction format
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9
RRI-A
62
rx
ry
F
immediate(3:0)
EXTEND
User’s Manual U15509EJ2V0UM
8
7
6
5
immediate(10:4)
4
3
2
1
0
immediate(14:11)
CHAPTER 3
MIPS16 INSTRUCTION SET
EXT-SHIFT instruction format
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9
SHIFT
rx
0
ry
0
0
F
EXTEND
8
7
6
shamt(4:0)
5
S5
Note
4
3
2
1
0
0
0
0
0
0
Note Only in the case of DSLL, the S5 bit is the most significant bit of the 6-bit shift count field (shamt).
In the case of all 32-bit extended shifts, S5 must be 0. For a normal shift instruction, the display of shift
count 0 is considered as shift count 8, but the extended shift instruction does not perform such mapping
changes. Therefore, 0-bit shift using the extended format is possible.
EXT-I8 instruction format
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9
I8
Funct
0
0
0
immediate(4:0)
EXTEND
8
7
6
5
immediate(10:5)
4
3
2
1
0
immediate(15:11)
EXT-I64 instruction format
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9
I64
Funct
0
0
0
immediate(4:0)
EXTEND
8
7
6
5
immediate(10:5)
4
3
2
1
0
immediate(15:11)
EXT-RI64 instruction format
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9
I64
Funct
ry
immediate(4:0)
EXTEND
8
7
6
5
immediate(10:5)
4
3
2
1
0
immediate(15:11)
EXT-SHIFT64 instruction format
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9
RR
0
0
0
ry
Function
EXTEND
8
7
shamt(4:0)
6
4
3
2
1
0
S5Note 0
5
0
0
0
0
Note The S5 bit is the most significant bit of the 6-bit shift count field (shamt). In the case of a normal shift
instruction, the display of shift count 0 is considered as shift count 8, but the extended shift instruction
does not perform such mapping changes.
Therefore, 0-bit shift using the extended format is possible.
User’s Manual U15509EJ2V0UM
63
CHAPTER 3
MIPS16 INSTRUCTION SET
3.7 MIPS16 Operation Code Bit Encoding
This section describes encoding for major and minor opcode. Table 3-5 shows bit encoding of the MIPS16 major
operation code. Tables 3-6 to 3-11 show bit encoding of the minor operation code. The italic operation codes in the
tables are instructions for the extended ISA.
Table 3-5. Bit Encoding of Major Operation Code (op)
Instruction
bits
[15:14]
Instruction bits [13:11]
000
001
Note 1
addiusp
00
01
RRI-A
10
lb
11
sb
addiupc
010
Note 2
b
Note 4
addiu8
011
Note 3
jal(x)
100
101
110
111
beqz
bnez
SHIFT
ld
slti
sltiu
l8
li
cmpi
sd
lh
lwsp
lw
lbu
lhu
lwpc
lwu
sh
swsp
sw
RRR
RR
extend
l64
Notes 1.
addiusp : addiu rx, sp, immediate
2.
addiupc : addiu rx, pc, immediate
3.
jal(x)
4.
addiu8 : aadiu rx, immediate
: jal instruction and jalx instruction
Table 3-6. RR Minor Operation Code (RR-Type Instruction)
Instruction
bits
[4:3]
Instruction bits [2:0]
000
001
010
011
100
101
110
111
Note 1
∗
slt
sltu
sllv
break
srlv
srav
Note 2
syscall
cmp
neg
and
or
xor
not
dsllv
∗
dsrlv
dsrav
dmult
dmultu
ddiv
ddivu
j(al)r
00
dsrl
01
10
Mfhi
∗
mflo
11
mult
Multu
div
Notes 1.
Note 2
dsra
divu
J(al)r: jr rx instruction (ry = 000)
jr ra instruction (ry = 001, rx = 000)
jalr ra, rx instruction (ry = 010)
2.
dsrl and dsra use the rx register field to encode the shift count (8-digit shift for 0). In the case of the
extended version of these two instructions, the EXT-SHIFT64 format is used. Only these two RR
instructions can be extended.
Remarks The symbols in the figures have the following meaning.
∗ : Execution of operation code with an asterisk on the current VR4100 Series causes a reserved
instruction exception to be generated. This code is reserved for future extension.
64
User’s Manual U15509EJ2V0UM
CHAPTER 3
MIPS16 INSTRUCTION SET
Table 3-7. RRR Minor Operation Code (RRR-Type Instruction)
Instruction bits [1:0]
00
01
10
11
daddu
addu
dsubu
subu
Table 3-8. RRI-A Minor Operation Code (RRI-Type ADD Instruction)
Instruction bit [4]
0
1
Note 1
addiu
daddiuNote 2
Notes 1. addiu : addiu ry, rx, immediate
2. daddiu : daddiu ry, rx immediate
Table 3-9. SHIFT Minor Operation Code (SHIFT-Type Instruction)
Instruction bits [1:0]
00
01
10
11
sll
Dsll
srl
sra
Table 3-10. I8 Minor Operation Code (I8-Type Instruction)
Instruction bits [10:8]
000
001
bteqz
Notes 1.
btnez
010
swrasp
011
Note 2
adjsp
100
∗
101
mov32r
Note 3
110
111
∗
movr32Note 4
swrasp : sw ra, immediate(sp)
2.
adjsp
3.
mov32r: move r32, rz
4.
movr32: move ry, r32
Remark
Note 1
: addiu sp, immediate
The symbols used in the figures have the following meaning.
∗ : Execution of operation code with an asterisk on the current VR4100 Series causes a reserved
instruction exception to be generated. This code is reserved for future extension.
User’s Manual U15509EJ2V0UM
65
CHAPTER 3
MIPS16 INSTRUCTION SET
Table 3-11. I64 Minor Operation Code (64-bit Only, I64-Type Instruction)
Instruction bits [10:8]
000
001
Note 1
Note 2
ldsp
66
sdsp
010
011
Note 3
Note 4
sdrasp
dadjsp
Notes 1.
ldsp
: ld ry, immediate
2.
sdsp
: sd ry, immediate
3.
sdrasp : sd ra, immediate
4.
dadjsp : daddiu sp, immediate
5.
ldpc
6.
daddiu5 : daddiu ry, immediate
7.
dadiupc : daddiu ry, pc, immediate
8.
dadiusp : daddiu ry, sp, immediate
100
ldpc
Note 5
: ld ry, immediate
User’s Manual U15509EJ2V0UM
101
110
Note 6
daddiu5
dadiupc
111
Note 7
dadiuspNote 8
CHAPTER 3
MIPS16 INSTRUCTION SET
3.8 Outline of Instructions
This section describes the assembler syntax and defines each instruction. Instructions can be divided into the
following four types.
• Load and Store instructions
• Computational instructions
• Jump and Branch instructions
• Special instructions
3.8.1 PC-relative instructions
PC-relative instructions is the instruction format first defined among the MIPS16 instruction set. MIPS16 supports
both extension and non-extension through the Extend instruction for four PC-relative instructions.
Load Word
LW rx, offset(pc)
Load Doubleword
LD ry, offset(pc)
Add Immediate Unsigned
ADDIU rx, pc, immediate
Doubleword Add Immediate Unsigned
DADDIU ry, pc, immediate
All these instructions calculate the PC value of a PC-relative instruction or the PC value of the instruction
immediately preceding as the base address. The address calculation base using various function combinations is
shown next.
Table 3-12. Base PC Address Setting
Instruction
Base PC value
Non-extension PC-relative instructions
not located in Jump delay slot
PC of instruction
Extension PC-relative instruction
PC of Extend instruction
Non-extension PC-relative instruction in
Jump delay slot of JR or JALR
PC of JR instruction or JALR instruction
Non-extension PC-relative instruction in
Jump delay slot of JAL or JALX
PC of initial halfword of JAL or JALXNote
Note Because the JAL and JALX instruction length is 32 bits.
The PC value used as the base for address calculation for the PC-relative instruction outlines shown in tables 3-14
and 3-15 is called base PC value. The base PC value is defined so as to be equivalent to the exception program
counter (EPC) value related to the PC-relative instruction.
User’s Manual U15509EJ2V0UM
67
CHAPTER 3
MIPS16 INSTRUCTION SET
3.8.2 Extend instruction
The Extend instruction can extend the immediate fields of MIPS16 instructions, which have fewer immediate fields
than equivalent 32-bit MIPS instructions.
The Extend instruction must always precede (by one instruction) the
instruction whose immediate field you want to extend. Every extended instruction consumes four bytes in program
memory instead of two bytes (two bytes for Extend and two bytes for the instruction being extended), and it can cross
a word boundary.
For example, the MIPS16 instruction
LW ry, offset (rx)
contains a five-bit immediate. The immediate expands to 16 bits (000000000 || offset || 00) before execution in the
pipeline. This allows 32 different offset values of 0, 4, 8, and up through 124. Once extended, this instruction can
hold any of the normal 65,536 values in the range –32768 through 32767.
Shift instructions are extended to 5-bit unsigned immediate values. All other immediate instructions expand to
either signed or unsigned 16-bit immediate values. The only exceptions are
ADDIU ry, rx, immediate
DADDIU ry, rx, immediate
which can be extended only to a 15-bit signed immediate.
There is only one restriction. Extended instructions should not be placed in jump delay slots. Otherwise, the
results are unpredictable because the pipeline would attempt to execute one half the instruction.
Table 3-13 lists the MIPS16 extendable instructions, the size of their immediate, and how much each immediate
can be extended when preceded with the Extend instruction.
For the instruction format of the Extend instruction, see 3.6 Instruction Format.
68
User’s Manual U15509EJ2V0UM
CHAPTER 3
MIPS16 INSTRUCTION SET
Table 3-13. Extendable MIPS16 Instructions
MIPS16 Immediate
Instruction Format
Extended
Immediate
Load Byte
5
RRI
16
EXT-RRI
Load Byte Unsigned
5
RRI
16
EXT-RRI
Load Halfword
5
RRI
16
EXT-RRI
Load Halfword Unsigned
5
RRI
16
EXT-RRI
Load Word
5
8
RRI
RI
16
16
EXT-RRI
EXT-RI
Load Word Unsigned
5
RRI
16
EXT-RRI
Load Doubleword
5
RRI
16
EXT-RRI
Store Byte
5
RRI
16
EXT-RRI
Store Halfword
5
RRI
16
EXT-RRI
Store Word
5 (Other)
8 (SW rx, offset(sp))
8 (SW ra, offset(sp))
RRI
RI
I8
16
16
16
EXT-RRI
EXT-RI
EXT-I8
Store Doubleword
5 (SD ry, offset(rx))
8 (Other)
RRI
I64
16
16
EXT-RRI
EXT-I64
8
RI
16
EXT-RI
4 (ADDIU ry, rx, imm)
8 (ADDIU sp, imm)
8 (Other)
RRI-A
I8
RI
15
16
16
EXT-RRI-A
EXT-I8
EXT-RI
4 (DADDIU ry, rx, imm)
5 (DADDIU ry, pc, imm)
8 (Other)
RRI-A
RI64
I64
15
16
16
EXT-RRI-A
EXT-RI64
EXT-I64
Set on Less Than Immediate
8
RI
16
EXT-RI
Set on Less Than Immediate Unsigned
8
RI
16
EXT-RI
Compare Immediate
8
RI
16
EXT-RI
Shift Left Logical
3
SHIFT
5
EXT-SHIFT
Shift Right Logical
3
SHIFT
5
EXT-SHIFT
Shift Right Arithmetic
3
SHIFT
5
EXT-SHIFT
Doubleword Shift Left Logical
3
SHIFT
6
EXT-SHIFT
Doubleword Shift Right Logical
3
RR
6
EXT- SHIFT64
Doubleword Shift Right Arithmetic
3
RR
6
EXT- SHIFT64
Branch on Equal to Zero
8
RI
16
EXT-RI
Branch on Not Equal to Zero
8
RI
16
EXT-RI
Branch on T Equal to Zero
8
I8
16
EXT-I8
Branch on T Not Equal to Zero
8
I8
16
EXT-I8
Branch Unconditional
11
I
16
EXT-I
MIPS16 Instruction
Load Immediate
Add Immediate Unsigned
Doubleword Add Immediate Unsigned
User’s Manual U15509EJ2V0UM
Instruction
Format
69
CHAPTER 3
MIPS16 INSTRUCTION SET
3.8.3 Delay slots
MIPS16 instructions normally execute in one cycle. However, some instructions have special requirements that
must be met to assure optimum instruction flow. The instructions include All Load, Branch, and Multiply/Divide
instructions.
(1) Load delay slots
MIPS16 operates with delayed loads. This is similar to the method used by 32-bit length instruction sets. If
another instruction references the load destination register before the load operation is completed, one cycle
occurs automatically. To assure the best performance, the compiler should always schedule load delay slots as
early as possible.
(2) Branch delay slots not supported
Unlike for 32-bit length instructions, there are no branch delay slots for branch instructions in MIPS16. If a
branch is taken, the instruction that immediately follows the branch (instruction corresponding to 32-bit length
instruction's delay slot) is cancelled. There are no restrictions on the instruction that follows a branch instruction,
and such instruction is executed only when a branch is not taken. Branches, jumps, and extended instructions
are permitted in the instruction slot after a branch.
(3) Jump delay slots
With MIPS16, there is a delay of one cycle after each jump instruction. The processor executes any instruction
in the jump delay slot before it executes the jump target instruction. Two restrictions apply to any instruction
placed in the jump delay slot:
1.
Do not specify a branch or jump in the delay slot.
2.
Do not specify an extended instruction (32 bits) in the delay slot.
Doing so will make the results
unpredictable.
(4) Multiply and divide scheduling
Multiply and divide latency depends on the hardware implementation. If an MFLO or MFHI instruction references
the Multiply or Divide result registers before the result is ready, the pipeline stalls until the operation is complete
and the result is available. However, to assure the best performance, the compiler should always schedule
Multiply and Divide instructions as early as possible.
MIPS16 requires that all MFHI and MFLO instructions be followed by two instructions that do not write to the HI or
LO registers. Otherwise, the data read by MFLO or MFHI will be undefined. The Extend instruction is counted
singly as one instruction.
70
User’s Manual U15509EJ2V0UM
CHAPTER 3
MIPS16 INSTRUCTION SET
3.8.4 Instruction details
(1) Load and store instructions
Load and Store instructions move data between memory and the general-purpose registers.
The only
addressing mode that is supported is the mode for adding immediate offset to the base register.
Table 3-14. Load and Store Instructions (1/3)
Instruction
Format and Description
Load Byte
LB ry, offset (rx)
The 5-bit immediate is zero extended and then added to the contents of general-purpose register rx to
form the virtual address. The bytes of the memory location specified by the address are sign extended
and loaded into general-purpose register ry.
Load Byte Unsigned
LBU ry, offset (rx)
The 5-bit immediate is zero extended and then added to the contents of general-purpose register rx to
form the virtual address. The bytes of the memory location specified by the address are zero extended
and loaded into general-purpose register ry
Load Halfword
LH ry, offset (rx)
The 5-bit immediate is shifted left one bit, zero extended, and then added to the contents of generalpurpose register rx to form the virtual address. The halfword of the memory location specified by the
address is sign extended and loaded to general-purpose register ry.
If the least significant bit of the address is not 0, an address error exception is generated.
Load Halfword
Unsigned
LHU ry, offset (rx)
The 5-bit immediate is shifted left one bit, zero extended, and then added to the contents of generalpurpose register rx to form the virtual address. The halfword of the memory location specified by the
address is zero extended and loaded to general-purpose register ry.
If the least significant bit of the address is not 0, an address error exception is generated.
Load Word
LW ry, offset (rx)
The 5-bit immediate is shifted left two bits, zero extended, and then added to the contents of generalpurpose register rx to form the virtual address. The word of the memory location specified by the
address is loaded to general-purpose register ry. In the 64-bit mode, it is further sign extended to 64
bits.
If either of the lower two bits is not 0, an address error exception is generated.
LW rx, offset (pc)
The two lower bits of the BasePC value associated with the instruction are cleared to form the masked
BasePC value. The 8-bit immediate is shifted left two bits, zero extended, and then added to the
masked BasePC to form the virtual address. The contents of the word at the memory location specified
by the address are loaded to general-purpose register rx. In the 64-bit mode, it is further sign extended
to 64 bits.
LW rx, offset (sp).
The 8-bit immediate is shifted left two bits, zero extended, and then added to the contents of generalpurpose register sp to form the virtual address. The contents of the word at the memory location
specified by the address are loaded to general-purpose register rx. In the 64-bit mode, it is further sign
extended to 64 bits.
If either of the two lower bits of the address is 0, an address error exception is generated.
User’s Manual U15509EJ2V0UM
71
CHAPTER 3
MIPS16 INSTRUCTION SET
Table 3-14. Load and Store Instructions (2/3)
Instruction
Format and Description
Load Word Unsigned
LWU ry, offset (rx)
The 5-bit immediate is shifted left two bits, zero extended to 64 bits, and then added to the contents of
general-purpose register rx to form the virtual address. The word of the memory location specified by
the address is zero extended and loaded to general-purpose register ry.
If either of the two lower bits of the address is not 0, an address error exception is generated.
This operation is defined in the 64-bit mode and the 32-bit kernel mode. When this instruction is
executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated.
Load Doubleword
LD ry, offset (rx)
The 5-bit immediate is shifted left three bits, zero extended to 64 bits, and then added to the contents
of general-purpose register rx to form the virtual address. The 64-bit doubleword of the memory
location specified by the address is loaded to general-purpose register ry.
If any of the lower three bits of the address is not 0, an address error exception is generated.
This operation is defined in the 64-bit mode and the 32-bit kernel mode. When this instruction is
executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated.
LD ry, offset (pc)
The lower three bits of the base PC value related to the instruction are cleared to form the masked
BasePC value.
The 5-bit immediate is shifted left three bits, zero extended to 64 bits, and then added to the masked
BasePC to form the virtual address. The 64-bit doubleword at the memory location specified by the
address is loaded to general-purpose register ry.
This operation is defined in the 64-bit mode and the 32-bit kernel mode. When this instruction is
executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated.
LD ry, offset (sp)
The 5-bit immediate is shifted left three bits, zero extended to 64 bits, and added to the contents of
general-purpose register sp to form the virtual address. The 64-bit doubleword at the memory location
specified by the address is loaded to general-purpose register ry.
If any of the three lower bits of the address is not 0, an address error exception is generated.
This operation is defined in the 64-bit mode and the 32-bit kernel mode. When this instruction is
executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated.
72
User’s Manual U15509EJ2V0UM
CHAPTER 3
MIPS16 INSTRUCTION SET
Table 3-14. Load and Store Instructions (3/3)
Instruction
Format and Description
Store Byte
SB ry, offset (rx)
The 5-bit immediate is zero extended and then added to the contents of general-purpose register rx to
form the virtual address. The least significant byte of general-purpose register ry is stored to the
memory location specified by the address.
Store Halfword
SH ry, offset (rx)
The 5-bit immediate is shifted left one bit, zero extended, and then added to the contents of generalpurpose register rx to form the virtual address. The lower halfword of general-purpose register ry is
stored to the memory location specified by the address.
If the least significant bit of the address is not 0, an address error exception is generated.
Store Word
SW ry, offset (rx)
The 5-bit immediate is shifted left two bits, zero extended, and then added to the contents of generalpurpose register rx to form a virtual address. The contents of general-purpose register ry are stored to
the memory location specified by the address. If either of the two lower bits of the address is not 0, an
address error exception is generated.
SW rx, offset (sp)
The 8-bit immediate is shifted left two bits, zero extended, and then added to the contents of generalpurpose register sp to form the virtual address. The contents of general-purpose register rx are stored
to the memory location specified by the address. If either of the two lower bits of the address is not 0,
and address error exception is generated.
SW ra, offset (sp)
The 8-bit immediate is shifted left two bits, zero extended, and then added to the contents of generalpurpose register sp to form the virtual address. The contents of general-purpose register ra are stored
to the memory location specified by the address. If either of the two lower bits of the address is not 0,
an address error exception is generated.
Store Doubleword
SD ry, offset (rx)
The 5-bit immediate is shifted left three bits, zero extended to 64 bits, and then added to the contents
of general-purpose register rx to form the virtual address. The 64 bits of general-purpose register ry are
stored to the memory location specified by the address. If any of the lower three bits of the address is
not 0, an address error exception is generated.
This operation is defined in the 64-bit mode and the 32-bit kernel mode. When this instruction is
executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated.
SD ry, offset (sp)
The 5-bit immediate is shifted left three bits, zero extended to 64 bits, and then added to the contents
of general-purpose register sp to form the virtual address. The 64 bits of general-purpose register ry
are stored to the memory location specified by the address.
If any of the lower three bits of the address is not 0, an address error exception is generated.
This operation is defined in the 64-bit mode and the 32-bit kernel mode. When this instruction is
executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated.
SD ra, offset (sp).
The 8-bit immediate is shifted left three bits, zero extended to 64 bits, and then added to the contents
of general-purpose register sp to form the virtual address. The 64 bits of general-purpose register ra
are stored to the memory location specified by the memory. If any of the three lower bits of the address
is not 0, an address error exception is generated.
This operation is defined in the 64-bit mode and the 32-bit kernel mode. When this instruction is
executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated.
User’s Manual U15509EJ2V0UM
73
CHAPTER 3
MIPS16 INSTRUCTION SET
(2) Computational instructions
Computational instructions perform arithmetic, logical, and shift operations on values in registers. There are four
categories of Computational instructions: ALU Immediate, Two/Three-Operand Register-Type, Shift, and
Multiply/Divide.
Table 3-15. ALU Immediate Instructions (1/2)
Instruction
Format and Description
Load Immediate
LI rx, immediate
The 8-bit immediate is zero extended and loaded to general-purpose register rx.
Add Immediate
Unsigned
ADDIU ry, rx, immediate
The 4-bit immediate is sign extended and then added to the contents of general-purpose register rx to
form a 32-bit result. The result is placed into general-purpose register ry. No integer overflow exception
occurs under any circumstances. In the 64-bit mode, the operand must be a 64-bit value formed by
sign-extending a 32-bit value.
ADDIU rx, immediate
The 8-bit immediate is sign extended and then added to the contents of general-purpose register rx to
form a 32-bit result. The result is placed into general-purpose register rx. No integer overflow exception
occurs under any circumstances. In the 64-bit mode, the operand must be a 64-bit value formed by
sign-extending a 32-bit value.
ADDIU sp, immediate
The 8-bit immediate is shifted left three bits, sign extended, and then added to the contents of generalpurpose register sp to form a 32-bit result. The result is placed into general-purpose register sp. No
integer overflow exception occurs under any circumstances. In the 64-bit mode, the operand must be a
64-bit value formed by sign-extending a 32-bit value.
ADDIU rx, pc, immediate
The two lower bits of the BasePC value associated with the instruction are cleared to form the masked
BasePC value. The 8-bit immediate is shifted left two bits, zero extended, and then added to the
masked BasePC value to form the virtual address. This address is placed into general-purpose register
rx. No integer overflow exception occurs under any circumstances.
ADDIU rx, sp, immediate
The 8-bit immediate is shifted left two bits, zero extended, and then added to the contents of register
sp to form a 32-bit result. The result is placed into general-purpose register rx. No integer overflow
exception occurs under any circumstance. In the 64-bit mode, the operand must be a 64-bit value
formed by sign-extending a 32-bit value.
74
User’s Manual U15509EJ2V0UM
CHAPTER 3
MIPS16 INSTRUCTION SET
Table 3-15. ALU Immediate Instructions (2/2)
Instruction
Doubleword Add
Immediate Unsigned
Format and Description
DADDIU ry, rx, immediate
The 4-bit immediate is sign extended to 64 bits, and then added to the contents of register rx to form a
64-bit result. The result is placed into general-purpose register ry. No integer overflow exception occurs
under any circumstances.
This operation is defined in the 64-bit mode and the 32-bit kernel mode. When this instruction is
executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated.
DADDIU ry, immediate
The 5-bit immediate is sign extended to 64 bits, and then added to the contents of register ry to form a
64-bit result. The result is placed into general-purpose register ry. No integer overflow exception occurs
under any circumstances.
This operation is defined in the 64-bit mode and the 32-bit kernel mode. When this instruction is
executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated.
DADDIU sp, immediate
The 8-bit immediate is shifted left three bits, sign extended to 64 bits, and then added to the contents
of register sp to form a 64-bit result. The result is placed into general-purpose register sp. No integer
overflow exception occurs under any circumstances.
This operation is defined in the 64-bit mode and the 32-bit kernel mode. When this instruction is
executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated.
DADDIU ry, pc, immediate
The two lower bits of the BasePC value associated with the instruction are cleared to form the masked
BasePC value. The 5-bit immediate is shifted left two bits, zero extended, and added to the masked
BasePC value to form the virtual address. This address is placed into general-purpose register ry. No
integer overflow exception occurs under any circumstances.
This operation is defined in the 64-bit mode and the 32-bit kernel mode. When this instruction is
executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated.
DADDIU ry, sp, immediate
The 5-bit immediate is shifted left two bits, zero extended to 64 bits, and then added to the contents of
register sp to form a 64-bit result. This result is placed into register ry. No integer overflow exception
occurs under any circumstances.
This operation is defined in the 64-bit mode and the 32-bit kernel mode. When this instruction is
executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated.
Set on Less Than
Immediate
SLTI rx, immediate
The 8-bit immediate is zero extended and subtracted from the contents of general-purpose register rx.
Considering both quantities as signed integers, if rx is less than the zero-extended immediate, the
result is set to 1; otherwise, the result is set to 0. The result is placed into register T ($24).
Set on Less Than
Immediate Unsigned
SLTIU rx, immediate
The 8-bit immediate is zero extended and subtracted from the contents of general-purpose register rx.
Considering both quantities as signed integers, if rx is less than the zero-extended immediate, the
result is set to 1; otherwise, the result is set to 0. The result is placed into register T ($24).
Compare Immediate
CMPI rx, immediate
The 8-bit immediate is zero extended and exclusive ORed in 1-bit units with the contents of generalpurpose register rx. The result is placed into register T ($24).
User’s Manual U15509EJ2V0UM
75
CHAPTER 3
MIPS16 INSTRUCTION SET
Table 3-16. Two-/Three-Operand Register Type (1/2)
Instruction
Format and Description
Add Unsigned
ADDU rz, rx, ry
The contents of general-purpose registers rx and ry are added together to form a 32-bit result. The
result is placed into general-purpose register rz. No integer overflow exception occurs under any
circumstances. In the 64-bit mode, the operand must be a 64-bit value formed by sign-extending a 32bit value.
Subtract Unsigned
SUBU rz, rx, ry
The contents of general-purpose register ry are subtracted from the contents of general-purpose
register rx. The 32-bit result is placed into general-purpose register rz. No integer overflow exception
occurs under any circumstances. In the 64-bit mode, the operand must be a 64-bit value formed by
sign-extending a 32-bit value.
Doubleword Add
Unsigned
DADDU rz, rx, ry
The contents of general-purpose register ry are added to the contents of general-purpose register rx.
The 64-bit result is placed into register rz. No integer overflow exception occurs under any
circumstances.
This operation is defined in the 64-bit mode and the 32-bit kernel mode. When this instruction is
executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated.
Doubleword Subtract
Unsigned
DSUBU rz, rx, ry
The contents of general-purpose register ry are subtracted from the contents of general-purpose
register rx. The 64-bit result is placed into general-purpose register rz. No integer overflow exception
occurs under any circumstances.
This operation is defined in the 64-bit mode and the 32-bit kernel mode. When this instruction is
executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated.
Set on Less Than
SLT rx, ry
The contents of general-purpose register ry are subtracted from the contents of general-purpose
register rx. Considering both quantities as signed integers, if the contents of rx are less than the
contents of ry, the result is set to 1; otherwise, the result is set to 0. The result is placed into register T
($24).
No integer overflow exception occurs. The comparison is valid even if the subtraction overflows.
Set on Less Than
Unsigned
SLTU rx, ry
The contents of general-purpose register ry are subtracted from the contents of general-purpose
register rx. Considering both quantities as unsigned integers, if the contents of rx are less than the
contents of ry, the result is set to 1; otherwise, the result it set to 0. The result is place in register T
($24).
No integer overflow exception occurs. The comparison is valid even if the subtraction overflows.
76
User’s Manual U15509EJ2V0UM
CHAPTER 3
MIPS16 INSTRUCTION SET
Table 3-16. Two-/Three-Operand Register Type (2/2)
Instruction
Format and Description
Compare
CMP rx, ry
The contents of general-purpose register ry are Exclusive-ORed with the contents of general-purpose
register rx. The result is placed into register T ($24).
Negate
NEG rx, ry
The contents of general-purpose register ry are subtracted from zero to form a 32-bit result. The result
is placed in general-purpose register rx.
AND
AND rx, ry
The contents of general-purpose register ry are logical ANDed with the contents of general-purpose
register rx in 1-bit units. The result is placed in general-purpose register rx.
OR
OR rx, ry
The contents of general-purpose register ry are logical ORed with the contents of general-purpose
register ry. The result is placed in general-purpose register rx.
Exclusive OR
XOR rx, ry
The contents of general-purpose register ry are Exclusive-ORed with the contents of general-purpose
register rx in 1-bit units. The result is placed in general-purpose register rx.
NOT
NOT rx, ry
The contents of general-purpose register ry are inverted in 1-bit units and placed in general-purpose
register rx.
Move
MOVE ry, r32
The contents of general-purpose register r32 are moved to general-purpose register ry. R32 can
specify any one of the 32 general-purpose registers.
MOVE r32, rz
The contents of general-purpose register rz are moved to general-purpose register r32. r32 can specify
any one of the 32 general-purpose registers
User’s Manual U15509EJ2V0UM
77
CHAPTER 3
MIPS16 INSTRUCTION SET
Table 3-17. Shift Instructions (1/2)
Instruction
Format and Description
Shift Left Logical
SLL rx, ry, immediate
The 32-bit contents of general-purpose register ry are shifted left and zeros are inserted into the
emptied low-order bits. The 3-bit immediate specifies the shift count. A shift count of 0 is interpreted as
a shift count of 8. The result is placed in general-purpose register rx. In the 64-bit mode, the value that
is formed by sign-extending shifted 32-bit value is stored as the result.
Shift Right Logical
SLR rx, ry, immediate
The 32-bit contents of general-purpose register ry are shifted right, and zeros are inserted into the
emptied high-order bits. The 3-bit immediate specifies the shift count. A shift count of 0 is interpreted
as a shift count of 8. The result is placed in general-purpose register rx. In the 64-bit mode, the value
that is formed by sign-extending shifted 32-bit value is stored as the result.
Shift Right Arithmetic
SRA rx, ry, immediate
The 32-bit contents of general-purpose register ry are shifted right and the emptied high-order bits are
sign extended. The 3-bit immediate specifies the shift count. A shift count of 0 is interpreted as a shift
count of 8. In the 64-bit mode, the value that is formed by sign-extending shifted 32-bit value is stored
as the result.
Shift Left Logical
Variable
SLLV ry, rx
The 32-bit contents of general-purpose register ry are shifted left, and zeros are inserted into the
emptied low-order bits. The five low-order bits of general-purpose register rx specify the shift count.
The result is placed in general-purpose register ry. In the 64-bit mode, the value that is formed by signextending shifted 32-bit value is stored as the result.
Shift Right Logical
Variable
SRLV ry, rx
The 32-bit contents of general-purpose register ry are shifted right, and the emptied high-order bits are
sign extended. The five lower-order bits of general-purpose register rx specify the shift count. The
register is placed in general-purpose register ry. In the 64-bit mode, the value that is formed by signextending shifted 32-bit value is stored as the result.
Shift Right Arithmetic
Variable
SRAV ry, rx
The 32-bit contents of general-purpose register ry are shifted right, and the emptied high-order bits are
sign extended. The five low-order bits of general-purpose register rx specify the shift count. The result
is placed in general-purpose register ry. In the 64-bit mode, the value that is formed by sign-extending
shifted 32-bit value is stored as the result.
78
User’s Manual U15509EJ2V0UM
CHAPTER 3
MIPS16 INSTRUCTION SET
Table 3-17. Shift Instructions (2/2)
Instruction
Format and Description
Doubleword Shift Left
Logical
DSLL rx, ry, immediate
The 64-bit doubleword contents of general-purpose register ry are shifted left, and zeros are inserted
into the emptied low-order bits. The 3-bit immediate specifies the shift count. A shift count of 0 is
interpreted as a shift count of 8. The 64-bit result is placed in general-purpose register rx.
This operation is defined in the 64-bit mode and the 32-bit kernel mode. When this instruction is
executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated.
Doubleword Shift
Right Logical
DSRL ry, immediate
The 64-bit doubleword contents of general-purpose register ry are shifted right, and zeros are inserted
into the emptied high-order bits. The 3-bit immediate specifies the shift count. A shift count of 0 is
interpreted as a shift count of 8.
This operation is defined in the 64-bit mode and the 32-bit kernel mode. When this instruction is
executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated.
Doubleword Shift
Right Arithmetic
DSRA ry, immediate
The 64-bit doubleword contents of general-purpose register ry are shifted right, and the emptied highorder bits are sign extended. The 3-bit immediate specifies the shift count. A shift count of 0 is
interpreted as a shift count of 8.
This operation is defined in the 64-bit mode and the 32-bit kernel mode. When this instruction is
executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated.
Doubleword Shift Left
Logical Variable
DSLLV ry, rx
The 64-bit doubleword contents of general-purpose register ry are shifted left, and zeros are inserted
into the emptied low-order bits. The six low-order bits of general-purpose register rx specify the shift
count. The result is placed in general-purpose register ry.
This operation is defined in the 64-bit mode and the 32-bit kernel mode. When this instruction is
executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated.
Doubleword Shift
Right Logical Variable
DSRLV ry, rx
The 64-bit doubleword contents of general-purpose register ry are shifted right, and zeros are inserted
into the emptied high-order bits. The six low-order bits of general-purpose register rx specify the shift
count. The result is placed in general-purpose register ry.
This operation is defined in the 64-bit mode and the 32-bit kernel mode. When this instruction is
executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated.
Doubleword Shift
Right Arithmetic
Variable
DSRAV ry, rx
The 64-bit doubleword contents of general-purpose register ry are shifted right, and the emptied highorder bits are sign extended. The six low-order bits of general-purpose register rx specify the shift
count. The result is placed in general-purpose register ry.
This operation is defined in the 64-bit mode and the 32-bit kernel mode. When this instruction is
executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated.
User’s Manual U15509EJ2V0UM
79
CHAPTER 3
MIPS16 INSTRUCTION SET
Table 3-18. Multiply/Divide Instructions (1/2)
Instruction
Format and Description
Multiply
MULT rx, ry
The contents of general-purpose registers rx and ry are multiplied, treating both operands as 32-bit
two's complement values. No integer overflow exception occurs.
In the 64-bit mode, the operand must be a 64-bit value formed by sign-extending a 32-bit value.
The low-order 32-bit word of the result are placed in special register LO, and the high-order 32-bit word
is placed in special register HI. In the 64-bit mode, each result is sign extended and then stored.
If either of the two immediately preceding instructions is MFHI or MFLO, their transfer instruction
execution result becomes undefined. To obtain the correct result, insert two or more other instructions
between the MFHI, MFLO instructions, and the MULT instruction.
Multiply Unsigned
MULTU rx, ry
The contents of general-purpose registers rx and ry are multiplied, treating both operands as 32-bit
unsigned values. No integer overflow exception occurs. In the 64-bit mode, the operand must be a 64bit value formed by sign-extending a 32-bit value. The low-order 32-bit word of the result is placed in
special register LO, and the high-order 32-bit word is placed in special register HI. In the 64-bit mode,
each result is sign extended and stored.
If either of the two immediately preceding instructions is MFHI or MFLO, the result of execution of
these transfer instructions is undefined. To obtain the correct result, insert two or more other
instructions between the MFHI, MFLO instructions and the MULTU instruction.
Divide
DIV rx, ry
The contents of general-purpose register rx are divided by the contents of general-purpose register ry,
treating both operands as 32-bit two's complement values. No integer overflow exception occurs. The
result when the divisor is 0 is undefined. The 32-bit quotient is placed in special register LO, and the
32-bit remainder is placed in special register HI. In the 64-bit mode, the result is sign extended.
Normally, this instruction is executed after instructions checking for division by zero and overflow.
If either of the two immediately preceding instructions is MFHI or MFLO, the result of execution of
these transfer instructions is undefined. To obtain the correct result, insert two or more other
instructions between the MFHI, MFLO instructions and the DIV instruction.
Divide Unsigned
DIVU rx, ry
The contents of general-purpose register rx are divided by the contents of general-purpose register ry,
treating both operands as unsigned values. No integer overflow exception occurs. The result when the
divisor is 0 is undefined. The 32-bit quotient is placed in special register LO, and the 32-bit remainder is
placed in special register HI. In the 64-bit mode, the result is sign extended.
Normally, this instruction is executed after instructions checking for division by zero.
If either of the two immediately preceding instructions is MFHI or MFLO, the result of execution of
these transfer instructions is undefined. To obtain the correct result, insert two or more other
instructions between the MFHI, MFLO instructions and the DIVU instruction.
Move from HI
MFHI rx
The contents of special register HI are loaded into general-purpose register rx.
To ensure correct operation when an interrupt occurs, do not use an instruction that changes the HI
register (MULT, MULTU, DIV, DIVU, DMULT, DMULTU, DDIV, DDIVU) for the two instructions after
the MFHI instruction.
80
User’s Manual U15509EJ2V0UM
CHAPTER 3
MIPS16 INSTRUCTION SET
Table 3-18. Multiply/Divide Instructions (2/2)
Instruction
Format and Description
Move from LO
MFLO rx
The contents of special register LO are loaded into general-purpose register rx.
To ensure correct operation when an interrupt occurs, do not use an instruction that changes the HI
register (MULT, MULTU, DIV, DIVU, DMULT, DMULTU, DDIV, DDIVU) for the two instructions after
the MFLO instruction.
Doubleword Multiply
DMULT rx, ry
The 64-bit contents of general-purpose register rx and ry are multiplied, treating both operands as two's
complement values. No integer overflow exception occurs. The low-order 64 bits of the result are
placed in special register LO, and the high-order 64 bits are placed in special register HI.
If either of the two immediately preceding instructions is MFHI or MFLO, the result of execution of
these transfer instructions is undefined. To obtain the correct result, insert two or more other
instructions between the MFHI, MFLO instructions and the DMULT instruction.
This operation is defined in the 64-bit mode and the 32-bit kernel mode. When this instruction is
executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated.
Doubleword Multiply
Unsigned
DMULTU rx, ry
The 64-bit contents of general-purpose registers rx and ry are multiplied, treating both operands as
unsigned values. No integer overflow exception occurs. The low-order 64 bits of the result are placed in
special register LO, and the high-order 64 bits of the result are placed in special register HI.
If either of the two immediately preceding instructions is MFHI or MFLO, the result of execution of
these transfer instructions is undefined. To obtain the correct result, insert two or more other
instructions between the MFHI, MFLO instructions and the DMULTU instruction.
This operation is defined in the 64-bit mode and the 32-bit kernel mode. When this instruction is
executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated.
Doubleword divide
DDIV rx, ry
The 64-bit contents of general-purpose registers rx are divided by the contents of general-purpose
register ry, treating both operands as two's complement values. No integer overflow exception occurs.
The result when the divisor is 0 is undefined. The 64-bit quotient is placed in special register LO, and
the 64-bit remainder is placed in special register HI. Normally, this instruction is executed after
instructions checking for division by zero and overflow.
If either of the two immediately preceding instructions is MFHI or MFLO, the result of execution of
these transfer instructions is undefined. To obtain the correct result, insert two or more other
instructions between the MFHI, MFLO instructions and the DDIV instruction.
This operation is defined in the 64-bit mode and the 32-bit kernel mode. When this instruction is
executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated.
Doubleword Divide
Unsigned
DDIVU rx, ry
The 64-bit contents of general-purpose register rx are divided by the contents of general-purpose
register ry, treating both operands as unsigned values. No integer overflow exception occurs. The
result when the divisor is 0 is undefined. The 64-bit quotient is placed in special register LO, and the
64-bit remainder is placed in special register HI. Normally, this instruction is executed after an
instruction checking for division by zero.
If either of the two immediately preceding instructions is MFHI or MFLO, the result of execution of
these transfer instructions is undefined. To obtain the correct result, insert two or more other
instructions between the MFHI, MFLO instructions and the DDIVU instruction.
This operation is defined in the 64-bit mode and the 32-bit kernel mode. When this instruction is
executed in the 32-bit user/supervisor mode, a reserved instruction exception is generated.
User’s Manual U15509EJ2V0UM
81
CHAPTER 3
MIPS16 INSTRUCTION SET
(3) Jump and branch instructions
Jump and Branch instructions change the control flow of a program.
All Jump instructions occur with a one-instruction delay. That is, the instruction immediately following the jump is
always executed.
Branch instructions do not have a delay slot. If a branch is taken, the instruction immediately following the branch
is never executed.
If the branch is not taken, the instruction immediately following the branch is always
executed.
Table 3-19 shows the MIPS16 Jump and Branch instructions.
Table 3-19. Jump and Branch Instructions (1/2)
Instruction
Format and Description
Jump and Link
JAL target
The 26-bit target address is shifted left two bits and combined with the high-order four bits of the
address of the delay slot. The program unconditionally jumps to this calculated address with a delay of
one instruction. The address of the instruction immediately following the delay slot is placed in register
ra. The ISA Mode bit is left unchanged. The value stored in ra bit 0 will reflect the current ISA Mode bit.
Jump and Link
Exchange
JALX target
The 26-bit target address is shifted left two bits and combined with the high-order four bits of the
address of the delay slot. The program unconditionally jumps to this calculated address with a delay of
one instruction. The address of the instruction immediately following the delay slot is placed in register
ra. The ISA Mode bit is inverted with a delay of one instruction. The value stored in ra bit 0 will reflect
the ISA Mode bit before execution of the Jump execution.
Jump Register
JR rx
The program unconditionally jumps to the address specified in general-purpose register rx, with a delay
of one instruction. The instruction sets the ISA Mode bit to the value in rx bit 0. If the Jump target
address is in the MIPS16 instruction length mode, no address exception occurs when bit 0 of the
source register is 1 because bit 0 of the target address is 0 so that the instruction is located at the
halfword boundary.
If the 32-bit length instruction mode is changed, an address exception occurs when the jump target
address is fetched if the two low-order bits of the target address are not 0.
JR ra
The program unconditionally jumps to the address specified in register ra, with a delay of one
instruction. The instruction sets the ISA Mode bit to the value in ra bit 0. If the Jump target address is in
the MIPS16 instruction length mode, no address exception occurs when bit 0 of the source register is 1
because bit 0 of the target address is 0 so that the instruction is located at the halfword boundary.
If the 32-bit length instruction mode is changed, an address exception occurs when the jump target
address is fetched if the two low-order bits of the target address are not 0.
Jump and Link
Register
82
JALR ra, rx
The program unconditionally jumps to the address contained in register rx, with a delay of one
instruction. This instruction sets the ISA Mode bit to the value in rx bit 0. The address of the instruction
immediately following the delay slot is placed in register ra. The value stored in ra bit 0 will reflect the
ISA mode bit before the jump execution is executed.
If the Jump target address is in the MIPS16 instruction length mode, no address exception occurs
when bit 0 of the source register is 1 because bit 0 of the target address is 0 so that the instruction is
located at the halfword boundary.
If the 32-bit length instruction mode is changed, an address exception occurs when the jump target
address is fetched if the two low-order bits of the target address are not 0.
User’s Manual U15509EJ2V0UM
CHAPTER 3
MIPS16 INSTRUCTION SET
Table 3-19. Jump and Branch Instructions (2/2)
Instruction
Format and Description
Branch on Equal to
Zero
BEQZ rx, immediate
The 8-bit immediate is shifted left one bit, sign extended, and then added to the address of the
instruction after the branch to form the target address. If the contents of general-purpose register rx are
equal to zero, the program branches to the target address. No delay slot is generated.
Branch on Not Equal
to Zero
BNEZ rx, immediate
The 8-bit immediate is shifted left one bit, sign extended, and then added to the address of the
instruction after the branch to form the target address. If the contents of general-purpose register rx are
not equal to zero, the program branches to the target address. No delay slot is generated.
Branch on T Equal to
Zero
BTEQZ immediate
The 8-bit immediate is shifted left one bit, sign extended, and then added to the address of the
instruction after the branch to form the target address. If the contents of special register T ($24) are not
equal to zero, the program branches to the target address. No delay slot is generated.
Branch on T Not
Equal to Zero
BTNEZ immediate
The 8-bit immediate is shifted left one bit, sign extended, and then added to the address of the
instruction after the branch to form the target address. If the contents of special register T ($24) are not
equal to zero, the program branches to the target address. No delay slot is generated.
Branch Unconditional
B immediate
The 11-bit immediate is shifted left one bit, sign extended, and then added to the address of the
instruction after the branch to form the target address. The program branches to the target address
unconditionally.
(4) Special instructions
Special instructions unconditionally perform branching to general exception vectors. Special instructions are of
the R type. Table 3-20 shows three special instructions.
Table 3-20. Special Instructions
Instruction
Format and Description
Breakpoint
BREAK immediate
A breakpoint trap occurs, immediately and unconditionally transferring control to the exception handler.
By using a 6-bit code area, parameters can be sent to the exception handler. If the exception handler
uses this parameter, the contents of memory including instructions must be loaded as data.
Extend
EXTEND immediate
The 11-bit immediate is combined with the immediate in the next instruction to form a larger immediate
equivalent to 32-bit MIPS. The Extend instruction must always precede (by one instruction) the
instruction whose immediate field you want to extend. Every extended instruction consumes four bytes
in program memory instead of two bytes (two bytes for Extend and two bytes for the instruction being
extended), and it can cross a word boundary. (For details, see 3.8.2 Extend instruction.)
System Call
SYSCALL
A system call trap occurs, immediately and unconditionally transferring control to the exception
handler.
User’s Manual U15509EJ2V0UM
83
CHAPTER 4 PIPELINE
This chapter describes the basic operation of the VR4100 Series processor pipeline, which includes descriptions
of the delay slots (instructions that follow a branch or load instruction in the pipeline), and interrupts to the pipeline
flow caused by interlocks and exceptions.
4.1 Pipeline Stages
In the VR Series, an instruction execution system called a pipeline is adopted.
In the pipeline, instruction
execution processing is delimited into several stages. Instruction execution is complete when each stage is passed.
When processing of one instruction in one stage of the pipeline is complete, the next instruction enters that stage.
When the pipeline is full, it means that instructions equaling the number of pipeline stages are being executed
simultaneously.
The pipeline clock is called the PClock. Each cycle of the PClock is called a PCycle. Instructions are read in
synchronization with the PClock. Each stage of the pipeline is executed in one PCycle. Therefore, executing an
instruction requires as many PCycles as the number of pipeline stages. When the required data has not been cached
and must instead be fetched from the main memory, the execution requires more cycles than the number of pipeline
stages.
4.1.1 VR4121, VR4122, VR4181A
The pipeline of the VR4121, VR4122, or VR4181A has five stages in the MIPS III (32-bit length) instruction mode,
or six stages in the MIPS16 (16-bit length) instruction mode.
The name and meanings of each stage are as follows.
• IF - Instruction cache fetch
• IT - Instruction translation (in MIPS16 instruction mode only)
• RF - Register fetch
• EX - Execution
• DC - Data cache fetch
• WB - Writeback
84
User’s Manual U15509EJ2V0UM
CHAPTER 4 PIPELILNE
Figure 4-1. Pipeline Stages (VR4121, VR4122, VR4181A)
(a) MIPS III instruction mode
PCycle
PClock
IF
Stage
RF
EX
DC
WB
(b) MIPS16 instruction mode
PCycle
PClock
Stage
IF
IT
RF
EX
DC
WB
Figure 4-2 shows instruction execution in the pipeline. In this figure, a row indicates the execution process of each
instruction, and a column indicates the processes executed simultaneously.
User’s Manual U15509EJ2V0UM
85
CHAPTER 4 PIPELILNE
Figure 4-2. Instruction Execution in the Pipeline (VR4121, VR4122, VR4181A)
(a) MIPS III instruction mode
(5-deep)
PCycle
IF
RF
EX
DC
WB
IF
RF
EX
DC
WB
IF
RF
EX
DC
WB
IF
RF
EX
DC
WB
IF
RF
EX
DC
WB
Current CPU
cycle
(b) MIPS16 instruction mode
(6-deep)
PCycle
IF
IT
RF
EX
DC
WB
IF
IF
IT
RF
EX
DC
WB
IF
IF
IT
RF
EX
DC
WB
IF
IF
IT
RF
EX
DC
WB
IF
IF
IT
RF
EX
DC
WB
IF
IF
IT
RF
EX
DC
Current CPU
cycle
86
User’s Manual U15509EJ2V0UM
WB
CHAPTER 4 PIPELILNE
4.1.2 VR4131
The pipeline of the VR4131 employs the 2-way superscalar mechanism that can execute two instructions each in
the same stage. Each pipeline has six stages in the MIPS III (32-bit length) instruction mode, or seven stages in the
MIPS16 (16-bit length) instruction mode.
The name and meanings of each stage are as follows.
• IF - Instruction cache fetch
• IT - Instruction translation (in MIPS16 instruction mode only)
• RF - Register fetch
• EX - Execution
• DC1 - Data cache fetch
• DC2 - Data read
• WB - Writeback
Figure 4-3. Pipeline Stages (VR4131)
(a) MIPS III instruction mode
PCycle
PClock
Stage
IF
RF
EX
DC1
DC2
WB
(b) MIPS16 instruction mode
PCycle
PClock
Stage
IF
IT
RF
EX
DC1
DC2
WB
Figure 4-4 shows instruction execution in the pipeline. In this figure, a row indicates the execution process of each
instruction, and a column indicates the processes executed simultaneously.
User’s Manual U15509EJ2V0UM
87
CHAPTER 4 PIPELILNE
Figure 4-4. Instruction Execution in the Pipeline (VR4131)
(a) MIPS III instruction mode
PCycle
(6-deep)
IF
RF
EX
DC1
DC2
WB
IF
RF
EX
DC1
DC2
WB
IF
RF
EX
DC1
DC2
WB
IF
RF
EX
DC1
DC2
WB
IF
RF
EX
DC1
DC2
WB
IF
RF
EX
DC1
DC2
WB
IF
RF
EX
DC1
DC2
WB
IF
RF
EX
DC1
DC2
WB
IF
RF
EX
DC1
DC2
WB
IF
RF
EX
DC1
DC2
WB
IF
RF
EX
DC1
DC2
WB
IF
RF
EX
DC1
DC2
WB
Current CPU
cycle
(b) MIPS16 instruction mode
(7-deep)
PCycle
IF
IT
RF
EX
DC1
DC2
WB
IF
IT
RF
EX
DC1
DC2
WB
IF
IT
RF
EX
DC1
DC2
WB
IF
IT
RF
EX
DC1
DC2
WB
IF
IT
RF
EX
DC1
DC2
WB
IF
IT
RF
EX
DC1
DC2
WB
IF
IT
RF
EX
DC1
DC2
WB
IF
IT
RF
EX
DC1
DC2
WB
IF
IT
RF
EX
DC1
DC2
WB
IF
IT
RF
EX
DC1
DC2
WB
IF
IT
RF
EX
DC1
DC2
WB
IF
IT
RF
EX
DC1
DC2
WB
IF
IT
RF
EX
DC1
DC2
WB
IF
IT
RF
EX
DC1
DC2
WB
Current CPU
cycle
88
User’s Manual U15509EJ2V0UM
CHAPTER 4 PIPELILNE
4.1.3 VR4181
The pipeline of the VR4181 has five stages regardless the instruction set modes. Each stage has two phases: Φ1
and Φ2.
The name and meanings of each stage are as follows.
• IF - Instruction cache fetch
• RF - Register fetch
• EX - Execution
• DC - Data cache fetch
• WB - Write back
Figure 4-5. Pipeline Stages (VR4181)
PCycle
PClock
Φ1
Phase
Φ2
Φ1
IF
Stage
Φ2
Φ1
RF
Φ2
Φ1
EX
Φ2
Φ1
DC
Φ2
WB
Figure 4-6 shows instruction execution in the pipeline. In this figure, a row indicates the execution process of each
instruction, and a column indicates the processes executed simultaneously.
Figure 4-6. Instruction Execution in the Pipeline (VR4181)
PCycle
IF1
IF2
(5-deep)
RF1
RF2
EX1
EX2
DC1
DC2 WB1 WB2
IF1
IF2
RF1
RF2
EX1
EX2
DC1
DC2 WB1 WB2
IF1
IF2
RF1
RF2
EX1
EX2
DC1
DC2 WB1 WB2
IF1
IF2
RF1
RF2
EX1
EX2
DC1
DC2 WB1 WB2
IF1
IF2
RF1
RF2
EX1
EX2
DC1
DC2 WB1 WB2
Current CPU
cycle
User’s Manual U15509EJ2V0UM
89
CHAPTER 4 PIPELILNE
4.2 Branch Delay
During a VR4100 Series' pipeline operation, a branch delay occurs when:
• Target address is calculated by a Jump instruction
• Branch condition of branch instruction is met and then logical operation starts for branch-destination
comparison
The instruction location immediately following a Jump/Branch instruction is referred to as the branch delay slot.
4.2.1 VR4121, VR4122, VR4181A
The instruction address generated at the EX stage in the Jump/Branch instruction is available in the IF stage two
instructions later.
In the VR4121, VR4122, and VR4181A, two cycles of branch delay occurs during MIPS III (32-bit length) instruction
mode, or three cycles during MIPS16 (16-bit length) instruction mode, when a branch condition is met. An instruction
in the branch delay slot is executed during MIPS III instruction mode (except for Branch Likely instructions), though it
is discarded during MIPS16 instruction mode.
Figure 4-7 illustrates the branch delay and the location of the branch delay slot.
Figure 4-7. Branch Delay (VR4121, VR4122, VR4181A)
(a) MIPS III Instruction mode
PCycle
Jump/Branch
IF
(Branch delay slot)
RF
EX
DC
WB
IF
RF
EX
DC
WB
IF
RF
EX
Target
DC
WB
Branch delay
(b) MIPS16 instruction mode
PCycle
Jump/Branch
(Branch delay slot)
IF
IT
RF
EX
DC
WB
IF
IT
RF
EX
DC
WB
IF
IT
RF
Target
Branch delay
90
User’s Manual U15509EJ2V0UM
EX
DC
WB
CHAPTER 4 PIPELILNE
4.2.2 VR4131
The instruction address prefetched at the RF stage in the Jump/Branch instruction is available in the IF stage two
instructions later.
Since the VR4131 employs the 2-way superscalar mechanism, the manipulation of succeeding instructions differs
depending that the address of a Jump/Branch instruction is higher or not than that of the instruction in the other way
when it is fetched.
(1) MIPS III instruction mode
In the VR4131, two cycles of branch delay occurs when a branch condition is met. An instruction in the branch
delay slot is executed (except for Branch Likely instructions).
Figure 4-8 illustrates the branch delay and the location of the branch delay slot.
Figure 4-8. Branch Delay (VR4131, MIPS III Instruction Mode)
(a) When Jump/Branch has lower address
PCycle
Jump/Branch
0
IF
RF
EX
DC1
DC2
WB
(Branch delay slot)
4
IF
RF
EX
DC1
DC2
WB
8
IF
RF
C
IF
RF
0
IF
RF
EX
DC1
DC2
WB
4
IF
RF
EX
DC1
DC2
WB
Target
(b) When Jump/Branch has higher address
PCycle
EX
DC1
DC2
WB
8
IF
RF
EX
DC1
DC2
WB
C
IF
RF
0
IF
RF
EX
DC1
DC2
WB
4
IF
RF
EX
DC1
DC2
WB
4
(Branch delay slot)
Target
IF
RF
Jump/Branch
User’s Manual U15509EJ2V0UM
91
CHAPTER 4 PIPELILNE
(2) MIPS16 instruction mode
In the VR4131, three cycles of branch delay occurs when a branch condition is met. An instruction in the branch
delay slot is discarded.
Figure 4-9 illustrates the branch delay and the location of the branch delay slot.
Figure 4-9. Branch Delay (VR4131, MIPS16 Instruction Mode)
(a) When Jump/Branch has lower address
PCycle
Jump/Branch
(Branch delay slot)
Target
1
IF
IT
RF
EX
3
IF
IT
RF
EX
5
IF
IT
RF
7
IF
IT
RF
9
IF
IT
B
IF
IT
DC1
DC2
WB
1
IF
IT
RF
EX
DC1
DC2
WB
3
IF
IT
RF
EX
DC1
DC2
WB
(b) When Jump/Branch has higher address
PCycle
RF
EX
DC1
DC2
WB
5
IF
IT
RF
7
IF
IT
RF
9
IF
IT
B
IF
IT
1
IF
IT
RF
EX
DC1
DC2
WB
3
IF
IT
RF
EX
DC1
DC2
WB
3
(Branch delay slot)
Target
92
IF
IT
Jump/Branch
User’s Manual U15509EJ2V0UM
CHAPTER 4 PIPELILNE
4.2.3 VR4181
The instruction address generated at the RF stage in the Jump/Branch instruction are available in the IF stage,
two instructions later.
In the VR4181, one cycle of branch delay occurs when a branch condition is met in MIPS III instruction mode. An
instruction in the branch delay slot is executed (except for Branch Likely instructions).
No branch delay due to a branch instruction occurs in MIPS16 instruction mode. When a branch condition is met,
the instruction representing a delay slot is discarded.
Figure 4-10 illustrates the branch delay and the location of the branch delay slot.
Figure 4-10. Branch Delay (VR4181)
PCycle
Jump/Branch
(Branch delay slot)
IF
RF
EX
DC
WB
IF
RF
EX
DC
WB
IF
RF
EX
DC
Target
WB
Branch delay
User’s Manual U15509EJ2V0UM
93
CHAPTER 4 PIPELILNE
4.3 Branch Prediction
The VR4122, VR4131, and VR4181A have a branch prediction mechanism to speed up branch instruction
processing.
The VR4122, VR4131, and VR4181A have a full-associative virtual address cache called a branch prediction table.
This table holds the history of the branches that have been satisfied recently, using the address of the Branch
instruction as a tag and the branch destination address as data.
The VR4122, VR4131, and VR4181A reference the branch prediction table when they fetch a Branch instruction. If
the same Branch instruction is in the table (hit), they branch to the branch destination address in the table rather than
calculating the branch destination address. If the corresponding Branch instruction is not in the table (miss), they
recalculate the branch destination address. If the condition of a missed Branch instruction is satisfied, that Branch
instruction and the address of the branch destination are stored in the branch prediction table. New history is written
over the entry stored earliest (LRU (least recently used) algorithm).
The branch prediction table of the VR4122 and VR4181A can hold four entries, and that of the VR4131 can hold
eight entries.
Whether the branch prediction mechanism is to be used can be specified by using the BP bit of the Config register
of CP0. Branch prediction is executed when the BP bit is cleared to 0; it is not executed when the bit is set to 1. The
BP bit is cleared to 0 by default.
Branch prediction is not executed in the MIPS16 instruction mode and debug mode. The BP bit is automatically
set to 1.
Because the branch prediction table is a virtual address cache, it is invalid if the contents of a physical address
corresponding to a virtual address change. When performing an operation that rewrites the text area (such as
changing the bank or downloading), therefore, either disable branch prediction (by setting the BP bit to 1) or clear the
history of the branch prediction table immediately before. Clear the history regardless of whether the VR4122,
VR4131, or VR4181A operates in the virtual address mode. The VR4122, VR4131, and VR4181A clear the history of
the branch prediction table in the following cases.
- Writing to EntryHi register
- Writing to Config register (VR4131 only)
- Execution of TLBWI instruction
- Execution of TLBWR instruction
- Execution of TLBR instruction
94
User’s Manual U15509EJ2V0UM
CHAPTER 4 PIPELILNE
4.3.1 VR4122, VR4181A
The VR4122 and VR4181A reference the branch prediction table in the IF stage of a Branch instruction. If a hit
occurs when the branch condition is decoded in the RF stage, the instruction at the branch destination address
output from the branch prediction table is fetched.
When the branch condition is checked in the EX stage and it has been ascertained that a branch is to occur, the
pipeline processing of the instruction at the branch destination continues. If it has been found that a branch is not to
occur, the processing of the instruction at the branch destination is stopped, and the next instruction in the branch
delay slot is fetched in the DC stage.
If it is found that the condition of a Branch instruction missed in the branch prediction table is satisfied and that a
branch is to occur, the branch prediction table is updated in the DC stage.
The figure below illustrates the pipeline operation when branch prediction is performed.
Figure 4-11. Pipeline on Branch Prediction (VR4122, VR4181A) (1/2)
(a) When branch prediction misses and no branch is to occur
PCycle
Branch
IF
(Branch delay slot)
RF
EX
DC
WB
IF
RF
EX
DC
WB
IF
RF
EX
DC
Instruction following
branch delay slot
WB
(b) When branch prediction misses and branch is to occur
PCycle
Branch
(Branch delay slot)
Target
IF
RF
EX
DC
WB
IF
RF
EX
DC
WB
IF
RF
EX
User’s Manual U15509EJ2V0UM
DC
WB
95
CHAPTER 4 PIPELILNE
Figure 4-11. Pipeline on Branch Prediction (VR4122, VR4181A) (2/2)
(c) When branch prediction hits and no branch is to occur
PCycle
Branch
IF
(Branch delay slot)
RF
EX
DC
WB
IF
RF
EX
DC
WB
IF
RF
EX
Target
IF
Instruction following
branch delay slot
DC
WB
(d) When branch prediction hits and branch is to occur
PCycle
Branch
(Branch delay slot)
Target
96
IF
RF
EX
DC
WB
IF
RF
EX
DC
WB
IF
RF
EX
DC
User’s Manual U15509EJ2V0UM
WB
CHAPTER 4 PIPELILNE
4.3.2 VR4131
The VR4131 references the branch prediction table in the IF stage of a Branch instruction. If a hit occurs, the
instruction at the branch destination address output from the branch prediction table is fetched.
When the branch condition is checked in the EX stage and it has been ascertained that a branch is to occur, the
pipeline processing of the instruction at the branch destination continues. If it has been found that a branch is not to
occur, the processing of the instruction at the branch destination is stopped, and the next instruction in the branch
delay slot is fetched in the DC stage.
If it is found that the condition of a Branch instruction missed in the branch prediction table is satisfied and that a
branch is to occur, the branch prediction table is updated in the DC stage.
The figure below illustrates the pipeline operation when branch prediction is performed.
Figure 4-12. Pipeline on Branch Prediction (VR4131, When the Branch Is in the Lower Address) (1/2)
(a) When branch prediction misses and no branch is to occur
PCycle
Branch
0
IF
RF
EX
DC1
DC2
WB
(Branch delay slot)
4
IF
RF
EX
DC1
DC2
WB
8
IF
RF
EX
DC1
DC2
WB
C
IF
RF
EX
DC1
DC2
WB
(b) When branch prediction misses and branch is to occur
PCycle
Branch
0
IF
RF
EX
DC1
DC2
WB
(Branch delay slot)
4
IF
RF
EX
DC1
DC2
WB
0
IF
RF
EX
DC1
DC2
WB
4
IF
RF
EX
DC1
DC2
WB
Target
8
IF
RF
C
IF
RF
10
IF
14
IF
User’s Manual U15509EJ2V0UM
97
CHAPTER 4 PIPELILNE
Figure 4-12. Pipeline on Branch Prediction (VR4131, When the Branch Is in the Lower Address) (2/2)
(c) When branch prediction hits and no branch is to occur
PCycle
0
IF
RF
EX
DC1
DC2
WB
(Branch delay slot)
4
IF
RF
EX
DC1
DC2
WB
Target
4
IF
RF
8
IF
RF
EX
DC1
DC2
WB
C
IF
RF
EX
DC1
DC2
WB
Branch
IF
8
Instruction following
branch delay slot
(d) When branch prediction hits and branch is to occur
PCycle
0
IF
RF
EX
DC1
DC2
WB
(Branch delay slot)
4
IF
RF
EX
DC1
DC2
WB
Target
4
IF
RF
EX
DC1
DC2
WB
8
IF
RF
EX
DC1
DC2
WB
C
IF
RF
EX
DC1
DC2
WB
Branch
98
User’s Manual U15509EJ2V0UM
CHAPTER 4 PIPELILNE
Figure 4-13. Pipeline on Branch Prediction (VR4131, When the Branch Is in the Higher Address) (1/2)
(a) When branch prediction misses and no branch is to occur
PCycle
IF
RF
EX
DC1
DC2
WB
8
IF
RF
EX
DC1
DC2
WB
C
IF
RF
EX
DC1
DC2
WB
Branch
4
(Branch delay slot)
(b) When branch prediction misses and branch is to occur
PCycle
Branch
(Branch delay slot)
Target
IF
RF
EX
DC1
DC2
WB
8
IF
RF
EX
DC1
DC2
WB
C
IF
RF
0
IF
RF
EX
DC1
DC2
WB
4
IF
RF
EX
DC1
DC2
WB
4
User’s Manual U15509EJ2V0UM
99
CHAPTER 4 PIPELILNE
Figure 4-13. Pipeline on Branch Prediction (VR4131, When the Branch Is in the Higher Address) (2/2)
(c) When branch prediction hits and no branch is to occur
PCycle
Branch
(Branch delay slot)
Instruction following
branch delay slot
IF
RF
EX
DC1
DC2
WB
8
IF
RF
EX
DC1
DC2
WB
C
IF
IF
RF
EX
DC1
DC2
WB
IF
RF
EX
DC1
DC2
4
C
0
WB
(d) When branch prediction hits and branch is to occur
PCycle
EX
DC1
DC2
WB
8
IF
RF
EX
DC1
DC2
WB
C
IF
0
IF
RF
EX
DC1
DC2
WB
4
IF
RF
EX
DC1
DC2
WB
4
(Branch delay slot)
Target
100
IF
RF
Branch
User’s Manual U15509EJ2V0UM
CHAPTER 4 PIPELILNE
4.4 Load Delay
The instruction location immediately following a load instruction is referred to as the load delay slot.
The instruction in a load delay slot can use the contents of the loaded register, however in such cases hardware
interlocks insert additional delay cycles.
Consequently, scheduling load delay slots can be desirable, both for
performance and VR-Series processor compatibility.
In the VR4121, VR4122, and VR4181A, two cycles of DC stage are necessary during a load instruction execution
for data read from the data cache and data alignment, and therefore hardware automatically causes interlock.
4.5 Instruction Streaming
If a miss occurs in the instruction cache, a cycle to refill instructions from the main memory to the instruction
cache is started. At this time, the VR4122, VR4131, and VR4181A continue pipeline processing while writing data
(instructions) to the instruction cache and bypassing the data (instructions) to the instruction decoder of the CPU.
Therefore, processing can be resumed earlier from a stall that takes place if a miss occurs in the instruction cache.
This instruction data bypassing function is called streaming.
The instruction streaming function is enabled or disabled by the IS bit of the Config register of CP0. Instruction
streaming is executed when the IS bit is cleared to 0; it is not executed when the bit is set to 1. The IS bit is cleared
to 0 by default.
If instruction streaming is not executed, the pipeline is stalled until refilling the instruction cache has been
completed.
User’s Manual U15509EJ2V0UM
101
CHAPTER 4 PIPELILNE
4.6 Pipeline Activities
Figure 4-14 shows the activities that can occur during each pipeline stage; Table 4-1 describes these pipeline
activities.
Figure 4-14. Pipeline Activities (1/2)
(a) VR4121, VR4122, and VR4181A
PCycle
PClock
Stage
Instruction fetch
IF
IT
ICA
ITC
RF
EX
DC
DC
WB
DCA
DLA
DTLB
DTC
WB
DSA
DTD
DCW
ITCNote
ITLB
Instruction translation
& decode
ITR
IDEC
RF
EX
ALU
DVA
Load/Store
Branch
WB
BAC
(b) VR4131
PCycle
PClock
Stage
Instruction fetch
IF
IT
RF
ICA
ITC
ITCNote
EX
DC1
DC2
WB
ITLB
Decode
ITR
IDEC
RF
EX
ALU
DVA
Load/Store
Branch
BAC
Note When MIPS III instruction mode
102
User’s Manual U15509EJ2V0UM
WB
DCA
DLA
DTLB
DTC
WB
DSA
DTD
DCW
CHAPTER 4 PIPELILNE
Figure 4-14. Pipeline Activities (2/2)
(c) VR4181
PCycle
PClock
Phase
Φ1
Φ2
Φ1
Φ2
Φ1
Φ2
Φ1
Φ2
Φ1
Φ2
Stage
IF1
IF2
RF1
RF2
EX1
EX2
DC1
DC2
WB1
WB2
Instruction fetch &
IDC
ICA
IDEC
RF
decode
ITLB
ITC
ALU
EX
DVA
Load/Store
DCA
DLA
DTLB
DTC
SA
DTD
WB
DCW
BAC
Branch
WB
Table 4-1. Description of Pipeline Activities during Each Stage
Mnemonic
Description
IDC
Instruction cache address decode
ITLB
Instruction address translation
ICA
Instruction cache array access
ITR
Instruction translation
ITC
Instruction tag check
IDEC
Instruction decode
RF
Register operand fetch
BAC
Branch address calculation
EX
Execution stage
DVA
Data virtual address calculation
SA/DSA
Store align
DCA
Data cache address decode/array access
DTLB
Data address translation
DLA
Data cache load align
DTC
Data tag check
DTD
Data transfer to data cache
DCW
Data cache write
WB
Write back to register file
The operation of the pipeline is illustrated by the following examples that describe how typical instructions are
executed. Each instruction is taken through the pipeline and the operations that occur in each relevant stage are
described.
User’s Manual U15509EJ2V0UM
103
CHAPTER 4 PIPELILNE
(1) Add instruction (ADD rd, rs, rt)
IF stage
The eleven least-significant bits of the virtual address are used to access the instruction cache.
Then the cache index is compared with the page frame number and the cache data is read out.
The virtual PC is incremented by 4 so that the next instruction can be fetched.
IT stage
A MIPS16 instruction is translated into a 32-bit length instruction (VR4121, VR4122, VR4131, and
VR4181A only).
RF stage
The 2-port register file is addressed with the rs and rt fields and the register data is valid at the
register file output. At the same time, bypass multiplexers select inputs from either the EX- or DCstage output in addition to the register file output, depending on the need for an operand bypass.
EX stage
The operands flow into the ALU inputs, and the ALU operation is started. The result of the ALU
operation is latched into the ALU output latch.
DC stage
This stage is a NOP for this instruction. The data from the output of the EX stage (the ALU) is
moved into the output latch of the DC.
WB stage
The WB latch feeds the data to the inputs of the register file, which is accessed by the rd field. The
data is written into the file.
Figure 4-15. ADD Instruction Pipeline Activities (VR4121, VR4122, VR4181A)
(a) MIPS III instruction mode
PCycle
PClock
Stage
IF
RF
ICA
IDEC
ITLB
EX
RF
DC
WB
WB
EX
ITC
(b) MIPS16 instruction mode
PCycle
PClock
Stage
IF
IT
RF
ICA
ITR
IDEC
ITLB
RF
EX
EX
ITC
104
User’s Manual U15509EJ2V0UM
DC
WB
WB
CHAPTER 4 PIPELILNE
Figure 4-16. ADD Instruction Pipeline Activities (VR4131)
(a) MIPS III instruction mode
PCycle
PClock
Stage
IF
RF
ICA
IDEC
ITLB
RF
EX
DC1
DC2
WB
EX
WB
ITC
(b) MIPS16 instruction mode
PCycle
PClock
Stage
IF
IT
RF
ICA
ITR
IDEC
ITLB
EX
RF
DC1
DC2
WB
EX
WB
ITC
Figure 4-17. ADD Instruction Pipeline Activities (VR4181)
PCycle
PClock
Phase
Φ1
Φ2
Φ1
Φ2
Φ1
Φ2
Φ1
Φ2
Φ1
Φ2
Stage
IF1
IF2
RF1
RF2
EX1
EX2
DC1
DC2
WB1
WB2
IDC
ICA
RF
EX
ITLB
ITC IDEC
User’s Manual U15509EJ2V0UM
WB
105
CHAPTER 4 PIPELILNE
(2) Jump and Link Register instruction (JALR rd, rs)
IF stage
Same as the IF stage for the ADD instruction.
IT stage
Same as the IT stage for the ADD instruction (VR4121, VR4122, VR4131, and VR4181A only).
RF stage
A register specified in the rs field is read from the file, and the value read from the rs register is
input to the virtual PC latch synchronously. This value is used to fetch an instruction at the jump
destination. The value of the virtual PC incremented during the IF stage is incremented again to
produce the link address PC + 8 (PC + 4 in MIPS16 instruction mode) where PC is the address of
the JALR instruction. The resulting value is the PC to which the program will eventually return.
This value is placed in the Link output latch of the Instruction Address unit.
EX stage
The PC + 8 (PC + 4 in MIPS16 instruction mode) value is moved from the Link output latch to the
output latch of the EX stage.
DC stage
The PC + 8 (PC + 4 in MIPS16 instruction mode) value is moved from the output latch of the EX
stage to the output latch of the DC stage.
WB stage
Refer to the ADD instruction. Note that if no value is explicitly provided for rd then register 31 is
used as the default. If rd is explicitly specified, it cannot be the same register addressed by rs; if it
is, the result of executing such an instruction is undefined.
Figure 4-18. JALR Instruction Pipeline Activities (VR4121, VR4122, VR4181A)
(a) MIPS III instruction mode
PCycle
PClock
Stage
IF
RF
EX
ICA
IDEC
ITLB
RF
EX
ITC
BAC
DC
WB
WB
(b) MIPS16 instruction mode
PCycle
PClock
Stage
IF
IT
RF
ICA
ITR
IDEC
ITLB
RF
ITC
106
EX
EX
BAC
User’s Manual U15509EJ2V0UM
DC
WB
WB
CHAPTER 4 PIPELILNE
Figure 4-19. JALR Instruction Pipeline Activities (VR4131)
(a) MIPS III instruction mode
PCycle
PClock
Stage
IF
RF
EX
ICA
IDEC
ITLB
RF
EX
ITC
BAC
DC1
DC2
WB
WB
(b) MIPS16 instruction mode
PCycle
PClock
Stage
IF
IT
RF
ICA
ITR
IDEC
ITLB
EX
RF
DC1
DC2
WB
EX
ITC
WB
BAC
Figure 4-20. JALR Instruction Pipeline Activities (VR4181)
PCycle
PClock
Phase
Φ1
Φ2
Φ1
Φ2
Φ1
Φ2
Φ1
Φ2
Φ1
Φ2
Stage
IF1
IF2
RF1
RF2
EX1
EX2
DC1
DC2
WB1
WB2
IDC
ICA
RF
EX
ITLB
ITC IDEC
WB
BAC
User’s Manual U15509EJ2V0UM
107
CHAPTER 4 PIPELILNE
(3) Branch on Equal instruction (BEQ rs, rt, offset)
IF stage
Same as the IF stage for the ADD instruction.
IT stage
Same as the IT stage for the ADD instruction (VR4121, VR4122, VR4131, and VR4181A only).
RF stage
The register file is addressed with the rs and rt fields. A check is performed to determine if each
corresponding bit position of these two operands has equal values. If they are equal, the PC is
set to PC + target, where target is the sign-extended offset field. If they are not equal, the PC is
set to PC + 4.
EX stage
The next PC resulting from the branch comparison is valid at the beginning of instruction fetch.
DC stage
This stage is a NOP for this instruction.
WB stage
This stage is a NOP for this instruction.
Figure 4-21. BEQ Instruction Pipeline Activities (VR4121, VR4122, VR4181A)
(a) MIPS III instruction mode
PCycle
PClock
Stage
IF
RF
ICA
IDEC
ITLB
EX
RF
EX
ITC
BAC
DC
WB
WB
(b) MIPS16 instruction mode
PCycle
PClock
Stage
IF
IT
RF
ICA
ITR
IDEC
ITLB
RF
ITC
108
EX
EX
BAC
User’s Manual U15509EJ2V0UM
DC
WB
WB
CHAPTER 4 PIPELILNE
Figure 4-22. BEQ Instruction Pipeline Activities (VR4131)
(a) MIPS III instruction mode
PCycle
PClock
Stage
IF
RF
EX
ICA
IDEC
ITLB
RF
EX
ITC
BAC
DC1
DC2
WB
WB
(b) MIPS16 instruction mode
PCycle
PClock
Stage
IF
IT
RF
ICA
ITR
IDEC
ITLB
EX
RF
DC1
DC2
WB
EX
ITC
WB
BAC
Figure 4-23. BEQ Instruction Pipeline Activities (VR4181)
PCycle
PClock
Phase
Φ1
Φ2
Φ1
Φ2
Φ1
Φ2
Φ1
Φ2
Φ1
Φ2
Stage
IF1
IF2
RF1
RF2
EX1
EX2
DC1
DC2
WB1
WB2
IDC
ICA
RF
EX
ITLB
ITC IDEC
BAC
User’s Manual U15509EJ2V0UM
109
CHAPTER 4 PIPELILNE
(4) Trap if Less Than instruction (TLT rs, rt)
Remark TLT instruction is not included in the MIPS16 instruction set.
IF stage
Same as the IF stage for the ADD instruction.
RF stage
Same as the RF stage for the ADD instruction.
EX stage
ALU controls are set to do an A – B operation. The operands flow into the ALU inputs, and the
ALU operation is started. The result of the ALU operation is latched into the ALU output latch.
The sign bits of operands and of the ALU output latch are checked to determine if a less than
condition is true. If this condition is true, a Trap exception occurs. The value in the PC register is
used as an exception vector value, and from now on any instruction will be invalid.
DC stage
This stage is a NOP for this instruction.
WB stage
The EPC register is loaded with the value of the PC if the less than condition was met in the EX
stage. The Cause register ExCode field and BD bit are updated appropriately, as is the EXL bit of
the Status register. If the less than condition was not met in the EX stage, no activity occurs in
the WB stage.
Figure 4-24. TLT Instruction Pipeline Activities (VR4121, VR4122, VR4181A)
PCycle
PClock
Stage
IF
RF
ICA
IDEC
ITLB
RF
EX
EX
ITC
110
User’s Manual U15509EJ2V0UM
DC
WB
CHAPTER 4 PIPELILNE
Figure 4-25. TLT Instruction Pipeline Activities (VR4131)
PCycle
PClock
Stage
IF
RF
ICA
IDEC
ITLB
RF
EX
DC1
DC2
WB
EX
ITC
Figure 4-26. TLT Instruction Pipeline Activities (VR4181)
PCycle
PClock
Phase
Φ1
Φ2
Φ1
Φ2
Φ1
Φ2
Φ1
Φ2
Φ1
Φ2
Stage
IF1
IF2
RF1
RF2
EX1
EX2
DC1
DC2
WB1
WB2
IDC
ICA
RF
EX
ITLB
ITC IDEC
User’s Manual U15509EJ2V0UM
111
CHAPTER 4 PIPELILNE
(5) Load Word instruction (LW rt, offset (base))
IF stage
Same as the IF stage for the ADD instruction.
IT stage
Same as the IT stage for the ADD instruction (VR4121, VR4122, VR4131, and VR4181A only).
RF stage
Same as the RF stage for the ADD instruction. Note that the base field is in the same position as
the rs field.
EX stage
Refer to the EX stage for the ADD instruction. For LW, the inputs to the ALU come from
GPR[base] through the bypass multiplexer and from the sign-extended offset field. The result of
the ALU operation that is latched into the ALU output latch represents the effective virtual address
of the operand (DVA).
DC stage
The cache tag field is compared with the Page Frame Number (PFN) field of the TLB entry. After
passing through the load aligner, aligned data is placed in the DC output latch.
DC2 stage
After passing through the load aligner, aligned data is placed in the DC2 output latch (VR4121,
VR4122, VR4131, and VR4181A only).
WB stage
The cache read data is written into the register file addressed by the rt field.
Figure 4-27. LW Instruction Pipeline Activities (VR4121, VR4122, VR4181A)
(a) MIPS III instruction mode
PCycle
PClock
Stage
IF
RF
ICA
IDEC
ITLB
EX
DC
DC2
WB
WB
RF
EX
DCA
DLA
ITC
DVA
DTLB
DTC
(b) MIPS16 instruction mode
PCycle
PClock
Stage
IF
IT
ICA
EX
DC
DC2
WB
WB
IDEC
ITLB
RF
ITC
112
RF
EX
DCA
DLA
DVA
DTLB
DTC
User’s Manual U15509EJ2V0UM
CHAPTER 4 PIPELILNE
Figure 4-28. LW Instruction Pipeline Activities (VR4131)
(a) MIPS III instruction mode
PCycle
PClock
Stage
IF
RF
EX
DC1
DC2
WB
ICA
IDEC
ITLB
RF
EX
DCA
DLA
WB
ITC
DVA
DTLB
DTC
(b) MIPS16 instruction mode
PCycle
PClock
Stage
IF
IT
RF
ICA
IDEC
ITLB
RF
ITC
EX
DC1
DC2
WB
EX
DCA
DLA
WB
DVA
DTLB
DTC
Figure 4-29. LW Instruction Pipeline Activities (VR4181)
PCycle
PClock
Phase
Φ1
Φ2
Φ1
Φ2
Φ1
Φ2
Φ1
Φ2
Φ1
Φ2
Stage
IF1
IF2
RF1
RF2
EX1
EX2
DC1
DC2
WB1
WB2
IDC
ICA
ITLB
ITC IDEC
RF
EX
DCA
DLA
DVA
DTLB
DTC
User’s Manual U15509EJ2V0UM
WB
113
CHAPTER 4 PIPELILNE
(6) Store Word instruction (SW rt, offset (base))
IF stage
Same as the IF stage for the ADD instruction.
IT stage
Same as the IT stage for the ADD instruction (VR4121, VR4122, VR4131, and VR4181A only).
RF stage
Same as the RF stage for the LW instruction.
EX stage
Refer to the LW instruction for a calculation of the effective address. From the RF output latch,
the GPR[rt] is sent through the bypass multiplexer and into the main shifter. The results of the
ALU are latched in the output latches.
DC stage
Refer to the LW instruction for a description of the cache access. The store data is aligned.
DC2 stage
Refer to the LW instruction for a description of the cache access (VR4121, VR4122, VR4131, and
VR4181A only).
WB stage
If there was a cache hit, the content of the store data output latch is written into the data cache at
the appropriate word location.
Note that all store instructions use the data cache for two consecutive PCycles. If the following
instruction requires use of the data cache, the pipeline is slipped for one PCycle to complete the
writing of an aligned store data.
Figure 4-30. SW Instruction Pipeline Activities (VR4121, VR4122, VR4181A)
(a) MIPS III instruction mode
PCycle
PClock
IF
RF
ICA
IDEC
ITLB
Stage
EX
DC
DC2
WB
RF
EX
DCA
DLA
WB
ITC
DVA
DTLB
DTC
DSA
DTD
(b) MIPS16 instruction mode
PCycle
PClock
Stage
IF
IT
ICA
EX
DC
DC2
WB
WB
IDEC
ITLB
RF
ITC
114
RF
EX
DCA
DLA
DVA
DTLB
DTC
DSA
DTD
User’s Manual U15509EJ2V0UM
CHAPTER 4 PIPELILNE
Figure 4-31. SW Instruction Pipeline Activities (VR4131)
(a) MIPS III instruction mode
PCycle
PClock
Stage
IF
RF
EX
DC1
DC2
WB
ICA
IDEC
ITLB
RF
EX
DCA
DLA
WB
ITC
DVA
DTLB
DTC
DSA
DTD
(b) MIPS16 instruction mode
PCycle
PClock
Stage
IF
IT
RF
ICA
IDEC
ITLB
RF
ITC
EX
DC1
DC2
WB
EX
DCA
DLA
WB
DVA
DTLB
DTC
DSA
DTD
Figure 4-32. SW Instruction Pipeline Activities (VR4181)
PCycle
PClock
Phase
Φ1
Φ2
Φ1
Φ2
Φ1
Φ2
Φ1
Φ2
Φ1
Φ2
Stage
IF1
IF2
RF1
RF2
EX1
EX2
DC1
DC2
WB1
WB2
IDC
ICA
ITLB
ITC IDEC
RF
EX
DVA
DTC
DTLB
SA
DTD
User’s Manual U15509EJ2V0UM
DCW
115
CHAPTER 4 PIPELILNE
4.7 Interlock and Exception
Smooth pipeline flow is interrupted when cache misses or exceptions occur, or when data dependencies are
detected. Interruptions handled using hardware, such as cache misses, are referred to as interlocks, while those that
are handled using software are called exceptions. As shown in Figure 4-33, all interlock and exception conditions are
collectively referred to as faults.
Figure 4-33. Interlocks, Exceptions, and Faults
Faults
Software
Hardware
Exceptions
Interlocks
Abort
Stall
Slip
At each cycle, exception and interlock conditions are checked for all active instructions.
Because each exception or interlock condition corresponds to a particular pipeline stage, a condition can be
traced back to the particular instruction in the exception/interlock stage, as shown in Table 4-2. For instance, an LDI
Interlock is raised in the Register Fetch (RF) stage.
Tables 4-3 and 4-4 describe the pipeline interlocks and exceptions listed in Table 4-2.
116
User’s Manual U15509EJ2V0UM
CHAPTER 4 PIPELILNE
Table 4-2. Correspondence of Pipeline Stage to Interlock and Exception Conditions
Stage
IF
RF
Status
Interlock
EX
DC
WB
(IT)
−
Stall
−
ITM
ICM
−
DTM
DCM
DCB
−
Slip
−
LDI
−
−
MDI
SLI
CP0
Exception
IAErr
NMI
Trap
Reset
ITLB
OVF
DTLB
INTr
DAErr
DTMod
IBE
WAT
SYSC
DBE
BP
NMI (VR4131)
CUn
INTr (VR4131)
−
RSVD
Remark In the above table, exception conditions are listed up in higher priority order.
User’s Manual U15509EJ2V0UM
117
CHAPTER 4 PIPELILNE
Table 4-3. Pipeline Interlock
Interlock
Description
ITM
Interrupt TLB Miss
ICM
Interrupt Cache Miss
LDI
Load Data Interlock
MDI
MD Busy Interlock
SLI
Store-Load Interlock
CP0
Coprocessor 0 Interlock
DTM
Data TLB Miss
DCM
Data Cache Miss
DCB
Data Cache Busy
Table 4-4. Description of Pipeline Exception
Exception
118
Description
IAErr
Instruction Address Error exception
NMI
Non-maskable Interrupt exception
ITLB
ITLB exception
INTr
Interrupt exception
IBE
Instruction Bus Error exception
SYSC
System Call exception
BP
Breakpoint exception
CUn
Coprocessor Unusable exception
RSVD
Reserved Instruction exception
Trap
Trap exception
OVF
Overflow exception
DAErr
Data Address Error exception
Reset
Reset exception
DTLB
DTLB exception
DTMod
DTLB Modified exception
WAT
Watch exception
DBE
Data Bus Error exception
User’s Manual U15509EJ2V0UM
CHAPTER 4 PIPELILNE
4.7.1 Exception conditions
When an exception condition occurs, the relevant instruction and all those that follow it in the pipeline are
cancelled.
Accordingly, any stall conditions and any later exception conditions that may have referenced this
instruction are inhibited; there is no benefit in servicing stalls for a cancelled instruction.
When an exceptional condition is detected for an instruction, the VR4100 Series will kill it and all following
instructions. When this instruction reaches the WB stage, the exception flag and various information items are
written to CP0 registers. The current PC is changed to the appropriate exception vector address and the exception
bits of earlier pipeline stages are cleared.
This implementation allows all preceding instructions to complete execution and prevents all subsequent
instructions from completing. Thus the value in the EPC is sufficient to restart execution. It also ensures that
exceptions are taken in the order of execution; an instruction taking an exception may itself be killed by an instruction
further down the pipeline that takes an exception in a later cycle.
Figure 4-34. Exception Detection
Instruction
causing exception
1
IF
RF
EX
DC
WB
IF
RF
EX
DC
WB
IF
RF
EX
DC
WB
IF
RF
EX
DC
2
Exception vector
WB
: Killed stage
: Cancellation
User’s Manual U15509EJ2V0UM
119
CHAPTER 4 PIPELILNE
4.7.2 Stall conditions
Stalls are used to stop the pipeline for conditions detected after the RF stage. When a stall occurs, the processor
will resolve the condition and then the pipeline will continue. Figure 4-35 shows a data cache miss stall, and Figure
4-36 shows a CACHE instruction stall.
Figure 4-35. Data Cache Miss Stall
IF
RF
IF
EX
DC
WB
WB
<1>
<2>
WB
WB
WB
<3>
RF
EX
DC
DC
DC
DC
DC
WB
IF
RF
EX
EX
EX
EX
EX
DC
WB
IF
RF
RF
RF
RF
RF
EX
DC
WB
<1> Data cache miss
<2> Start moving data cache line to write buffer
<3> Get last word into cache and restart pipeline
If the cache line to be replaced is dirty  the W bit is set  the data is moved to the internal write buffer in the
next cycle. The write-back data is returned to memory. The last word in the data is returned to the cache at <3>, and
pipelining restarts.
Figure 4-36. CACHE Instruction Stall
IF
RF
EX
DC
WB
WB
WB
WB
<1>
IF
WB
<2>
RF
EX
DC
DC
DC
DC
DC
WB
IF
RF
EX
EX
EX
EX
EX
DC
WB
IF
RF
RF
RF
RF
RF
EX
DC
WB
<1> CACHE instruction start
<2> CACHE instruction complete
When the CACHE instruction enters the DC pipe-stage, the pipeline stalls while the CACHE instruction is
executed. The pipeline begins running again when the CACHE instruction is completed, allowing the instruction fetch
to proceed.
120
User’s Manual U15509EJ2V0UM
CHAPTER 4 PIPELILNE
4.7.3 Slip conditions
During the RF stage and the EX stage, internal logic will determine whether it is possible to start the current
instruction in this cycle. If all of the source operands are available (either from the register file or via the internal
bypass logic) and all the hardware resources necessary to complete the instruction will be available whenever
required, then the instruction “run”; otherwise, the instruction will “slip”.
Slipped instructions are retired on
subsequent cycles until they issue. The backend of the pipeline (stages DC and WB) will advance normally during
slips in an attempt to resolve the conflict. NOPs will be inserted into the bubble in the pipeline. Instructions killed by
branch likely instructions, ERET or exceptions will not cause slips.
Figure 4-37. Load Data Interlock
(a) VR4121, VR4122, VR4131, VR4181A
Load A
IF
Load B
RF
EX
DC
DC2 WB
IF
RF
EX
DC
IF
RF
RF
RF
<1>
<1>
<2>
DC2 WB
Bypass
Add A,B
IF
EX
DC
WB
RF
EX
DC
WB
(b) VR4181
Load A
Load B
IF
RF
EX
DC
WB
IF
RF
EX
DC
WB
Bypass
Add A,B
IF
RF
RF
<1>
<2>
IF
EX
DC
WB
RF
EX
DC
WB
<1> Detect load data interlock
<2> Get target data
Load Data Interlock is detected in the RF stage and also the pipeline slips in the stage. Load Data Interlock
occurs when data fetched by a load instruction and data moved from HI, LO or CP0 register is required by the next
immediate instruction. The pipeline begins running again at the clock after the target of the load is read from the data
cache, HI, LO and CP0 registers. The data returned at the end of the DC stage is input into the end of the RF stage,
using the bypass multiplexers.
User’s Manual U15509EJ2V0UM
121
CHAPTER 4 PIPELILNE
Figure 4-38. MD Busy Interlock
(a) VR4121, VR4122, VR4131, VR4181A
MULT/DIV
IF
RF
EX
DC
WB
Multiply/Divide
Bypass
MFHI/MFLO
IF
RF
IF
EX
EX
EX
EX
DC
<1>
<1>
<1>
<1>
<2>
RF
RF
RF
RF
EX
WB
DC
WB
(b) VR4181
IF
RF
EX
DC
WB
Bypass
IF
MFHI/MFLO
RF
RF
<1>
<2>
IF
EX
DC
WB
RF
EX
DC
WB
<1> Detect MD busy interlock
<2> Get target data
MD Busy Interlock occurs when HI/LO register is required by MFHI/MFLO instruction before finishing
Multiply/Divide execution. The pipeline begins running again at the clock after finishing Multiply/Divide execution.
In the VR4121, VR4122, VR4131, and VR4181A, MD Busy Interlock is detected in the EX stage and also the
pipeline slips in the stage. The data returned from the HI/LO register at the end of the DC stage is input into the end
of the EX stage, using the bypass multiplexer.
In the VR4181, MD Busy Interlock is detected in the RF stage and also the pipeline slips in the stage. The data
returned from the HI/LO register at the end of the DC stage is input into the end of the RF stage, using the bypass
multiplexer.
Store-Load Interlock is detected in the EX stage and the pipeline slips in the RF stage. Store-Load Interlock
occurs when store instruction followed by load instruction is detected. The pipeline begins running again one clock
later.
Coprocessor 0 Interlock is detected in the EX stage and the pipeline slips in the RF stage. Coprocessor Interlock
occurs when an MTC0 instruction for the Config or Status register is detected. The pipeline begins running again one
clock later.
122
User’s Manual U15509EJ2V0UM
CHAPTER 4 PIPELILNE
4.7.4 Bypassing
In some cases, data and conditions produced in the EX, DC and WB stages of the pipeline are made available to
the EX stage (only) through the bypass data path.
Operand bypass allows an instruction in the EX stage to continue without having to wait for data or conditions to
be written to the register file at the end of the WB stage. Instead, the Bypass Control Unit is responsible for ensuring
data and conditions from later pipeline stages are available at the appropriate time for instructions earlier in the
pipeline.
The Bypass Control Unit is also responsible for controlling the source and destination register addresses supplied
to the register file.
User’s Manual U15509EJ2V0UM
123
CHAPTER 5 MEMORY MANAGEMENT SYSTEM
The VR4100 Series provides a memory management unit (MMU) which uses a translation lookaside buffer (TLB)
to translate virtual addresses into physical addresses. This chapter describes the virtual and physical address
spaces, the virtual-to-physical address translation, the operation of the TLB in making these translations, and the
CP0 registers that provide the software interface to the TLB.
5.1 Processor Modes
5.1.1 Operating mode
The processor has three operating modes, and accessible address spaces are determined by these modes.
• User mode
• Supervisor mode
• Kernel mode
User and Kernel modes are common to all VR-Series processors. Generally, Kernel mode is used to executing the
operating system, while User mode is used to run application programs. The VR4000 Series
TM
and later processors
have a third mode, which is called Supervisor mode and categorized in between User and Kernel modes. This mode
is used to configure a high-security system.
When an exception occurs, the CPU enters Kernel mode, and remains in this mode until an exception return
instruction (ERET) is executed. The ERET instruction brings back the processor to the mode in which it was just
before the exception occurs.
Access to the kernel address space is allowed when the processor is in Kernel mode.
Access to the supervisor address space is allowed when the processor is in Kernel or Supervisor mode.
Access to the user address space is allowed in any of the three operating modes.
5.1.2 Addressing mode
In the VR4100 Series, 32- or 64-bit mode is independently selectable for User, Supervisor, and Kernel operating
modes. A processor in 64-bit mode translates 64-bit addresses and processes data in 64-bit unit.
124
User’s Manual U15509EJ2V0UM
CHAPTER 5 MEMORY MANAGEMENT SYSTEM
5.2 Translation Lookaside Buffer (TLB)
Virtual addresses are translated into physical addresses using an on-chip TLB. The on-chip TLB is a fullyassociative memory that holds 32 entries, which provide mapping to 32 odd/even page pairs for one entry. The
pages can have five different sizes, 1 K, 4 K, 16 K, 64 K, and 256 K, and can be specified in each entry. If it is
supplied with a virtual address, each of the 32 TLB entries is checked simultaneously to see whether they match the
virtual addresses that are provided with the ASID field and saved in the EntryHi register.
If there is a virtual address match, or “hit,” in the TLB, the physical page number is extracted from the TLB and
concatenated with the offset to form the physical address.
If no match occurs (TLB “miss”), an exception is taken and software refills the TLB from the page table resident in
memory. The software writes to an entry selected using the Index register or a random entry indicated in the
Random register.
If more than one entry in the TLB matches the virtual address being translated, TLB operations are not performed
correctly. In the VR4181, the TLB-Shutdown (TS) bit of the Status register is set to 1, and the TLB becomes unusable
(an attempt to access the TLB results in a TLB Refill exception regardless of whether there is an entry that hits). The
TS bit can be cleared only by a reset. The VR4121, VR4122, VR4131, and VR4181A have no TS bit, and their
operation is undefined if more than one entry in the TLB matches.
Note that virtual addresses may be converted to physical addresses without using a TLB, depending on the
address space that is being subjected to address translation. For example, address translation for the kseg0 or
kseg1 address space does not use mapping. The physical addresses of these address spaces are determined by
subtracting the base address of the address space from the virtual addresses.
5.2.1 Format of a TLB entry
Each TLB entry has fields corresponding to the EntryHi, EntryLo0, EbtryLo1, and PageMask registers. The format
of the EntryHi, EntryLo0, EbtryLo1, and PageMask registers are nearly the same as the TLB entry. However, the bit
in the EntryHi register that corresponds to the TLB G bit is a reserved bit (0), and the bit in the TLB entry that
corresponds to the G bit of the EntryLo register is reserved to 0. For details about other bits, refer to the descriptions
of each register.
Figure 5-1 shows the TLB entry formats for both 32- and 64-bit modes.
User’s Manual U15509EJ2V0UM
125
CHAPTER 5 MEMORY MANAGEMENT SYSTEM
Figure 5-1. Format of a TLB Entry
(a) 32-bit Mode
115 114
127
107 106
0
96
0
MASK
95
75 74 73
VPN2
63
G
60 59
PFN
6
0
35 34 33 32
C
28 27
5
3
PFN
64
ASID
38 37
0
31
72 71
0
C
D
V
0
2
1
0
D
V
0
(b) 64-bit Mode
255
211 210
0
191
190 189
R
168 167
0
139 138 137 136 135
VPN2
92 91
0
63
192
MASK
0
127
203 202
G
0
70 69
PFN
28 27
0
67 66 65 64
C
6 5
PFN
128
ASID
D
3
C
V
0
2
1
0
D
V
0
5.2.2 Manipulation of TLB
The contents of each TLB entry can be read or written through the EntryHi, EntryLo0, EbtryLo1, and PageMask
registers with TLB manipulation instructions, as shown in Figure 5-2. An entry specified through the Index register or
indicated in the Random register is used as a target.
The TLB must also be initialized and set after reset. Refer to VR Series Programming Guide Application Note
for details about procedures and program examples of initialization.
126
User’s Manual U15509EJ2V0UM
CHAPTER 5 MEMORY MANAGEMENT SYSTEM
Figure 5-2. TLB Manipulation Overview
PageMask
EntryHi
EntryLo1
EntryLo0
31
TLB entry specified
by Index register or
Random register
TLB
0
127/255
0
5.2.3 TLB instructions
The instructions used for TLB control are described below. Refer to Chapter 9 for details about each instruction.
(1) Translation lookaside buffer probe (TLBP)
The translation lookaside buffer probe (TLBP) instruction loads the Index register with a TLB number that
matches the content of the EntryHi register. If there is no TLB number that matches the TLB entry, the highestorder bit of the Index register is set.
(2) Translation lookaside buffer read (TLBR)
The translation lookaside buffer read (TLBR) instruction loads the EntryHi, EntryLo0, EntryLo1, and PageMask
registers with the content of the TLB entry indicated by the content of the Index register.
(3) Translation lookaside buffer write index (TLBWI)
The translation lookaside buffer write index (TLBWI) instruction writes the contents of the EntryHi, EntryLo0,
EntryLo1, and PageMask registers to the TLB entry indicated by the content of the Index register.
(4) Translation lookaside buffer write random (TLBWR)
The translation lookaside buffer write random (TLBWR) instruction writes the contents of the EntryHi, EntryLo0,
EntryLo1, and PageMask registers to the TLB entry indicated by the content of the Random register.
5.2.4 TLB exceptions
If there is no TLB entry that matches the virtual address, a TLB Refill exception occurs. If the access control bits
(D and V) indicate that the access is not valid, a TLB Modified or TLB Invalid exception occurs. If the C bit is 010, the
retrieved physical address directly accesses main memory, bypassing the cache.
See Chapter 6 for details of the TLB Miss exception.
User’s Manual U15509EJ2V0UM
127
CHAPTER 5 MEMORY MANAGEMENT SYSTEM
5.3 Virtual-to-Physical Address Translation
Converting a virtual address to a physical address begins by comparing the virtual address from the processor
with the virtual addresses of all entries in the TLB. Either of the following comparisons is performed for the virtual
page number (VPN):
• In 32-bit mode, the high-order bits
Note
of the 32-bit virtual address are compared to the contents of the VPN2
(virtual page number divided by two) of each TLB entry.
• In 64-bit mode, the high-order bits
Note
of the 64-bit virtual address are compared to the contents of the VPN2
(virtual page number divided by two) and R of each TLB entry.
Note The number of bits differs from page sizes. The table below shows the examples of high-order bits of
the virtual address in page size of 256 KB and 1 KB.
Page size
256 KB
1 KB
Mode
32-bit mode
bits 31 to 19
bits 31 to 11
64-bit mode
bits 63, 62, 39 to 19
bits 63, 62, 39 to 11
It is a match when there is an entry whose VPN field is the same as that of virtual address, and either:
• the Global (G) bit of the TLB entry is set to 1, or
• the ASID field of the virtual address is the same as the ASID field of the TLB entry.
This match is referred to as a TLB hit.
If a TLB entry matches, the physical address and access control bits (C, D, and V) are retrieved from the matching
TLB entry. While the V bit of the entry must be set to 1 for a valid address translation to take place, it is not involved
in the determination of a matching TLB entry. The offset is concatenated to the retrieved physical address. An
offset, which indicates an address within the page frame space, is the low-order bits of the virtual address and is
output without passing through the TLB.
If there is no match, a TLB Refill exception is taken by the processor and software is allowed to refill the TLB from
a page table of virtual/physical addresses in memory.
Figure 5-3 illustrates an outline of the address translation, and Figure 5-4 illustrates the TLB address translation
flow.
128
User’s Manual U15509EJ2V0UM
CHAPTER 5 MEMORY MANAGEMENT SYSTEM
Figure 5-3. Virtual-to-Physical Address Translation
Virtual address
1 VPN (virtual page number, high-order
bits of virtual address) is compared
with that in TLB.
2 If there is a match, PFN (page frame
number, high-order bits of physical
address) is output from TLB.
ASID
G
VPN
ASID
Offset
VPN
TLB
entry
PFN
TLB
3 The offset, which does not pass
through TLB, is concatenated to PFN.
Offset
PFN
Physical address
User’s Manual U15509EJ2V0UM
129
CHAPTER 5 MEMORY MANAGEMENT SYSTEM
Figure 5-4. Address Translation in TLB
Virtual address
input
User mode?
No
Yes
No
Address Error
exception
Supervisor
mode?
Address
OK?
Yes
Yes
No
No
Address
OK?
Mapped
adderss?
Yes
Address
OK?
No
No
Address Error
exception
Yes
Address Error
exception
Yes
VPN match?
No
Yes
G bit = 1?
No
Yes
ASID match?
V bit = 1?
No
Yes
Yes
TLB Refill
exception
Write?
No
No
Yes
130
No
No
Yes
Uncached?
32-bit
address?
TLB Invalid
exception
Yes
D bit = 1?
No
Physical address
output
Physical address
output
Access main memory
Access cache
User’s Manual U15509EJ2V0UM
Yes
TLB Modified
exception
XTLB Refill
exception
CHAPTER 5 MEMORY MANAGEMENT SYSTEM
5.3.1 32-bit mode address translation
Figure 5-5 shows the virtual-to-physical-address translation of a 32-bit mode address. The pages can have five
different sizes between 1 KB (10 bits) and 256 KB (18 bits), each being 4 times as large as the preceding one in
ascending order, that is 1 K, 4 K, 16 K, 64 K, and 256 K. This figure illustrates the two possible page sizes: a 1 KB
page (10 bits) and a 256 KB page (18 bits).
• Shown at the top of Figure 5-5 is the virtual address space in which the page size is 1 KB and the offset is 10
bits. The 22 bits excluding the ASID field represents the virtual page number (VPN), enabling selecting a
page table of 4 M entries.
• Shown at the bottom of Figure 5-5 is the virtual address space in which the page size is 256 KB and the offset
is 18 bits. The 14 bits excluding the ASID field represents the VPN, enabling selecting a page table of 16 K
entries.
Figure 5-5. 32-bit Mode Virtual Address Translation
32 31 29 28
39
Virtual address with
4M (222) 1KB pages
ASID
10 9
VPN
Note
0
Offset
22 bits = 4M pages
Offset passed unchanged
and used for physical address
Virtual-to-physical address
translation in TLB
TLB
31
0
32-bit physical
address
PFN
Offset
TLB
Virtual-to-physical address
translation in TLB
39
Virtual address with
16K (214) 256KB pages
32 31 29 28
ASID
Note
Offset passed unchanged
and used for physical address
18 17
VPN
0
Offset
14 bits = 16K pages
Note Bits 31 to 29 of the virtual address select user, supervisor, or kernel address spaces.
User’s Manual U15509EJ2V0UM
131
CHAPTER 5 MEMORY MANAGEMENT SYSTEM
5.3.2 64-bit mode address translation
Figure 5-6 shows the virtual-to-physical-address translation of a 64-bit mode address. The pages can have five
different sizes between 1 KB (10 bits) and 256 KB (18 bits), each being 4 times as large as the preceding one in
ascending order, that is 1 K, 4 K, 16 K, 64 K, and 256 K. This figure illustrates the two possible page sizes: a 1 KB
page (10 bits) and a 256 KB page (18 bits).
• Shown at the top of Figure 5-6 is the virtual address space in which the page size is 1 KB and the offset is 10
bits. The 30 bits excluding the ASID field represents the virtual page number (VPN), enabling selecting a
page table of 1 G entry.
• Shown at the bottom of Figure 5-6 is the virtual address space in which the page size is 256 KB and the offset
is 18 bits. The 22 bits excluding the ASID field represents the VPN, enabling selecting a page table of 4 M
entries.
Figure 5-6. 64-bit Mode Virtual Address Translation
71
Virtual address with
1G (230) 1KB pages
64 63 62 61
ASID
Note
40 39
10 9
0 or -1
0
VPN
Offset
30 bits = 1G pages
Offset passed unchanged
and used for physical address
Virtual-to-physical address
translation in TLB
TLB
31
0
32-bit physical
address
PFN
Offset
TLB
Virtual-to-physical address
translation in TLB
Virtual address with
4 M (222) 256KB pages
71
64 63 62 61
ASID
40 39
Note 0 or -1
Offset passed unchanged
and used for physical address
18 17
VPN
0
Offset
22 bits = 4M pages
Note Bits 63 and 62 of the virtual address select user, supervisor, or kernel address spaces.
132
User’s Manual U15509EJ2V0UM
CHAPTER 5 MEMORY MANAGEMENT SYSTEM
5.4 Address Spaces
The address space of the CPU is extended in memory management system, by converting (translating) huge
virtual memory addresses into physical addresses.
The physical address space of the VR4100 Series is 4 GB and 32-bit width addresses are used.
For the virtual address space, up to 2 GB (2
31
bytes) are provided as a user’s area and 32-bit width addresses are
used in the 32-bit mode. In the 64-bit mode, up to 1 TB (2
40
bytes) is provided as a user’s area and 64-bit width
addresses are used. For the format of the TLB entry in each mode, refer to 5.2.1.
As shown in Figures 5-5 and 5-6, the virtual address is extended with an address space identifier (ASID), which
reduces the frequency of TLB flushing when switching contexts. This 8-bit ASID is in the CP0 EntryHi register, and
the Global (G) bit is in the EntryLo0 and EntryLo1 registers, described later in this chapter.
5.4.1 User mode virtual address space
During User mode, a 2 GB (2
mode, a 1 TB (2
40
31
bytes) virtual address space (useg) can be used in the 32-bit mode. In the 64-bit
bytes) virtual address space (xuseg) can be used.
As shown in Tables 5-5 and 5-6, each virtual address is extended independently as another virtual address by
setting an 8-bit address space ID area (ASID), to support user processes of up to 256. The contents of TLB can be
retained after context switching by allocating each process by ASID. useg and xuseg can be referenced via TLB.
Whether a cache is used or not is determined for each page by the TLB entry (depending on the C bit setting in the
TLB entry).
The User segment starts at address 0 and the current active user process resides in either useg (in 32-bit mode)
or xuseg (in 64-bit mode). The TLB identically maps all references to useg/xuseg from all modes, and controls cache
accessibility.
The processor operates in User mode when the Status register contains the following bit-values:
• KSU = 10
• EXL = 0
• ERL = 0
In conjunction with these bits, the UX bit in the Status register selects 32- or 64-bit User mode addressing as
follows:
• When UX = 0, 32-bit useg space is selected.
• When UX = 1, 64-bit xuseg space is selected.
Figure 5-7 shows the address mapping for the User mode, and Table 5-1 lists the characteristics of each user
segment (useg and xuseg).
User’s Manual U15509EJ2V0UM
133
CHAPTER 5 MEMORY MANAGEMENT SYSTEM
Figure 5-7. User Mode Address Space
32-bit ModeNote
64-bit Mode
0xFFFF FFFF FFFF FFFF
0xFFFF FFFF
Address Error
Address Error
0x8000 0000
0x7FFF FFFF
0x0000 0100 0000 0000
0x0000 00FF FFFF FFFF
2GB
TLB Mapped
1TB
TLB Mapped
useg
0x0000 0000
Note
xuseg
0x0000 0000 0000 0000
The VR4100 Series uses 64-bit addresses within it. When the processor is running in Kernel mode, it
saves the contents of each register or restores their previous contents to initialize them before
switching the context. For 32-bit mode addressing, bit 31 is sign-extended to bits 32 to 63, and the
resulting 32 bits are used for addressing.
generate invalid addresses.
Usually, it is impossible for 32-bit mode programs to
If context switching occurs and the processor enters Kernel mode,
however, an attempt may be made to save an address other than the sign-extended 32-bit address
mentioned above to a 64-bit register. In this case, user-mode programs are likely to generate an
invalid address.
Table 5-1. User Mode Segments
Mode
32-bit
Address bit
value
EXL
ERL
UX
Segment
name
Address range
KSU
Status register bit value
A31 = 0
10
0
0
0
useg
0x0000 0000
to
Size
2 GB
31
(2
bytes)
0x7FFF FFFF
64-bit
A(63:40) = 0
10
0
0
1
xuseg
0x0000 0000 0000 0000
to
0x0000 00FF FFFF FFFF
134
User’s Manual U15509EJ2V0UM
1 TB
40
(2
bytes)
CHAPTER 5 MEMORY MANAGEMENT SYSTEM
(1) useg (32-bit mode)
In User mode, when UX = 0 in the Status register and the most significant bit of the virtual address is 0, this
virtual address space is labeled useg.
Any attempt to reference an address with the most-significant bit set while in User mode causes an Address
Error exception (see CHAPTER 6 EXCEPTION PROCESSING).
The TLB Refill exception vector is used for TLB misses.
(2) xuseg (64-bit mode)
In User mode, when UX = 1 in the Status register and bits 63 to 40 of the virtual address are all 0, this virtual
address space is labeled xuseg.
Any attempt to reference an address with bits 63:40 equal to 1 causes an Address Error exception (see
CHAPTER 6 EXCEPTION PROCESSING).
The XTLB Refill exception vector is used for TLB misses.
5.4.2 Supervisor mode virtual address space
Supervisor mode is designed for layered operating systems in which a true kernel runs in Kernel mode, and the
rest of the operating system runs in Supervisor mode.
All of the suseg, sseg, xsuseg, xsseg, and csseg spaces are referenced via TLB. Whether cache can be used or
not is determined by bit C of each page’s TLB entry.
The processor operates in Supervisor mode when the Status register contains the following bit-values:
• KSU = 01
• EXL = 0
• ERL = 0
In conjunction with these bits, the SX bit in the Status register selects 32- or 64-bit Supervisor mode addressing as
follows:
• When SX = 0, 32-bit supervisor space is selected.
• When SX = 1, 64-bit supervisor space is selected.
Figure 5-8 shows the supervisor mode address space, and Table 5-2 lists the characteristics of the Supervisor
mode segments.
User’s Manual U15509EJ2V0UM
135
CHAPTER 5 MEMORY MANAGEMENT SYSTEM
Figure 5-8. Supervisor Mode Address Space
32-bit ModeNote
64-bit Mode
0xFFFF FFFF
0xFFFF FFFF FFFF FFFF
Address Error
0xE000 0000
0xDFFF FFFF
0xC000 0000
0xBFFF FFFF
0.5GB
TLB Mapped
Address Error
0xFFFF FFFF E000 0000
0xFFFF FFFF DFFF FFFF
sseg
0xFFFF FFFF C000 0000
0xFFFF FFFF BFFF FFFF
Address Error
0x4000 0100 0000 0000
0x4000 00FF FFFF FFFF
0x4000 0000 0000 0000
0x3FFF FFFF FFFF FFFF
suseg
Note
1TB
TLB Mapped
xsseg
Address Error
0x0000 0100 0000 0000
0x0000 00FF FFFF FFFF
0x0000 0000
csseg
Address Error
0x8000 0000
0x7FFF FFFF
2GB
TLB Mapped
0.5GB
TLB Mapped
0x0000 0000 0000 0000
The VR4100 Series uses 64-bit addresses within it.
1TB
TLB Mapped
xsuseg
For 32-bit mode addressing, bit 31 is sign-
extended to bits 32 to 63, and the resulting 32 bits are used for addressing. Usually, it is impossible
for 32-bit mode programs to generate invalid addresses. In an operation of base register + offset for
addressing, however, a two’s complement overflow may occur, causing an invalid address. Note that
the result becomes undefined. Two factors that can cause a two’s complement follow:
• When offset bit 15 is 0, base register bit 31 is 0, and bit 31 of the operation “base register + offset”
is 1
• When offset bit 15 is 1, base register bit 31 is 1, and bit 31 of the operation “base register + offset”
is 0
136
User’s Manual U15509EJ2V0UM
CHAPTER 5 MEMORY MANAGEMENT SYSTEM
Table 5-2. 32-bit and 64-bit Supervisor Mode Segments
Mode
32-bit
Address bit
Status register bit value
Segment
value
KSU
EXL
ERL
SX
name
A31 = 0
01
0
0
0
suseg
Address range
Size
0x0000 0000
to
2 GB
31
(2
bytes)
0x7FFF FFFF
32-bit
A(31:29) = 110
01
0
0
0
sseg
0xC000 0000
to
512 MB
29
(2
bytes)
0xDFFF FFFF
64-bit
A(63:62) = 00
01
0
0
1
xsuseg
0x0000 0000 0000 0000
to
1 TB
40
(2
bytes)
0x0000 00FF FFFF FFFF
64-bit
A(63:62) = 01
01
0
0
1
xsseg
0x4000 0000 0000 0000
to
1 TB
40
(2
bytes)
0x4000 00FF FFFF FFFF
64-bit
A(63:62) = 11
01
0
0
1
csseg
0xFFFF FFFF C000 0000
to
512 MB
29
(2
bytes)
0xFFFF FFFF DFFF FFFF
(1) suseg (32-bit Supervisor mode, user space)
When SX = 0 in the Status register and the most-significant bit of the virtual address space is set to 0, the suseg
virtual address space is selected; it covers 2 GB (2
31
bytes) of the current user address space. The virtual
address is extended with the contents of the 8-bit ASID field to form a unique virtual address. This mapped
space starts at virtual address 0x0000 0000 and runs through 0x7FFF FFFF.
(2) sseg (32-bit Supervisor mode, supervisor space)
When SX = 0 in the Status register and the three most-significant bits of the virtual address space are 110, the
sseg virtual address space is selected; it covers 512 MB (2
29
bytes) of the current supervisor virtual address
space. The virtual address is extended with the contents of the 8-bit ASID field to form a unique virtual address.
This mapped space begins at virtual address 0xC000 0000 and runs through 0xDFFF FFFF.
(3) xsuseg (64-bit Supervisor mode, user space)
When SX = 1 in the Status register and bits 63 and 62 of the virtual address space are set to 00, the xsuseg
virtual address space is selected; it covers 1 TB (2
40
bytes) of the current user address space. The virtual
address is extended with the contents of the 8-bit ASID field to form a unique virtual address. This mapped
space starts at virtual address 0x0000 0000 0000 0000 and runs through 0x0000 00FF FFFF FFFF.
User’s Manual U15509EJ2V0UM
137
CHAPTER 5 MEMORY MANAGEMENT SYSTEM
(4) xsseg (64-bit Supervisor mode, current supervisor space)
When SX = 1 in the Status register and bits 63 and 62 of the virtual address space are set to 01, the xsseg virtual
address space is selected; it covers 1 TB (2
40
bytes) of the current supervisor virtual address space. The virtual
address is extended with the contents of the 8-bit ASID field to form a unique virtual address. This mapped
space begins at virtual address 0x4000 0000 0000 0000 and runs through 0x4000 00FF FFFF FFFF.
(5) csseg (64-bit Supervisor mode, separate supervisor space)
When SX = 1 in the Status register and bits 63 and 62 of the virtual address space are set to 11, the csseg
virtual address space is selected; it covers 512 MB (2
29
bytes) of the separate supervisor virtual address space.
The virtual address is extended with the contents of the 8-bit ASID field to form a unique virtual address. This
mapped space begins at virtual address 0xFFFF FFFF C000 0000 and runs through 0xFFFF FFFF DFFF FFFF.
5.4.3 Kernel mode virtual address space
If the Status register satisfies any of the following conditions, the processor runs in Kernel mode.
• KSU = 00
• EXL = 1
• ERL = 1
The addressing width in Kernel mode varies according to the state of the KX bit of the Status register, as follows:
• When KX = 0, 32-bit kernel space is selected.
• When KX = 1, 64-bit kernel space is selected.
The processor enters Kernel mode whenever an exception is detected and it remains in Kernel mode until an
exception return (ERET) instruction is executed and results in ERL and/or EXL = 0. The ERET instruction restores
the processor to the mode existing prior to the exception.
Kernel mode virtual address space is divided into regions differentiated by the high-order bits of the virtual
address, as shown in Figure 5-9. Table 5-3 lists the characteristics of the 32-bit Kernel mode segments, and Table
5-4 lists the characteristics of the 64-bit Kernel mode segments.
138
User’s Manual U15509EJ2V0UM
CHAPTER 5 MEMORY MANAGEMENT SYSTEM
Figure 5-9. Kernel Mode Address Space
32-bit ModeNote1
64-bit Mode
0xFFFF FFFF FFFF FFFF
0xFFFF FFFF
0.5 GB
TLB Mapped
kseg3
0xE000 0000
0xDFFF FFFF
0.5 GB
TLB Mapped
0xC000 0000
0xBFFF FFFF
0xA000 0000
0x9FFF FFFF
ksseg
0.5 GB
TLB Unmapped
Uncached
kseg1
0.5 GB
TLB Unmapped
CacheableNote2
kseg0
0xFFFF FFFF E000 0000
0xFFFF FFFF DFFF FFFF
0xFFFF FFFF C000 0000
0xFFFF FFFF BFFF FFFF
0xFFFF FFFF A000 0000
0xFFFF FFFF 9FFF FFFF
0xFFFF FFFF 8000 0000
0xFFFF FFFF 7FFF FFFF
0.5 GB
TLB Mapped
ckseg3
0.5 GB
TLB Mapped
cksseg
0.5 GB
TLB Unmapped
Uncached
ckseg1
0.5 GB
TLB Unmapped
CacheableNote2
ckseg0
Address Error
0x8000 0000
0x7FFF FFFF
0xC000 00FF 8000 0000
0xC000 00FF 7FFF FFFF
TLB Mapped
0xC000 0000 0000 0000
0xBFFF FFFF FFFF FFFF
0x8000 0000 0000 0000
0x7FFF FFFF FFFF FFFF
xkseg
TLB Unmapped
xkphys
(Refer to Figure 5-10)
Address Error
2 GB
TLB Mapped
kuseg
0x4000 0100 0000 0000
0x4000 00FF FFFF FFFF
0x4000 0000 0000 0000
0x3FFF FFFF FFFF FFFF
1 TB
TLB Mapped
xksseg
Address Error
0x0000 0100 0000 0000
0x0000 00FF FFFF FFFF
0x0000 0000
0x0000 0000 0000 0000
1 TB
TLB Mapped
xkuseg
Notes 1. The VR4100 Series uses 64-bit addresses within it. For 32-bit mode addressing, bit 31 is signextended to bits 32 to 63, and the resulting 32 bits are used for addressing. Usually, a 64-bit
instruction is used for the program in 32-bit mode. In an operation of base register + offset for
addressing, however, a two’s complement overflow may occur, causing an invalid address. Note
that the result becomes undefined. Two factors that can cause a two’s complement follow:
• When offset bit 15 is 0, base register bit 31 is 0, and bit 31 of the operation “base register +
offset” is 1
• When offset bit 15 is 1, base register bit 31 is 1, and bit 31 of the operation “base register +
offset” is 0
2. The K0 field of the Config register controls cacheability of kseg0 and ckseg0.
User’s Manual U15509EJ2V0UM
139
CHAPTER 5 MEMORY MANAGEMENT SYSTEM
Figure 5-10. xkphys Area Address Space
0xBFFF FFFF FFFF FFFF
Address Error
0xB800 0001 0000 0000
0xB800 0000 FFFF FFFF
0xB800 0000 0000 0000
0xB7FF FFFF FFFF FFFF
4 GB
TLB Unmapped
Cacheable
Address Error
0xB000 0001 0000 0000
0xB000 0000 FFFF FFFF
0xB000 0000 0000 0000
0xAFFF FFFF FFFF FFFF
4 GB
TLB Unmapped
Cacheable
Address Error
0xA800 0001 0000 0000
0xA800 0000 FFFF FFFF
0xA800 0000 0000 0000
0xA7FF FFFF FFFF FFFF
4 GB
TLB Unmapped
Cacheable
Address Error
0xA000 0001 0000 0000
0xA000 0000 FFFF FFFF
0xA000 0000 0000 0000
0x9FFF FFFF FFFF FFFF
4 GB
TLB Unmapped
Cacheable
Address Error
0x9800 0001 0000 0000
0x9800 0000 FFFF FFFF
0x9800 0000 0000 0000
0x97FF FFFF FFFF FFFF
4 GB
TLB Unmapped
Cacheable
Address Error
0x9000 0001 0000 0000
0x9000 0000 FFFF FFFF
0x9000 0000 0000 0000
0x8FFF FFFF FFFF FFFF
4 GB
TLB Unmapped
Uncached
Address Error
0x8800 0001 0000 0000
0x8800 0000 FFFF FFFF
0x8800 0000 0000 0000
0x87FFF FFFF FFFF FFFF
4 GB
TLB Unmapped
Cacheable
Address Error
0x8000 0001 0000 0000
0x8000 0000 FFFF FFFF
0x8000 0000 0000 0000
140
4 GB
TLB Unmapped
Cacheable
User’s Manual U15509EJ2V0UM
CHAPTER 5 MEMORY MANAGEMENT SYSTEM
Table 5-3. 32-bit Kernel Mode Segments
Address bit value
Status register bit value
KSU
A31 = 0
A(31:29) = 100
EXL
ERL
KSU = 00
Segment
KX
name
0
kuseg
Virtual address
0x0000 0000
to
EXL = 1
0x7FFF FFFF
0
kseg0
0
A(31:29) = 110
0
kseg1
ksseg
TLB map
2 GB
31
(2
0x8000 0000
0x0000 0000
to
to
0x9FFF FFFF
0x1FFF FFFF
0xA000 0000
0x0000 0000
to
to
0xBFFF FFFF
0x1FFF FFFF
0xC000 0000
TLB map
ERL = 1
A(31:29) = 101
Size
Address
or
or
Physical
bytes)
512 MB
29
(2
bytes)
512 MB
29
(2
bytes)
512 MB
29
to
(2
bytes)
0xDFFF FFFF
A(31:29) = 111
0
kseg3
0xE000 0000
TLB map
512 MB
29
to
(2
bytes)
0xFFFF FFFF
(1) kuseg (32-bit Kernel mode, user space)
When KX = 0 in the Status register, and the most-significant bit of the virtual address space is 0, the kuseg
31
virtual address space is selected; it is the current 2 GB (2 -byte) user address space.
The virtual address is extended with the contents of the 8-bit ASID field to form a unique virtual address.
References to kuseg are mapped through TLB. Whether cache can be used or not is determined by bit C of
each page’s TLB entry.
If the ERL bit of the Status register is 1, the user address space is assigned 2 GB (2
31
bytes) without TLB
mapping and becomes unmapped (with virtual addresses being used as physical addresses) and uncached so
that the cache error handler can use it. This allows the Cache Error exception code to operate uncached using
r0 as a base register.
(2) kseg0 (32-bit Kernel mode, kernel space 0)
When KX = 0 in the Status register and the most-significant three bits of the virtual address space are 100, the
29
kseg0 virtual address space is selected; it is the current 512 MB (2 -byte) physical space.
References to kseg0 are not mapped through TLB; the physical address selected is defined by subtracting
0x8000 0000 from the virtual address.
The K0 field of the Config register controls cacheability (refer to 5.5.8).
User’s Manual U15509EJ2V0UM
141
CHAPTER 5 MEMORY MANAGEMENT SYSTEM
(3) kseg1 (32-bit Kernel mode, kernel space 1)
When KX = 0 in the Status register and the most-significant three bits of the virtual address space are 101, the
29
kseg1 virtual address space is selected; it is the current 512 MB (2 -byte) physical space.
References to kseg1 are not mapped through TLB; the physical address selected is defined by subtracting
0xA000 0000 from the virtual address.
Caches are disabled for accesses to these addresses, and main memory (or memory-mapped I/O device
registers) is accessed directly.
(4) ksseg (32-bit Kernel mode, supervisor space)
When KX = 0 in the Status register and the most-significant three bits of the virtual address space are 110, the
29
ksseg virtual address space is selected; it is the current 512 MB (2 -byte) virtual address space. The virtual
address is extended with the contents of the 8-bit ASID field to form a unique virtual address.
References to ksseg are mapped through TLB. Whether cache can be used or not is determined by bit C of
each page’s TLB entry.
(5) kseg3 (32-bit Kernel mode, kernel space 3)
When KX = 0 in the Status register and the most-significant three bits of the virtual address space are 111, the
29
kseg3 virtual address space is selected; it is the current 512 MB (2 -byte) kernel virtual space. The virtual
address is extended with the contents of the 8-bit ASID field to form a unique virtual address.
References to kseg3 are mapped through TLB. Whether cache can be used or not is determined by bit C of
each page’s TLB entry.
142
User’s Manual U15509EJ2V0UM
CHAPTER 5 MEMORY MANAGEMENT SYSTEM
Table 5-4. 64-bit Kernel Mode Segments
Address bit
value
Status register bit value
KSU
A(63:62) = 00
EXL
ERL
Virtual address
KX
Segment
name
Physical
address
1
xkuseg
0x0000 0000 0000 0000
TLB map
KSU = 00
or
to
EXL = 1
0x0000 00FF FFFF FFFF
or
A(63:62) = 01
1
xksseg
0x4000 0000 0000 0000
ERL = 1
Size
1 TB
40
(2
TLB map
bytes)
1 TB
40
to
(2
bytes)
0x4000 00FF FFFF FFFF
A(63:62) = 10
1
A(63:62) = 11
xkphys
1
xkseg
0x8000 0000 0000 0000
0x0000 0000
4 GB
32
to
to
0xBFFF FFFF FFFF FFFF
0xFFFF FFFF
0xC000 0000 0000 0000
TLB map
(2
40
2
bytes)
31
-2
bytes
to
0xC000 00FF 7FFF FFFF
A(63:62) = 11
1
ckseg0
0xFFFF FFFF 8000 0000
0x0000 0000
to
to
0xFFFF FFFF 9FFF FFFF
0x1FFF FFFF
0xFFFF FFFF A000 0000
0x0000 0000
to
to
0xFFFF FFFF BFFF FFFF
0x1FFF FFFF
0xFFFF FFFF C000 0000
TLB map
A(63:31) = -1
A(63:62) = 11
1
ckseg1
A(63:31) = -1
A(63:62) = 11
1
cksseg
A(63:31) = -1
512 MB
29
(2
512 MB
29
(2
bytes)
512 MB
29
to
bytes)
(2
bytes)
0xFFFF FFFF DFFF FFFF
A(63:62) = 11
1
ckseg3
0xFFFF FFFF E000 0000
A(63:31) = -1
TLB map
512 MB
29
to
(2
bytes)
0xFFFF FFFF FFFF FFFF
(6) xkuseg (64-bit Kernel mode, user space)
When KX = 1 in the Status register and bits 63 and 62 of the virtual address space are 00, the xkuseg virtual
40
address space is selected; it is the 1 TB (2 -byte) current user address space. The virtual address is extended
with the contents of the 8-bit ASID field to form a unique virtual address.
References to xkuseg are mapped through TLB. Whether cache can be used or not is determined by bit C of
each page’s TLB entry.
If the ERL bit of the Status register is 1, the user address space is assigned 2 GB (2
31
bytes) without TLB
mapping and becomes unmapped (with virtual addresses being used as physical addresses) and uncached so
that the cache error handler can use it. This allows the Cache Error exception code to operate uncached using
r0 as a base register.
(7) xksseg (64-bit Kernel mode, current supervisor space)
When KX = 1 in the Status register and bits 63 and 62 of the virtual address space are 01, the xksseg address
40
space is selected; it is the 1 TB (2 -byte) current supervisor address space. The virtual address is extended
with the contents of the 8-bit ASID field to form a unique virtual address.
References to xksseg are mapped through TLB. Whether cache can be used or not is determined by bit C of
each page’s TLB entry.
User’s Manual U15509EJ2V0UM
143
CHAPTER 5 MEMORY MANAGEMENT SYSTEM
(8) xkphys (64-bit Kernel mode, physical spaces)
When the KX = 1 in the Status register and bits 63 and 62 of the virtual address space are 10, the virtual address
space is called xkphys and selected as either cached or uncached. If any of bits 58 to 32 of the address is 1, an
attempt to access that address results in an address error.
Whether cache can be used or not is determined by bits 59 to 61 of the virtual address. Table 5-5 shows
cacheability corresponding to 8 address spaces.
Table 5-5. Cacheability and the xkphys Address Space
Bits 61 to 59
Cacheability
Address range
0
Cached
0x8000 0000 0000 0000
to
0x8000 0000 FFFF FFFF
1
Cached
0x8800 0000 0000 0000
to
0x8800 0000 FFFF FFFF
2
Uncached
0x9000 0000 0000 0000
to
0x9000 0000 FFFF FFFF
3
Cached
0x9800 0000 0000 0000
to
0x9800 0000 FFFF FFFF
4
Cached
0xA000 0000 0000 0000
to
0xA000 0000 FFFF FFFF
5
Cached
0xA800 0000 0000 0000
to
0xA800 0000 FFFF FFFF
6
Cached
0xB000 0000 0000 0000
to
0xB000 0000 FFFF FFFF
7
Cached
0xB800 0000 0000 0000
to
0xB800 0000 FFFF FFFF
(9) xkseg (64-bit Kernel mode, kernel spaces)
When the KX = 1 in the Status register and bits 63 and 62 of the virtual address space are 11, the virtual address
space is called xkseg and selected as either of the following:
• Kernel virtual space, xkseg, the current kernel virtual space; the virtual address is extended with the
contents of the 8-bit ASID field to form a unique virtual address
References to xkseg are mapped through TLB. Whether cache can be used or not is determined by bit C
of each page’s TLB entry.
• one of the four 32-bit kernel compatibility spaces, as described in the next section.
144
User’s Manual U15509EJ2V0UM
CHAPTER 5 MEMORY MANAGEMENT SYSTEM
(10) 64-bit Kernel mode compatible spaces (ckseg0, ckseg1, cksseg, and ckseg3)
If the conditions listed below are satisfied in Kernel mode, ckseg0, ckseg1, cksseg, or ckseg3 (each having 512
Mbytes) is selected as a compatible space according to the state of the bits 30 and 29 (two low-order bits) of the
address.
• The KX bit of the Status register is 1.
• Bits 63 and 62 of the 64-bit virtual address are 11.
• Bits 61 to 31 of the virtual address are all 1.
(a) ckseg0
This space is an unmapped region, compatible with the 32-bit mode kseg0 space. The K0 field of the Config
register controls cacheability and coherency (refer to 5.5.8).
(b) ckseg1
This space is an unmapped and uncached region, compatible with the 32-bit mode kseg1 space.
(c) cksseg
This space is the current supervisor virtual space, compatible with the 32-bit mode ksseg space.
References to cksseg are mapped through TLB. Whether cache can be used or not is determined by bit C of
each page’s TLB entry.
(d) ckseg3
This space is the current supervisor virtual space, compatible with the 32-bit mode kseg3 space.
References to ckseg3 are mapped through TLB. Whether cache can be used or not is determined by bit C of
each page’s TLB entry.
User’s Manual U15509EJ2V0UM
145
CHAPTER 5 MEMORY MANAGEMENT SYSTEM
5.5 Memory Management Registers
This section describes the CP0 registers that are accessed by the memory management system and software.
Table 5-6 lists the CP0 registers. About the exception processing registers of the CP0 registers, refer to CHAPTER 6
EXCEPTION PROCESSING.
Table 5-6 CP0 Registers
(a) Memory Management Registers
Register name
Register number
(b) Exception Processing Registers
Register name
Register number
Index register
0
Context register
4
Random register
1
BadVAddr register
8
EntryLo0 register
2
Count register
9
EntryLo1 register
3
Compare register
11
PageMask register
5
Status register
12
Wired register
6
Cause register
13
EntryHi register
10
EPC register
14
PRId register
15
WatchLo register
18
Config register
LLAddr register
TagLo register
Note1
16
WatchHi register
19
17
XContext register
20
28
Parity Error register
Note2
TagHi register
29
Cache Error register
−
−
ErrorEPC register
Note2
26
27
30
Notes 1. This register is defined to maintain compatibility with the VR4000 and VR4400. The
content of this register is meaningless in the normal operation.
2. This register is defined to maintain compatibility with the VR4100. This register is
not used in the normal operation.
Details about each register are explained below.
The parenthesized number in section titles is the register
number (refer to 1.2.3).
146
User’s Manual U15509EJ2V0UM
CHAPTER 5 MEMORY MANAGEMENT SYSTEM
5.5.1 Index register (0)
The Index register is a 32-bit, read/write register containing five low-order bits to index an entry in the TLB. The
most-significant bit of the register shows the success or failure of a TLB probe (TLBP) instruction.
The Index register also specifies the TLB entry affected by TLB read (TLBR) or TLB write index (TLBWI)
instructions.
The contents of the Index register after reset are undefined so that it must be initialized by software.
Figure 5-11. Index Register
31 30
5
P
4
0
0
Index
P
: Indicates whether probing is successful or not. It is set to 1 if the latest TLBP instruction fails. It is
Index
: Specifies an index to a TLB entry that is a target of the TLBR or TLBWI instruction.
0
: Reserved for future use. Write 0 in a write operation. When this field is read, 0 is read.
cleared to 0 when the TLBP instruction is successful.
5.5.2 Random register (1)
The Random register is a read-only register. The low-order 5 bits are used in referencing a TLB entry. This
register is decremented each time an instruction is executed. The values that can be set in the register are as
follows:
• The lower bound is the content of the Wired register.
• The upper bound is 31.
The Random register specifies the entry in the TLB that is affected by the TLBWR instruction. The register is
readable to verify proper operation of the processor.
The Random register is set to the value of the upper bound upon Cold Reset. This register is also set to the upper
bound when the Wired register is written. Figure 5-12 shows the format of the Random register.
Figure 5-12. Random Register
31
5
0
4
0
Random
Random : TLB random index
0
: Reserved for future use. Write 0 in a write operation. When this field is read, 0 is read.
User’s Manual U15509EJ2V0UM
147
CHAPTER 5 MEMORY MANAGEMENT SYSTEM
5.5.3 EntryLo0 (2) and EntryLo1 (3) registers
The EntryLo register consists of two registers that have identical formats: EntryLo0, used for even virtual pages
and EntryLo1, used for odd virtual pages. The EntryLo0 and EntryLo1 registers are both read-/write-accessible.
They are used to access the built-in TLB. When a TLB read/write operation is carried out, the EntryLo0 and EntryLo1
registers hold the contents of the low-order 32 bits of TLB entries at even and odd addresses, respectively.
The contents of these registers after reset are undefined so that they must be initialized by software.
Figure 5-13. EntryLo0 and EntryLo1 Registers
(a) 32-bit Mode
31
EntryLo0
28 27
6
0
31
EntryLo1
5
PFN
C
28 27
6
0
3
5
PFN
3
C
2
1
0
D
V
G
2
1
0
D
V
G
(b) 64-bit Mode
63
EntryLo0
28 27
0
63
EntryLo1
6
5
PFN
28 27
0
3
C
6
5
PFN
3
C
2
1
0
D
V
G
2
1
0
D
V
G
PFN
: Page frame number; high-order bits of the physical address.
C
: Specifies the TLB page attribute (see Table 5-7).
D
: Dirty. If this bit is set to 1, the page is marked as dirty and, therefore, writable. This bit is actually
V
: Valid. If this bit is set to 1, it indicates that the TLB entry is valid; otherwise, a TLB Invalid
G
: Global. If this bit is set in both EntryLo0 and EntryLo1, then the processor ignores the ASID during
0
: Reserved for future use. Write 0 in a write operation. When this field is read, 0 is read.
a write-protect bit that software can use to prevent alteration of data.
exception (TLBL or TLBS) occurs.
TLB lookup.
The coherency attribute (C) bits are used to specify whether to use the cache in referencing a page. When the
cache is used, whether the page attribute is “cached” or “uncached” is selected by algorithm.
Table 5-7 lists the page attributes selected according to the value in the C bits.
148
User’s Manual U15509EJ2V0UM
CHAPTER 5 MEMORY MANAGEMENT SYSTEM
Table 5-7. Cache Algorithm
C bit value
Cache algorithm
0
Cached
1
Cached
2
Uncached
3
Cached
4
Cached
5
Cached
6
Cached
7
Cached
5.5.4 PageMask register (5)
The PageMask register is a read/write register used for reading from or writing to the TLB; it holds a comparison
mask that sets the page size for each TLB entry, as shown in Table 5-8. Page sizes must be from 1 KB to 256 KB.
TLB read and write instructions use this register as either a source or a destination; Bits 18 to 11 that are targets
of comparison are masked during address translation.
The contents of the PageMask register after reset are undefined so that it must be initialized by software.
Figure 5-14. PageMask Register
31
19 18
11 10
0
0
MASK
0
MASK
: Page comparison mask, which determines the virtual page size for the corresponding entry.
0
: Reserved for future use. Write 0 in a write operation. When this field is read, 0 is read.
Table 5-8 lists the mask pattern for each page size. If the mask pattern is one not listed below, the TLB behaves
unexpectedly.
Table 5-8. Mask Values and Page Sizes
Page size
Bit
18
17
16
15
14
13
12
11
1 KB
0
0
0
0
0
0
0
0
4 KB
0
0
0
0
0
0
1
1
16 KB
0
0
0
0
1
1
1
1
64 KB
0
0
1
1
1
1
1
1
256 KB
1
1
1
1
1
1
1
1
User’s Manual U15509EJ2V0UM
149
CHAPTER 5 MEMORY MANAGEMENT SYSTEM
5.5.5 Wired register (6)
The Wired register is a read/write register that specifies the lower boundary of the random entry of the TLB as
shown in Figure 5-15.
Wired entries cannot be overwritten by a TLBWR instruction.
They can, however, be
overwritten by a TLBWI instruction. Random entries can be overwritten by both instructions.
Figure 5-15. Positions Indicated by the Wired Register
TLB
31
Range specified
by Random register
Wired register value
Range of wired entries
0
The Wired register is set to 0 upon Cold Reset. Writing this register also sets the Random register to the value of
its upper bound (see 5.5.2 Random register (1)). Figure 5-16 shows the format of the Wired register.
Figure 5-16. Wired Register
31
5 4
0
0
Wired
Wired
: TLB wired boundary
0
: Reserved for future use. Write 0 in a write operation. When this field is read, 0 is read.
150
User’s Manual U15509EJ2V0UM
CHAPTER 5 MEMORY MANAGEMENT SYSTEM
5.5.6 EntryHi register (10)
The EntryHi register is write-accessible. It is used to access the built-in TLB. The EntryHi register holds the highorder bits of a TLB entry for TLB read and write operations. If a TLB Refill, TLB Invalid, or TLB Modified exception
occurs, the EntryHi register holds the high-order bit of the TLB entry. The EntryHi register is also set with the virtual
page number (VPN2) for a virtual address where an exception occurred and the ASID. See Chapter 6 for details of
the TLB exception.
The ASID is used to read from or write to the ASID field of the TLB entry. It is also checked with the ASID of the
TLB entry as the ASID of the virtual address during address translation.
The EntryHi register is accessed by the TLBP, TLBWR, TLBWI, and TLBR instructions.
The contents of the EntryHi register after reset are undefined so that it must be initialized by software.
Figure 5-17. EntryHi Register
(a) 32-bit Mode
31
11 10
VPN2
8
7
0
0
ASID
(b) 64-bit Mode
63
62 61
R
40 39
Fill
11 10
VPN2
8 7
0
0
ASID
VPN2
: Virtual page number divided by two (mapping to two pages)
ASID
: Address space ID. An 8-bit ASID field that lets multiple processes share the TLB; each process
has a distinct mapping of otherwise identical virtual page numbers.
R
: Space type (00 → user, 01 → supervisor, 11 → kernel). Matches bits 63 and 62 of the virtual
address.
Fill
: Reserved. Ignored on write. When read, returns zero.
0
: Reserved for future use. Write 0 in a write operation. When this field is read, 0 is read.
User’s Manual U15509EJ2V0UM
151
CHAPTER 5 MEMORY MANAGEMENT SYSTEM
5.5.7 Processor Revision Identifier (PRId) register (15)
The 32-bit, read-only Processor Revision Identifier (PRId) register contains information identifying the
implementation and revision level of the CPU and CP0. Figure 5-18 shows the format of the PRId register.
Figure 5-18. PRId Register
31
16 15
0
8
7
Imp
0
Rev
Imp
: CPU core processor ID number (0x0C for the VR4100 Series)
Rev
: CPU core processor revision number
0
: Reserved for future use. Write 0 in a write operation. When this field is read, 0 is read.
The processor revision number is stored as a value in the form y.x, where y is a major revision number in bits 7 to
4 and x is a minor revision number in bits 3 to 0.
The processor revision number identifies the revision of a CPU core. The major revision number (bits 7 to 4)
identifies the VR4100 Series processors as follows:
Processor
Rev field
VR4121
0110xxxx
VR4122
0111xxxx (xxxx may be 0010 or less)
VR4131
1000xxxx
VR4181
0101xxxx
VR4181A
0111xxxx (xxxx may be 0011 or greater)
The minor revision number (bits 3 to 0) may be different even though the same processor names.
There is no guarantee that changes to the CPU core will necessarily be reflected in the PRId register, or changes
to the revision number necessarily reflect real CPU core changes. Therefore, create a program that does not depend
on the processor revision number field.
152
User’s Manual U15509EJ2V0UM
CHAPTER 5 MEMORY MANAGEMENT SYSTEM
5.5.8 Config register (16)
The Config register specifies various configuration options selected on VR4100 Series processors.
Some configuration options, as defined by the EC, M16, and BE fields, are set by the hardware during Cold Reset
and are included in the Config register as read-only status bits for the software to access. Other configuration
options are read/write (AD, EP, and K0 fields) and controlled by software; on Cold Reset these fields are undefined.
Since only a subset of the VR4000 Series options are available in the VR4100 Series, some bits are set to constants
(e.g., bits 14 to 13) that were variable in the VR4000 Series. The Config register should be initialized by software
before caches are used. Figure 5-19 shows the format of the Config register.
The contents of writable fields except for IS and BP bits in the Config register after reset are undefined so that they
must be initialized by software.
Figure 5-19. Config Register (1/2)
(a) VR4121, VR4181
31 30
0
28 27
EC
24 23 22 21 20 19 18 17 16 15 14 13 12 11
EP
AD
0
M16
0
1 0 BE
10
CS
9 8
IC
6 5
DC
3 2
0
0
K0
(b) VR4122
31 30
IS
28 27
EC
24 23 22 21 20 19 18 17 16 15 14 13 12 11
EP
AD
0
M16
0
1 BP BE
10
CS
9 8
IC
6 5 4 3 2
DC
0
IB
0
K0
(c) VR4131, VR4181A
31 30
IS
IS
28 27
EC
24 23 22 21 20 19 18 17 16 15 14 13 12 11
EP
AD
0
M16
0
1 BP BE
10
CS
9 8
IC
6 5
DC
4
3 2
IB DB 0
0
K0
: Instruction streaming function (VR4122, VR4131, VR4181A only)
0 → ON (default value)
1 → OFF
EC
: System clock ratio (see Table 5-9)
EP
: Transfer data pattern (cache write-back pattern) setting
0 → DD: 1 word/1 cycle
Others → Reserved
AD
: Accelerate data mode
0 → VR4000 Series compatible mode
1 → Reserved
M16
: MIPS16 ISA mode enable/disable indication (read only)
0 → MIPS16 instruction cannot be executed
1 → MIPS16 instruction can be executed
BE
: Endian mode of memory and a kernel.
0 → Little endian
1 → Big endian (VR4131 only)
User’s Manual U15509EJ2V0UM
153
CHAPTER 5 MEMORY MANAGEMENT SYSTEM
Figure 5-19. Config Register (2/2)
: Cache size mode indication (n = IC, DC). Fixed to 1 in the VR4100 Series.
CS
0 → Reserved
1→2
IC
(n+10)
bytes
: Instruction cache size indication. 2
(IC+10)
(DC+10)
bytes in the VR4100 Series (see Table 5-10).
DC
: Data cache size indication. 2
IB
: Instruction cache refill size setting (VR4122, VR4131, and VR4181A only, and fixed to 1 in the
bytes in the VR4100 Series (see Table 5-11).
VR4181A).
0 → 4 words (16 bytes)
1 → 8 words (32 bytes)
DB
: Data cache refill size setting (VR4131 and VR4181A only, and fixed to 1 in the VR4181A).
0 → 4 words (16 bytes)
1 → 8 words (32 bytes)
K0
: kseg0 cache coherency algorithm
010 → Uncached
Others → Cached
1
: 1 is returned when read.
0
: 0 is returned when read.
Caution
Be sure to set the EP field and the AD bit to 0. If they are set with any other values, the
processor may behave unexpectedly.
(1) Instruction streaming function (VR4122, VR4131, and VR4181A only)
Instruction streaming can shorten the period during which the pipeline is stalled. Usually, the pipeline is stalled
until the cache line is refilled if an instruction cache miss occurs. With the VR4122, VR4131, and VR4181A,
however, the stalled pipeline is resumed, even if refilling is not completed, as soon as the instruction to be
fetched has been read from the external memory.
(2) Indication of clock frequency ratio
The EC area indicates the ratio of the internal peripheral function operating clock frequency to the pipeline clock
(PClock) frequency. The frequency ratio to be indicated differs depending on the processor, as follows.
Table 5-9 System Interface Clock Ratio (to PClock)
EC field
154
VR4121
VR4122
VR4131
Reserved
VR4181
VR4181A
0
1/1.5
1
1/2
2
1/2.5
3
1/3
4
1/4
Reserved
1/4
5
1/5
Reserved
1/5
6
1/6
Reserved
1/6
7
1/1
Reserved
Reserved
User’s Manual U15509EJ2V0UM
1/2
Reserved
1/3
1/2
1/4
Reserved
Reserved
1/3
1/1
CHAPTER 5 MEMORY MANAGEMENT SYSTEM
(3) Branch prediction function (VR4122, VR4131, and VR4181A only)
Usually, a branch delay of at least 1 clock occurs in order to check the branch condition and calculate the branch
destination address when a branch instruction is fetched. The VR4122, VR4131, and VR4181A can reduce the
occurrence of this delay using branch prediction.
The VR4122, VR4131, and VR4181A have a branch prediction table to which branch instructions whose branch
conditions have been satisfied and their branch destination addresses are registered. When the next branch
instruction is fetched, this branch prediction table is referenced. If the same branch instruction is in the table
(hit), an instruction is fetched from the branch destination address in the table.
This branch prediction is
performed and branch instructions can be executed without delay if the BP bit is cleared to 0.
(4) Indication of cache size
The IC and DC fields indicate the respective capacities of the instruction cache and data cache. Because the
capacities of the caches differ depending on the processor, these fields are fixed to the value corresponding to
the processor.
Table 5-10 Instruction Cache Sizes
Processor
Size
IC field
VR4121
16 KB
4
VR4122
32 KB
5
VR4131
16 KB
4
VR4181
4 KB
2
VR4181A
8 KB
3
Table 5-11 Data Cache Sizes
Processor
Size
DC field
VR4121
8 KB
3
VR4122
16 KB
4
VR4131
16 KB
4
VR4181
4 KB
2
VR4181A
8 KB
3
5.5.9 Load Linked Address (LLAddr) register (17)
The read/write Load Linked Address (LLAddr) register is not used with the VR4100 Series processor except for
diagnostic purpose, and serves no function during normal operation.
LLAddr register is implemented just for compatibility between the VR4100 Series and VR4000/VR4400.
The contents of the LLAddr register after reset are undefined.
Figure 5-20. LLAddr Register
31
0
PAddr
PAddr
: 32-bit physical address
User’s Manual U15509EJ2V0UM
155
CHAPTER 5 MEMORY MANAGEMENT SYSTEM
5.5.10 TagLo (28) and TagHi (29) registers
The TagLo and TagHi registers are 32-bit read/write registers that hold the primary cache tag during cache
initialization, cache diagnostics, or cache error processing. The TagLo and TagHi registers are written by the CACHE
and MTC0 instructions.
Figures 5-21 and 5-22 show the format of these registers.
The contents of these registers after reset are undefined.
Figure 5-21. TagLo Register
(a) VR4121, VR4122, VR4181, VR4181A
31
For data cache
10
PTagLo
31
For instruction cache
10
PTagLo
9
8
7
V
D
W
9
8
0
6
0
0
V
0
(b) VR4131
31
For data cache
10
PTagLo
31
For instruction cache
10
PTagLo
9
8
7
6
5
4
V
D
W
0
L
R
9
8
6
5
4
L
R
V
0
0
3
0
0
3
0
PTagLo : Specifies physical address bits 31 to 10.
V
: Valid bit
D
: Dirty bit. However, this bit is defined only for the compatibility with the VR4000 Series processors,
and does not indicate the status of cache memory in spite of its readability and writability. This bit
cannot change the status of cache memory. In the VR4131, a write to this bit is ignored and the
same value as the V bit is read on read.
W
: Write-back bit (set if cache line has been updated)
L
: Lock bit. If this bit is set, the cache line is not refilled on cache misses.
R
: LRU bit. Indicates the way to be refilled on cache misses.
0
: Reserved for future use. Write 0 in a write operation. When this field is read, 0 is read.
Figure 5-22. TagHi Register
31
0
0
0
156
: Reserved for future use. Write 0 in a write operation. When this field is read, 0 is read.
User’s Manual U15509EJ2V0UM
CHAPTER 6 EXCEPTION PROCESSING
This chapter describes CPU exception processing, including an explanation of hardware that processes
exceptions, followed by the format and use of each CPU exception register.
6.1 Exception Processing Overview
The processor receives exceptions from a number of sources, including translation lookaside buffer (TLB) misses,
arithmetic overflows, I/O interrupts, and system calls. When the CPU detects an exception, the normal sequence of
instruction execution is suspended and the processor enters Kernel mode (see Chapter 5 for a description of system
operating modes). If an exception occurs while executing a MIPS16 instruction, the processor stops the MIPS16
instruction execution, and shifts to the 32-bit instruction execution mode. The processor then disables interrupts and
transfers control for execution to the exception handler (located at a specific address as an exception handling
routine implemented by software). The handler saves the context of the processor, including the contents of the
program counter, the current operating mode (User or Supervisor), statuses, and interrupt enabling. This context is
saved so it can be restored when the exception has been serviced.
When an exception occurs, the CPU loads the Exception Program Counter (EPC) register with a location where
execution can restart after the exception has been serviced. The restart location in the EPC register is the address of
the instruction that caused the exception or, if the instruction was executing in a branch delay slot, the address of the
branch instruction immediately preceding the delay slot. Note that no branch delay slot generated by executing a
branch instruction exists when the processor operates in the MIPS16 mode.
When MIPS16 instructions are enabled to be executed, bit 0 of the EPC register indicates the operating mode in
which an exception occurred. It indicates 1 when in the MIPS16 instruction mode, and indicates 0 when in the MIPS
III instruction mode.
The VR4100 Series processors have registers other than above that retain address, cause, or status information
during exception processing. Details about these registers are described in 6.2 Exception Processing Registers.
For detailed descriptions about exception processing, refer to 6.4 Details of Exceptions.
6.1.1 Precision of exceptions
VR4100 Series exceptions are logically precise; the instruction that causes an exception and all those that follow it
are aborted and can be re-executed after servicing the exception.
When succeeding instructions are killed,
exceptions associated with those instructions are also killed. Exceptions are not taken in the order detected, but in
instruction fetch order.
The exception handler can still determine exception and its origin. The cause of the program can be restarted by
rewriting the destination register - not automatically, however, as in the case of all the other precise exceptions where
no status change occurs.
User’s Manual U15509EJ2V0UM
157
CHAPTER 6 EXCEPTION PROCESSING
6.2 Exception Processing Registers
This section describes the CP0 registers that are used in exception processing. Table 6-1 lists the CP0 registers.
About the memory management registers of the CP0 registers, refer to CHAPTER 5 MEMORY MANAGEMENT
SYSTEM.
Table 6-1. CP0 Registers
(a) Exception Processing Registers
Register name
Register
number
(b) Memory Management Registers
Register name
Register
number
Context register
4
Index register
0
BadVAddr register
8
Random register
1
Count register
9
EntryLo0 register
2
Compare register
11
EntryLo1 register
3
Status register
12
PageMask register
5
Cause register
13
Wired register
6
EPC register
14
EntryHi register
10
WatchLo register
18
PRId register
15
WatchHi register
19
Config register
XContext register
Parity Error register
Note1
Cache Error register
ErrorEPC register
Note1
16
Note2
20
LLAddr register
17
26
TagLo register
28
27
TagHi register
29
−
30
−
Notes 1. This register is defined to maintain compatibility with the VR4100. This register is
not used in the normal operation.
2. This register is defined to maintain compatibility with the VR4000 and VR4400. The
content of this register is meaningless in the normal operation.
Software examines the CP0 registers during exception processing to determine the cause of the exception and the
state of the CPU at the time the exception occurred.
Details about each register are explained below.
The parenthesized number in section titles is the register
number (refer to 1.2.3).
158
User’s Manual U15509EJ2V0UM
CHAPTER 6 EXCEPTION PROCESSING
6.2.1 Context register (4)
The Context register is a read/write register containing the pointer to an entry in the page table entry (PTE) array
on the memory; this array is a table that stores virtual-to-physical address translations. When there is a TLB miss,
the operating system loads the unsuccessfully translated entry from the PTE array to the TLB. The Context register
is used by the TLB Refill exception handler for loading TLB entries. The Context register duplicates some of the
information provided in the BadVAddr register, but the information is arranged in a form that is more useful for a
software TLB exception handler. Figure 6-1 shows the format of the Context register.
Figure 6-1. Context Register
(a) 32-bit Mode
31
25 24
PTEBase
4
3
BadVPN2
0
0
(b) 64-bit Mode
63
25 24
PTEBase
4
BadVPN2
3
0
0
PTEBase: The PTEBase field is a base address of the PTE entry table.
BadVPN2: The BadVPN2 field is written by hardware if a TLB miss occurs. This field holds the value (VPN2)
obtained by halving the virtual page number of the most recent virtual address for which
translation failed.
0:
Reserved for future use. Write 0 in a write operation. When this field is read, 0 is read.
The PTEBase field is used by software as the pointer to the base address of the PTE table in the current user
address space.
The 21-bit BadVPN2 field contains bits 31 to 11 of the virtual address that caused the TLB miss; bit 10 is excluded
because a single TLB entry maps to an even-odd page pair. For a 1 KB page size, this format can directly address
the pair-table of 8-byte PTEs. When the page size is 4 KB or more, shifting or masking this value produces the
correct PTE reference address.
User’s Manual U15509EJ2V0UM
159
CHAPTER 6 EXCEPTION PROCESSING
6.2.2 BadVAddr register (8)
The Bad Virtual Address (BadVAddr) register is a read-only register that saves the most recent virtual address that
failed to have a valid translation, or that had an addressing error. Figure 6-2 shows the format of the BadVAddr
register.
Caution
This register saves no information after a bus error exception, because it is not an address error
exception.
Figure 6-2. BadVAddr Register
(a) 32-bit Mode
31
0
BadVAddr
(b) 64-bit Mode
63
0
BadVAddr
BadVAddr: Most recent virtual address for which an addressing error occurred, or for which address
translation failed.
6.2.3 Count register (9)
The read/write Count register acts as a timer. It is incremented in synchronization with the MasterOut clock
(internal clock), regardless of whether instructions are being executed, retired, or any forward progress is actually
made through the pipeline.
This register is a free-running type. When the register reaches all ones, it rolls over to zero and continues
counting. This register is used for self-diagnostic test, system initialization, or the establishment of inter-process
synchronization.
Figure 6-3 shows the format of the Count register.
Figure 6-3. Count Register
31
0
Count
Count: 32-bit up-date count value that is compared with the value of the Compare register.
160
User’s Manual U15509EJ2V0UM
CHAPTER 6 EXCEPTION PROCESSING
6.2.4 Compare register (11)
The Compare register causes a timer interrupt; it maintains a stable value that does not change on its own.
When the value of the Count register (see 6.2.3) equals the value of the Compare register, the IP7 bit in the
Cause register is set. This causes an interrupt as soon as the interrupt is enabled.
Writing a value to the Compare register, as a side effect, clears the timer interrupt request.
For diagnostic purposes, the Compare register is a read/write register. Normally, this register should be only used
for a write. Figure 6-4 shows the format of the Compare register.
Figure 6-4. Compare Register
31
0
Compare
Compare: Value that is compared with the count value of the Count register.
6.2.5 Status register (12)
The Status register is a read/write register that contains the operating mode, interrupt enabling, and the diagnostic
states of the processor. Figure 6-5 shows the format of the Status register.
Figure 6-5. Status Register (1/2)
(a) VR4121, VR4122, VR4181, VR4181A
31
29 28 27 26 25 24
0
CU0
0
RE
16 15
DS
8
IM
7
6
5
KX SX UX
4
3
KSU
2
1
0
ERL EXL IE
(b) VR4131
31 30 29 28 27 26 25 24
XX
0
CU0
0
RE
16 15
DS
8
IM
7
6
5
KX SX UX
XX:
Write 0 in a write operation. When this bit is read, 0 is read (VR4131 only).
CU0:
Enables/disables the use of the coprocessor (1 → Enabled, 0 → Disabled).
4
3
KSU
2
1
0
ERL EXL IE
CP0 can be used by the kernel at all times.
RE:
Enables/disables reversing of the endian setting in User mode (0 → Disabled, 1 → Enabled). This bit
DS:
Diagnostic Status field (see Figure 6-6).
must be set to 0 in the VR4100 Series.
User’s Manual U15509EJ2V0UM
161
CHAPTER 6 EXCEPTION PROCESSING
Figure 6-5. Status Register (2/2)
IM:
Interrupt Mask field used to enable/disable interrupts (0 → Disabled, 1 → Enabled). This field consists
of 8 bits that are used to control eight interrupts. The bits are assigned to interrupts as follows:
IM7:
Masks a timer interrupt.
Note
IM(6:2): Mask ordinary interrupts (Int(4:0)
only, and Int4
Note
). However, Int3
Note
occurs in the VR4121 and VR4181A
in the VR4181A only.
IM(1:0): Mask software interrupts.
Note Int(4:0) are internal signals of the CPU core. For details about connection to the on-chip
peripheral units, refer to Hardware User's Manual of each processor.
KX:
Enables 64-bit addressing in Kernel mode (0 → 32-bit, 1 → 64-bit).
SX:
Enables 64-bit addressing and operation in Supervisor mode (0 → 32-bit, 1 → 64-bit).
UX:
Enables 64-bit addressing and operation in User mode (0 → 32-bit, 1 → 64-bit).
KSU:
Sets and indicates the operating mode (00 → Kernel, 01 → Supervisor, 10 → User).
ERL:
Sets and indicates the error level (0 → Normal, 1 → Error).
EXL:
Sets and indicates the exception level (0 → Normal, 1 → Exception).
IE:
Sets and indicates interrupt enabling/disabling (0 → Disabled, 1 → Enabled).
0:
Reserved for future use. Write 0 in a write operation. When this bit is read, 0 is read.
162
User’s Manual U15509EJ2V0UM
CHAPTER 6 EXCEPTION PROCESSING
Figure 6-6 shows the details of the Diagnostic Status (DS) field. All DS field bits other than the TS bit are readable
and writable.
Figure 6-6. Status Register Diagnostic Status Field
(a) VR4181
24
23
0
22
21
20
19
18
17
16
BEV
TS
SR
0
CH
CE
DE
(b) VR4121, VR4122, VR4131, VR4181A
24
23
0
BEV:
22
21
20
19
18
17
16
BEV
0
SR
0
CH
CE
DE
Specifies the base address of a TLB Refill exception vector and common exception vector (0 →
Normal, 1 → Bootstrap).
TS:
Occurs the TLB to be shut down (VR4181 only) (0 → Not shut down, 1 → Shut down). This bit is
read only and used to avoid any problems that may occur when multiple TLB entries match the same
virtual address. After the TLB has been shut down, reset the processor to enable restart. Note that
the TLB is shut down even if a TLB entry matching a virtual address is marked as being invalid (with
the V bit cleared).
SR:
Occurs a Soft Reset or NMI exception (0 → Not occurred, 1 → Occurred).
CH:
CP0 condition bit (0 → False, 1 → True). This bit can be read and written by software only; it cannot
be accessed by hardware.
CE, DE: These are prepared to maintain compatibility with the VR4100, and are not used in the VR4100
Series hardware.
0:
Reserved for future use. Write 0 in a write operation. When this field is read, 0 is read.
User’s Manual U15509EJ2V0UM
163
CHAPTER 6 EXCEPTION PROCESSING
The Status register has the following fields where the modes and access status are set.
(1) Interrupt enable
Interrupts are enabled when all of the following conditions are true:
• IE bit is set to 1.
• EXL bit is cleared to 0.
• ERL bit is cleared to 0.
• The appropriate bit of the IM field is set to 1.
(2) Operating modes
The following Status register bit settings are required for User, Kernel, and Supervisor modes.
• The processor is in User mode when KSU field = 10, EXL bit = 0, and ERL bit = 0.
• The processor is in Supervisor mode when KSU field = 01, EXL bit = 0, and ERL bit = 0.
• The processor is in Kernel mode when KSU field = 00, EXL bit = 1, or ERL bit = 1.
Access to the kernel address space is allowed when the processor is in Kernel mode.
Access to the supervisor address space is allowed when the processor is in Supervisor or Kernel mode.
Access to the user address space is allowed in any of the three operating modes.
(3) Addressing modes
The following Status register bit settings select 32- or 64-bit operation for User, Kernel, and Supervisor
operating modes. Enabling 64-bit operation permits the execution of 64-bit opcodes and translation of 64-bit
addresses. 64-bit operation for User, Kernel and Supervisor modes can be set independently.
• 64-bit addressing for Kernel mode is enabled when KX bit = 1. If this bit is set, an XTLB Refill exception
occurs if a TLB miss occurs in the Kernel mode address space. 64-bit operations are always valid in Kernel
mode.
• 64-bit addressing and operations are enabled for Supervisor mode when SX bit = 1. If this bit is set, an
XTLB Refill exception occurs if a TLB miss occurs in the Supervisor mode address space.
• 64-bit addressing and operations are enabled for User mode when UX bit = 1. If this bit is set, an XTLB
Refill exception occurs if a TLB miss occurs in the User mode address space.
(4) Status after reset
The contents of the Status register are undefined after Cold resets, except for the following bits in the
diagnostic status field.
• TS bit is cleared to 0 (VR4181 only).
• SR bit is cleared to 0.
SR bit is 0 after Cold reset, and is 1 after Soft reset or NMI exception.
• ERL and BEV bits are both set to 1.
Remark Cold reset and Soft reset are CPU core reset. For details, refer to Hardware User's Manual of
each processor.
164
User’s Manual U15509EJ2V0UM
CHAPTER 6 EXCEPTION PROCESSING
6.2.6 Cause register (13)
The 32-bit read/write Cause register holds the cause of the most recent exception.
A 5-bit exception code
indicates one of the causes (see Table 6-2). Other bits holds the detailed information of the specific exception. All
bits in the Cause register, with the exception of the IP1 and IP0 bits, are read-only; IP1 and IP0 are used for software
interrupts. Figure 6-7 shows the fields of this register; Table 6-2 describes the Cause register codes.
Figure 6-7. Cause Register
31 30 29 28 27
BD
BD:
0
16 15
CE
0
8
IP
7
6
2
0
ExcCode
1
0
0
Indicates whether the most recent exception occurred in the branch delay slot (1 → In delay slot, 0
→ Normal).
CE:
Indicates the coprocessor number in which a Coprocessor Unusable exception occurred.
This field will remain undefined for as long as no exception occurs.
IP:
Indicates whether an interrupt is pending (1 → Interrupt pending, 0 → No interrupt pending).
The bits are assigned to interrupts as follows:
IM7:
A timer interrupt.
Note
IM(6:2): Ordinary interrupts (Int(4:0)
only, and Int4
Note
). However, Int3
Note
occurs in the VR4121 and VR4181A
in the VR4181A only.
IM(1:0): Software interrupts. Only these bits cause an interrupt exception, when they are set to
1 by means of software.
Note Int(4:0) are internal signals of the CPU core. For details about connection to the on-chip
peripheral units, refer to Hardware User's Manual of each processor.
ExcCode: Exception code field (see Table 6-2).
0:
Reserved for future use. Write 0 in a write operation. When this field is read, 0 is read.
User’s Manual U15509EJ2V0UM
165
CHAPTER 6 EXCEPTION PROCESSING
Table 6-2. Cause Register Exception Code Field
Exception code
Mnemonic
Description
0
Int
Interrupt exception
1
Mod
TLB Modified exception
2
TLBL
TLB Refill exception (load or fetch)
3
TLBS
TLB Refill exception (store)
4
AdEL
Address Error exception (load or fetch)
5
AdES
Address Error exception (store)
6
IBE
Bus Error exception (instruction fetch)
7
DBE
Bus Error exception (data load or store)
8
Sys
System Call exception
9
Bp
Breakpoint exception
10
RI
Reserved Instruction exception
11
CpU
Coprocessor Unusable exception
12
Ov
Integer Overflow exception
13
Tr
Trap exception
14 to 22

Reserved for future use
23
WATCH
Watch exception
24 to 31

Reserved for future use
The VR4100 Series has eight interrupt request sources, IP7 to IP0, that are used for the following purpose. For
the detailed description of interrupts, refer to Chapter 8.
(1) IP7
This bit indicates whether there is a timer interrupt request.
It is set when the values of Count register and Compare register match.
(2) IP6 to IP2
IP6 to IP2 reflect the state of the interrupt request signal of the CPU core.
(3) IP1 and IP0
These bits are used to set/clear a software interrupt request.
166
User’s Manual U15509EJ2V0UM
CHAPTER 6 EXCEPTION PROCESSING
6.2.7 Exception Program Counter (EPC) register (14)
The Exception Program Counter (EPC) is a read/write register that contains the address at which processing
resumes after an exception has been serviced. The contents of this register change depending on whether execution
of MIPS16 instructions is enabled or disabled.
Setting the MIPS16EN pin after RTC reset specifies whether
execution of the MIPS16 instructions is enabled or disabled.
When the MIPS16 instruction execution is disabled, the EPC register contains either:
• Virtual address of the instruction that caused the exception, or
• Virtual address of the immediately preceding branch or jump instruction (when the instruction associated with
the exception is in a branch delay slot, and the BD bit in the Cause register is set to 1).
When the MIPS16 instruction execution is enabled, the EPC register contains either:
• Virtual address of the instruction that caused the exception and ISA mode at which an exception occurs, or
• Virtual address of the immediately preceding branch or jump instruction and ISA mode at which an exception
occurs (when the instruction associated with the exception is in a branch delay slot of the jump instruction, and
the BD bit in the Cause register is set to 1).
When the 16-bit instruction is executed, the EPC register contains either:
• Virtual address of the instruction that caused the exception and ISA mode at which an exception occurs, or
• Virtual address of the immediately preceding Extend or jump instruction and ISA mode at which an exception
occurs (when the instruction associated with the exception is in a branch delay slot of the jump instruction or in
the instruction following the Extend instruction, and the BD bit in the Cause register is set to 1).
The EXL bit in the Status register is set to 1 to keep the processor from overwriting the address of the exceptioncausing instruction contained in the EPC register in the event of another exception.
The EPC register never indicates the address of the instruction in branch delay slot.
Figure 6-8 shows the EPC register format when MIPS16 ISA is disabled, and Figure 6-9 shows the EPC register
format when MIPS16 ISA is enabled.
Figure 6-8. EPC Register (When MIPS16 ISA Is Disabled)
(a) 32-bit Mode
31
0
EPC
(b) 64-bit Mode
63
0
EPC
EPC: Restart address after exception processing.
User’s Manual U15509EJ2V0UM
167
CHAPTER 6 EXCEPTION PROCESSING
Figure 6-9. EPC Register (When MIPS16 ISA Is Enabled)
31
1
EPC
0
EIM
EPC: Bits 31 to 1 of restart address after exception processing.
EIM: ISA mode at which an exception occurs.
(1 → when MIPS16 SIA instruction is executed, 0 → when MIPS III ISA instruction is executed.)
63
1
EPC
0
EIM
EPC: Bits 63 to 1 of restart address after exception processing.
EIM: ISA mode at which an exception occurs.
(1 → when MIPS16 SIA instruction is executed, 0 → when MIPS III ISA instruction is executed.)
6.2.8 WatchLo (18) and WatchHi (19) registers
The VR4100 Series processor provides a debugging feature to detect references to a selected physical address;
load and store instructions to the location specified by the WatchLo and WatchHi registers cause a Watch exception.
Figures 6-10 and 6-11 show the format of the WatchLo and WatchHi registers.
The contents of these registers after reset are undefined so that they must be initialized by software.
Figure 6-10. WatchLo Register
31
3
PAddr0
2
1
0
0
R
W
PAddr0: Specifies physical address bits 31 to 3.
R:
Specifies detection of watch address references when load instructions are executed (1 →
Detect, 0 → Not detect).
W:
Specifies detection of watch address references when store instructions are executed (1 →
Detect, 0 → Not detect).
0:
Reserved for future use. Write 0 in a write operation. When this field is read, 0 is read.
Figure 6-11. WatchHi Register
31
0
0
0:
168
Reserved for future use. Write 0 in a write operation. When this field is read, 0 is read.
User’s Manual U15509EJ2V0UM
CHAPTER 6 EXCEPTION PROCESSING
6.2.9 XContext register (20)
The read/write XContext register contains a pointer to an entry in the page table entry (PTE) array, an operating
system data structure that stores virtual-to-physical address translations. If a TLB miss occurs, the operating system
loads the untranslated data from the PTE into the TLB to handle the software error.
The XContext register is used by the XTLB Refill exception handler to load TLB entries in 64-bit addressing mode.
The XContext register duplicates some of the information provided in the BadVAddr register, and puts it in a form
useful for the XTLB exception handler.
This register is included solely for operating system use. The operating system sets the PTEBase field in the
register, as needed. Figure 6-12 shows the format of the XContext register.
Figure 6-12. XContext Register
63
35 34 33 32
PTEBase
R
4
BadVPN2
3
0
0
PTEBase:
The PTEBase field is a base address of the PTE entry table.
R:
Space type (00 → User, 01→ Supervisor, 11 → Kernel). The setting of this field matches virtual
address bits 63 and 62.
BadVPN2:
This field holds the value (VPN2) obtained by halving the virtual page number of the most recent
virtual address for which translation failed.
0:
Reserved for future use. Write 0 in a write operation. When this field is read, 0 is read.
The 29-bit BadVPN2 field has bits 39 to 11 of the virtual address that caused the TLB miss; bit 10 is excluded
because a single TLB entry maps to an even-odd page pair. For a 1 KB page size, this format may be used directly
to address the pair-table of 8-byte PTEs. For 4 KB-or-more page and PTE sizes, shifting or masking this value
produces the appropriate address.
User’s Manual U15509EJ2V0UM
169
CHAPTER 6 EXCEPTION PROCESSING
6.2.10 Parity Error register (26)
The Parity Error (PErr) register is a readable/writable register. This register is defined to maintain softwarecompatibility with the VR4100, and is not used in hardware because the VR4100 Series has no parity.
Figure 6-13 shows the format of the PErr register.
Figure 6-13. Parity Error Register
31
8 7
0
0
Diagnostic
Diagnostic: 8-bit self diagnostic field.
0:
Reserved for future use. Write 0 in a write operation. When this field is read, 0 is read.
6.2.11 Cache Error register (27)
The Cache Error register is a readable/writable register. This register is defined to maintain software-compatibility
with the VR4100, and is not used in hardware because the VR4100 Series has no parity.
Figure 6-14 shows the format of the Cache Error register.
Figure 6-14. Cache Error Register
31
0
0
0:
170
Reserved for future use. Write 0 in a write operation. When this field is read, 0 is read.
User’s Manual U15509EJ2V0UM
CHAPTER 6 EXCEPTION PROCESSING
6.2.12 ErrorEPC register (30)
The Error Exception Program Counter (ErrorEPC) register is similar to the EPC register. It is used to store the
Program Counter value at which the Cold Reset, Soft Reset, or NMI exception has been serviced.
The read/write ErrorEPC register contains the virtual address at which instruction processing can resume after
servicing an error. The contents of this register change depending on whether execution of MIPS16 instructions is
enabled or disabled.
Setting the MIPS16EN pin after RTC reset specifies whether the execution of MIPS16
instructions is enabled or disabled.
When the MIPS16 ISA is disabled, this address can be:
• Virtual address of the instruction that caused the exception, or
• Virtual address of the immediately preceding branch or jump instruction, when the instruction associated with
the error exception is in a branch delay slot.
When the MIPS16 instruction execution is enabled during a 32-bit instruction execution, this address can be:
• Virtual address of the instruction that caused the exception and ISA mode at which an exception occurs, or
• Virtual address of the immediately preceding branch or jump instruction and ISA mode at which an exception
occurs when the instruction associated with the exception is in a branch delay slot.
When the MIPS16 instruction execution is enabled during a 16-bit instruction execution, this address can be:
• Virtual address of the instruction that caused the exception and ISA mode at which an exception occurs, or
• Virtual address of the immediately preceding jump instruction or Extend instruction and ISA mode at which an
exception occurs when the instruction associated with the exception is in a branch delay slot of the jump
instruction or is the instruction following the Extend instruction.
The contents of the ErrorEPC register do not change when the ERL bit of the Status register is set to 1. This
prevents the processor when other exceptions occur from overwriting the address of the instruction in this register
which causes an error exception.
There is no branch delay slot indication for the ErrorEPC register.
Figure 6-15 shows the format of the ErrorEPC register when the MIPS16ISA is disabled. Figure 6-16 shows the
format of the ErrorEPC register when the MIPS16ISA is enabled.
User’s Manual U15509EJ2V0UM
171
CHAPTER 6 EXCEPTION PROCESSING
Figure 6-15. ErrorEPC Register (When MIPS16 ISA Is Disabled)
(a) 32-bit Mode
31
0
ErrorEPC
(b) 64-bit Mode
63
0
ErrorEPC
ErrorEPC: Program counter that indicates the restart address after Cold reset, Soft reset, or NMI
exception.
Figure 6-16. ErrorEPC Register (When MIPS16 ISA Is Enabled)
(a) 32-bit mode
31
1
ErrorEPC
0
ErIM
ErrorEPC: Bits 31 to 1 of virtual restart address after Cold reset, Soft reset, or NMI exception.
ISA mode at which an error exception occurs (1 → MIPS16 ISA, 0 → MIPS III ISA).
ErIM:
(b) 64-bit mode
63
1
ErrorEPC
0
ErIM
ErrorEPC: Bits 63 to 1 of virtual restart address after Cold reset, Soft reset, or NMI exception.
ErIM:
172
ISA mode at which an error exception occurs (1 → MIPS16 ISA, 0 → MIPS III ISA).
User’s Manual U15509EJ2V0UM
CHAPTER 6 EXCEPTION PROCESSING
6.3 Overview of Exceptions
When the processor takes an exception, the EXL bit is set to 1, meaning the system is in Kernel mode. After
saving the appropriate state, the exception handler typically resets the EXL bit back to 0. The exception handler sets
the EXL bit to 1 so that the saved state is not lost upon the occurrence of another exception while the saved state is
being restored.
Returning from an exception also resets the EXL bit to 0. For details, see CHAPTER 9 CPU INSTRUCTION SET
DETAILS.
Remark When the EXL and ERL bits in the Status register are 0, either User, Supervisor, or Kernel operating
mode is specified by the KSU bits in the Status register. When either the EXL or ERL bit is set to 1,
the processor is in Kernel mode.
6.3.1 Exception types
Exceptions are classified to as follows according to the internal status of the processor retained at the occurrence
of an exception.
• Cold Reset
• Soft Reset, NMI
• Remaining processor exceptions (common exceptions)
6.3.2 Exception vector locations
When an exception occurs, the exception vector address is set to the program counter and the processing
branches to there from the main program. A program called exception handler that processes exceptions must be
placed at the location of the exception vector address.
A vector address is calculated by adding a vector offset to a base address. Each exception type has a different
vector address.
64-/32-bit mode exception vectors and their offsets are shown below.
User’s Manual U15509EJ2V0UM
173
CHAPTER 6 EXCEPTION PROCESSING
Table 6-3. 32-Bit Mode Exception Vector Base Addresses
Exception
Vector base address (virtual)
Vector offset
Cold Reset
Soft Reset
NMI
0xBFC0 0000
0x0000
TLB Refill (EXL = 0)
0x8000 0000 (BEV = 0)
0x0000
XTLB Refill (EXL = 0)
0xBFC0 0200 (BEV = 1)
0x0080
(BEV is automatically set to 1)
Others
0x0180
Table 6-4. 64-Bit Mode Exception Vector Base Addresses
Exception
Vector base address (virtual)
Vector offset
Cold Reset
Soft Reset
NMI
0xFFFF FFFF BFC0 0000
0x0000
TLB Refill (EXL = 0)
0xFFFF FFFF 8000 0000 (BEV = 0)
0x0000
XTLB Refill (EXL = 0)
0xFFFF FFFF BFC0 0200 (BEV = 1)
0x0080
(BEV is automatically set to 1)
Others
0x0180
(1) Vector of Cold Reset, Soft Reset, and NMI exceptions
The Cold Reset, Soft Reset, and NMI exceptions are always branched to the following reset exception vector
address (virtual). This address is in an uncached, unmapped space.
• 0xBFC0 0000 in 32-bit mode
• 0xFFFF FFFF BFC0 0000 in 64-bit mode
(2) TLB Refill exception vector
When BEV bit = 0, the vector base address (virtual) for the TLB Refill exception is in kseg0 (unmapped) space.
• 0x8000 0000 in 32-bit mode
• 0xFFFF FFFF 8000 0000 in 64-bit mode
When BEV bit = 1, the vector base address (virtual) for the TLB Refill exception is in kseg1 (uncached,
unmapped) space.
• 0xBFC0 0200 in 32-bit mode
• 0xFFFF FFFF BFC0 0200 in 64-bit mode
This is an uncached, non-TLB-mapped space, allowing the exception handler to bypass the cache and TLB.
(3) Common exception vector
Addresses for the remaining exceptions are a combination of a vector offset and a base address.
174
User’s Manual U15509EJ2V0UM
CHAPTER 6 EXCEPTION PROCESSING
6.3.3 Priority of exceptions
While more than one exception can occur for a single instruction, only the exception with the highest priority is
reported. Table 6-5 lists the priorities.
Table 6-5. Exception Priority Order
Priority
Exceptions
High
Cold Reset
↑
Soft Reset

NMI

Address Error (instruction fetch)

TLB/XTLB Refill (instruction fetch)

TLB Invalid (instruction fetch)

Bus Error (instruction fetch)

System Call

Breakpoint

Coprocessor Unusable

Reserved Instruction

Trap

Integer Overflow

Address Error (data access)

TLB/XTLB Refill (data access)

TLB Invalid (data access)

TLB Modified (data write)

Watch
↓
Bus Error (data access)
Low
Interrupt (other than NMI)
Hereafter, handling exceptions by hardware is referred to as “process”, and handling exception by software is
referred to as “service”.
User’s Manual U15509EJ2V0UM
175
CHAPTER 6 EXCEPTION PROCESSING
6.4 Details of Exceptions
6.4.1 Cold Reset exception
Cause
The Cold Reset exception occurs when the ColdReset# signal (internal) is asserted and then deasserted. This
exception is not maskable. The Reset# signal (internal) must be asserted along with the ColdReset# signal (for
details, see Hardware User's Manual of each processor).
Processing
The CPU provides a special interrupt vector for this exception:
• 0xBFC0 0000 (virtual) in 32-bit mode
• 0xFFFF FFFF BFC0 0000 (virtual) in 64-bit mode
The Cold Reset vector resides in unmapped and uncached CPU address space, so the hardware need not
initialize the TLB or the cache to process this exception. It also means the processor can fetch and execute
instructions while the caches and virtual memory are in an undefined state.
The contents of all registers in the CPU are undefined when this exception occurs, except for the following register
fields:
• When the MIPS16 instruction execution is disabled while the ERL of Status register is 0, the PC value at
which an exception occurs is set to the ErrorEPC register.
When the MIPS16 instruction execution is enabled while the ERL of Status register is 0, the PC value at
which an exception occurs is set to the ErrorEPC register and the ISA mode in which an exception occurs is
set to the least significant bit of the ErrorEPC register.
• TS (VR4181 only) and SR of the Status register are cleared to 0.
• ERL and BEV of the Status register are set to 1.
• The Random register is initialized to the value of its upper bound (31).
• The Wired register and the Count register are initialized to 0.
• R and W of the WatchLo register are cleared to 0 (other than VR4181).
• IS and BP of the Config register are cleared to 0 (VR4122, VR4131, and VR4181A only).
• In the VR4121 and VR4181, bits 31 to 28 and bits 22 to 3 of the Config register are set to fixed values.
• In the VR4122, bits 30 to 28, bits 22 to 17, bits 15 to 6, bit 4, and bit 3 of the Config register are set to fixed
values.
• In the VR4131 and VR4181A, bits 30 to 28, bits 22 to 17, bits 15 to 6, and bit 3 of the Config register are set to
fixed values.
• All other bits are undefined.
Servicing
The Cold Reset exception is serviced by:
• Initializing all processor registers, coprocessor registers, TLB, caches, and the memory system
• Performing diagnostic tests
• Bootstrapping the operating system
176
User’s Manual U15509EJ2V0UM
CHAPTER 6 EXCEPTION PROCESSING
6.4.2 Soft Reset exception
Cause
A Soft Reset (sometimes called Warm Reset) occurs when the ColdReset# signal remains deasserted while the
Reset# signal goes from assertion to deassertion (for details, see Hardware User's Manual of each processor).
A Soft Reset immediately resets all state machines, and sets the SR bit of the Status register. Execution begins at
the reset vector when the Reset# is deasserted. This exception is not maskable.
Caution In the VR4100 Series, a Soft Reset never occurs.
Processing
The CPU provides a special interrupt vector for this exception (same location as Cold Reset):
• 0xBFC0 0000 (virtual) in 32-bit mode
• 0xFFFF FFFF BFC0 0000 (virtual) in 64-bit mode
This vector is located within unmapped and uncached address space, so that the cache and TLB need not be
initialized to process this exception. The SR bit of the Status register is set to 1 to distinguish this exception from
a Cold Reset exception.
When this exception occurs, the contents of all registers are preserved except for the following registers:
• When the MIPS16 instruction execution is disabled, the PC value at which an exception occurs is set to the
ErrorEPC register.
When the MIPS16 instruction execution is enabled, the PC value at which an exception occurs is set to the
ErrorEPC register and the ISA mode in which an exception occurs is set to the least significant bit of the
ErrorEPC register.
• TS bit of the Status register is cleared to 0 (VR4181 only).
• ERL, SR, and BEV bits of the Status register are set to 1.
• R and W of the WatchLo register are cleared to 0 (other than VR4181).
During a Soft Reset, access to the operating cache or system interface may be aborted. This means that the
contents of the cache and memory will be undefined if a Soft Reset occurs.
Servicing
The Soft Reset exception is serviced by:
• Preserving the current processor states for diagnostic tests
• Reinitializing the system in the same way as for a Cold Reset exception
User’s Manual U15509EJ2V0UM
177
CHAPTER 6 EXCEPTION PROCESSING
6.4.3 NMI exception
Cause
The Nonmaskable Interrupt (NMI) exception occurs when the NMI signal (internal) becomes active. This interrupt
is not maskable; it occurs regardless of the settings of the EXL, ERL, and the IE bits in the Status register (for
details, see CHAPTER 8 CPU CORE INTERRUPTS).
Processing
The CPU provides a special interrupt vector for this exception:
• 0xBFC0 0000 (virtual) in 32-bit mode
• 0xFFFF FFFF BFC0 0000 (virtual) in 64-bit mode
This vector is located within unmapped and uncached address space so that the cache and TLB need not be
initialized to process an NMI interrupt. The SR bit of the Status register is set to 1 to distinguish this exception
from a Cold Reset exception.
Unlike Cold Reset and Soft Reset, but like other exceptions, NMI is taken only at instruction boundaries. The
states of the caches and memory system are preserved by this exception.
When this exception occurs, the contents of all registers are preserved except for the following registers:
• When the MIPS16 instruction execution is disabled, the PC value at which an exception occurs is set to the
ErrorEPC register.
When the MIPS16 instruction execution is enabled, the PC value at which an exception occurs is set to the
ErrorEPC register and the ISA mode in which an exception occurs is set to the least significant bit of the
ErrorEPC register.
• The TS bit of the Status register is cleared to 0 (VR4181 only).
• The ERL, SR, and BEV bits of the Status register are set to 1.
Servicing
The NMI exception is serviced by:
• Preserving the current processor states for diagnostic tests
• Reinitializing the system in the same way as for a Cold Reset exception
178
User’s Manual U15509EJ2V0UM
CHAPTER 6 EXCEPTION PROCESSING
6.4.4 Address Error exception
Cause
The Address Error exception occurs when an attempt is made to execute one of the following. This exception is
not maskable.
• Execution of the LW, LWU, SW, or CACHE instruction for word data that is not located on a word boundary
• Execution of the LH, LHU, or SH instruction for half-word data that is not located on a half-word boundary
• Execution the LD or SD instruction for double-word data that is not located on a double-word boundary
• Referencing the kernel address space in User or Supervisor mode
• Referencing the supervisor space in User mode
• Referencing an address that does not exist in the kernel, user, or supervisor address space in 64-bit Kernel,
User, or Supervisor mode
• Branching to an address that was not located on a ward boundary when the MIPS16 instruction is disabled
• Branching to address whose least-significant 2 bits are 10 when the MIPS16 instruction is enabled
Processing
The common exception vector is used for this exception. The AdEL or AdES code in the Cause register is set. If
this exception has been caused by an instruction reference or load operation, AdEL is set. If it has been caused
by a store operation, AdES is set.
When this exception occurs, the BadVAddr register stores the virtual address that was not properly aligned or was
referenced in protected address space. The contents of the VPN field of the Context and EntryHi registers are
undefined, as are the contents of the EntryLo register.
When the MIPS16 instruction is disabled, the EPC register contains the address of the instruction that caused the
exception. However, if this instruction is in a branch delay slot, the EPC register contains the address of the
preceding jump or branch instruction, and the BD bit of the Cause register is set to 1.
When the MIPS16 instruction is enabled, the EPC register contains the address of the instruction that caused the
exception, and the least significant bit stores the ISA mode in which an exception occurs. However, if this
instruction is in a branch delay slot or is the instruction following the Extend instruction, the EPC register contains
the address of the preceding jump or Extend instruction, and the BD bit of the Cause register is set to 1.
Servicing
TM
The kernel reports the UNIX
SIGSEGV (segmentation violation) signal to the current process, and this exception
is usually fatal.
User’s Manual U15509EJ2V0UM
179
CHAPTER 6 EXCEPTION PROCESSING
6.4.5 TLB exceptions
Three types of TLB exceptions can occur:
• A TLB Refill exception occurs when there is no TLB entry that matches a referenced address.
• A TLB Invalid exception occurs when a TLB entry that matches a referenced virtual address is marked as
being invalid (with the V bit set to 0).
• A TLB Modified exception occurs when a TLB entry that matches a virtual address referenced by the store
instruction is marked as being valid (with the V bit set to 1) though a write to it is disabled (with the D bit set to
0).
The following three sections describe these TLB exceptions.
(1) TLB Refill exception (32-bit space mode)/XTLB Refill exception (64-bit space mode)
Cause
The TLB Refill exception occurs when there is no TLB entry to match a reference to a mapped address space.
This exception is not maskable.
Processing
There are two special exception vectors for this exception; one for references to 32-bit address spaces, and one
for references to 64-bit address spaces. The UX, SX, and KX bits of the Status register determine whether the
user, supervisor or kernel address spaces referenced are 32-bit or 64-bit spaces. When the EXL bit of the Status
register is set to 0, either of these two special vectors is referenced. When the EXL bit is set to 1, the common
exception vector is referenced.
This exception sets the TLBL or TLBS code in the ExcCode field of the Cause register. If this exception has been
caused by an instruction reference or load operation, TLBL is set. If it has been caused by a store operation,
TLBS is set.
When this exception occurs, the BadVAddr, Context, XContext and EntryHi registers hold the virtual address that
failed address translation. The EntryHi register also contains the ASID from which the translation fault occurred.
The Random register normally contains a valid location in which to place the replacement TLB entry.
The
contents of the EntryLo register are undefined.
When the MIPS16 instruction is disabled, the EPC register contains the address of the instruction that caused the
exception. However, if this instruction is in a branch delay slot, the EPC register contains the address of the
preceding jump or branch instruction, and the BD bit of the Cause register is set to 1.
When the MIPS16 instruction is enabled, the EPC register contains the address of the instruction that caused the
exception, and the least significant bit stores the ISA mode in which an exception occurs. However, if this
instruction is in a branch delay slot or is the instruction following the Extend instruction, the EPC register contains
the address of the preceding jump or Extend instruction, and the BD bit of the Cause register is set to 1.
180
User’s Manual U15509EJ2V0UM
CHAPTER 6 EXCEPTION PROCESSING
Servicing
To service this exception, the contents of the Context or XContext register are used as a virtual address to fetch
memory words containing the physical page frame and access control bits for a pair of TLB entries. The memory
word is written into the TLB entry by using the EntryLo0, EntryLo1, or EntryHi register.
It is possible that the physical page frame and access control bits are placed in a page where the virtual address
is not resident in the TLB. This condition is processed by allowing a TLB Refill exception in the TLB Refill
exception handler. In this case, the common exception vector is used because the EXL bit of the Status register is
set to 1.
(2) TLB Invalid exception
Cause
The TLB Invalid exception occurs when the TLB entry that matches with the virtual address to be referenced is
invalid (the V bit is set to 0). This exception is not maskable.
Processing
The common exception vector is used for this exception. The TLBL or TLBS code in the ExcCode field of the
Cause register is set. If this exception has been caused by an instruction reference or load operation, TLBL is set.
If it has been caused by a store operation, TLBS is set.
When this exception occurs, the BadVAddr, Context, XContext, and EntryHi registers contain the virtual address
that failed address translation.
The EntryHi register also contains the ASID from which the translation fault
occurred. The Random register normally stores a valid location in which to place the replacement TLB entry. The
contents of the EntryLo register are undefined.
When the MIPS16 instruction is disabled, the EPC register contains the address of the instruction that caused the
exception. However, if this instruction is in a branch delay slot, the EPC register contains the address of the
preceding jump or branch instruction, and the BD bit of the Cause register is set to 1.
When the MIPS16 instruction is enabled, the EPC register contains the address of the instruction that caused the
exception, and the least significant bit stores the ISA mode in which an exception occurs. However, if this
instruction is in a branch delay slot or is the instruction following the Extend instruction, the EPC register contains
the address of the preceding jump or Extend instruction, and the BD bit of the Cause register is set to 1.
Servicing
Usually, the V bit of a TLB entry is cleared in the following cases:
• When the virtual address does not exist
• When the virtual address exists, but is not in main memory (a page fault)
• When a trap is required on any reference to the page (for example, to maintain a reference bit)
After servicing the cause of a TLB Invalid exception, the TLB entry is located with a TLBP (TLB Probe) instruction,
and replaced by an entry with its V bit set to 1.
User’s Manual U15509EJ2V0UM
181
CHAPTER 6 EXCEPTION PROCESSING
(3) TLB Modified exception
Cause
The TLB Modified exception occurs when the TLB entry that matches with the virtual address referenced by the
store instruction is valid (bit V is 1) but is not writable (bit D is 0). This exception is not maskable.
Processing
The common exception vector is used for this exception, and the Mod code in the ExcCode field of the Cause
register is set.
When this exception occurs, the BadVAddr, Context, XContext, and EntryHi registers contain the virtual address
that failed address translation.
The EntryHi register also contains the ASID from which the translation fault
occurred. The contents of the EntryLo register are undefined.
When the MIPS16 instruction is disabled, the EPC register contains the address of the instruction that caused the
exception. However, if this instruction is in a branch delay slot, the EPC register contains the address of the
preceding jump or branch instruction, and the BD bit of the Cause register is set to 1.
When the MIPS16 instruction is enabled, the EPC register contains the address of the instruction that caused the
exception, and the least significant bit stores the ISA mode in which an exception occurs. However, if this
instruction is in a branch delay slot or is the instruction following the Extend instruction, the EPC register contains
the address of the preceding jump or Extend instruction, and the BD bit of the Cause register is set to 1.
Servicing
The kernel uses the failed virtual address or virtual page number to identify the corresponding access control bits.
The page identified may or may not permit write accesses; if writes are not permitted, a write protection violation
occurs.
If write accesses are permitted, the page frame is marked dirty (i.e. writable) by the kernel in its own data
structures.
The TLBP instruction places the index of the TLB entry that must be altered into the Index register. The word data
containing the physical page frame and access control bits (with the D bit set to 1) is loaded to the EntryLo
register, and the contents of the EntryHi and EntryLo registers are written into the TLB.
182
User’s Manual U15509EJ2V0UM
CHAPTER 6 EXCEPTION PROCESSING
6.4.6 Bus Error exception
Cause
A Bus Error exception is raised by board-level circuitry for events such as bus time-out, local bus parity errors, and
invalid physical memory addresses or access types. This exception is not maskable.
A Bus Error exception occurs only when a cache miss refill, uncached reference, or unbuffered write occurs
synchronously.
Processing
The common interrupt vector is used for a Bus Error exception. The IBE or DBE code in the ExcCode field of the
Cause register is set, signifying whether the instruction caused the exception by an instruction reference, load
operation, or store operation.
When the MIPS16 instruction is disabled, the EPC register contains the address of the instruction that caused the
exception. However, if this instruction is in a branch delay slot, the EPC register contains the address of the
preceding jump or branch instruction, and the BD bit of the Cause register is set to 1.
When the MIPS16 instruction is enabled, the EPC register contains the address of the instruction that caused the
exception, and the least significant bit stores the ISA mode in which an exception occurs. However, if this
instruction is in a branch delay slot or is the instruction following the Extend instruction, the EPC register contains
the address of the preceding jump or Extend instruction, and the BD bit of the Cause register is set to 1.
Note that the EPC register may indicate a succeeding instruction instead of the instruction that caused the
exception if the Instruction Streaming function is on in the VR4122, VR4131, and VR4181A.
Servicing
The physical address at which the fault occurred can be computed from information available in the System
Control Coprocessor (CP0) registers.
• If the IBE code in the Cause register is set (indicating an instruction fetch), the virtual address is contained in
the EPC register.
• If the DBE code is set (indicating a load or store), the virtual address of the instruction that caused the
exception is saved to the EPC register.
The virtual address of the load and store instruction can then be obtained by interpreting the instruction. The
physical address can be obtained by using the TLBP instruction and reading the EntryLo register to compute the
physical page number.
At the time of this exception, the kernel reports the UNIX SIGBUS (bus error) signal to the current process, but the
exception is usually fatal.
User’s Manual U15509EJ2V0UM
183
CHAPTER 6 EXCEPTION PROCESSING
6.4.7 System Call exception
Cause
A System Call exception occurs during an attempt to execute the SYSCALL instruction. This exception is not
maskable.
Processing
The common exception vector is used for this exception, and the Sys code in the ExcCode field of the Cause
register is set.
The EPC register contains the address of the SYSCALL instruction unless it is in a branch delay slot, in which
case the EPC register contains the address of the preceding branch instruction.
If the SYSCALL instruction is in a branch delay slot, the BD bit of the Status register is set to 1; otherwise this bit is
cleared.
Servicing
When this exception occurs, control is transferred to the applicable system routine.
To resume execution, the EPC register must be altered so that the SYSCALL instruction does not re-execute; this
is accomplished by adding a value of 4 to the EPC register before returning.
If a SYSCALL instruction is in a branch delay slot, interpretation of the branch instruction is required to resume
execution.
184
User’s Manual U15509EJ2V0UM
CHAPTER 6 EXCEPTION PROCESSING
6.4.8 Breakpoint exception
Cause
A Breakpoint exception occurs when an attempt is made to execute the BREAK instruction. This exception is not
maskable.
Processing
The common exception vector is used for this exception, and the BP code in the ExcCode field of the Cause
register is set.
When the MIPS16 instruction is disabled, the EPC register contains the address of the instruction that caused the
exception. However, if this instruction is in a branch delay slot, the EPC register contains the address of the
preceding jump or branch instruction, and the BD bit of the Cause register is set to 1.
When the MIPS16 instruction is enabled, the EPC register contains the address of the instruction that caused the
exception, and the least significant bit stores the ISA mode in which an exception occurs. However, if this
instruction is in a branch delay slot or is the instruction following the Extend instruction, the EPC register contains
the address of the preceding jump or Extend instruction, and the BD bit of the Cause register is set to 1.
If the BREAK instruction is in a branch delay slot, the BD bit of the Status register is set to 1; otherwise this bit is
cleared.
Servicing
When the Breakpoint exception occurs, control is transferred to the applicable system routine.
Additional
distinctions can be made by analyzing the unused bits of the BREAK instruction (bits 25 to 6), and loading the
contents of the instruction whose address the EPC register contains. A value of 4 must be added to the contents
of the EPC register to locate the instruction if it resides in a branch delay slot.
To resume execution, the EPC register must be altered so that the BREAK instruction does not re-execute; this is
accomplished by adding a value of 4 to the EPC register before returning.
When a Breakpoint exception occurs while executing the MIPS16 instruction, a valve of 2 should be added to the
EPC register before returning.
If a BREAK instruction is in a branch delay slot, interpretation (decoding) of the branch instruction is required to
resume execution.
User’s Manual U15509EJ2V0UM
185
CHAPTER 6 EXCEPTION PROCESSING
6.4.9 Coprocessor Unusable exception
Cause
The Coprocessor Unusable exception occurs when an attempt is made to execute a coprocessor instruction for
either:
• a corresponding coprocessor unit that has not been marked usable (Status register bit, CU0 = 0), or
• CP0 instructions, when the unit has not been marked usable (Status register bit, CU0 = 0) and the process
executes in User or Supervisor mode.
This exception is not maskable.
Processing
The common exception vector is used for this exception, and the CpU code in the ExcCode field of the Cause
register is set. The CE bit of the Cause register indicates which of the four coprocessors was referenced.
When the MIPS16 instruction is disabled, the EPC register contains the address of the instruction that caused the
exception. However, if this instruction is in a branch delay slot, the EPC register contains the address of the
preceding jump or branch instruction, and the BD bit of the Cause register is set to 1.
When the MIPS16 instruction is enabled, the EPC register contains the address of the instruction that caused the
exception, and the least significant bit stores the ISA mode in which an exception occurs. However, if this
instruction is in a branch delay slot or is the instruction following the Extend instruction, the EPC register contains
the address of the preceding jump or Extend instruction, and the BD bit of the Cause register is set to 1.
Servicing
The coprocessor unit to which an attempted reference was made is identified by the CE bit of the Cause register.
One of the following processing is performed by the handler:
• If the process is entitled access to the coprocessor, the coprocessor is marked usable and the corresponding
state is restored to the coprocessor.
• If the process is entitled access to the coprocessor, but the coprocessor does not exist or has failed,
interpretation of the coprocessor instruction is possible.
• If the BD bit in the Cause register is set to 1, the branch instruction must be interpreted; then the coprocessor
instruction can be emulated and execution resumed with the EPC register advanced past the coprocessor
instruction.
• If the process is not entitled access to the coprocessor, the kernel reports UNIX SIGILL/ILL_PRIVIN_FAULT
(illegal instruction/privileged instruction fault) signal to the current process, and this exception is fatal.
186
User’s Manual U15509EJ2V0UM
CHAPTER 6 EXCEPTION PROCESSING
6.4.10 Reserved Instruction exception
Cause
The Reserved Instruction exception occurs when an attempt is made to execute one of the following instructions:
• Instruction with an undefined major opcode (bits 31 to 26)
• SPECIAL instruction with an undefined minor opcode (bits 5 to 0)
• REGIMM instruction with an undefined minor opcode (bits 20 to 16)
• 64-bit instructions in 32-bit User or Supervisor mode
• RR instruction with an undefined minor op code (bits 4 to 0) when executing the MIPS16 instruction
• I8 instruction with an undefined minor op code (bits 10 to 8) when executing the MIPS16 instruction
64-bit operations are always valid in Kernel mode regardless of the value of the KX bit in the Status register. This
exception is not maskable.
Processing
The common exception vector is used for this exception, and the RI code in the ExcCode field of the Cause
register is set.
When the MIPS16 instruction is disabled, the EPC register contains the address of the instruction that caused the
exception. However, if this instruction is in a branch delay slot, the EPC register contains the address of the
preceding jump or branch instruction, and the BD bit of the Cause register is set to 1.
When the MIPS16 instruction is enabled, the EPC register contains the address of the instruction that caused the
exception, and the least significant bit stores the ISA mode in which an exception occurs. However, if this
instruction is in a branch delay slot or is the instruction following the Extend instruction, the EPC register contains
the address of the preceding jump or Extend instruction, and the BD bit of the Cause register is set to 1.
Servicing
All currently defined MIPS ISA instructions can be executed. The process executing at the time of this exception
is handled by a UNIX SIGILL/ILL_RESOP_FAULT (illegal instruction/reserved operand fault) signal. This error is
usually fatal.
User’s Manual U15509EJ2V0UM
187
CHAPTER 6 EXCEPTION PROCESSING
6.4.11 Trap exception
Cause
The Trap exception occurs when a TGE, TGEU, TLT, TLTU, TEQ, TNE, TGEI, TGEUI, TLTI, TLTUI, TEQI, or TNEI
instruction results in a TRUE condition. This exception is not maskable.
Processing
The common exception vector is used for this exception, and the Tr code in the ExcCode field of the Cause
register is set.
The EPC register contains the address of the trap instruction causing the exception unless the instruction is in a
branch delay slot, in which case the EPC register contains the address of the preceding branch instruction and the
BD bit of the Cause register is set to 1.
Servicing
At the time of a Trap exception, the kernel reports the UNIX SIGFPE/FPE_INTOVF_TRAP (floating-point
exception/integer overflow) signal to the current process, but the exception is usually fatal.
6.4.12 Integer Overflow exception
Cause
An Integer Overflow exception occurs when an ADD, ADDI, SUB, DADD, DADDI, or DSUB instruction results in a
2’s complement overflow. This exception is not maskable.
Processing
The common exception vector is used for this exception, and the Ov code in the ExcCode field of the Cause
register is set.
The EPC register contains the address of the instruction that caused the exception unless the instruction is in a
branch delay slot, in which case the EPC register contains the address of the preceding branch instruction and the
BD bit of the Cause register is set to 1.
Servicing
At the time of the exception, the kernel reports the UNIX SIGFPE/FPE_INTOVF_TRAP (floating-point
exception/integer overflow) signal to the current process, and this exception is usually fatal.
188
User’s Manual U15509EJ2V0UM
CHAPTER 6 EXCEPTION PROCESSING
6.4.13 Watch exception
Cause
A Watch exception occurs when a load or store instruction references the physical address specified by the
WatchLo/WatchHi registers. The WatchLo/WatchHi registers specify whether a load or store or both could have
initiated this exception.
• When the R bit of the WatchLo register is set to 1: Load instruction
• When the W bit of the WatchLo register is set to 1: Store instruction
• When both the R bit and W bit of the WatchLo register are set to 1: Load instruction or store instruction
The CACHE instruction never causes a Watch exception.
The Watch exception is postponed while the EXL bit in the Status register is set to 1, and Watch exception is
maskable by setting the EXL bit in the Status register to 1 or by setting the R or W bit in the WatchLo register to 0.
Processing
The common exception vector is used for this exception, and the WATCH code in the ExcCode field of the Cause
register is set.
When the MIPS16 instruction is disabled, the EPC register contains the address of the instruction that caused the
exception. However, if this instruction is in a branch delay slot, the EPC register contains the address of the
preceding jump or branch instruction, and the BD bit of the Cause register is set to 1.
When the MIPS16 instruction is enabled, the EPC register contains the address of the instruction that caused the
exception, and the least significant bit stores the ISA mode in which an exception occurs. However, if this
instruction is in a branch delay slot or is the instruction following the Extend instruction, the EPC register contains
the address of the preceding jump or Extend instruction, and the BD bit of the Cause register is set to 1.
Servicing
The Watch exception is a debugging aid; typically the exception handler transfers control to a debugger, allowing
the user to examine the situation. To continue, once the Watch exception must be disabled to execute the faulting
instruction. The Watch exception must then be reenabled. The faulting instruction can be executed either by the
debugger or by setting breakpoints.
The contents of the WatchLo/WatchHi register after reset are undefined so that they, especially the R and W bits,
must be initialized by software, otherwise a Watch exception may occur after reset.
User’s Manual U15509EJ2V0UM
189
CHAPTER 6 EXCEPTION PROCESSING
6.4.14 Interrupt exception
Cause
The Interrupt exception occurs when one of the eight interrupt conditions
Note
is asserted. In the VR4100 Series,
interrupt requests from internal peripheral units first enter the ICU and are then notified to the CPU core via one of
five interrupt sources (Int(4:0)) or NMI.
Each of the eight interrupts can be masked by clearing the corresponding bit in the IM field of the Status register,
and all of the eight interrupts can be masked at once by clearing the IE bit of the Status register or setting the
EXL/ERL bit.
Note They are 1 timer interrupt, 5 ordinary interrupts, and 2 software interrupts.
Of the five ordinary interrupts, Int3 becomes active in the VR4121 and VR4181A only, and Int4 in the VR4181A
only.
For details about the Interrupt Control Unit (ICU), refer to Hardware User's Manual of each processor.
Processing
The common exception vector is used for this exception, and the Int code in the ExcCode field of the Cause
register is set.
The IP field of the Cause register indicates current interrupt requests. It is possible that more than one of the bits
can be simultaneously set (or cleared) if the interrupt request signal is asserted (or deasserted) before this register
is read.
When the MIPS16 instruction is disabled, the EPC register contains the address of the instruction that caused the
exception. However, if this instruction is in a branch delay slot, the EPC register contains the address of the
preceding jump or branch instruction, and the BD bit of the Cause register is set to 1.
When the MIPS16 instruction is enabled, the EPC register contains the address of the instruction that caused the
exception, and the least significant bit stores the ISA mode in which an exception occurs. However, if this
instruction is in a branch delay slot or is the instruction following the Extend instruction, the EPC register contains
the address of the preceding jump or Extend instruction, and the BD bit of the Cause register is set to 1.
Servicing
If the interrupt is caused by one of the two software-generated exceptions, the interrupt condition is cleared by
setting the corresponding Cause register bit to 0.
If the interrupt is caused by hardware, the interrupt condition is cleared by deactivating the corresponding interrupt
request signal.
190
User’s Manual U15509EJ2V0UM
CHAPTER 6 EXCEPTION PROCESSING
6.5 Exception Processing and Servicing Flowcharts
The remainder of this chapter contains flowcharts for the following exceptions and guidelines for their handlers:
• Common exceptions and a guideline to their exception handler
• TLB/XTLB Refill exception and a guideline to their exception handler
• Cold Reset, Soft Reset and NMI exceptions, and a guideline to their handler.
User’s Manual U15509EJ2V0UM
191
CHAPTER 6 EXCEPTION PROCESSING
Figure 6-17. Common Exception Handling (1/2)
(a) Processing by hardware
Start
EntryHi, XContext/Context registers
are set when a TLB Refill, TLB Invalid,
or TLB Modified exception occurs.
EntryHi ← VPN2, ASID
XContext/Context ← VPN2
Set ExcCode, CE fields
Check for multiple exceptions
No
EXL bit = 0?
Yes
M16 bit = 0?
No
Yes
Instruction
in branch delay
slot?
Instruction
in branch delay
slot?
No
No
Yes
Yes
BD bit ← 1
EPC ← PC−4
BD bit ← 0
EPC ← PC
BD bit ← 1
EPC ← PC−4Note1
EIM bit ← 0/1
BD bit ← 0
EPC ← PCNote2
EIM bit ← 0/1
Kernel mode is set and interrupts
are disabled.
EXL bit ← 1
BadVAddr register is set only when
a TLB Refill, TLB Invalid, or TLB
Modified exception occurs
(it is not set when a Bus Error
exception occurs).
No
BEV bit = 0?
Yes
Normal
PC ← 0xFFFF FFFF 8000 0000+180
(Unmapped, cacheable)
Bootstrap
PC ← 0xFFFF FFFF BFC0 0200+180
(Unmapped, uncacheable)
A
Notes 1. PC – 2 when the JR or JALR instruction of MIPS16 instructions
2. PC – 2 when the Extend instruction of MIPS16 instructions
Remark
The interrupts can be masked by setting the IE or IM bit. The Watch exception can be set to
pending state by setting the EXL bit to 1.
192
User’s Manual U15509EJ2V0UM
CHAPTER 6 EXCEPTION PROCESSING
Figure 6-17. Common Exception Handling (2/2)
(b) Servicing by software
A
The occurrence of TLB Refill, TLB Invalid, and TLB Modified
exceptions is disabled by using an unmapped space.
The occurrence of the Watch and Interrupt exceptions is
disabled setting EXL = 1.
Execute MFC0 instruction
XContext/Context register
EPC register
Status register
Cause register
The Cold Reset, Soft Reset, and NMI exceptions are
enabled.
Other exceptions are avoided in the OS programs.
Execute MTC0 instruction
(Status register setting)
KSU bits ← 00
EXL bit ← 0
IE bit ← 1
In Kernel mode, interrupts are enabled.
After EXL = 0 is set, all exceptions are enabled (although
the interrupt exception can be masked by the IE and IM bits).
Check the Cause register,
and jump to each routine
TS bit = 0?
No
Yes
Servicing by each exception routine
VR4181 only.
The processor is reset.
The register files are saved.
EXL bit = 1
Execute MTC0 instruction
EPC register
Status register
Execute ERET instruction
The execution of the ERET instruction is disabled in the
delay slots for the other jump instructions.
The processor does not execute an instruction in the branch
delay slot for the ERET instruction.
PC ← EPC register, EXL bit ← 0
End
User’s Manual U15509EJ2V0UM
193
CHAPTER 6 EXCEPTION PROCESSING
Figure 6-18. TLB/XTLB Refill Exception Handling (1/2)
(a) Processing by hardware
Start
EntryHi ← VPN2, ASID
XContext/Context ← VPN2
Sets ExcCode, CE fields
EXL bit = 0?
Check for multiple exceptions
No
Yes
M16 bit = 0?
No
Yes
Instruction
in branch delay
slot?
Instruction
in branch delay
slot?
No
Yes
Yes
BD bit ← 1
EPC ← PC−4
XTLB
exception?
No
BD bit ← 0
EPC ← PC
BD bit ← 1
EPC ← PC−4Note1
EIM bit ← 0/1
BD bit ← 0
EPC ← PCNote2
EIM bit ← 0/1
No
Yes
XTLB Refill
Vector offset = 0x080
TLB Refill
Vector offset = 0x000
EXL bit ← 1
BEV bit = 0?
Kernel mode is set and interrupts
are disabled.
No
Normal
Bootstrap
PC ← 0xFFFF FFFF 8000 0000 + Vector offset
(Unmapped, cacheable)
PC ← 0xFFFF FFFF BFC0 0200 + Vector offset
(Unmapped, uncacheable)
Yes
B
Notes 1. PC – 2 when the JR or JALR instruction of MIPS16 instructions
2. PC – 2 when the Extend instruction of MIPS16 instructions
194
TLB Refill
Vector offset = 0x180
User’s Manual U15509EJ2V0UM
CHAPTER 6 EXCEPTION PROCESSING
Figure 6-18. TLB/XTLB Refill Exception Handling (2/2)
(b) Servicing by software
B
Execute MFC0 instruction
XContext/Context register
Servicing
by each exception routineNote
Execute ERET instruction
The occurrence of TLB Refill, TLB Invalid, and TLB
Modified exceptions is disabled by using an unmapped space.
The occurrence of the Watch and Interrupt exceptions is
disabled by setting EXL = 1.
However, the Cold Reset, Soft Reset, and NMI exceptions
are enabled.
Other exceptions are avoided in the OS programs.
The physical address for a virtual address that is loaded into
the Context register is loaded into the EntryLo register and written
to the TLB.
The execution of the ERET instruction is not allowed in the
branch delay slots for other jump instructions.
The processor does not execute an instruction in the branch
delay slot for the ERET instruction.
PC ← EPC register, EXL bit ← 0
End
Note As long as a data/instruction address exists in the mapping space, another TLB Refill exception may
occur. In such a case, EXL = 1 is set, causing a jump to the common exception vector. In this case, the
common exception handler handles the TLB miss, the ERET instruction returns control to the user
program, then a TLB Refill exception is generated again.
User’s Manual U15509EJ2V0UM
195
CHAPTER 6 EXCEPTION PROCESSING
Figure 6-19. Cold Reset Exception Handling
Start
Hardware
No
ERL bit = 0?
Yes
No
M16 bit = 0?
Instruction
in branch delay
slot?
Yes
No
Yes
Instruction
in branch delay
slot?
No
BD bit ← 1
ErrorEPC ← PC−4Note1
ErIM bit ← 0/1
Yes
BD bit ← 1
ErrorEPC ← PC−4
BD bit ← 0
ErrorEPC ← PCNote2
ErIM bit ← 0/1
BD bit ← 0
ErrorEPC ← PC
Random register ← 31
Wired register ← 0
Count register ← 0
Update Config register bits
Set WatchLo register
R bit ← 0
W bit ← 0
Set Status register
BEV bit ← 1
SR bit ← 0
TS bit ← 0
ERL bit ← 1
Refer to 6. 4. 1 about Config register
bits to be updated.
Setting WatchLo register is for
processors other than VR4181.
Manipulation of TS bit is for VR4181 only.
PC ← 0xFFFF FFFF BFC0 0000
Software
The processor provides no means
of distinguishing between an NMI
exception and Soft Reset exception,
so that this must be determined at
the system level.
No
NMI?
Yes
SR bit = 1?
Servicing by NMI
exception routine
No
Yes
Execute ERET instruction
Servicing by Soft Reset
exception routine
Servicing by Cold Reset
exception routine
End
Notes 1. PC – 2 when the JR or JALR instruction of MIPS16 instructions
2. PC – 2 when the Extend instruction of MIPS16 instructions
196
User’s Manual U15509EJ2V0UM
CHAPTER 6 EXCEPTION PROCESSING
Figure 6-20. Soft Reset and NMI Exception Handling
Start
Hardware
No
ERL bit = 0?
Yes
No
M16 bit = 0?
Instruction
in branch delay
slot?
Yes
No
Yes
Instruction
in branch delay
slot?
No
Yes
BD bit ← 1
ErrorEPC ← PC−4
BD bit ← 1
ErrorEPC ← PC−4Note1
ErIM bit ← 0/1
BD bit ← 0
ErrorEPC ← PCNote2
ErIM bit ← 0/1
BD bit ← 0
ErrorEPC ← PC
Setting WatchLo register (only for Soft
Reset) is for processors other than
VR4181.
Set WatchLo register
R bit ← 0
W bit ← 0
Set Status register
BEV bit ← 1
SR bit ← 1
TS bit ← 0
ERL bit ← 1
Manipulation of TS bit is for VR4181 only.
PC ← 0xFFFF FFFF BFC0 0000
Software
The processor provides no means of
distinguishing between an NMI
exception and Soft Reset exception,
so that this must be determined at the
system level.
No
NMI?
Yes
No
SR bit = 1?
Servicing by NMI
exception routine
Yes
Execute ERET instruction
Servicing by Soft Reset
exception routine
Servicing by Cold Reset
exception routine
End
Notes 1. PC – 2 when the JR or JALR instruction of MIPS16 instructions
2. PC – 2 when the Extend instruction of MIPS16 instructions
User’s Manual U15509EJ2V0UM
197
CHAPTER 7 CACHE MEMORY
This chapter describes in detail the cache memory of the VR4100 Series: its place in the CPU core memory
organization, and individual organization of the caches.
This chapter uses the following terminology:
• The data cache may also be referred to as the D-cache.
• The instruction cache may also be referred to as the I-cache.
These terms are used interchangeably throughout this book.
7.1 Memory Organization
Figure 7-1 shows the CPU core system memory hierarchy. In the logical memory hierarchy, the caches lie
between the CPU and main memory. They are designed to make the speedup of memory accesses transparent to
the user.
Each functional block in Figure 7-1 has the capacity to hold more data than the block above it. For instance,
physical main memory has a larger capacity than the caches. At the same time, each functional block takes longer to
access than any block above it. For instance, it takes longer to access data in main memory than in the CPU on-chip
registers.
Figure 7-1. Logical Hierarchy of Memory
CPU core
Register
Register
Instruction
cache
Data
cache
Register
Cache
Faster access
time
Main memory
Disks, CD-ROMs,
tapes, etc.
198
Memory
Media
User’s Manual U15509EJ2V0UM
Increasing data
capacity
CHAPTER 7 CACHE MEMORY
7.1.1 On-chip caches
The CPU core has two on-chip caches: one holds instructions (the instruction cache), the other holds data (the
data cache). The instruction and data caches can be read in one PClock cycle.
2 PCycles are needed to write data. However, data writes are pipelined and can complete at a rate of one per
PClock cycle. In the first stage of the cycle, the store address is translated and the tag is checked; in the second
stage, the data is written into the data RAM.
Figure 7-2 provides a relationship between cache and memory.
Figure 7-2. On-chip Caches and Main Memory
CPU core
Main memory
Cache controller
Instruction
cache
Data
cache
On-chip caches have the following characteristics:
• indexed with a virtual address
• holds physical address with a tag
• maintains coherency to memory with writeback
The cache data of the VR4121, VR4122, VR4181, and VR4181A are directly mapped; on the other hand those of
the VR4131 are mapped in 2-way set associative format. In addition, the caches of the VR4131 have line lock
function.
User’s Manual U15509EJ2V0UM
199
CHAPTER 7 CACHE MEMORY
7.2 Cache Organization
This section describes the organization of the on-chip data and instruction caches.
A cache consists of blocks called cache lines, which is the smallest unit of information that can be fetched from
main memory to a cache. A cache line itself has tag and data fields. Two types of line size can be selectable by
setting the Config register of the CP0 for the instruction cache line of the VR4122 and for the instruction/data cache
line of the VR4131.
7.2.1 Instruction cache line
Figure 7-3 shows the format of a 4-word (16-byte) I-cache line.
Figure 7-3. Instruction Cache Line Format
(a) VR4121, VR4122, VR4181
22 21
Tag
0
V
PTag
96 95
127
Data
Data
64 63
Data
32 31
Data
0
Data
(b) VR4131
23
Tag
V
22 21
L
PTag
96 95
127
Data
0
Data
64 63
Data
V
: Valid bit (line status)
L
: Lock bit (line lock status)
Ptag
: Physical tag (bits 31 to 10 of physical address)
Data
: Cache data
32 31
Data
0
Data
Remarks 1. In the VR4181A, the data field has 256 bits since the line size is 8 words (32 bytes), though the tag
format is the same as that of the VR4121, VR4122, and VR4181.
2. When the line size is specified as 8 words (32 bytes) in the VR4122 or VR4131, the data field
becomes 256 bits wide.
200
User’s Manual U15509EJ2V0UM
CHAPTER 7 CACHE MEMORY
7.2.2 Data cache line
Figure 7-4 shows the format of a 4-word (16-byte) D-cache line.
Figure 7-4. Data Cache Line Format
(a) VR4121, VR4122, VR4181
Tag
24
23
22 21
W
V
D
0
PTag
64 63
127
Data
0
Data
Data
(b) VR4131
Tag
24
23
22 21
W
V
L
0
PTag
64 63
127
Data
W
Data
0
Data
: Write-back bit (set if cache line has been written)
V
: Valid bit (line status)
D
: Dirty bit (write status)
L
: Lock bit (line lock status)
Ptag
: Physical tag (bits 31 to 10 of physical address)
Data
: D-cache data
Remarks 1. In the VR4181A, the data field has 256 bits since the line size is 8 words (32 bytes), though the tag
format is the same as that of the VR4121, VR4122, and VR4181.
2. When the line size is specified as 8 words (32 bytes) in the VR4131, the data field becomes 256 bits
wide.
User’s Manual U15509EJ2V0UM
201
CHAPTER 7 CACHE MEMORY
7.2.3 Placement of cache data
The cache data of the VR4121, VR4122, VR4181, and VR4181A are directly mapped; on the other hand those of
the VR4131 are mapped in 2-way set associative format.
(1) Direct mapping
In this format, a cache is dealt with one block of memory space, and cache lines are placed linearly.
(2) 2-way set associative
In this format, the memory space of a cache is divided into two blocks (ways), and two cache lines are placed in
the same index (of different ways).
7.3 Cache Operations
As described earlier, caches provide fast temporary data storage, and they make the speedup of memory
accesses transparent to the user. In general, the CPU core accesses cache-resident instructions or data through the
following procedure:
1.
The CPU core, through the on-chip cache controller, attempts to access the next instruction or data in the
appropriate cache.
2.
The cache controller checks to see if this instruction or data is present in the cache.
• If the instruction/data is present, the CPU core retrieves it. This is called a cache hit.
• If the instruction/data is not present in the cache, the cache controller must retrieve it from memory. This is
called a cache miss.
3.
The CPU core retrieves the instruction/data from the cache and operation continues.
It is possible for the same data to be in two places simultaneously: main memory and cache. This data is kept
consistent through the use of a writeback methodology; that is, modified data is not written back to memory until the
cache line is to be replaced.
202
User’s Manual U15509EJ2V0UM
CHAPTER 7 CACHE MEMORY
7.3.1 Cache data coherency
The CPU core of the VR4100 Series manages its data cache by using a writeback policy; that is, it stores write
data into the cache, instead of writing it directly to memory. Some time later this data is independently written into
memory. In the VR4100 Series implementation, a modified cache line is not written back to memory until the cache
line is to be replaced.
When the CPU core writes a cache line back to memory, it does not ordinarily retain a copy of the cache line, and
the state of the cache line is changed to invalid.
Remark
Contrary to the writeback, the write-through cache policy stores write data into the memory and cache
simultaneously.
(1) VR4121, VR4122, VR4181, and VR4181A
On a store miss writeback, data tag is checked and data is transferred to the write buffer. If an error is detected
in the data field, the writeback is not terminated; the erroneous data is still written out to main memory. If an
error is detected in the tag field, the writeback bus cycle is not issued.
The cache data may not be checked during CACHE operation.
(2) VR4131
On a store miss writeback, data tag is checked, a refill request is issued, and data is transferred to the write
buffer. The writeback is performed after the refill is completed.
7.3.2 Replacement of cache line
When a cache miss occurs or when the Fill operation (for instruction cache only) or the Fetch_and_Lock operation
(for VR4131 only) of CACHE instruction is executed, one of the cache lines is overwritten with data that is read from
main memory. Such an overwriting is called replacement of a cache line.
The on-chip caches of the VR4131 are 2-way set associative memory where two cache lines are placed to one
index. When a cache miss occurs, the way to be replaced is determined by the LRU (Least recently used) algorithm.
It is indicated in the TagLo register of the CP0.
The on-chip caches of the VR4131 also have the line lock function. If a line is set locked on its placement, it will
not be replaced even when a cache miss occurs. Cache line locking is set or cancelled with CACHE instruction, and
locking status is indicated in the TagLo register of the CP0.
User’s Manual U15509EJ2V0UM
203
CHAPTER 7 CACHE MEMORY
7.3.3 Accessing the caches
CACHE instruction is used to change cache line states or to write back cache data (for details, refer to CHAPTER
9 CPU INSTRUCTION SET DETAILS).
Some bits of the virtual address (VA) are used to index into the caches. The number of virtual address bits used
to index the instruction and data caches depends on the cache size. In addition, bit 13 of the virtual address
specifies the way to be accessed in the VR4131.
Table 7-1. Cache Size, Line Size, and Index
Processor
VR4121
VR4122
VR4131
VR4181
VR4181A
Cache
Cache size
Line size
Index
Instruction
16 KB
4 words
VA(13:4)
Data
8 KB
4 words
VA(12:4)
Instruction
32 KB
4 words or 8 words
VA(14:4)
Data
16 KB
4 words
VA(13:4)
Instruction
16 KB
4 words or 8 words
VA(12:4)
Data
16 KB
4 words or 8 words
VA(12:4)
Instruction
4 KB
4 words
VA(11:4)
Data
4 KB
4 words
VA(11:4)
Instruction
8 KB
8 words
VA(12:5)
Data
8 KB
8 words
VA(12:5)
Figure 7-5 shows index into caches and data output.
Figure 7-5. Cache Index and Data Output
Internal address bus
Cache memory
Tag line
Data line
Cache index
PTag D
L
V
W
Internal data bus
204
User’s Manual U15509EJ2V0UM
Data
64 (data cache)/
32 (instruction cache)
CHAPTER 7 CACHE MEMORY
7.4
Cache States
There are three cache line states that indicate validity and consistency with main memory of line data.
(1) Instruction cache
The instruction cache supports two cache states:
• Invalid: a cache line that does not contain valid information must be marked invalid, and cannot be used.
• Valid: a cache line that contains valid data.
(2) Data cache
The data cache supports three cache states:
• Invalid: a cache line that does not contain valid information must be marked invalid, and cannot be used.
• Valid clean: a cache line that contains data that has not changed since it was loaded from memory.
• Valid dirty: a cache line containing data that has changed since it was loaded from memory.
The state of a valid cache line may be modified when the processor executes some operations of CACHE
instruction. CACHE instruction and its operations are described in CHAPTER 9 CPU INSTRUCTION SET DETAILS.
User’s Manual U15509EJ2V0UM
205
CHAPTER 7 CACHE MEMORY
7.4.1 Cache state transition diagrams
The following section describes the cache state diagrams for the data and instruction cache lines. These state
diagrams do not cover the initial state of the system, since the initial state is system-dependent.
(1) Instruction cache state transition
The following diagram illustrates the instruction cache state transition sequence.
• Read (1) indicates a read operation from main memory to cache, inducing a cache state transition.
• Read (2) indicates a read operation from cache to the CPU core, which induces no cache state transition.
Figure 7-6. Instruction Cache State Diagram
CACHE instruction
Read (2)
Valid
Invalid
Read (1)
(2) Data cache state transition
The following diagram illustrates the data cache state transition sequence. A load or store operation may include
one or more of the atomic read and/or write operations shown in the state diagram below, which may cause cache
state transitions.
• Read (1) indicates a read operation from main memory to cache, inducing a cache state transition.
• Write (1) indicates a write operation from CPU core to cache, inducing a cache state transition.
• Read (2) indicates a read operation from cache to the CPU core, which induces no cache state transition.
• Write (2) indicates a write operation from CPU core to cache, which induces no cache state transition.
Figure 7-7. Data Cache State Diagram
CACHE instruction
Write (1)
Read (2)
Write (2)
Valid
Dirty
CACHE instruction
Invalid
Read (1)
Write (1)
CACHE instruction
Write-back
206
User’s Manual U15509EJ2V0UM
Read (2)
Valid
Clean
CHAPTER 7 CACHE MEMORY
7.5 Cache Access Flow
Figures 7-8 to 7-23 show operation flows for various cache accesses.
Figure 7-8. Flow on Instruction Fetch
(a) VR4121, VR4122, VR4181, VR4181A
Start
Tag check
(b) VR4131
Start
Hit
Miss
Tag check
Hit
Miss
Refill
(see Figure 7-22)
R bit check
Data fetch
Refill
(see Figure 7-22)
End
R bit update
Data fetch
End
User’s Manual U15509EJ2V0UM
207
CHAPTER 7 CACHE MEMORY
Figure 7-9. Flow on Load Operations
(a) VR4121, VR4122, VR4181, VR4181A
(b) VR4131
Start
Tag check
Miss or
Invalid
V bit,
W bit
Start
Hit
Tag check
Hit
Miss or
invalid
V = 0 (Invalid) or
W = 0 (Clean)
R bit check
V = 1 (Valid) and
W = 1 (Dirty)
Writeback and refill
(see Figure 7-23)
Refill
(see Figure7-22)
V bit,
W bit
V = 0 (Invalid) or
W = 0 (Clean)
V = 1 (Valid) and
W = 1 (Dirty)
Data write
to register
Writeback and refill
(see Figure 7-23)
End
R bit update
Data write
to register
End
208
User’s Manual U15509EJ2V0UM
Refill
(see Figure 7-22)
CHAPTER 7 CACHE MEMORY
Figure 7-10. Flow on Store Operations
(a) VR4121, VR4122, VR4181, VR4181A
(b) VR4131
Start
Tag check
Start
Hit
Tag check
Miss
V bit,
W bit
Hit
Miss
V = 0 (Invalid) or
W = 0 (Clean)
R bit check
V = 1 (Valid) and
W = 1 (Dirty)
Writeback and refill
(see Figure 7-23)
Refill
(see Figure 7-22)
V bit,
W bit
V = 0 (Invalid) or
W = 0 (Clean)
V = 1 (Valid) and
W = 1 (Dirty)
Data write
to data cache
Writeback and refill
(see Figure 7-23)
End
Refill
(see Figure7-22)
R bit update
Data write
to data cache
End
User’s Manual U15509EJ2V0UM
209
CHAPTER 7 CACHE MEMORY
Figure 7-11. Flow on Index_Invalidate Operations
(a) VR4121, VR4122, VR4181, VR4181A
(b) VR4131
Start
Start
V bit clear
V bit clear
End
R bit update
End
210
User’s Manual U15509EJ2V0UM
CHAPTER 7 CACHE MEMORY
Figure 7-12. Flow on Index_Writeback_Invalidate Operations
(a) VR4121, VR4122, VR4181, VR4181A
(b) VR4131
Start
Start
= 0 (Invalid)
V bit
= 1 (Valid)
V bit
= 0 (Invalid)
= 1 (Valid)
= 0 (Clean)
= 0 (Clean)
W bit
W bit
= 1 (Dirty)
= 1 (Dirty)
Writeback
(see Figure 7-21)
Writeback
(see Figure 7-21)
V bit and W bit
clear
V bit and W bit
clear
End
R bit update
End
Figure 7-13. Flow on Index_Load_Tag Operations
Start
Tag read
to TagLo
W bit read
to TagLo
For data cache
End
User’s Manual U15509EJ2V0UM
211
CHAPTER 7 CACHE MEMORY
Figure 7-14. Flow on Index_Store_Tag Operations
Start
Tag write
from TagLo
End
Figure 7-15. Flow on Create_Dirty Operations
(a) VR4121, VR4122, VR4181, VR4181A
(b) VR4131
Start
Start
Hit
Hit
Tag check
Tag check
Miss
Miss
V bit,
W bit
V = 1 (Valid) and
W = 1 (Dirty)
Writeback
(see Fitgure 7-21)
V = 0 (Invalid) or
W = 0 (Clean)
R bit check
V bit,
W bit
V = 1 (Valid) and
W = 1 (Dirty)
V bit and W bit set
Tag write
Writeback
(see Figure 7-21)
End
V bit and W bit set
Tag write
End
212
User’s Manual U15509EJ2V0UM
V = 0 (Invalid) or
W = 0 (Clean)
CHAPTER 7 CACHE MEMORY
Figure 7-16. Flow on Hit_Invalidate Operations
(a) VR4121, VR4122, VR4181, VR4181A
(b) VR4131
Start
Tag check
Start
Miss or invalid
Hit
Tag check
Miss or invalid
Hit
V bit clear
V bit clear
End
R bit update
End
User’s Manual U15509EJ2V0UM
213
CHAPTER 7 CACHE MEMORY
Figure 7-17. Flow on Hit_Writeback_Invalidate Operations
(a) VR4121, VR4122, VR4181, VR4181A
(b) VR4131
Start
Tag check
Start
Miss or invalid
Hit
Tag check
Hit
= 0 (Clean)
= 0 (Clean)
W bit
W bit
= 1 (Dirty)
= 1 (Dirty)
Writeback
(see Figure 7-21)
Writeback
(see Figure 7-21)
V bit clear
V bit clear
End
R bit update
End
214
Miss or invalid
User’s Manual U15509EJ2V0UM
CHAPTER 7 CACHE MEMORY
Figure 7-18. Flow on Fill Operations
(a) VR4121, VR4122, VR4181, VR4181A
(b) VR4131
Start
Start
Refill
(see Figure 7-22)
Tag check
Hit
Miss or invalid
R bit check
End
Refill
(see Figure 7-22)
R bit update
End
User’s Manual U15509EJ2V0UM
215
CHAPTER 7 CACHE MEMORY
Figure 7-19. Flow on Hit_Writeback Operations
Start
Tag check
Miss or invalid
Hit
= 0 (Clean)
W bit
For data cache
= 1 (Dirty)
Writeback
(see Figure 7-21)
W bit clear
For data cache
End
216
User’s Manual U15509EJ2V0UM
CHAPTER 7 CACHE MEMORY
Figure 7-20. Flow on Fetch_and_Lock Operations (VR4131 only)
Start
Tag check
Hit
Miss or invalid
R bit check
= 0 (Clean)
W bit
For data cache
= 1 (Dirty)
Writeback
(see Figure 7-21)
For data cache
W bit clear
For data cache
Refill
(see Figure 7-22)
L bit set
R bit update
End
User’s Manual U15509EJ2V0UM
217
CHAPTER 7 CACHE MEMORY
Figure 7-21. Writeback Flow
Writeback
to memory
No
EOD?
Yes
Figure 7-22. Refill Flow
Data write
to cache
EOD?
No
Yes
Erroneous bit
Error
No error
Cache line
invalidate
Bus Error exception
218
User’s Manual U15509EJ2V0UM
CHAPTER 7 CACHE MEMORY
Figure 7-23. Writeback & Refill Flow
(a) VR4121, VR4122, VR4181, VR4181A
Writeback
to memory
EOD?
(b) VR4131
Refill request
No
Data write
to cache
Yes
EOD?
Refill start
No
Yes
Writeback
to memory
Data write
to cache
EOD?
No
EOD?
Yes
Yes
Erroneous bit
No
Error
Erroneous bit
Error
No error
No error
Cache line
invalidate
Cache line
invalidate
Bus Error exception
Bus Error exception
User’s Manual U15509EJ2V0UM
219
CHAPTER 7 CACHE MEMORY
7.6 Manipulation of the Caches by an External Agent
The VR4100 Series does not provide any mechanisms for an external agent to examine and manipulate the state
and contents of the caches.
7.7 Initialization of the Caches
The caches of the VR4100 Series also need an initialization on reset or such cases. For procedures and program
examples of initialization, refer to VR Series Programming Guide Application Note.
220
User’s Manual U15509EJ2V0UM
CHAPTER 8 CPU CORE INTERRUPTS
Four types of interrupt are available on the CPU core of the VR4100 Series. These are:
• one non-maskable interrupt, NMI
• five ordinary interrupts
• two software interrupts
• one timer interrupt
For the interrupt request input to the CPU core from on-chip peripheral units, see Hardware User's Manual of
each product.
8.1 Types of Interrupt Request
8.1.1 Non-maskable interrupt (NMI)
The non-maskable interrupt is acknowledged by asserting the NMI signal (internal), forcing the processor to
branch to the Reset Exception vector. This signal is latched into an internal register at the rising edge of MasterOut
(internal), as shown in Figure 8-1.
NMI only takes effect when the processor pipeline is running.
This interrupt cannot be masked.
Figure 8-1 shows the internal service of the NMI signal. The NMI signal is latched into an internal register by the
rising edge of MasterOut. The latched signal is inverted to be transferred to inside the device as an NMI request.
Figure 8-1. Non-maskable Interrupt Signal
(Internal register)
NMI
NMI request
MasterOut
8.1.2 Ordinary interrupts
Ordinary interrupts are acknowledged by asserting the Int(4:0) signals (internal). However, Int3 occurs in the
VR4121 and VR4181A only, and Int4 in the VR4181A only.
This interrupt request can be masked with the IM (6:2), IE, EXL, and ERL fields of the Status register.
User’s Manual U15509EJ2V0UM
221
CHAPTER 8 CPU CORE INTERRUPTS
8.1.3 Software interrupts generated in CPU core
Software interrupts generated in the CPU core use bits 1 and 0 of the IP (interrupt pending) field in the Cause
register. These may be written by software, but there is no hardware mechanism to set or clear these bits.
After the processing of a software interrupt exception, corresponding bit of the IP field in the Cause register must
be cleared before enabling multiple interrupts or until the operation returns to normal routine.
This interrupt request is maskable through the IM (1:0), IE, EXL, and ERL fields of the Status register.
8.1.4 Timer interrupt
The timer interrupt uses bit 7 of the IP (interrupt pending) field of the Cause register. This bit is set automatically
whenever the value of the Count register equals the value of the Compare register, and an interrupt request is
acknowledged.
This interrupt is maskable through IM7, IE, EXL, and ERL fields of the Status register.
8.2 Acknowledging Interrupts
8.2.1 Detecting hardware interrupts
Figure 8-2 shows how the hardware interrupts are readable through the Cause register.
• The timer interrupt signal of the CPU core is directly readable as bit 15 (IP7) of the Cause register.
• The Int(4:0) signals are directly readable as bits 14 to 10 (IP(6:2)) of the Cause register.
IP(1:0) of the Cause register are used for software interrupt requests. There is no hardware mechanism for setting
or clearing the software interrupts.
Figure 8-2. Hardware Interrupt Signals
(Internal register)
Int0
Int1
Int2
Int3
0
IP2
10
IP3
11
IP4
12
IP5
13
IP6
14
IP7
15
1
2
See Figure 8-3
3
4
Int4
MasterOut
Remark
222
Timer interrupt
Cause register
bits 15 to 10
Int3 occurs in the VR4121 and VR4181A only, and Int4 in the VR4181A only.
User’s Manual U15509EJ2V0UM
CHAPTER 8 CPU CORE INTERRUPTS
8.2.2 Masking interrupt signals
Figure 8-3 shows the masking of the CPU core interrupt signals.
• Cause register bits 15 to 8 (IP(7:0)) are AND-ORed with Status register interrupt mask bits 15 to 8 (IM(7:0)) to
mask individual interrupts.
• Status register bit 0 is a global Interrupt Enable (IE) bit. It is ANDed with the output of the AND-OR logic to
produce the CPU core interrupt signal. The EXL bit in the Status register also enables these interrupts.
Figure 8-3. Masking of the Interrupt Request Signals
Status register
bit 0
IE
Status register
bits 15 to 8
IM0
IM1
IM2
IM3
IM4
IM5
IM6
IM7
8
9
10
11
12
13
14
15
8
CPU core interrupt
1
1
8
9
10
8
11
Ordinary interrupts
12
13
14
Timer interrupt
15
Cause register
bits 15 to 8
Software interrupts
of CPU core
Bit
IE
IP0
IP1
IP2
IP3
IP4
IP5
IP6
IP7
AND block
AND-OR
block
Function
Whole interrupts enable
Setting
1 : Enable
0 : Disable
IM(7:0)
Interrupt mask
Each bit
1 : Enable
0 : Disable
IP(7:0)
Interrupt request
Each bit
1 : Pending
0 : Not pending
User’s Manual U15509EJ2V0UM
223
CHAPTER 9 CPU INSTRUCTION SET DETAILS
This chapter provides a detailed description of the operation of each VR4100 Series instruction in both 32- and 64bit modes. The instructions are listed in alphabetical order.
9.1 Instruction Notation Conventions
In this chapter, all variable subfields in an instruction format (such as rs, rt, immediate, etc.) are shown in
lowercase names.
For the sake of clarity, we sometimes use an alias for a variable subfield in the formats of specific instructions.
For example, we use rs = base in the format for load and store instructions. Such an alias is always lower case,
since it refers to a variable subfield.
Figures with the actual bit encoding for all the mnemonics are located at the end of this chapter (9.4 CPU
Instruction Opcode Bit Encoding), and the bit encoding also accompanies each instruction.
In the instruction descriptions that follow, the Operation section describes the operation performed by each
instruction using a high-level language notation.
The VR4100 Series can operate as either a 32- or 64-bit
microprocessor and the operation for both modes is included with the instruction description.
Special symbols used in the notation are described in Table 9-1.
224
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
Table 9-1. CPU Instruction Operation Notations
Symbol
Meaning
<-
Assignment.
||
Bit string concatenation.
y
x
Replication of bit value x into a y-bit string. x is always a single-bit value.
xy…z
Selection of bits y through z of bit string x. Little-endian bit notation is always used. If y is less than z, this
expression is an empty (zero length) bit string.
+
2’s complement or floating-point addition.
-
2’s complement or floating-point subtraction.
*
2’s complement or floating-point multiplication.
div
2’s complement integer division.
mod
2’s complement modulo.
/
Floating-point division.
<
2’s complement less than comparison.
and
Bit-wise logical AND.
or
Bit-wise logical OR.
xor
Bit-wise logical XOR.
nor
Bit-wise logical NOR.
GPR [x]
General-Register x. The content of GPR [0] is always zero. Attempts to alter the content of GPR [0] have
no effect.
CPR [z, x]
Coprocessor unit z, general register x.
CCR [z, x]
Coprocessor unit z, control register x.
COC [z]
Coprocessor unit z condition signal.
BigEndianMem
Big-endian mode as configured at reset (0 → Little, 1 → Big). Specifies the endianness of the memory
interface (see Table 9-2), and the endianness of Kernel and Supervisor mode execution.
However, this value is always 0 in the VR4121, VR4122, VR4181, and VR4181A since they support the little
endian order only.
ReverseEndian
Signal to reverse the endianness of load and store instructions. This feature is available in User mode
only, and is effected by setting the RE bit of the Status register. Thus, ReverseEndian may be computed
as (SR25 and User mode).
However, this value is always 0 since the VR4100 Series does not support the reverse of the endianness.
BigEndianCPU
The endianness for load and store instructions (0 → Little, 1 → Big). In User mode, this endianness may
be reversed by setting SR25. Thus, BigEndianCPU may be computed as BigEndianMem XOR
ReverseEndian.
However, this value is always 0 in the VR4121, VR4122, VR4181, and VR4181A since they support the little
endian order only.
T+i:
Indicates the time steps between operations. Each of the statements within a time step are defined to be
executed in sequential order (as modified by conditional and loop constructs). Operations which are
marked T + i : are executed at instruction cycle i relative to the start of execution of the instruction. Thus,
an instruction which starts at time j executes operations marked T + i : at time i + j. The interpretation of
the order of execution between two instructions or two operations that execute at the same time should be
pessimistic; the order is not defined.
User’s Manual U15509EJ2V0UM
225
CHAPTER 9 CPU INSTRUCTION SET DETAILS
The following examples illustrate the application of some of the instruction notation conventions:
Example #1:
GPR [rt] ← immediate || 0
16
Sixteen zero bits are concatenated with an immediate value (typically 16 bits), and the 32-bit string (with
the lower 16 bits set to zero) is assigned to General-purpose register rt.
Example #2:
(immediate15)
16
|| immediate15...0
Bit 15 (the sign bit) of an immediate value is extended for 16 bit positions, and the result is concatenated
with bits 15 through 0 of the immediate value to form a 32-bit sign extended value.
9.2 Notes on Using CPU Instructions
9.2.1 Load and Store instructions
In the VR4100 Series implementation, the instruction immediately following a Load may use the loaded contents of
the register. In such cases, the hardware interlocks, requiring additional real cycles, so scheduling load delay slots is
still desirable, although not required for functional code.
In the Load and Store descriptions, the functions listed in Table 9-2 are used to summarize the handling of virtual
addresses and physical memory.
Table 9-2. Load and Store Common Functions
Function
Meaning
Address Translation
Uses the TLB to find the physical address given the virtual address. The function fails and an
exception is taken if the required translation is not present in the TLB.
Load Memory
Uses the cache and main memory to find the contents of the word containing the specified
physical address. The low-order three bits of the address and the Access Type field indicate which
of each of the four bytes within the data word need to be returned. If the cache is enabled for this
access, the entire word is returned and loaded into the cache. If the specified data is short of word
length, the data position to which the contents of the specified data is stored is determined
considering the endian mode and reverse endian mode.
Store Memory
Uses the cache, write buffer, and main memory to store the word or part of word specified as data
in the word containing the specified physical address. The low-order three bits of the address and
the Access Type field indicate which of each of the four bytes within the data word should be
stored. If the specified data is short of word length, the data position to which the contents of the
specified data is stored is determined considering the endian mode and reverse endian mode.
226
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
As shown in Table 9-3, the Access Type field indicates the size of the data item to be loaded or stored.
Regardless of access type or byte-numbering order (endianness), the address specifies the byte that has the
smallest byte address in the addressed field. For a big-endian machine, this is the leftmost byte and contains the
sign for a 2's complement number; for a little-endian machine, this is the rightmost byte.
Table 9-3. Access Type Specifications for Loads/Stores
Access type mnemonic
Meaning
Value in
internal
command
DOUBLEWORD
7
8 bytes (64 bits)
SEPTIBYTE
6
7 bytes (56 bits)
SEXTIBYTE
5
6 bytes (48 bits)
QUINTIBYTE
4
5 bytes (40 bits)
WORD
3
4 bytes (32 bits)
TRIPLEBYTE
2
3 bytes (24 bits)
HALFWORD
1
2 bytes (16 bits)
BYTE
0
1 byte (8 bits)
The bytes within the addressed doubleword that are used can be determined directly from the access type and the
three low-order bits of the address.
9.2.2 Jump and Branch instructions
All Jump and Branch instructions have an architectural delay of exactly one instruction. That is, the instruction
immediately following a Jump or Branch (that is, occupying the delay slot) is always executed while the target
instruction is being fetched from storage. A delay slot may not itself be occupied by a Jump or Branch instruction;
however, this error is not detected and the results of such an operation are undefined.
If an exception or interrupt prevents the completion of a legal instruction during a delay slot, the hardware sets the
EPC register to point at the Jump or Branch instruction that precedes it. When the code is restarted, both the Jump
or Branch instructions and the instruction in the delay slot are reexecuted.
Because Jump and Branch instructions may be restarted after exceptions or interrupts, they must be restartable.
Therefore, when a Jump or Branch instruction stores a return link value, register r31 (the register in which the link is
stored) may not be used as a source register.
Since instructions must be word-aligned, a Jump Register or Jump and Link Register instruction must use a
register which contains an address whose two low-order bits (low-order one bit in the 16-bit mode) are zero. If these
low-order bits are not zero, an address exception will occur when the jump target instruction is subsequently fetched.
User’s Manual U15509EJ2V0UM
227
CHAPTER 9 CPU INSTRUCTION SET DETAILS
9.2.3 System control coprocessor (CP0) instructions
There are some special limitations imposed on operations involving CP0 that is incorporated within the CPU.
Although Load and Store instructions to transfer data to/from coprocessors and to move control to/from coprocessor
instructions are generally permitted by the MIPS architecture, CP0 is given a somewhat protected status since it has
responsibility for exception handling and memory management.
Therefore, the move to/from coprocessor
instructions are the only valid mechanism for writing to and reading from the CP0 registers.
Several CP0 instructions are defined to directly read, write, and probe TLB entries and to modify the operating
modes in preparation for returning to User mode or interrupt-enabled states.
9.3 CPU Instructions
This section describes the functions of CPU instructions in detail for both 32-bit address mode and 64-bit address
mode.
The exception that may occur by executing each instruction is shown in the last of each instruction’s description.
For details of exceptions and their processes, see CHAPTER 6 EXCEPTION PROCESSING.
228
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
ADD
ADD
Add
31
26 25
SPECIAL
000000
21 20
16 15
rs
rt
11 10
rd
6 5
0
00000
0
ADD
100000
Format:
ADD rd, rs, rt
Description:
The contents of general register rs and the contents of general register rt are added to form the result. The result
is placed into general register rd. In 64-bit mode, the operands must be valid sign-extended, 32-bit values.
An overflow exception occurs if the carries out of bits 30 and 31 differ (2’s complement overflow).
The
destination register rd is not modified when an integer overflow exception occurs.
Restrictions:
If the value of either general register rt or general register rs is not a sign-extended 32-bit value (bits 63 to 31
have the same value), the result of this operation will be undefined.
Operation:
32
T:
GPR [rd] ← GPR [rs] + GPR [rt]
64
T:
temp ← GPR [rs] + GPR [rt]
GPR [rd] ← (temp31)
32
|| temp31…0
Exceptions:
Integer overflow exception
User’s Manual U15509EJ2V0UM
229
CHAPTER 9 CPU INSTRUCTION SET DETAILS
ADDI
ADDI
Add Immediate
31
26 25
ADDI
001000
21 20
rs
16 15
rt
0
immediate
Format:
ADDI rt, rs, immediate
Description:
The 16-bit immediate is sign-extended and added to the contents of general register rs to form the result. The
result is placed into general register rt. In 64-bit mode, the operand must be valid sign-extended, 32-bit values.
An overflow exception occurs if carries out of bits 30 and 31 differ (2’s complement overflow). The destination
register rt is not modified when an integer overflow exception occurs.
Restrictions:
If the value of general register rs is not a sign-extended 32-bit value (bits 63 to 31 have the same value), the
result of this operation will be undefined.
Operation:
32
T:
GPR [rt] ← GPR [rs] + (immediate15)
64
T:
temp ← GPR [rs] + (immediate15)
GPR [rt] ← (temp31)
32
48
16
|| immediate15…0
|| immediate15…0
|| temp31…0
Exceptions:
Integer overflow exception
230
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
ADDIU
ADDIU
Add Immediate Unsigned
31
26 25
ADDI
001001
21 20
rs
16 15
rt
0
immediate
Format:
ADDIU rt, rs, immediate
Description:
The 16-bit immediate is sign-extended and added to the contents of general register rs to form the result. The
result is placed into general register rt. No integer overflow exception occurs under any circumstances. In 64-bit
mode, the operand must be valid sign-extended, 32-bit values.
The only difference between this instruction and the ADDI instruction is that ADDIU never causes an integer
overflow exception.
Restrictions:
If the value of general register rs is not a sign-extended 32-bit value (bits 63 to 31 have the same value), the
result of this operation will be undefined.
Operation:
32
T:
GPR [rt] ← GPR [rs] + (immediate15)
64
T:
temp ← GPR [rs] + (immediate15)
GPR [rt] ← (temp31)
32
48
16
|| immediate15…0
|| immediate15…0
|| temp31…0
Exceptions:
None
User’s Manual U15509EJ2V0UM
231
CHAPTER 9 CPU INSTRUCTION SET DETAILS
ADDU
ADDU
Add Unsigned
31
26 25
SPECIAL
000000
21 20
16 15
rs
rt
11 10
rd
6 5
0
00000
0
ADDU
100001
Format:
ADDU rd, rs, rt
Description:
The contents of general register rs and the contents of general register rt are added to form the result. The result
is placed into general register rd. No integer overflow exception occurs under any circumstances. In 64-bit
mode, the operands must be valid sign-extended, 32-bit values.
The only difference between this instruction and the ADD instruction is that ADDU never causes an integer
overflow exception.
Restrictions:
If the value of either general register rt or general register rs is not a sign-extended 32-bit value (bits 63 to 31
have the same value), the result of this operation will be undefined.
Operation:
32
T:
64
T:
GPR [rt] ← GPR [rs] + GPR [rt]
temp ← GPR [rs] + GPR [rt]
GPR [rd] ← (temp31)
32
|| temp31…0
Exceptions:
None
232
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
AND
AND
AND
31
26 25
SPECIAL
000000
21 20
rs
16 15
rt
11 10
rd
6 5
0
00000
0
AND
100100
Format:
AND rd, rs, rt
Description:
The contents of general register rs are combined with the contents of general register rt in a bit-wise logical AND
operation. The result is placed into general register rd.
Operation:
32
T:
GPR [rd] ← GPR [rs] and GPR [rt]
64
T:
GPR [rd] ← GPR [rs] and GPR [rt]
Exceptions:
None
User’s Manual U15509EJ2V0UM
233
CHAPTER 9 CPU INSTRUCTION SET DETAILS
ANDI
ANDI
AND Immediate
31
26 25
ANDI
001100
21 20
rs
16 15
rt
0
immediate
Format:
ANDI rt, rs, immediate
Description:
The 16-bit immediate is zero-extended and combined with the contents of general register rs in a bit-wise logical
AND operation. The result is placed into general register rt.
Operation:
32
T:
GPR [rt] ← 0
16
|| (immediate and GPR [rs]15…0)
64
T:
GPR [rt] ← 0
48
|| (immediate and GPR [rs]15…0)
Exceptions:
None
234
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
BC0F
BC0F
Branch on Coprocessor 0 False
31
26 25
COPz
Note
0100XX
21 20
BC
01000
16 15
0
BCF
00000
offset
Format:
BC0F offset
Description:
A branch target address is computed from the sum of the address of the instruction in the delay slot and the 16bit offset, shifted left two bits and sign-extended. If the Coprocessor 0’s condition signal (CpCond), as sampled
during the previous instruction, is false, then the program branches to the target address with a delay of one
instruction.
Because the condition signal is sampled during the previous instruction, there must be at least one instruction
between this instruction and a coprocessor instruction that changes the condition signal.
Operation:
32
T–1: condition ← not SR18
T:
target ← (offset15)
14
|| offset || 0
2
T+1: if condition then
PC ← PC + target
endif
64
T–1: condition ← not SR18
T:
target ← (offset15)
46
|| offset || 0
2
T+1: if condition then
PC ← PC + target
endif
Exceptions:
Coprocessor unusable exception
Note See the opcode table below, or 9.4 CPU Instruction Opcode Bit Encoding.
Opcode Table:
BC0F
31
30
29
28
27
26
25
24
23
22
21
20
19
18
17
16
0
1
0
0
0
0
0
1
0
0
0
0
0
0
0
0
Opcode
Coprocessor
number
BC sub-opcode
User’s Manual U15509EJ2V0UM
0
Branch condition
235
CHAPTER 9 CPU INSTRUCTION SET DETAILS
BC0FL
BFC0FL
Branch on Coprocessor 0 False Likely
31
26 25
COPz
Note
0100XX
21 20
BC
01000
16 15
0
BCFL
00010
offset
Format:
BC0FL offset
Description:
A branch target address is computed from the sum of the address of the instruction in the delay slot and the 16bit offset, shifted left two bits and sign-extended. If the Coprocessor 0’s condition signal (CpCond), as sampled
during the previous instruction, is false, the target address is branched to with a delay of one instruction.
If the conditional branch is not taken, the instruction in the branch delay slot is nullified.
Because the condition signal is sampled during the previous instruction, there must be at least one instruction
between this instruction and a coprocessor instruction that changes the condition signal.
Operation:
32
T–1: condition ← not SR18
T:
target ← (offset15)
14
|| offset || 0
2
T+1: if condition then
PC ← PC + target
else
NullifyCurrentInstruction
endif
64
T–1: condition ← not SR18
T:
target ← (offset15)
46
|| offset || 0
2
T+1: if condition then
PC ← PC + target
else
NullifyCurrentInstruction
endif
Exceptions:
Coprocessor unusable exception
Note See the opcode table below, or 9.4 CPU Instruction Opcode Bit Encoding.
Opcode Table:
BC0FL
31
30
29
28
27
26
25
24
23
22
21
20
19
18
17
16
0
1
0
0
0
0
0
1
0
0
0
0
0
0
1
0
Opcode
236
Coprocessor
number
BC sub-opcode
User’s Manual U15509EJ2V0UM
Branch condition
0
CHAPTER 9 CPU INSTRUCTION SET DETAILS
BC0T
BC0T
Branch on Coprocessor 0 True
31
26 25
COPz
Note
0100XX
21 20
BC
01000
16 15
0
BCT
00001
offset
Format:
BC0T offset
Description:
A branch target address is computed from the sum of the address of the instruction in the delay slot and the 16bit offset, shifted left two bits and sign-extended. If the Coprocessor 0’s condition signal (CpCond), as sampled
during the previous instruction, is true, then the program branches to the target address, with a delay of one
instruction.
Because the condition signal is sampled during the previous instruction, there must be at least one instruction
between this instruction and a coprocessor instruction that changes the condition signal.
Operation:
32
T–1: condition ← SR18
T:
target ← (offset15)
14
|| offset || 0
2
T+1: if condition then
PC ← PC + target
endif
64
T–1: condition ← SR18
T:
target ← (offset15)
46
|| offset || 0
2
T+1: if condition then
PC ← PC + target
endif
Exceptions:
Coprocessor unusable exception
Note See the opcode table below, or 9.4 CPU Instruction Opcode Bit Encoding.
Opcode Table:
BC0T
31
30
29
28
27
26
25
24
23
22
21
20
19
18
17
16
0
1
0
0
0
0
0
1
0
0
0
0
0
0
0
1
Opcode
Coprocessor
number
BC sub-opcode
User’s Manual U15509EJ2V0UM
0
Branch condition
237
CHAPTER 9 CPU INSTRUCTION SET DETAILS
BC0TL
BC0TL
Branch on Coprocessor 0 True Likely
31
26 25
COPz
Note
0100XX
21 20
BC
01000
16 15
0
BCTL
00011
offset
Format:
BC0TL offset
Description:
A branch target address is computed from the sum of the address of the instruction in the delay slot and the 16bit offset, shifted left two bits and sign-extended. If the Coprocessor 0’s condition signal (CpCond), as sampled
during the previous instruction, is true, the target address is branched to with a delay of one instruction.
If the conditional branch is not taken, the instruction in the branch delay slot is nullified.
Because the condition signal is sampled during the previous instruction, there must be at least one instruction
between this instruction and a coprocessor instruction that changes the condition signal.
Operation:
32
T–1: condition ← SR18
T:
target ← (offset15)
14
|| offset || 0
2
T+1: if condition then
PC ← PC + target
else
NullifyCurrentInstruction
endif
64
T–1: condition ← SR18
T:
target ← (offset15)
46
|| offset || 0
2
T+1: if condition then
PC ← PC + target
else
NullifyCurrentInstruction
endif
Exceptions:
Coprocessor unusable exception
Note See the opcode table below, or 9.4 CPU Instruction Opcode Bit Encoding.
Opcode Table:
BC0TL
31
30
29
28
27
26
25
24
23
22
21
20
19
18
17
16
0
1
0
0
0
0
0
1
0
0
0
0
0
0
1
1
Opcode
238
Coprocessor
number
BC sub-opcode
User’s Manual U15509EJ2V0UM
Branch condition
0
CHAPTER 9 CPU INSTRUCTION SET DETAILS
BEQ
BEQ
Branch on Equal
31
26 25
BEQ
000100
21 20
rs
16 15
rt
0
offset
Format:
BEQ rs, rt, offset
Description:
A branch target address is computed from the sum of the address of the instruction in the delay slot and the 16bit offset, shifted left two bits and sign-extended. The contents of general register rs and the contents of general
register rt are compared. If the two registers are equal, then the program branches to the target address, with a
delay of one instruction.
Operation:
32
T:
target ← (offset15)
14
|| offset || 0
2
condition ← (GPR [rs] = GPR [rt])
T+1: if condition then
PC ← PC + target
endif
64
T:
target ← (offset15)
46
|| offset || 0
2
condition ← (GPR [rs] = GPR [rt])
T+1: if condition then
PC ← PC + target
endif
Exceptions:
None
User’s Manual U15509EJ2V0UM
239
CHAPTER 9 CPU INSTRUCTION SET DETAILS
BEQL
BEQL
Branch on Equal Likely
31
26 25
BEQL
010100
21 20
16 15
rs
rt
0
offset
Format:
BEQL rs, rt, offset
Description:
A branch target address is computed from the sum of the address of the instruction in the delay slot and the 16bit offset, shifted left two bits and sign-extended. The contents of general register rs and the contents of general
register rt are compared. If the two registers are equal, then the program branches to the target address, with a
delay of one instruction. If the conditional branch is not taken, the instruction in the branch delay slot is nullified.
Operation:
32
T:
target ← (offset15)
14
|| offset || 0
2
condition ← (GPR [rs] = GPR [rt])
T+1: if condition then
PC ← PC + target
else
NullifyCurrentInstruction
endif
64
T:
target ← (offset15)
46
|| offset || 0
2
condition ← (GPR [rs] = GPR [rt])
T+1: if condition then
PC ← PC + target
else
NullifyCurrentInstruction
endif
Exceptions:
None
240
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
BGEZ
Branch on Greater than or Equal to Zero
31
26 25
REGIMM
000001
21 20
16 15
BGEZ
00001
rs
BGEZ
0
offset
Format:
BGEZ rs, offset
Description:
A branch target address is computed from the sum of the address of the instruction in the delay slot and the 16bit offset, shifted left two bits and sign-extended. If the contents of general register rs have the sign bit cleared,
then the program branches to the target address, with a delay of one instruction.
Operation:
32
T:
target ← (offset15)
14
|| offset || 0
2
condition ← (GPR [rs]31 = 0)
T+1: if condition then
PC ← PC + target
endif
64
T:
target ← (offset15)
46
|| offset || 0
2
condition ← (GPR [rs]63 = 0)
T+1: if condition then
PC ← PC + target
endif
Exceptions:
None
User’s Manual U15509EJ2V0UM
241
CHAPTER 9 CPU INSTRUCTION SET DETAILS
BGEZAL
31
Branch on Greater than or Equal to Zero And Link
26 25
REGIMM
000001
21 20
16 15
BGEZAL
10001
rs
BGEZAL
0
offset
Format:
BGEZAL rs, offset
Description:
A branch target address is computed from the sum of the address of the instruction in the delay slot and the 16bit offset, shifted left two bits and sign-extended. Unconditionally, the address of the instruction after the delay
slot is placed in the link register, r31. If the contents of general register rs have the sign bit cleared, then the
program branches to the target address, with a delay of one instruction.
General register rs may not be general register r31, because such an instruction is not restartable. An attempt to
execute such an instruction is not trapped, however.
Operation:
32
T:
target ← (offset15)
14
|| offset || 0
2
condition ← (GPR [rs]31 = 0)
GPR [31] ← PC + 8
T+1: if condition then
PC ← PC + target
endif
64
T:
target ← (offset15)
46
|| offset || 0
2
condition ← (GPR [rs]63 = 0)
GPR [31] ← PC + 8
T+1: if condition then
PC ← PC + target
endif
Exceptions:
None
242
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
BGEZALL
31
Branch on Greater than or Equal to Zero And Link Likely
26 25
REGIMM
000001
21 20
16 15
BGEZALL
10011
rs
BGEZALL
0
offset
Format:
BGEZALL rs, offset
Description:
A branch target address is computed from the sum of the address of the instruction in the delay slot and the 16bit offset, shifted left two bits and sign-extended. Unconditionally, the address of the instruction after the delay
slot is placed in the link register, r31. If the contents of general register rs have the sign bit cleared, then the
program branches to the target address, with a delay of one instruction.
General register rs may not be general register r31, because such an instruction is not restartable. An attempt to
execute such an instruction is not trapped, however. If the conditional branch is not taken, the instruction in the
branch delay slot is nullified.
Operation:
32
T:
target ← (offset15)
14
|| offset || 0
2
condition ← (GPR [rs]31 = 0)
GPR [31] ← PC + 8
T+1: if condition then
PC ← PC + target
else
NullifyCurrentInstruction
endif
64
T:
target ← (offset15)
46
|| offset || 0
2
condition ← (GPR [rs]63 = 0)
GPR [31] ← PC + 8
T+1: if condition then
PC ← PC + target
else
NullifyCurrentInstruction
endif
Exceptions:
None
User’s Manual U15509EJ2V0UM
243
CHAPTER 9 CPU INSTRUCTION SET DETAILS
BGEZL
Branch on Greater than or Equal to Zero Likely
31
26 25
REGIMM
000001
21 20
16 15
BGEZL
00011
rs
BGEZL
0
offset
Format:
BGEZL rs, offset
Description:
A branch target address is computed from the sum of the address of the instruction in the delay slot and the 16bit offset, shifted left two bits and sign-extended. If the contents of general register rs have the sign bit cleared,
then the program branches to the target address, with a delay of one instruction. If the conditional branch is not
taken, the instruction in the branch delay slot is nullified.
Operation:
32
T:
target ← (offset15)
14
|| offset || 0
2
condition ← (GPR [rs]31 = 0)
T+1: if condition then
PC ← PC + target
else
NullifyCurrentInstruction
endif
64
T:
target ← (offset15)
46
|| offset || 0
2
condition ← (GPR [rs]63 = 0)
T+1: if condition then
PC ← PC + target
else
NullifyCurrentInstruction
endif
Exceptions:
None
244
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
BGTZ
BGTZ
Branch on Greater than Zero
31
26 25
BGTZ
000111
21 20
16 15
0
0
00000
rs
offset
Format:
BGTZ rs, offset
Description:
A branch target address is computed from the sum of the address of the instruction in the delay slot and the 16bit offset, shifted left two bits and sign-extended. The contents of general register rs are compared to zero. If the
contents of general register rs have the sign bit cleared and are not equal to zero, then the program branches to
the target address, with a delay of one instruction.
Operation:
32
T:
target ← (offset15)
14
|| offset || 0
2
32
condition ← (GPR [rs]31 = 0) or (GPR [rs] ≠ 0 )
T+1: if condition then
PC ← PC + target
endif
64
T:
target ← (offset15)
46
|| offset || 0
2
64
condition ← (GPR [rs]63 = 0) or (GPR [rs] ≠ 0 )
T+1: if condition then
PC ← PC + target
endif
Exceptions:
None
User’s Manual U15509EJ2V0UM
245
CHAPTER 9 CPU INSTRUCTION SET DETAILS
BGTZL
Branch on Greater than Zero Likely
31
26 25
BGTZL
010111
21 20
16 15
0
0
00000
rs
BGTZL
offset
Format:
BGTZL rs, offset
Description:
A branch target address is computed from the sum of the address of the instruction in the delay slot and the 16bit offset, shifted left two bits and sign-extended. The contents of general register rs are compared to zero. If the
contents of general register rs have the sign bit cleared and are not equal to zero, then the program branches to
the target address, with a delay of one instruction. If the conditional branch is not taken, the instruction in the
branch delay slot is nullified.
Operation:
32
T:
target ← (offset15)
14
|| offset || 0
2
32
condition ← (GPR [rs]31 = 0) or (GPR [rs] ≠ 0 )
T+1: if condition then
PC ← PC + target
else
NullifyCurrentInstruction
endif
64
T:
target ← (offset15)
46
|| offset || 0
2
64
condition ← (GPR [rs]63 = 0) or (GPR [rs] ≠ 0 )
T+1: if condition then
PC ← PC + target
else
NullifyCurrentInstruction
endif
Exceptions:
None
246
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
BLEZ
Branch on Less than or Equal to Zero
31
26 25
BLEZ
000110
21 20
16 15
0
0
00000
rs
BLEZ
offset
Format:
BLEZ rs, offset
Description:
A branch target address is computed from the sum of the address of the instruction in the delay slot and the 16bit offset, shifted left two bits and sign-extended. The contents of general register rs are compared to zero. If the
contents of general register rs have the sign bit set or are equal to zero, then the program branches to the target
address, with a delay of one instruction.
Operation:
32
T:
target ← (offset15)
14
|| offset || 0
2
32
condition ← (GPR [rs]31 = 1) or (GPR [rs] = 0 )
T+1: if condition then
PC ← PC + target
endif
64
T:
target ← (offset15)
46
|| offset || 0
2
64
condition ← (GPR [rs]63 = 1) or (GPR [rs] = 0 )
T+1: if condition then
PC ← PC + target
endif
Exceptions:
None
User’s Manual U15509EJ2V0UM
247
CHAPTER 9 CPU INSTRUCTION SET DETAILS
BLEZL
Branch on Less than or Equal to Zero Likely
31
26 25
BLEZL
010110
21 20
16 15
0
0
00000
rs
BLEZL
offset
Format:
BLEZL rs, offset
Description:
A branch target address is computed from the sum of the address of the instruction in the delay slot and the 16bit offset, shifted left two bits and sign-extended. The contents of general register rs is compared to zero. If the
contents of general register rs have the sign bit set or are equal to zero, then the program branches to the target
address, with a delay of one instruction.
If the conditional branch is not taken, the instruction in the branch delay slot is nullified.
Operation:
32
T:
target ← (offset15)
14
|| offset || 0
2
32
condition ← (GPR [rs]31 = 1) or (GPR [rs] = 0 )
T+1: if condition then
PC ← PC + target
else
NullifyCurrentInstruction
endif
64
T:
target ← (offset15)
46
|| offset || 0
2
64
condition ← (GPR [rs]63 = 1) or (GPR [rs] = 0 )
T+1: if condition then
PC ← PC + target
else
NullifyCurrentInstruction
endif
Exceptions:
None
248
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
BLTZ
BLTZ
Branch on Less than Zero
31
26 25
REGIMM
000001
21 20
16 15
BLTZ
00000
rs
0
offset
Format:
BLTZ rs, offset
Description:
A branch target address is computed from the sum of the address of the instruction in the delay slot and the 16bit offset, shifted left two bits and sign-extended. If the contents of general register rs have the sign bit set, then
the program branches to the target address, with a delay of one instruction.
Operation:
32
T:
target ← (offset15)
14
|| offset || 0
2
condition ← (GPR [rs]31 = 1)
T+1: if condition then
PC ← PC + target
endif
64
T:
target ← (offset15)
46
|| offset || 0
2
condition ← (GPR [rs]63 = 1)
T+1: if condition then
PC ← PC + target
endif
Exceptions:
None
User’s Manual U15509EJ2V0UM
249
CHAPTER 9 CPU INSTRUCTION SET DETAILS
BLTZAL
31
Branch on Less than Zero and Link
26 25
REGIMM
000001
21 20
16 15
BLTZAL
10000
rs
BLTZAL
0
offset
Format:
BLTZAL rs, offset
Description:
A branch target address is computed from the sum of the address of the instruction in the delay slot and the 16bit offset, shifted left two bits and sign-extended. Unconditionally, the address of the instruction after the delay
slot is placed in the link register, r31. If the contents of general register rs have the sign bit set, then the program
branches to the target address, with a delay of one instruction.
General register rs may not be general register r31, because such an instruction is not restartable. An attempt to
execute such an instruction is not trapped, however.
Operation:
32
T:
target ← (offset15)
14
|| offset || 0
2
condition ← (GPR [rs]31 = 1)
GPR [31] ← PC + 8
T+1: if condition then
PC ← PC + target
endif
64
T:
target ← (offset15)
46
|| offset || 0
2
condition ← (GPR [rs]63 = 1)
GPR [31] ← PC + 8
T+1: if condition then
PC ← PC + target
endif
Exceptions:
None
250
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
BLTZALL
31
Branch on Less than Zero and Link Likely
26 25
REGIMM
000001
21 20
16 15
BLTZALL
10010
rs
BLTZALL
0
offset
Format:
BLTZALL rs, offset
Description:
A branch target address is computed from the sum of the address of the instruction in the delay slot and the 16bit offset, shifted left two bits and sign-extended. Unconditionally, the address of the instruction after the delay
slot is placed in the link register, r31. If the contents of general register rs have the sign bit set, then the program
branches to the target address, with a delay of one instruction.
General register rs may not be general register r31, because such an instruction is not restartable. An attempt to
execute such an instruction is not trapped, however. If the conditional branch is not taken, the instruction in the
branch delay slot is nullified.
Operation:
32
T:
target ← (offset15)
14
|| offset || 0
2
condition ← (GPR [rs]31 = 1)
GPR [31] ← PC + 8
T+1: if condition then
PC ← PC + target
else
NullifyCurrentInstruction
endif
64
T:
target ← (offset15)
46
|| offset || 0
2
condition ← (GPR [rs]63 = 1)
GPR [31] ← PC + 8
T+1: if condition then
PC ← PC + target
else
NullifyCurrentInstruction
endif
Exceptions:
None
User’s Manual U15509EJ2V0UM
251
CHAPTER 9 CPU INSTRUCTION SET DETAILS
BLTZL
Branch on Less than Zero Likely
31
26 25
REGIMM
000001
21 20
16 15
BLTZL
00010
rs
BLTZL
0
offset
Format:
BLTZ rs, offset
Description:
A branch target address is computed from the sum of the address of the instruction in the delay slot and the 16bit offset, shifted left two bits and sign-extended. If the contents of general register rs have the sign bit set, then
the program branches to the target address, with a delay of one instruction. If the conditional branch is not
taken, the instruction in the branch delay slot is nullified.
Operation:
32
T:
target ← (offset15)
14
|| offset || 0
2
condition ← (GPR [rs]31 = 1)
T+1: if condition then
PC ← PC + target
else
NullifyCurrentInstruction
endif
64
T:
target ← (offset15)
46
|| offset || 0
2
condition ← (GPR [rs]63 = 1)
T+1: if condition then
PC ← PC + target
else
NullifyCurrentInstruction
endif
Exceptions:
None
252
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
BNE
BNE
Branch on Not Equal
31
26 25
BNE
000101
21 20
rs
16 15
rt
0
offset
Format:
BNE rs, rt, offset
Description:
A branch target address is computed from the sum of the address of the instruction in the delay slot and the 16bit offset, shifted left two bits and sign-extended. The contents of general register rs and the contents of general
register rt are compared. If the two registers are not equal, then the program branches to the target address,
with a delay of one instruction.
Operation:
32
T:
target ← (offset15)
14
|| offset || 0
2
condition ← (GPR [rs] ≠ GPR [rt])
T+1: if condition then
PC ← PC + target
endif
64
T:
target ← (offset15)
46
|| offset || 0
2
condition ← (GPR [rs] ≠ GPR [rt])
T+1: if condition then
PC ← PC + target
endif
Exceptions:
None
User’s Manual U15509EJ2V0UM
253
CHAPTER 9 CPU INSTRUCTION SET DETAILS
BNEL
BNEL
Branch on Not Equal Likely
31
26 25
BNEL
010101
21 20
16 15
rs
rt
0
offset
Format:
BNEL rs, rt, offset
Description:
A branch target address is computed from the sum of the address of the instruction in the delay slot and the 16bit offset, shifted left two bits and sign-extended. The contents of general register rs and the contents of general
register rt are compared. If the two registers are not equal, then the program branches to the target address,
with a delay of one instruction.
If the conditional branch is not taken, the instruction in the branch delay slot is nullified.
Operation:
32
T:
target ← (offset15)
14
|| offset || 0
2
condition ← (GPR [rs] ≠ GPR [rt])
T+1: if condition then
PC ← PC + target
else
NullifyCurrentInstruction
endif
64
T:
target ← (offset15)
46
|| offset || 0
2
condition ← (GPR [rs] ≠ GPR [rt])
T+1: if condition then
PC ← PC + target
else
NullifyCurrentInstruction
endif
Exceptions:
None
254
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
BREAK
BREAK
Breakpoint
31
26 25
SPECIAL
000000
6 5
code
0
BREAK
001101
Format:
BREAK
Description:
A breakpoint trap occurs, immediately and unconditionally transferring control to the exception handler.
The code field is available for use as software parameters, but is retrieved by the exception handler only by
loading the contents of the memory word containing the instruction.
Operation:
32, 64 T:
BreakpointException
Exceptions:
Breakpoint exception
User’s Manual U15509EJ2V0UM
255
CHAPTER 9 CPU INSTRUCTION SET DETAILS
CACHE
31
CACHE
Cache Operation
26 25
CACHE
101111
21 20
base
16 15
op
0
offset
Format:
CACHE op, offset (base)
Description:
The 16-bit offset is sign-extended and added to the contents of general register base to form a virtual address.
The 5-bit sub-opcode op specifies a cache operation for that address.
If CP0 is not usable (User or Supervisor mode) and the CP0 enable bit in the Status register is cleared, a
coprocessor unusable exception is taken. The operation of this instruction on any operation/cache combination
not listed below, or on a secondary cache, is undefined. The operation of this instruction on uncached addresses
is also undefined.
The Index operation uses part of the virtual address to specify a cache block. For a cache of 2
with 2
LINEBITS
CACHEBITS
bytes
bytes per tag, vAddrCACHEBITS...LINEBITS in the VR4121, VR4122, VR4181, and VR4181A or
vAddrCACHEBITS−2...LINEBITS in the VR4131 specifies the block. In the VR4131, bit 31 of the virtual address indicates
the way of cache to be used.
The Hit operation translates the virtual address to a physical address using the TLB, accesses the specified
cache as normal data references, and performs the specified operation if the cache block contains valid data with
the specified physical address (a hit). If the cache block is invalid or contains a different address (a miss), no
operation is performed.
256
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
CACHE
CACHE
Cache
(Continued)
Write back from a primary cache goes to memory. The address to be written is specified by the cache tag and
not the translated physical address.
TLB Refill and TLB Invalid exceptions can occur on any operation. For Index operations (where the physical
address is used to index the cache but need not match the cache tag) to unmapped addresses may be used to
avoid TLB exceptions. This operation never causes a TLB Modified exception.
Bits 17 and 16 (op1..0) of the instruction code specify the cache as follows:
op1..0
Name
Cache
0
I
Instruction cache
1
D
Data cache
2

Reserved
3

Reserved
User’s Manual U15509EJ2V0UM
257
CHAPTER 9 CPU INSTRUCTION SET DETAILS
CACHE
CACHE
Cache
(Continued)
Bits 20 to 18 (op4..2) of the instruction specify the operation as follows:
op4..2
Cache
Name
0
I
Index_Invalidate
Set the cache state of the cache block to Invalid. This operation can also be used
to cancel lock of a cache block in the VR4131.
0
D
Index_Write_
Back_Invalidate
Examine the cache state and W bit of the primary data cache block at the index
specified by the virtual address. If the state is not Invalid and the W bit is set, then
write back the block to memory. The address to write is taken from the primary
cache tag. Set cache state of primary cache block to Invalid. This operation can
also be used to cancel lock of a cache block in the VR4131.
1
I, D
Index_Load_Tag
Read the tag for the cache block at the specified index and place it into the TagLo
register of the CP0.
2
I, D
Index_Store_
Tag
Write the tag for the cache block at the specified index from the TagLo register of
the CP0.
3
D
Create_Dirty_
Exclusive
This operation is used to avoid loading data needlessly from memory when writing
new contents into an entire cache block. If the cache block does not contain the
specified address, and the block is dirty, write it back to the memory. In all cases,
set the cache state to Dirty.
4
I, D
Hit_Invalidate
If the cache block contains the specified address, mark the cache block Invalid.
This operation can also be used to cancel lock of a cache block in the VR4131.
5
D
Hit_Write_Back
Invalidate
If the cache block contains the specified address, write back the data if it is dirty,
and mark the cache block Invalid.
5
I
Fill
Fill the primary instruction cache block from memory. This operation can also be
used to cancel lock of a cache block in the VR4131.
6
D
Hit_Write_Back
If the cache block contains the specified address, and the W bit is set, write back
the data to memory and clear the W bit.
6
I
Hit_Write_Back
If the cache block contains the specified address, write back the data
unconditionally.
7
I, D
Fetch_and_Lock
For the VR4131 only. If the cache block contains the specified address, fill the
cache block from memory. Locks the cache line regardless of refilling the cache
block.
258
Operation
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
CACHE
Cache
(Continued)
CACHE
Operation:
32, 64 T:
vAddr ← ((offset15)
48
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
CacheOp (op, vAddr, pAddr)
Exceptions:
Coprocessor unusable exception
TLB refill exception
TLB invalid exception
Bus error exception
Address error exception
User’s Manual U15509EJ2V0UM
259
CHAPTER 9 CPU INSTRUCTION SET DETAILS
DADD
DADD
Doubleword Add
31
26 25
SPECIAL
000000
21 20
rs
16 15
rt
11 10
rd
6 5
0
00000
0
DADD
101100
Format:
DADD rd, rs, rt
Description:
The contents of general register rs and the contents of general register rt are added to form the result. The result
is placed into general register rd.
An integer overflow exception occurs if the carries out of bits 62 and 63 differ (2’s complement overflow). The
destination register rd is not modified when an integer overflow exception occurs.
This operation is defined for the VR4100 Series operating in 64-bit mode or in 32-bit Kernel mode. Execution of
this instruction in 32-bit User or Supervisor mode causes a reserved instruction exception.
Operation:
32, 64 T:
GPR [rd] ← GPR [rs] + GPR [rt]
Exceptions:
Integer overflow exception
Reserved instruction exception (VR4100 Series in 32-bit User mode, VR4100 Series in 32-bit Supervisor mode)
260
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
DADDI
Doubleword Add Immediate
31
26 25
DADDI
011000
21 20
rs
16 15
rt
DADDI
0
immediate
Format:
DADDI rt, rs, immediate
Description:
The 16-bit immediate is sign-extended and added to the contents of general register rs to form the result. The
result is placed into general register rt.
An integer overflow exception occurs if carries out of bits 62 and 63 differ (2’s complement overflow). The
destination register rt is not modified when an integer overflow exception occurs.
This operation is defined for the VR4100 Series operating in 64-bit mode or in 32-bit Kernel mode. Execution of
this instruction in 32-bit User or Supervisor mode causes a reserved instruction exception.
Operation:
32, 64 T:
GPR [rt] ← GPR [rs] + (immediate15)
48
|| immediate15…0
Exceptions:
Integer overflow exception
Reserved instruction exception (VR4100 Series in 32-bit User mode, VR4100 Series in 32-bit Supervisor mode)
User’s Manual U15509EJ2V0UM
261
CHAPTER 9 CPU INSTRUCTION SET DETAILS
DADDIU
31
Doubleword Add Immediate Unsigned
26 25
DADDIU
011001
21 20
rs
16 15
rt
DADDIU
0
immediate
Format:
DADDIU rt, rs, immediate
Description:
The 16-bit immediate is sign-extended and added to the contents of general register rs to form the result. The
result is placed into general register rt.
The only difference between this instruction and the DADDI instruction is that DADDIU never causes an overflow
exception.
This operation is defined for the VR4100 Series operating in 64-bit mode or in 32-bit Kernel mode. Execution of
this instruction in 32-bit User or Supervisor mode causes a reserved instruction exception.
Operation:
64
T:
GPR [rt] ← GPR [rs] + (immediate15)
48
|| immediate15…0
Exceptions:
Reserved instruction exception (VR4100 Series in 32-bit User mode, VR4100 Series in 32-bit Supervisor mode)
262
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
DADDU
DADDU
Doubleword Add Unsigned
31
26 25
SPECIAL
000000
21 20
rs
16 15
rt
11 10
rd
6 5
0
00000
0
DADDU
101101
Format:
DADDU rd, rs, rt
Description:
The contents of general register rs and the contents of general register rt are added to form the result. The result
is placed into general register rd.
The only difference between this instruction and the DADD instruction is that DADDU never causes an overflow
exception.
This operation is defined for the VR4100 Series operating in 64-bit mode or in 32-bit Kernel mode. Execution of
this instruction in 32-bit User or Supervisor mode causes a reserved instruction exception.
Operation:
64
T:
GPR [rd] ← GPR [rs] + GPR [rt]
Exceptions:
Reserved instruction exception (VR4100 Series in 32-bit User mode, VR4100 Series in 32-bit Supervisor mode)
User’s Manual U15509EJ2V0UM
263
CHAPTER 9 CPU INSTRUCTION SET DETAILS
DDIV
DDIV
Doubleword Divide
31
26 25
SPECIAL
000000
21 20
16 15
rs
rt
6 5
0
0000000000
0
DDIV
011110
Format:
DDIV rs, rt
Description:
The contents of general register rs are divided by the contents of general register rt, treating both operands as
2’s complement values. No overflow exception occurs under any circumstances, and the result of this operation
is undefined when the divisor is zero.
This instruction is typically followed by additional instructions to check for a zero divisor and for overflow.
When the operation completes, the doubleword quotient of the result is loaded into special register LO, and the
doubleword remainder of the result is loaded into special register HI.
If either of the two preceding instructions is MFHI or MFLO, the results of those instructions are undefined.
Correct operation requires separating reads of HI or LO from writes by two or more instructions.
This operation is defined for the VR4100 Series operating in 64-bit mode or in 32-bit Kernel mode. Execution of
this instruction in 32-bit User or Supervisor mode causes a reserved instruction exception.
Operation:
32, 64 T−2: LO ← undefined
HI
← undefined
T−1: LO ← undefined
HI
T:
← undefined
LO ← GPR [rs] div GPR [rt]
HI
← GPR [rs] mod GPR [rt]
Exceptions:
Reserved instruction exception (VR4100 Series in 32-bit User mode, VR4100 Series in 32-bit Supervisor mode)
264
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
DDIVU
DDIVU
Doubleword Divide Unsigned
31
26 25
SPECIAL
000000
21 20
rs
16 15
rt
6 5
0
0000000000
0
DDIVU
011111
Format:
DDIVU rs, rt
Description:
The contents of general register rs are divided by the contents of general register rt, treating both operands as
unsigned values. No integer overflow exception occurs under any circumstances, and the result of this operation
is undefined when the divisor is zero.
This instruction may be followed by additional instructions to check for a zero divisor.
When the operation completes, the doubleword quotient of the result is loaded into special register LO, and the
doubleword remainder of the result is loaded into special register HI.
If either of the two preceding instructions is MFHI or MFLO, the results of those instructions are undefined.
Correct operation requires separating reads of HI or LO from writes by two or more instructions.
This operation is defined for the VR4100 Series operating in 64-bit mode or in 32-bit Kernel mode. Execution of
this instruction in 32-bit User or Supervisor mode causes a reserved instruction exception.
Operation:
64
T−2: LO ← undefined
HI
← undefined
T−1: LO ← undefined
HI
T:
← undefined
LO ← (0 || GPR [rs]) div (0 || GPR [rt])
HI
← (0 || GPR [rs]) mod (0 || GPR [rt])
Exceptions:
Reserved instruction exception (VR4100 Series in 32-bit User mode, VR4100 Series in 32-bit Supervisor mode)
User’s Manual U15509EJ2V0UM
265
CHAPTER 9 CPU INSTRUCTION SET DETAILS
DIV
DIV
Divide
31
26 25
21 20
SPECIAL
000000
16 15
rs
rt
6 5
0
0000000000
0
DIV
011010
Format:
DIV rs, rt
Description:
The contents of general register rs are divided by the contents of general register rt, treating both operands as
2’s complement values. No overflow exception occurs under any circumstances, and the result of this operation
is undefined when the divisor is zero.
In 64-bit mode, the operands must be valid sign-extended, 32-bit values.
This instruction is typically followed by additional instructions to check for a zero divisor and for overflow.
When the operation completes, the doubleword quotient of the result is loaded into special register LO, and the
doubleword remainder of the result is loaded into special register HI.
If either of the two preceding instructions is MFHI or MFLO, the results of those instructions are undefined.
Correct operation requires separating reads of HI or LO from writes by two or more instructions.
Restrictions:
If the value of either general register rt or general register rs is not a sign-extended 32-bit value (bits 63 to 31
have the same value), the result of this operation will be undefined.
Operation:
32
T−2: LO ← undefined
HI
← undefined
T−1: LO ← undefined
HI
T:
LO ← GPR [rs] div GPR [rt]
HI
64
← undefined
← GPR [rs] mod GPR [rt]
T−2: LO ← undefined
HI
← undefined
T−1: LO ← undefined
T:
HI
← undefined
q
← GPR [rs] 31…0 div GPR [rt] 31…0
r
← GPR [rs] 31…0 mod GPR [rt] 31…0
LO ← (q 31)
HI
← (r 31)
32
32
|| q 31…0
|| r 31…0
Exceptions:
None
266
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
DIVU
DIVU
Divide Unsigned
31
26 25
21 20
SPECIAL
000000
rs
16 15
rt
6 5
0
0000000000
0
DIVU
011011
Format:
DIVU rs, rt
Description:
The contents of general register rs are divided by the contents of general register rt, treating both operands as
unsigned values. No integer overflow exception occurs under any circumstances, and the result of this operation
is undefined when the divisor is zero.
In 64-bit mode, the operands must be valid sign-extended, 32-bit values.
This instruction is typically followed by additional instructions to check for a zero divisor.
When the operation completes, the doubleword quotient of the result is loaded into special register LO, and the
doubleword remainder of the result is loaded into special register HI.
If either of the two preceding instructions is MFHI or MFLO, the results of those instructions are undefined.
Correct operation requires separating reads of HI or LO from writes by two or more instructions.
Restrictions:
If the value of either general register rt or general register rs is not a sign-extended 32-bit value (bits 63 to 31
have the same value), the result of this operation will be undefined.
Operation:
32
T−2: LO ← undefined
HI
← undefined
T−1: LO ← undefined
HI
T:
LO ← (0 || GPR [rs]) div (0 || GPR [rt])
HI
64
← undefined
← (0 || GPR [rs]) mod (0 || GPR [rt])
T−2: LO ← undefined
HI
← undefined
T−1: LO ← undefined
T:
HI
← undefined
q
← (0 || GPR [rs] 31…0 ) div (0 || GPR [rt] 31…0)
r
← (0 || GPR [rs] 31…0 ) mod (0 || GPR [rt] 31…0)
LO ← (q 31)
HI
← (r 31)
32
32
|| q 31…0
|| r 31…0
Exceptions:
None
User’s Manual U15509EJ2V0UM
267
CHAPTER 9 CPU INSTRUCTION SET DETAILS
DMACC
31
DMACC
Doubleword Multiply and Add Accumulate
(for VR4121, VR4122, VR4131, and VR4181A)
26 25
SPECIAL
000000
21 20
rs
16 15
11 10 9
rd
rt
8 7 6
sat hi 0 0 us
5
0
DMACC
101001
Format:
DMACC
rd, rs, rt
DMACCU
rd, rs, rt
DMACCHI
rd, rs, rt
DMACCHIU rd, rs, rt
DMACCS
rd, rs, rt
DMACCUS
rd, rs, rt
DMACCHIS rd, rs, rt
DMACCHIUSrd, rs, rt
Description:
The mnemonics of the DMACC instruction differ as shown in the table below by the setting of the sat, hi, or us
bits.
Mnemonic
sat
hi
us
DMACC
0
0
0
DMACCU
0
0
1
DMACCHI
0
1
0
DMACCHIU
0
1
1
DMACCS
1
0
0
DMACCUS
1
0
1
DMACCHIS
1
1
0
DMACCHIUS
1
1
1
The number of valid bits in the operands differs depending on whether saturation processing is executed (sat =
1) or not (sat = 0).
• When saturation processing is executed (sat = 1): DMACCS, DMACCUS, DMACCHIS, and DMACCHIUS
instructions
The contents of general register rs are multiplied by the contents of general register rt. If us = 1, the contents
of both operands are handled as 16-bit unsigned data. If us = 0, the contents are handled as 16-bit signed
integers. Sign/zero extension by software is required for bits 16 to 31 in the operands.
268
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
DMACC
Doubleword Multiply and Add Accumulate
(for VR4121, VR4122, VR4131, and VR4181A)
(Continued)
DMACC
The product of this multiply operation is added to the value in special register LO. If us = 1, this add operation
handles the values being added as 32-bit unsigned data. If us = 0, the values are handled as 32-bit signed
integers. Sign/zero extension by software is required for bits 32 to 63 in special register LO.
After saturation processing of 32 bits has been performed (refer to the table below), the sum from this add
operation is loaded to special register LO. When hi = 1, data that is the same as the data loaded to special
register HI is also loaded to general register rd. When hi = 0, data that is the same as the data loaded to
special register LO is also loaded to general register rd. Overflow exceptions do not occur.
• When saturation processing is not executed (sat = 0): DMACC, DMACCU, DMACCHI, and DMACCHIU
instructions
The contents of general register rs are multiplied by the contents of general register rt. If us = 1, the contents
of both operands are handled as 32-bit unsigned data. If us = 0, the contents are handled as 32-bit signed
integers. Sign/zero extension by software is required for bits 32 to 63 in the operands.
The product of this multiply operation is added to the value in special register LO. If us = 1, this add operation
handles the values being added as 64-bit unsigned data. If us = 0, the values are handled as 64-bit signed
integers.
The sum from this add operation is loaded to special register LO. When hi = 1, data that is the same as the
data loaded to special register HI is also loaded to general register rd. When hi = 0, data that is the same as
the data loaded to special register LO is also loaded to general register rd. Overflow exceptions do not occur.
These operations are defined for 64-bit mode and 32-bit Kernel mode. A reserved instruction exception occurs if
one of these instructions is executed during 32-bit User/Supervisor mode.
User’s Manual U15509EJ2V0UM
269
CHAPTER 9 CPU INSTRUCTION SET DETAILS
DMACC
DMACC
Doubleword Multiply and Add Accumulate
(for VR4121, VR4122, VR4131, and VR4181A)
(Continued)
The correspondence of us and sat settings and values stored during saturation processing is shown below, along
with the hazard cycles required between execution of the instruction for manipulating the HI and LO registers and
execution of the DMACC instruction.
Values Stored During Saturation Processing
Overflow
Hazard Cycle Counts
us
sat
Underflow
Instruction
0
0
Store calculation result as is
Store calculation result as is
1
0
Store calculation result as is
Store calculation result as is
0
1
0x0000 0000 7FFF FFFF
0xFFFF FFFF 8000 0000
1
1
0xFFFF FFFF FFFF FFFF
None
Cycle Count
MULT, MULTU
DMULT, DMULTU
DIV, DIVU
DDIV, DDIVU
MFHI, MFLO
MTHI, MTLO
MACC
DMACC
Note1
3
36
68
Note2
0
0
0
Notes 1. VR4121, VR4122 … 1
VR4131 … 0
VR4181A … 1
2. VR4121, VR4122 … 2
VR4131 … 0
VR4181A … 2
Operation:
32, 64, sat = 0, hi = 0, us = 0 (DMACC instruction)
T:
temp1 ← ((GPR[rs]31)
32
|| GPR [rs]) * ((GPR[rt]31)
32
|| GPR [rt])
32
|| GPR [rt])
temp2 ← temp1 + LO
LO ← temp2
GPR[rd] ← LO
32, 64, sat = 0, hi = 0, us = 1 (DMACCU instruction)
T:
temp1 ← (0
32
|| GPR [rs]) * (0
32
|| GPR [rt])
temp2 ← temp1 + LO
LO ← temp2
GPR[rd] ← LO
32, 64, sat = 0, hi = 1, us = 0 (DMACCHI instruction)
T:
temp1 ← ((GPR[rs]31)
32
|| GPR [rs]) * ((GPR[rt]31)
temp2 ← temp1 + LO
LO ← temp2
GPR[rd] ← HI
270
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
DMACC
Doubleword Multiply and Add Accumulate
(for VR4121, VR4122, VR4131, and VR4181A)
(Continued)
DMACC
32, 64, sat = 0, hi = 1, us = 1 (DMACCHIU instruction)
T:
temp1 ← (0
32
|| GPR [rs]) * (0
32
|| GPR [rt])
temp2 ← temp1 + LO
LO ← temp2
GPR[rd] ← HI
32, 64, sat = 1, hi = 0, us = 0 (DMACCS instruction)
T:
temp1 ← ((GPR[rs]31)
32
|| GPR [rs]) * ((GPR[rt]31)
32
|| GPR [rt])
32
|| GPR [rt])
temp2 ← saturation(temp1 + LO)
LO ← temp2
GPR[rd] ← LO
32, 64, sat = 1, hi = 0, us = 1 (DMACCUS instruction)
T:
temp1 ← (0
32
|| GPR [rs]) * (0
32
|| GPR [rt])
temp2 ← saturation(temp1 + LO)
LO ← temp2
GPR[rd] ← LO
32, 64, sat = 1, hi = 1, us = 0 (DMACCHIS instruction)
T:
temp1 ← ((GPR[rs]31)
32
|| GPR [rs]) * ((GPR[rt]31)
temp2 ← saturation(temp1 + LO)
LO ← temp2
GPR[rd] ← HI
32, 64, sat = 1, hi = 1, us = 1 (DMACCHIUS instruction)
T:
temp1 ← (0
32
|| GPR [rs]) * (0
32
|| GPR [rt])
temp2 ← saturation(temp1 + LO)
LO ← temp2
GPR[rd] ← HI
Exceptions:
Reserved instruction exception (in 32-bit User/Supervisor mode)
User’s Manual U15509EJ2V0UM
271
CHAPTER 9 CPU INSTRUCTION SET DETAILS
DMADD16
31
DMADD16
Doubleword Multiply and Add 16-bit Integer
(for VR4181 only)
26 25
SPECIAL
000000
21 20
rs
16 15
rt
6 5
0
0000000000
0
DMADD16
101001
Format:
DMADD16 rs, rt
Description:
The contents of general registers rs and rt are multiplied, treating both operands as 16-bit 2’s complement
values. Bits 62 to 15 of the operand must be sign-extended values.
This multiplied result and the contents of special register LO are added to form the result as a signed integer.
When the operation completes, the doubleword result is loaded into special register LO.
No integer overflow exception occurs under any circumstances.
This operation is defined for the VR4181 operating in 64-bit mode or in 32-bit Kernel mode. Execution of this
instruction in 32-bit User or Supervisor mode causes a reserved instruction exception.
The following table shows hazard cycles between DMADD16 and other instructions.
Instruction sequence
No. of cycles
MULT/MULTU → DMADD16
1 Cycle
DMULT/DMULTU → DMADD16
4 Cycles
DIV/DIVU → DMADD16
36 Cycles
DDIV/DDIVU → DMADD16
68 Cycles
MFHI/MFLO → DMADD16
2 Cycles
MADD16 → DMADD16
0 Cycles
DMADD16 → DMADD16
0 Cycles
Operation:
32, 64 T−2: LO ← undefined
HI
← undefined
T−1: LO ← undefined
HI
T:
← undefined
temp ← GPR [rs] * GPR [rt]
temp ← temp + LO
LO ← temp
HI
← undefined
Exceptions:
Reserved instruction exception (VR4181 in 32-bit User mode, VR4181 in 32-bit Supervisor mode)
272
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
DMFC0
Doubleword Move from System Control Coprocessor
31
26 25
COP0
010000
21 20
DMF
00001
16 15
rt
DMFC0
11 10
rd
0
0
00000000000
Format:
DMFC0 rt, rd
Description:
The contents of coprocessor register rd of the CP0 are loaded into general register rt.
This operation is defined for the VR4100 Series operating in 64-bit mode or in 32-bit Kernel mode. Execution of
this instruction in 32-bit User or Supervisor mode causes a reserved instruction exception.
All 64-bits of the general register destination are written from the coprocessor register source. The operation of
DMFC0 on a 32-bit Coprocessor 0 register is undefined.
Operation:
32, 64 T:
data ← CPR [0, rd]
T+1: GPR [rt] ← data
Exceptions:
Coprocessor unusable exception (User and Supervisor mode if CP0 not enabled)
Reserved instruction exception (VR4100 Series in 32-bit User mode, VR4100 Series in 32-bit Supervisor mode)
User’s Manual U15509EJ2V0UM
273
CHAPTER 9 CPU INSTRUCTION SET DETAILS
DMTC0
Doubleword Move to System Control Coprocessor
31
26 25
COP0
010000
21 20
DMT
00101
16 15
rt
DMTC0
11 10
rd
0
0
00000000000
Format:
DMTC0 rt, rd
Description:
The contents of general register rt are loaded into coprocessor register rd of the CP0.
This operation is defined for the VR4100 Series operating in 64-bit mode or in 32-bit Kernel mode. Execution of
this instruction in 32-bit User or Supervisor mode causes a reserved instruction exception.
All 64-bits of the coprocessor register destination are written from the general register source. The operation of
DMTC0 on a 32-bit Coprocessor 0 register is undefined.
Because the state of the virtual address translation system may be altered by this instruction, the operation of
load instructions, store instructions, and TLB operations immediately prior to and after this instruction are
undefined.
Operation:
32, 64 T:
data ← GPR [rt]
T+1: CPR [0, rd] ← data
Exceptions:
Coprocessor unusable exception (User and Supervisor mode if CP0 not enabled)
Reserved instruction exception (VR4100 Series in 32-bit User mode, VR4100 Series in 32-bit Supervisor mode)
274
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
DMULT
DMULT
Doubleword Multiply
31
26 25
SPECIAL
000000
21 20
rs
16 15
rt
6 5
0
0000000000
0
DMULT
011100
Format:
DMULT rs, rt
Description:
The contents of general registers rs and rt are multiplied, treating both operands as 2’s complement values. No
integer overflow exception occurs under any circumstances.
When the operation completes, the low-order doubleword of the result is loaded into special register LO, and the
high-order doubleword of the result is loaded into special register HI.
If either of the two preceding instructions is MFHI or MFLO, the results of these instructions are undefined.
Correct operation requires separating reads of HI or LO from writes by a minimum of two other instructions.
This operation is defined for the VR4100 Series operating in 64-bit mode or in 32-bit Kernel mode. Execution of
this instruction in 32-bit User or Supervisor mode causes a reserved instruction exception.
Operation:
32, 64 T−2: LO ← undefined
HI
← undefined
T−1: LO ← undefined
T:
HI
← undefined
t
← GPR [rs] * GPR [rt]
LO ← t 63…0
HI
← t 127…64
Exceptions:
Reserved instruction exception (VR4100 Series in 32-bit User mode, VR4100 Series in 32-bit Supervisor mode)
User’s Manual U15509EJ2V0UM
275
CHAPTER 9 CPU INSTRUCTION SET DETAILS
DMULTU
31
DMULTU
Doubleword Multiply Unsigned
26 25
SPECIAL
000000
21 20
rs
16 15
rt
6 5
0
0000000000
0
DMULTU
011101
Format:
DMULTU rs, rt
Description:
The contents of general register rs and the contents of general register rt are multiplied, treating both operands
as unsigned values. No overflow exception occurs under any circumstances.
When the operation completes, the low-order doubleword of the result is loaded into special register LO, and the
high-order doubleword of the result is loaded into special register HI.
If either of the two preceding instructions is MFHI or MFLO, the results of these instructions are undefined.
Correct operation requires separating reads of HI or LO from writes by a minimum of two instructions.
This operation is defined for the VR4100 Series operating in 64-bit mode or in 32-bit Kernel mode. Execution of
this instruction in 32-bit User or Supervisor mode causes a reserved instruction exception.
Operation:
32, 64 T−2: LO ← undefined
HI
← undefined
T−1: LO ← undefined
T:
HI
← undefined
t
← (0 || GPR [rs]) * (0 || GPR [rt])
LO ← t 63…0
HI
← t 127…64
Exceptions:
Reserved instruction exception (VR4100 Series in 32-bit User mode, VR4100 Series in 32-bit Supervisor mode)
276
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
DSLL
DSLL
Doubleword Shift Left Logical
31
26 25
SPECIAL
000000
21 20
0
00000
16 15
rt
11 10
rd
6 5
sa
0
DSLL
111000
Format:
DSLL rd, rt, sa
Description:
The contents of general register rt are shifted left by sa bits, inserting zeros into the low-order bits. The result is
placed in general register rd.
This operation is defined for the VR4100 Series operating in 64-bit mode or in 32-bit Kernel mode. Execution of
this instruction in 32-bit User or Supervisor mode causes a reserved instruction exception.
Operation:
32, 64 T:
s ← 0 || sa
GPR [rd] ← GPR [rt] 63 − s…0 || 0
s
Exceptions:
Reserved instruction exception (VR4100 Series in 32-bit User mode, VR4100 Series in 32-bit Supervisor mode)
User’s Manual U15509EJ2V0UM
277
CHAPTER 9 CPU INSTRUCTION SET DETAILS
DSLLV
DSLLV
Doubleword Shift Left Logical Variable
31
26 25
SPECIAL
000000
21 20
rs
16 15
rt
11 10
rd
6 5
0
00000
0
DSLLV
010100
Format:
DSLLV rd, rt, rs
Description:
The contents of general register rt are shifted left by the number of bits specified by the low-order six bits
contained in general register rs, inserting zeros into the low-order bits. The result is placed in general register rd.
This operation is defined for the VR4100 Series operating in 64-bit mode or in 32-bit Kernel mode. Execution of
this instruction in 32-bit User or Supervisor mode causes a reserved instruction exception.
Operation:
32, 64 T:
s ← GPR [rs] 5…0
GPR [rd] ← GPR [rt] 63 − s…0 || 0
s
Exceptions:
Reserved instruction exception (VR4100 Series in 32-bit User mode, VR4100 Series in 32-bit Supervisor mode)
278
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
DSLL32
31
DSLL32
Doubleword Shift Left Logical + 32
26 25
SPECIAL
000000
21 20
0
00000
16 15
rt
11 10
rd
6 5
sa
0
DSLL32
111100
Format:
DSLL32 rd, rt, sa
Description:
The contents of general register rt are shifted left by 32 + sa bits, inserting zeros into the low-order bits. The
result is placed in general register rd.
This operation is defined for the VR4100 Series operating in 64-bit mode or in 32-bit Kernel mode. Execution of
this instruction in 32-bit User or Supervisor mode causes a reserved instruction exception.
Operation:
32, 64 T:
s ← 1 || sa
GPR [rd] ← GPR [rt] 63 − s…0 || 0
s
Exceptions:
Reserved instruction exception (VR4100 Series in 32-bit User mode, VR4100 Series in 32-bit Supervisor mode)
User’s Manual U15509EJ2V0UM
279
CHAPTER 9 CPU INSTRUCTION SET DETAILS
DSRA
DSRA
Doubleword Shift Right Arithmetic
31
26 25
SPECIAL
000000
21 20
0
00000
16 15
rt
11 10
rd
6 5
sa
0
DSRA
111011
Format:
DSRA rd, rt, sa
Description:
The contents of general register rt are shifted right by sa bits, sign-extending the high-order bits. The result is
placed in general register rd.
This operation is defined for the VR4100 Series operating in 64-bit mode or in 32-bit Kernel mode. Execution of
this instruction in 32-bit User or Supervisor mode causes a reserved instruction exception.
Operation:
32, 64 T:
s ← 0 || sa
s
GPR [rd] ← (GPR [rt]63) || GPR [rt]63…s
Exceptions:
Reserved instruction exception (VR4100 Series in 32-bit User mode, VR4100 Series in 32-bit Supervisor mode)
280
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
DSRAV
Doubleword Shift Right Arithmetic Variable
31
26 25
SPECIAL
000000
21 20
rs
16 15
rt
11 10
rd
DSRAV
6 5
0
00000
0
DSRAV
010111
Format:
DSRAV rd, rt, rs
Description:
The contents of general register rt are shifted right by the number of bits specified by the low-order six bits of
general register rs, sign-extending the high-order bits. The result is placed in general register rd.
This operation is defined for the VR4100 Series operating in 64-bit mode or in 32-bit Kernel mode. Execution of
this instruction in 32-bit User or Supervisor mode causes a reserved instruction exception.
Operation:
32, 64 T:
s ← GPR [rs]5…0
s
GPR [rd] ← (GPR [rt]63) || GPR [rt]63…s
Exceptions:
Reserved instruction exception (VR4100 Series in 32-bit User mode, VR4100 Series in 32-bit Supervisor mode)
User’s Manual U15509EJ2V0UM
281
CHAPTER 9 CPU INSTRUCTION SET DETAILS
DSRA32
31
DSRA32
Doubleword Shift Right Arithmetic + 32
26 25
SPECIAL
000000
21 20
0
00000
16 15
rt
11 10
rd
6 5
sa
0
DSRA32
111111
Format:
DSRA32 rd, rt, sa
Description:
The contents of general register rt are shifted right by 32 + sa bits, sign-extending the high-order bits. The result
is placed in general register rd.
This operation is defined for the VR4100 Series operating in 64-bit mode or in 32-bit Kernel mode. Execution of
this instruction in 32-bit User or Supervisor mode causes a reserved instruction exception.
Operation:
32, 64 T:
s ← 1 || sa
s
GPR [rd] ← (GPR [rt]63) || GPR [rt]63…s
Exceptions:
Reserved instruction exception (VR4100 Series in 32-bit User mode, VR4100 Series in 32-bit Supervisor mode)
282
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
DSRL
DSRL
Doubleword Shift Right Logical
31
26 25
SPECIAL
000000
21 20
16 15
0
00000
rt
11 10
rd
6 5
sa
0
DSRL
111010
Format:
DSRL rd, rt, sa
Description:
The contents of general register rt are shifted right by sa bits, inserting zeros into the high-order bits. The result
is placed in general register rd.
This operation is defined for the VR4100 Series operating in 64-bit mode or in 32-bit Kernel mode. Execution of
this instruction in 32-bit User or Supervisor mode causes a reserved instruction exception.
Operation:
32, 64 T:
s ← 0 || sa
s
GPR [rd] ← 0 || GPR [rt]63…s
Exceptions:
Reserved instruction exception (VR4100 Series in 32-bit User mode, VR4100 Series in 32-bit Supervisor mode)
User’s Manual U15509EJ2V0UM
283
CHAPTER 9 CPU INSTRUCTION SET DETAILS
DSRLV
DSRLV
Doubleword Shift Right Logical Variable
31
26 25
SPECIAL
000000
21 20
16 15
rs
rt
11 10
rd
6 5
0
00000
0
DSRLV
010110
Format:
DSRLV rd, rt, rs
Description:
The contents of general register rt are shifted right by the number of bits specified by the low-order six bits of
general register rs, inserting zeros into the high-order bits. The result is placed in general register rd.
This operation is defined for the VR4100 Series operating in 64-bit mode or in 32-bit Kernel mode. Execution of
this instruction in 32-bit User or Supervisor mode causes a reserved instruction exception.
Operation:
32, 64 T:
s ← GPR [rs]5…0
s
GPR [rd] ← 0 || GPR [rt]63…s
Exceptions:
Reserved instruction exception (VR4100 Series in 32-bit User mode, VR4100 Series in 32-bit Supervisor mode)
284
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
DSRL32
31
DSRL32
Doubleword Shift Right Logical + 32
26 25
SPECIAL
000000
21 20
16 15
0
00000
rt
11 10
rd
6 5
sa
0
DSRL32
111110
Format:
DSRL32 rd, rt, sa
Description:
The contents of general register rt are shifted right by 32 + sa bits, inserting zeros into the high-order bits. The
result is placed in general register rd.
This operation is defined for the VR4100 Series operating in 64-bit mode or in 32-bit Kernel mode. Execution of
this instruction in 32-bit User or Supervisor mode causes a reserved instruction exception.
Operation:
32, 64 T:
s ← 1 || sa
s
GPR [rd] ← 0 || GPR [rt]63…s
Exceptions:
Reserved instruction exception (VR4100 Series in 32-bit User mode, VR4100 Series in 32-bit Supervisor mode)
User’s Manual U15509EJ2V0UM
285
CHAPTER 9 CPU INSTRUCTION SET DETAILS
DSUB
DSUB
Doubleword Subtract
31
26 25
SPECIAL
000000
21 20
rs
16 15
rt
11 10
rd
6 5
0
00000
0
DSUB
101110
Format:
DSUB rd, rs, rt
Description:
The contents of general register rt are subtracted from the contents of general register rs to form a result. The
result is placed into general register rd.
An integer overflow exception takes place if the carries out of bits 62 and 63 differ (2’s complement overflow).
The destination register rd is not modified when an integer overflow exception occurs.
This operation is defined for the VR4100 Series operating in 64-bit mode or in 32-bit Kernel mode. Execution of
this instruction in 32-bit User or Supervisor mode causes a reserved instruction exception.
Operation:
32, 64 T:
GPR [rd] ← GPR [rs] − GPR [rt]
Exceptions:
Integer overflow exception
Reserved instruction exception (VR4100 Series in 32-bit User mode, VR4100 Series in 32-bit Supervisor mode)
286
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
DSUBU
DSUBU
Doubleword Subtract Unsigned
31
26 25
SPECIAL
000000
21 20
rs
16 15
rt
11 10
rd
6 5
0
00000
0
DSUBU
101111
Format:
DSUBU rd, rs, rt
Description:
The contents of general register rt are subtracted from the contents of general register rs to form a result. The
result is placed into general register rd.
The only difference between this instruction and the DSUB instruction is that DSUBU never traps on overflow.
This operation is defined for the VR4100 Series operating in 64-bit mode or in 32-bit Kernel mode. Execution of
this instruction in 32-bit User or Supervisor mode causes a reserved instruction exception.
Operation:
32, 64 T:
GPR [rd] ← GPR [rs] − GPR [rt]
Exceptions:
Reserved instruction exception (VR4100 Series in 32-bit User mode, VR4100 Series in 32-bit Supervisor mode)
User’s Manual U15509EJ2V0UM
287
CHAPTER 9 CPU INSTRUCTION SET DETAILS
ERET
ERET
Exception Return
31
26 25 24
COP0
010000
CO
1
6 5
0
0000000000000000000
0
ERET
011000
Format:
ERET
Description:
ERET is the instruction for returning from an interrupt, exception, or error trap.
Unlike a Branch or Jump
instruction, ERET does not execute the next instruction.
ERET must not itself be placed in a branch delay slot.
If the processor is servicing an error trap (SR2 = 1), then load the PC from the ErrorEPC register and clear the
ERL bit of the Status register (SR2 = 0). Otherwise (SR2 = 0), load the PC from the EPC register, and clear the
EXL bit of the Status register (SR1 = 0).
When MIPS16 instructions are enabled, the value of clearing the least significant bit of the EPC or ErrorEPC
register to 0 is loaded to PC. This means the content of the least significant bit is reflected on the ISA mode bit
(internal).
Operation:
32, 64 T:
if SR2 = 1 then
if MIPS16EN = 1 then
PC ← ErrorEPC63…1 || 0
ISA MODE ← ErrorEPC0
else
PC ← ErrorEPC
endif
SR ← SR31…3 || 0 || SR1…0
else
if MIPS16EN = 1 then
PC ← EPC63…1 || 0
ISA MODE ← EPC0
else
PC ← EPC
endif
SR ← SR31…2 || 0 || SR0
endif
Exceptions:
Coprocessor unusable exception
288
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
HIBERNATE
31
Hibernate
26 25 24
COP0
010000
CO
1
HIBERNATE
6 5
0
0000000000000000000
0
HIBERNATE
100011
Format:
HIBERNATE
Description:
HIBERNATE instruction starts mode transition from Fullspeed mode to Hibernate mode.
When the HIBERNATE instruction finishes the WB stage, the VR4100 Series wait by the SysAD bus is idle state,
and then fix the all clocks generated by the CPU core to high level, thus freezing the pipeline.
Once the VR4100 Series is in Hibernate mode, the Cold Reset sequence will cause the VR4100 Series to exit
Hibernate mode and to enter Fullspeed mode.
Operation:
32, 64 T:
T+1: Hibernate operation ( )
Exceptions:
Coprocessor unusable exception
Remark
Refer to Hardware User's Manual of each product for details about the operation of the peripheral
units at mode transition.
User’s Manual U15509EJ2V0UM
289
CHAPTER 9 CPU INSTRUCTION SET DETAILS
J
J
Jump
31
26 25
0
J
000010
target
Format:
J target
Description:
The 26-bit target address is shifted left by two bits and combined with the high-order four bits of the address of
the delay slot. The program unconditionally jumps to this calculated address with a delay of one instruction.
Operation:
32
T:
temp ← target
T+1: PC ← PC31…28 || temp || 0
64
T:
temp ← target
T+1: PC ← PC63…28 || temp || 0
2
2
Exceptions:
None
290
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
JAL
Jump And Link
31
26 25
JAL
0
JAL
000011
target
Format:
JAL target
Description:
The 26-bit target address is shifted left by two bits and combined with the high-order four bits of the address of
the delay slot. The program unconditionally jumps to this calculated address with a delay of one instruction. The
address of the instruction immediately after a delay slot is placed in the link register (r31). When MIPS16
instructions are enabled, the value of bit 0 of r31 indicates the ISA mode bit (internal) before jump.
Operation:
32
T:
temp ← target
if MIPS16EN = 1 then
GPR [31] ← (PC + 8)31…1 || ISA MODE
else
GPR [31] ← PC + 8
endif
T+1: PC ← PC31…28 || temp || 0
64
T:
2
temp ← target
if MIPS16EN = 1 then
GPR [31] ← (PC + 8)63…1 || ISA MODE
else
GPR [31] ← PC + 8
endif
T+1: PC ← PC63…28 || temp || 0
2
Exceptions:
None
User’s Manual U15509EJ2V0UM
291
CHAPTER 9 CPU INSTRUCTION SET DETAILS
JALR
JALR
Jump And Link Register
31
26 25
SPECIAL
000000
21 20
rs
16 15
0
00000
11 10
rd
6 5
0
00000
0
JALR
001001
Format:
JALR rs
JALR rd, rs
Description:
The program unconditionally jumps to the address contained in general register rs, with a delay of one
instruction.
When MIPS16 instructions are enabled, the program unconditionally jumps with a delay of one instruction to the
address indicated by the value of clearing the least significant bit of the general register rs to 0. Then, the
content of the least significant bit of the general register rs is set to the ISA mode bit (internal).
The address of the instruction immediately after the delay slot is placed in general register rd. The default value
of rd, if omitted in the assembly language instruction, is 31. When MIPS16 instructions are enabled, the value of
bit 0 of rd indicates the ISA mode bit before jump.
Register specifiers rs and rd should not be equal since such an instruction does not have the same effect when
re-executed because storing a link address destroys the contents of rs if they are equal. However, an attempt to
execute this instruction is not trapped, and the result of executing such an instruction is undefined.
Since 32-bit length instructions must be word-aligned, a Jump and Link Register (JALR) instruction must specify
a target register (rs) that contains an address whose two low-order bits are zero when MIPS16 instructions are
enabled.
If these low-order bits are not zero, an address error exception will occur when the jump target
instruction is subsequently fetched.
Operation:
32, 64 T:
temp ← GPR [rs]
if MIPS16EN = 1 then
GPR [rd] ← (PC + 8)63…1 || ISA MODE
else
GPR [rd] ← PC + 8
endif
T+1: if MIPS16EN = 1 then
PC ← temp63…1 || 0
ISA MODE ← temp0
else
PC ← temp
endif
Exceptions:
None
292
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
JALX
Jump And Link Exchange
31
26 25
JALX
0
JALX
011101
target
Format:
JALX target
Description:
When MIPS16 instructions are enabled, a 26-bit target address is shifted to left by two bits and combined with
the high-order four bits of the address or the delay slot. The program unconditionally jumps to the calculated
address with a delay of one instruction. The address of the instruction immediately after a delay slot is placed in
the link register (r31). The ISA mode bit is inverted with a delay of one instruction. The value of bit 0 of the link
register (r31) indicates the ISA mode bit (internal) before jump.
Operation:
32
T:
temp ← target
GPR [31] ← (PC + 8)31…1 || ISA MODE
T+1: PC ← PC31…28 || temp || 0
2
ISA MODE toggle
64
T:
temp ← target
GPR [31] ← (PC + 8)63…1 || ISA MODE
T+1: PC ← PC63…28 || temp || 0
2
ISA MODE toggle
Exceptions:
Reserved instruction exception (when MIPS16 instruction execution disabled)
User’s Manual U15509EJ2V0UM
293
CHAPTER 9 CPU INSTRUCTION SET DETAILS
JR
JR
Jump Register
31
26 25
SPECIAL
000000
21 20
rs
6 5
0
000000000000000
0
JR
001000
Format:
JR rs
Description:
The program unconditionally jumps to the address contained in general register rs, with a delay of one
instruction.
When MIPS16 instructions are enabled, the program unconditionally jumps with a delay of one instruction to the
address indicated by the value of clearing the least significant bit of the general register rs to 0. Then, the
content of the least significant bit of the general register rs is set to the ISA mode bit (internal).
Since 32-bit length instructions must be word-aligned, a Jump Register (JR) instruction must specify a target
register (rs) that contains an address whose two low-order bits are zero when MIPS16 instructions are enabled.
If these low-order bits are not zero, an address error exception will occur when the jump target instruction is
subsequently fetched.
Operation:
32, 64 T:
temp ← GPR [rs]
T+1: if MIPS16EN = 1 then
PC ← temp63…1 || 0
ISA MODE ← temp0
else
PC ← temp
endif
Exceptions:
None
294
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
LB
31
26 25
LB
100000
Load Byte
LB
16 15
0
21 20
base
rt
offset
Format:
LB rt, offset (base)
Description:
The 16-bit offset is sign-extended and added to the contents of general register base to form a virtual address.
The contents of the byte at the memory location specified by the effective address are sign-extended and loaded
into general register rt.
Operation:
32
T:
vAddr ← ((offset15)
16
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
3
pAddr ← pAddrPSIZE – 1…3 || (pAddr2…0 xor ReverseEndian )
mem ← LoadMemory (uncached, BYTE, pAddr, vAddr, DATA)
byte ← vAddr2…0 xor BigEndianCPU
GPR [rt] ← (mem7 + 8*byte)
64
T:
vAddr ← ((offset15)
48
24
3
|| mem7 + 8*byte…8*byte
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
3
pAddr ← pAddrPSIZE – 1…3 || (pAddr2…0 xor ReverseEndian )
mem ← LoadMemory (uncached, BYTE, pAddr, vAddr, DATA)
byte ← vAddr2…0 xor BigEndianCPU
GPR [rt] ← (mem7 + 8*byte)
56
3
|| mem7 + 8*byte…8*byte
Exceptions:
TLB refill exception
TLB invalid exception
Bus error exception
Address error exception
User’s Manual U15509EJ2V0UM
295
CHAPTER 9 CPU INSTRUCTION SET DETAILS
LBU
LBU
Load Byte Unsigned
31
26 25
LBU
100100
21 20
base
16 15
0
rt
offset
Format:
LBU rt, offset (base)
Description:
The 16-bit offset is sign-extended and added to the contents of general register base to form a virtual address.
The contents of the byte at the memory location specified by the effective address are zero-extended and loaded
into general register rt.
Operation:
32
T:
vAddr ← ((offset15)
16
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
3
pAddr ← pAddrPSIZE – 1…3 || (pAddr2…0 xor ReverseEndian )
mem ← LoadMemory (uncached, BYTE, pAddr, vAddr, DATA)
byte ← vAddr2…0 xor BigEndianCPU
GPR [rt] ← 0
64
T:
24
3
|| mem7 + 8*byte…8*byte
vAddr ← ((offset15)
48
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
3
pAddr ← pAddrPSIZE – 1…3 || (pAddr2…0 xor ReverseEndian )
mem ← LoadMemory (uncached, BYTE, pAddr, vAddr, DATA)
byte ← vAddr2…0 xor BigEndianCPU
GPR [rt] ← 0
56
3
|| mem7 + 8*byte…8*byte
Exceptions:
TLB refill exception
TLB invalid exception
Bus error exception
Address error exception
296
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
LD
LD
Load Doubleword
31
26 25
LD
110111
21 20
base
16 15
rt
0
offset
Format:
LD rt, offset (base)
Description:
The 16-bit offset is sign-extended and added to the contents of general register base to form a virtual address.
The contents of the 64-bit doubleword at the memory location specified by the effective address are loaded into
general register rt.
If any of the three least-significant bits of the effective address are non-zero, an address error exception occurs.
This operation is defined for the VR4100 Series operating in 64-bit mode or in 32-bit Kernel mode. Execution of
this instruction in 32-bit User or Supervisor mode causes a reserved instruction exception.
Operation:
32
T:
vAddr ← ((offset15)
16
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
data ← LoadMemory (uncached, DOUBLEWORD, pAddr, vAddr, DATA)
GPR [rt] ← data
64
T:
vAddr ← ((offset15)
48
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
data ← LoadMemory (uncached, DOUBLEWORD, pAddr, vAddr, DATA)
GPR [rt] ← data
Exceptions:
TLB refill exception
TLB invalid exception
Bus error exception
Address error exception
Reserved instruction exception (VR4100 Series in 32-bit User mode, VR4100 Series in 32-bit Supervisor mode)
User’s Manual U15509EJ2V0UM
297
CHAPTER 9 CPU INSTRUCTION SET DETAILS
LDL
LDL
Load Doubleword Left
31
26 25
21 20
LDL
011010
16 15
base
0
rt
offset
Format:
LDL rt, offset (base)
Description:
This instruction can be used in combination with the LDR instruction to load a register with eight consecutive
bytes from memory, when the bytes cross a doubleword boundary. LDL loads the left portion of the register with
the appropriate part of the high-order doubleword in memory; LDR loads the right portion of the register with the
appropriate part of the low-order doubleword.
The LDL instruction adds its sign-extended 16-bit offset to the contents of general register base to form a virtual
address that can specify an arbitrary byte. It reads bytes only from the doubleword in memory that contains the
specified starting byte, and places them in the high-order part of general register rt.
The contents of the
remaining part of general register rt is retained. From one to eight bytes will be loaded, depending on the starting
byte specified.
Conceptually, it starts at the specified byte in memory and loads that byte into the high-order (left-most) byte of
the register; then it loads bytes from memory into the register until it reaches the low-order byte of the
doubleword in memory. The least-significant (right-most) byte(s) of the register will not be changed.
Memory (little endian)
address 8
15 14 13 12 11 10
9
8
address 0
7
1
0
6
5
4
3
2
Register
before
C
D
E
F
G
H
$24
12 11 10
9
8
F
G
H
$24
A
B
LDL $24, 12 ($0)
after
298
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
LDL
Load Doubleword Left
(Continued)
LDL
The contents of general register rt are internally bypassed within the processor so that no NOP is needed
between an immediately preceding load instruction which specifies register rt and a following LDL (or LDR)
instruction which also specifies register rt.
No address error exceptions due to alignment are possible.
This operation is defined for the VR4100 Series operating in 64-bit mode or in 32-bit Kernel mode. Execution of
this instruction in 32-bit User or Supervisor mode causes a reserved instruction exception.
Operation:
32
T:
vAddr ← ((offset15)
16
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
3
pAddr ← pAddrPSIZE – 1…3 || (pAddr2…0 xor ReverseEndian )
if BigEndianMem = 0 then
pAddr ← pAddrPSIZE – 1…3 || 0
3
endif
byte ← vAddr2…0 xor BigEndianCPU
3
mem ← LoadMemory (uncached, byte, pAddr, vAddr, DATA)
GPR [rt] ← mem7 + 8*byte…0 || GPR [rt]55 – 8*byte…0
64
T:
vAddr ← ((offset15)
48
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
3
pAddr ← pAddrPSIZE – 1…3 || (pAddr2…0 xor ReverseEndian )
if BigEndianMem = 0 then
pAddr ← pAddrPSIZE – 1…3 || 0
3
endif
byte ← vAddr2…0 xor BigEndianCPU
3
mem ← LoadMemory (uncached, byte, pAddr, vAddr, DATA)
GPR [rt] ← mem7 + 8*byte…0 || GPR [rt]55 – 8*byte…0
User’s Manual U15509EJ2V0UM
299
CHAPTER 9 CPU INSTRUCTION SET DETAILS
LDL
LDL
Load Doubleword Left
(Continued)
Given a doubleword in a register and a doubleword in memory, the operation of LDL is as follows:
Register
A
B
C
D
E
F
G
H
Memory
I
J
K
L
M
N
O
P
vAddr2..0
BigEndianCPU = 1 Note
BigEndianCPU = 0
destination
type
offset
LEM
destination
BEM
type
Note
offset
LEM
BEM
0
P BCDE FGH
0
0
7
I J K LMNOP
7
0
0
1
OPCDE FGH
1
0
6
J K LMNOP H
6
0
1
2
NOPDE FGH
2
0
5
K LMNOPGH
5
0
2
3
MNO P E F G H
3
0
4
L MNOP F GH
4
0
3
4
L MNOP F GH
4
0
3
MNO P E F G H
3
0
4
5
K LMNOPGH
5
0
2
NOPDE FGH
2
0
5
6
J K LMNOP H
6
0
1
OPCDE FGH
1
0
6
7
I J K LMNOP
7
0
0
P BCDE FGH
0
0
7
Note For VR4131 only
Remark
type:
access type (see Figure 2-2) sent to memory
offset: pAddr2..0 sent to memory
LEM:
Little-endian memory (BigEndianMem = 0)
BEM:
Big-endian memory (BigEndianMem = 1)
Exceptions:
TLB refill exception
TLB invalid exception
Bus error exception
Address error exception
Reserved instruction exception (VR4100 Series in 32-bit User mode, VR4100 Series in 32-bit Supervisor mode)
300
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
LDR
LDR
Load Doubleword Right
31
26 25
21 20
LDR
011011
16 15
base
0
rt
offset
Format:
LDR rt, offset (base)
Description:
This instruction can be used in combination with the LDL instruction to load a register with eight consecutive
bytes from memory, when the bytes cross a doubleword boundary. LDR loads the right portion of the register
with the appropriate part of the low-order doubleword in memory; LDL loads the left portion of the register with
the appropriate part of the high-order doubleword.
The LDR instruction adds its sign-extended 16-bit offset to the contents of general register base to form a virtual
address that can specify an arbitrary byte. It reads bytes only from the doubleword in memory that contains the
specified starting byte, and places them in the low-order part of general register rt. The contents of the remaining
part of general register rt is retained. From one to eight bytes will be loaded, depending on the starting byte
specified.
Conceptually, it starts at the specified byte in memory and loads that byte into the low-order (right-most) byte of
the register; then it loads bytes from memory into the register until it reaches the high-order byte of the
doubleword in memory. The most significant (left-most) byte(s) of the register will not be changed.
Memory (little endian)
address 8
15 14 13 12 11 10
9
8
address 0
7
1
0
6
5
4
3
2
Register
before
A
B
C
D
E
F
G
H
$24
after
A
B
C
D
E
7
6
5
$24
LDR $24, 5 ($0)
User’s Manual U15509EJ2V0UM
301
CHAPTER 9 CPU INSTRUCTION SET DETAILS
LDR
Load Doubleword Right
(Continued)
LDR
The contents of general register rt are internally bypassed within the processor so that no NOP is needed
between an immediately preceding load instruction which specifies register rt and a following LDR (or LDL)
instruction which also specifies register rt.
No address error exceptions due to alignment are possible.
This operation is defined for the VR4100 Series operating in 64-bit mode or in 32-bit Kernel mode. Execution of
this instruction in 32-bit User or Supervisor mode causes a reserved instruction exception.
Operation:
32
T:
vAddr ← ((offset15)
16
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
3
pAddr ← pAddrPSIZE – 1…3 || (pAddr2…0 xor ReverseEndian )
if BigEndianMem = 1 then
pAddr ← pAddrPSIZE – 1…3 || 0
3
endif
byte ← vAddr2…0 xor BigEndianCPU
3
mem ← LoadMemory (uncached, DOUBLEWORD-byte, pAddr, vAddr, DATA)
GPR [rt] ← GPR [rt]63…64 – 8*byte || mem63… 8*byte
64
T:
vAddr ← ((offset15)
48
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
3
pAddr ← pAddrPSIZE – 1…3 || (pAddr2…0 xor ReverseEndian )
if BigEndianMem = 1 then
pAddr ← pAddrPSIZE – 1…3 || 0
3
endif
byte ← vAddr2…0 xor BigEndianCPU
3
mem ← LoadMemory (uncached, DOUBLEWORD-byte, pAddr, vAddr, DATA)
GPR [rt] ← GPR [rt]63…64 – 8*byte || mem63… 8*byte
302
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
LDR
LDR
Load Doubleword Right
(Continued)
Given a doubleword in a register and a doubleword in memory, the operation of LDR is as follows:
Register
A
B
C
D
E
F
G
H
Memory
I
J
K
L
M
N
O
P
vAddr2..0
BigEndianCPU = 1 Note
BigEndianCPU = 0
destination
type
offset
LEM
destination
BEM
type
Note
offset
LEM
BEM
0
I J K LMNOP
7
0
0
A BCDE FG I
0
7
0
1
A I J K LMNO
6
1
0
A BCDE F I J
1
6
0
2
A B I J K LMN
5
2
0
A BCDE I J K
2
5
0
3
A BC I J K LM
4
3
0
A BCD I J K L
3
4
0
4
A BCD I J K L
3
4
0
A BC I J K LM
4
3
0
5
A BCDE I J K
2
5
0
A B I J K LMN
5
2
0
6
A BCDE F I J
1
6
0
A I J K LMNO
6
1
0
7
A BCDE FG I
0
7
0
I J K LMNOP
7
0
0
Note For VR4131 only
Remark
type:
access type (see Figure 2-2) sent to memory
offset: pAddr2..0 sent to memory
LEM:
Little-endian memory (BigEndianMem = 0)
BEM:
Big-endian memory (BigEndianMem = 1)
Exceptions:
TLB refill exception
TLB invalid exception
Bus error exception
Address error exception
Reserved instruction exception (VR4100 Series in 32-bit User mode, VR4100 Series in 32-bit Supervisor mode)
User’s Manual U15509EJ2V0UM
303
CHAPTER 9 CPU INSTRUCTION SET DETAILS
LH
31
26 25
LH
100001
Load Halfword
LH
16 15
0
21 20
base
rt
offset
Format:
LH rt, offset (base)
Description:
The 16-bit offset is sign-extended and added to the contents of general register base to form a virtual address.
The contents of the halfword at the memory location specified by the effective address are sign-extended and
loaded into general register rt.
If the least-significant bit of the effective address is non-zero, an address error exception occurs.
Operation:
32
T:
vAddr ← ((offset15)
16
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
2
pAddr ← pAddrPSIZE – 1…3 || (pAddr2…0 xor (ReverseEndian || 0))
mem ← LoadMemory (uncached, HALFWORD, pAddr, vAddr, DATA)
2
byte ← vAddr2…0 xor (BigEndianCPU || 0)
GPR [rt] ← (mem15 + 8*byte)
64
T:
vAddr ← ((offset15)
48
16
|| mem15 + 8*byte…8*byte
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
2
pAddr ← pAddrPSIZE – 1…3 || (pAddr2…0 xor (ReverseEndian || 0))
mem ← LoadMemory (uncached, HALFWORD, pAddr, vAddr, DATA)
2
byte ← vAddr2…0 xor (BigEndianCPU || 0)
GPR [rt] ← (mem15 + 8*byte)
48
|| mem15 + 8*byte…8*byte
Exceptions:
TLB refill exception
TLB invalid exception
Bus error exception
Address error exception
304
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
LHU
LHU
Load Halfword Unsigned
31
26 25
LHU
100101
21 20
base
16 15
0
rt
offset
Format:
LHU rt, offset (base)
Description:
The 16-bit offset is sign-extended and added to the contents of general register base to form a virtual address.
The contents of the halfword at the memory location specified by the effective address are zero-extended and
loaded into general register rt.
If the least-significant bit of the effective address is non-zero, an address error exception occurs.
Operation:
32
T:
vAddr ← ((offset15)
16
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
2
pAddr ← pAddrPSIZE – 1…3 || (pAddr2…0 xor (ReverseEndian || 0))
mem ← LoadMemory (uncached, HALFWORD, pAddr, vAddr, DATA)
2
byte ← vAddr2…0 xor (BigEndianCPU || 0)
GPR [rt] ← 0
64
T:
16
|| mem15 + 8*byte…8*byte
vAddr ← ((offset15)
48
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
2
pAddr ← pAddrPSIZE – 1…3 || (pAddr2…0 xor (ReverseEndian || 0))
mem ← LoadMemory (uncached, HALFWORD, pAddr, vAddr, DATA)
2
byte ← vAddr2…0 xor (BigEndianCPU || 0)
GPR [rt] ← 0
48
|| mem15 + 8*byte…8*byte
Exceptions:
TLB refill exception
TLB invalid exception
Bus error exception
Address error exception
User’s Manual U15509EJ2V0UM
305
CHAPTER 9 CPU INSTRUCTION SET DETAILS
LUI
LUI
Load Upper Immediate
31
26 25
LUI
001111
21 20
16 15
0
00000
0
rt
immediate
Format:
LUI rt, immediate
Description:
The 16-bit immediate is shifted left by 16 bits and concatenated to 16 bits of zeros. The result is placed into
general register rt. In 64-bit mode, the loaded word is sign-extended.
Operation:
32
T:
GPR [rt] ← immediate || 0
64
T:
GPR [rt] ← (immediate15)
16
32
|| immediate || 0
16
Exceptions:
None
306
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
LW
31
26 25
LW
100011
Load Word
LW
16 15
0
21 20
base
rt
offset
Format:
LW rt, offset (base)
Description:
The 16-bit offset is sign-extended and added to the contents of general register base to form a virtual address.
The contents of the word at the memory location specified by the effective address are loaded into general
register rt. In 64-bit mode, the loaded word is sign-extended.
If either of the two least-significant bits of the effective address is non-zero, an address error exception occurs.
Operation:
32
T:
vAddr ← ((offset15)
16
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
2
pAddr ← pAddrPSIZE – 1…3 || (pAddr2…0 xor (ReverseEndian || 0 ))
mem ← LoadMemory (uncached, WORD, pAddr, vAddr, DATA)
2
byte ← vAddr2…0 xor (BigEndianCPU || 0 )
GPR [rt] ← mem31 + 8*byte…8*byte
64
T:
vAddr ← ((offset15)
48
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
2
pAddr ← pAddrPSIZE – 1…3 || (pAddr2…0 xor (ReverseEndian || 0 ))
mem ← LoadMemory (uncached, WORD, pAddr, vAddr, DATA)
2
byte ← vAddr2…0 xor (BigEndianCPU || 0 )
GPR [rt] ← (mem31 + 8*byte)
32
|| mem31 + 8*byte…8*byte
Exceptions:
TLB refill exception
TLB invalid exception
Bus error exception
Address error exception
User’s Manual U15509EJ2V0UM
307
CHAPTER 9 CPU INSTRUCTION SET DETAILS
LWL
31
LWL
Load Word Left
26 25
LWL
100010
21 20
16 15
base
0
rt
offset
Format:
LWL rt, offset (base)
Description:
This instruction can be used in combination with the LWR instruction to load a register with four consecutive
bytes from memory, when the bytes cross a word boundary. LWL loads the left portion of the register with the
appropriate part of the high-order word in memory; LWR loads the right portion of the register with the
appropriate part of the low-order word.
The LWL instruction adds its sign-extended 16-bit offset to the contents of general register base to form a virtual
address that can specify an arbitrary byte. It reads bytes only from the word in memory that contains the
specified starting byte, and places them in the high-order part of general register rt.
The contents of the
remaining part of general register rt are retained. From one to four bytes will be loaded, depending on the
starting byte specified. In 64-bit mode, the loaded word is sign-extended.
Conceptually, it starts at the specified byte in memory and loads that byte into the high-order (left-most) byte of
the register; then it loads bytes from memory into the register until it reaches the low-order byte of the word in
memory. The least-significant (right-most) byte(s) of the register will not be changed.
Memory (little endian)
address 4
7
6
5
4
address 0
3
2
1
0
Register
before
A
B
C
D
$24
after
4
B
C
D
$24
LWL $24, 4 ($0)
308
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
LWL
LWL
Load Word Left
(Continued)
The contents of general register rt are internally bypassed within the processor so that no NOP is needed
between an immediately preceding load instruction which specifies register rt and a following LWL (or LWR)
instruction which also specifies register rt.
No address error exceptions due to alignment are possible.
Operation:
32
T:
vAddr ← ((offset15)
16
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
3
pAddr ← pAddrPSIZE – 1…3 || (pAddr2…0 xor ReverseEndian )
if BigEndianMem = 0 then
pAddr ← pAddrPSIZE – 1…2 || 0
2
endif
byte ← vAddr1…0 xor BigEndianCPU
2
word ← vAddr2 xor BigEndianCPU
mem ← LoadMemory (uncached, byte, pAddr, vAddr, DATA)
temp ← mem32*word + 8*byte + 7…32*word || GPR [rt]23 – 8*byte…0
GPR [rt] ← temp
64
T:
vAddr ← ((offset15)
48
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
3
pAddr ← pAddrPSIZE – 1…3 || (pAddr2…0 xor ReverseEndian )
if BigEndianMem = 0 then
pAddr ← pAddrPSIZE – 1…2 || 0
2
endif
byte ← vAddr1…0 xor BigEndianCPU
2
word ← vAddr2 xor BigEndianCPU
mem ← LoadMemory (uncached, 0 || byte, pAddr, vAddr, DATA)
temp ← mem32*word + 8*byte + 7…32*word || GPR [rt]23 – 8*byte…0
GPR [rt] ← (temp31)
32
|| temp
User’s Manual U15509EJ2V0UM
309
CHAPTER 9 CPU INSTRUCTION SET DETAILS
LWL
LWL
Load Word Left
(Continued)
Given a doubleword in a register and a doubleword in memory, the operation of LWL is as follows:
Register
A
B
C
D
E
F
G
H
Memory
I
J
K
L
M
N
O
P
vAddr2..0
BigEndianCPU = 1 Note
BigEndianCPU = 0
destination
type
offset
LEM
destination
BEM
offset
LEM
BEM
0
S SSSP FGH
0
0
7
S SSS I J K L
3
4
0
1
S SSSOPGH
1
0
6
S SSS J K L H
2
4
1
2
S SSSNOPH
2
0
5
S SSSK LGH
1
4
2
3
S S S SMNOP
3
0
4
S SSS L FGH
0
4
3
4
S SSS L FGH
0
4
3
S S S SMNOP
3
0
4
5
S SSSK LGH
1
4
2
S SSSNOPH
2
0
5
6
S SSS J K L H
2
4
1
S SSSOPGH
1
0
6
7
S SSS I J K L
3
4
0
S SSSP FGH
0
0
7
Note For VR4131 only
Remark
type:
access type (see Figure 2-2) sent to memory
offset: pAddr2..0 sent to memory
S:
LEM:
Little-endian memory (BigEndianMem = 0)
BEM:
Big-endian memory (BigEndianMem = 1)
sign-extend of destination31
Exceptions:
TLB refill exception
TLB invalid exception
Bus error exception
Address error exception
310
type
Note
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
LWR
31
LWR
Load Word Right
26 25
LWR
100110
21 20
16 15
base
0
rt
offset
Format:
LWR rt, offset (base)
Description:
This instruction can be used in combination with the LWL instruction to load a register with four consecutive
bytes from memory, when the bytes cross a word boundary. LWR loads the right portion of the register with the
appropriate part of the low-order word in memory; LWL loads the left portion of the register with the appropriate
part of the high-order word.
The LWR instruction adds its sign-extended 16-bit offset to the contents of general register base to form a virtual
address that can specify an arbitrary byte. It reads bytes only from the word in memory that contains the
specified starting byte, and places them in the low-order part of general register rt. The contents of the remaining
part of general register rt are retained. From one to four bytes will be loaded, depending on the starting byte
specified. In 64-bit mode, the loaded word is sign-extended.
Conceptually, it starts at the specified byte in memory and loads that byte into the low-order (right-most) byte of
the register; then it loads bytes from memory into the register until it reaches the high-order byte of the word in
memory. The most significant (left-most) byte(s) of the register will not be changed.
Memory (little endian)
address 4
7
6
5
4
address 0
3
2
1
0
Register
before
A
B
C
D
$24
after
A
3
2
1
$24
LWR $24, 1 ($0)
User’s Manual U15509EJ2V0UM
311
CHAPTER 9 CPU INSTRUCTION SET DETAILS
LWR
LWR
Load Word Right
(Continued)
The contents of general register rt are internally bypassed within the processor so that no NOP is needed
between an immediately preceding load instruction which specifies register rt and a following LWR (or LWL)
instruction which also specifies register rt.
No address error exceptions due to alignment are possible.
Operation:
32
T:
vAddr ← ((offset15)
16
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
3
pAddr ← pAddrPSIZE – 1…3 || (pAddr2…0 xor ReverseEndian )
if BigEndianMem = 1 then
pAddr ← pAddrPSIZE – 1…3 || 0
3
endif
byte ← vAddr1…0 xor BigEndianCPU
2
word ← vAddr2 xor BigEndianCPU
mem ← LoadMemory (uncached, 0 || byte, pAddr, vAddr, DATA)
temp ← GPR [rt]31…32 – 8*byte || mem31 + 32*word…32*word + 8*byte
GPR [rt] ← temp
64
T:
vAddr ← ((offset15)
48
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
3
pAddr ← pAddrPSIZE – 1…3 || (pAddr2…0 xor ReverseEndian )
if BigEndianMem = 1 then
pAddr ← pAddrPSIZE – 1…3 || 0
3
endif
byte ← vAddr1…0 xor BigEndianCPU
2
word ← vAddr2 xor BigEndianCPU
mem ← LoadMemory (uncached, WORD-byte, pAddr, vAddr, DATA)
temp ← GPR [rt]31…32 – 8*byte || mem31 + 32*word…32*word + 8*byte
GPR [rt] ← (temp31)
312
32
|| temp
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
LWR
LWR
Load Word Right
(Continued)
Given a word in a register and a word in memory, the operation of LWR is as follows:
Register
A
B
C
D
E
F
G
H
Memory
I
J
K
L
M
N
O
P
vAddr2..0
BigEndianCPU = 1 Note
BigEndianCPU = 0
destination
type
offset
LEM
destination
BEM
type
Note
offset
LEM
BEM
0
S S S SMNOP
3
0
4
S SSSE FG I
0
7
0
1
S S S S EMNO
2
1
4
S SSSE F I J
1
6
0
2
S S S S E FMN
1
2
4
S SSSE I J K
2
5
0
3
S S S S E F GM
0
3
4
S SSS I J K L
3
4
0
4
S SSS I J K L
3
4
0
S S S S E F GM
0
3
4
5
S SSSE I J K
2
5
0
S S S S E FMN
1
2
4
6
S SSSE F I J
1
6
0
S S S S EMNO
2
1
4
7
S SSSE FG I
0
7
0
S S S SMNOP
3
0
4
Note For VR4131 only
Remark
type:
access type (see Figure 2-2) sent to memory
offset: pAddr2..0 sent to memory
S:
LEM:
Little-endian memory (BigEndianMem = 0)
BEM:
Big-endian memory (BigEndianMem = 1)
sign-extend of destination31
Exceptions:
TLB refill exception
TLB invalid exception
Bus error exception
Address error exception
User’s Manual U15509EJ2V0UM
313
CHAPTER 9 CPU INSTRUCTION SET DETAILS
LWU
LWU
Load Word Unsigned
31
26 25
LWU
100111
21 20
base
16 15
0
rt
offset
Format:
LWU rt, offset (base)
Description:
The 16-bit offset is sign-extended and added to the contents of general register base to form a virtual address.
The contents of the word at the memory location specified by the effective address are loaded into general
register rt. The loaded word is zero-extended.
If either of the two least-significant bits of the effective address is non-zero, an address error exception occurs.
This operation is defined for the VR4100 Series operating in 64-bit mode or in 32-bit Kernel mode. Execution of
this instruction in 32-bit User or Supervisor mode causes a reserved instruction exception.
Operation:
32
T:
vAddr ← ((offset15)
16
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
2
pAddr ← pAddrPSIZE – 1…3 || (pAddr2…0 xor (ReverseEndian || 0 ))
mem ← LoadMemory (uncached, WORD, pAddr, vAddr, DATA)
2
byte ← vAddr2…0 xor (BigEndianCPU || 0 )
GPR [rt] ← 0
64
T:
32
|| mem 31 + 8*byte…8*byte
vAddr ← ((offset15)
48
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
2
pAddr ← pAddrPSIZE – 1…3 || (pAddr2…0 xor (ReverseEndian || 0 ))
mem ← LoadMemory (uncached, WORD, pAddr, vAddr, DATA)
2
byte ← vAddr2…0 xor (BigEndianCPU || 0 )
GPR [rt] ← 0
32
|| mem 31 + 8*byte…8*byte
Exceptions:
TLB refill exception
TLB invalid exception
Bus error exception
Address error exception
Reserved instruction exception (VR4100 Series in 32-bit User mode, VR4100 Series in 32-bit Supervisor mode)
314
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
MACC
MACC
Multiply and Add Accumulate
(for VR4121, VR4122, VR4131, and VR4181A)
31
26 25
SPECIAL
000000
21 20
rs
16 15
11 10 9
rd
rt
8 7 6
sat hi 0 0 us
5
0
MACC
101000
Format:
MACC
rd, rs, rt
MACCU
rd, rs, rt
MACCHI
rd, rs, rt
MACCHIU
rd, rs, rt
MACCS
rd, rs, rt
MACCUS
rd, rs, rt
MACCHIS
rd, rs, rt
MACCHIUS rd, rs, rt
Description:
The mnemonics of the MACC instruction differ as shown in the table below by the setting of the sat, hi, or us bits.
Mnemonic
sat
hi
us
MACC
0
0
0
MACCU
0
0
1
MACCHI
0
1
0
MACCHIU
0
1
1
MACCS
1
0
0
MACCUS
1
0
1
MACCHIS
1
1
0
MACCHIUS
1
1
1
The number of valid bits in the operands differs depending on whether saturation processing is executed (sat =
1) or not (sat = 0).
• When saturation processing is executed (sat = 1):
MACCS, MACCUS, MACCHIS, and MACCHIUS
instructions
The contents of general register rs are multiplied by the contents of general register rt. If us = 1, the contents
of both operands are handled as 16-bit unsigned data. If us = 0, the contents are handled as 16-bit signed
integers. Sign/zero extension by software is required for bits 16 to 31 in the operands.
User’s Manual U15509EJ2V0UM
315
CHAPTER 9 CPU INSTRUCTION SET DETAILS
MACC
Multiply and Add Accumulate
(for VR4121, VR4122, VR4131, and VR4181A)
(Continued)
MACC
The product of this multiply operation is added to the 64-bit value (of which only the low-order 32 bits are valid)
formed by concatenating special registers HI and LO. If us = 1, this add operation handles the values being
added as 32-bit unsigned data.
If us = 0, the values are handled as 32-bit signed integers.
Sign/zero
extension by software is required for bits 32 to 63 of the value formed by concatenating special registers HI and
LO.
After saturation processing of 32 bits has been performed (refer to the table below), the sum from this add
operation is loaded to special registers HI and LO. When hi = 1, data that is the same as the data loaded to
special register HI is also loaded to general register rd. When hi = 0, data that is the same as the data loaded
to special register LO is also loaded to general register rd. Overflow exceptions do not occur.
• When saturation processing is not executed (sat = 0):
MACC, MACCU, MACCHI, and MACCHIU
instructions
The contents of general register rs are multiplied by the contents of general register rt. If us = 1, the contents
of both operands are handled as 32-bit unsigned data. If us = 0, the contents are handled as 32-bit signed
integers. Sign/zero extension by software is required for bits 32 to 63 in the operands.
The product of this multiply operation is added to the 64-bit value formed by concatenating special registers HI
and LO. If us = 1, this add operation handles the values being added as 64-bit unsigned data. If us = 0, the
values are handled as 64-bit signed integers.
The low-order word of the sum from this add operation is loaded to special register LO, and the high-order word
to special register HI. When hi = 1, data that is the same as the data loaded to special register HI is also
loaded to general register rd. When hi = 0, data that is the same as the data loaded to special register LO is
also loaded to general register rd. Overflow exceptions do not occur.
316
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
MACC
MACC
Multiply and Add Accumulate
(for VR4121, VR4122, VR4131, and VR4181A)
(Continued)
The correspondence of us and sat settings and values stored during saturation processing is shown below, along
with the hazard cycles required between execution of the instruction for manipulating the HI and LO registers and
execution of the MACC instruction.
Values Stored During Saturation Processing
us
sat
Overflow
Underflow
0
0
Store calculation result as is
Store calculation result as is
1
0
Store calculation result as is
Store calculation result as is
0
1
0x0000 0000 7FFF FFFF
0xFFFF FFFF 8000 0000
1
1
0xFFFF FFFF FFFF FFFF
None
Hazard Cycle Counts
Instruction
Cycle Count
MULT, MULTU
DMULT, DMULTU
DIV, DIVU
DDIV, DDIVU
MFHI, MFLO
MTHI, MTLO
MACC
DMACC
Note1
3
36
68
Note2
0
0
0
Notes 1. VR4121, VR4122 … 1
VR4131 … 0
VR4181A … 1
2. VR4121, VR4122 … 2
VR4131 … 0
VR4181A … 2
Operation:
32, sat = 0, hi = 0, us = 0 (MACC instruction)
T:
temp1 ← GPR[rs] * GPR[rt]
temp2 ← temp1 + (HI || LO)
LO ← temp263..32
HI ← temp231..0
GPR[rd] ← LO
32, sat = 0, hi = 0, us = 1 (MACCU instruction)
T:
temp1 ← (0 || GPR[rs]) * (0 || GPR[rt])
temp2 ← temp1 + ((0 || HI) || (0 || LO))
LO ← temp263..32
HI ← temp231..0
GPR[rd] ← LO
User’s Manual U15509EJ2V0UM
317
CHAPTER 9 CPU INSTRUCTION SET DETAILS
MACC
Multiply and Add Accumulate
(for VR4121, VR4122, VR4131, and VR4181A)
(Continued)
32, sat = 0, hi = 1, us = 0 (MACCHI instruction)
T:
temp1 ← GPR[rs] * GPR[rt]
temp2 ← temp1 + (HI || LO)
LO ← temp263..32
HI ← temp231..0
GPR[rd] ← HI
32, sat = 0, hi = 1, us = 1 (MACCHIU instruction)
T:
temp1 ← (0 || GPR[rs]) * (0 || GPR[rt])
temp2 ← temp1 + ((0 || HI) || (0 || LO))
LO ← temp263..32
HI ← temp231..0
GPR[rd] ← HI
32, sat = 1, hi = 0, us = 0 (MACCS instruction)
T:
temp1 ← GPR[rs] * GPR[rt]
temp2 ← saturation(temp1 + (HI || LO))
LO ← temp263..32
HI ← temp231..0
GPR[rd] ← LO
32, sat = 1, hi = 0, us = 1 (MACCUS instruction)
T:
temp1 ← (0 || GPR[rs]) * (0 || GPR[rt])
temp2 ← saturation(temp1 + ((0 || HI) || (0 || LO)))
LO ← temp263..32
HI ← temp231..0
GPR[rd] ← LO
32, sat = 1, hi = 1, us = 0 (MACCHIS instruction)
T:
temp1 ← GPR[rs] * GPR[rt]
temp2 ← saturation(temp1 + (HI || LO))
LO ← temp263..32
HI ← temp231..0
GPR[rd] ← HI
32, sat = 1, hi = 1, us = 1 (MACCHIUS instruction)
T:
temp1 ← (0 || GPR[rs]) * (0 || GPR[rt])
temp2 ← saturation(temp1 + ((0 || HI) || (0 || LO)))
LO ← temp263..32
HI ← temp231..0
GPR[rd] ← HI
318
User’s Manual U15509EJ2V0UM
MACC
CHAPTER 9 CPU INSTRUCTION SET DETAILS
MACC
Multiply and Add Accumulate
(for VR4121, VR4122, VR4131, and VR4181A)
(Continued)
MACC
64, sat = 0, hi = 0, us = 0 (MACC instruction)
T:
32
32
temp1 ← ((GPR[rs]31 ) || GPR[rs]) * ((GPR[rt]31 ) || GPR[rt])
temp2 ← temp1 + (HI31..0 || LO31..0 )
32
LO ← ((temp263 ) || temp263..32 )
32
HI ← ((temp231 ) || temp231..0 )
GPR[rd] ← LO
64, sat = 0, hi = 0, us = 1 (MACCU instruction)
T:
32
32
temp1 ← (0 || GPR[rs]) * (0 || GPR[rt])
temp2 ← temp1 + (HI31..0 || LO31..0 )
32
LO ← ((temp263 ) || temp263..32 )
32
HI ← ((temp231 ) || temp231..0 )
GPR[rd] ← LO
64, sat = 0, hi = 1, us = 0 (MACCHI instruction)
T:
32
32
temp1 ← ((GPR[rs]31 ) || GPR[rs]) * ((GPR[rt]31 ) || GPR[rt])
temp2 ←temp1 + (HI31..0 || LO31..0 )
32
LO ← ((temp263 ) || temp263..32 )
32
HI ← ((temp231 ) || temp231..0 )
GPR[rd] ← HI
64, sat = 0, hi = 1, us = 1 (MACCHIU instruction)
T:
32
32
temp1 ← (0 || GPR[rs]) * (0 || GPR[rt])
temp2 ← temp1 + (HI31..0 || LO31..0 )
32
LO ← ((temp263 ) || temp263..32 )
32
HI ← ((temp231 ) || temp231..0 )
GPR[rd] ← HI
64, sat = 1, hi = 0, us = 0 (MACCS instruction)
T:
32
32
temp1 ← ((GPR[rs]31 ) || GPR[rs]) * ((GPR[rt]31 ) || GPR[rt])
temp2 ← saturation(temp1 + (HI31..0 || LO31..0 ))
32
LO ← ((temp263 ) || temp263..32 )
32
HI ← ((temp231 ) || temp231..0 )
GPR[rd] ← LO
64, sat = 1, hi = 0, us = 1 (MACCUS instruction)
T:
32
32
temp1 ← (0 || GPR[rs]) * (0 || GPR[rt])
temp2 ←saturation(temp1 + (HI31..0 || LO31..0 ))
32
LO ← ((temp263 ) || temp263..32 )
32
HI ← ((temp231 ) || temp231..0 )
GPR[rd] ← LO
User’s Manual U15509EJ2V0UM
319
CHAPTER 9 CPU INSTRUCTION SET DETAILS
MACC
Multiply and Add Accumulate
(for VR4121, VR4122, VR4131, and VR4181A)
(Continued)
64, sat = 1, hi = 1, us = 0 (MACCHIS instruction)
T:
32
32
temp1 ← ((GPR[rs]31 ) || GPR[rs]) * ((GPR[rt]31 ) || GPR[rt])
temp2 ← saturation(temp1 + (HI31..0 || LO31..0 ))
32
LO ← ((temp263 ) || temp263..32 )
32
HI ← ((temp231 ) || temp231..0 )
GPR[rd] ← HI
64, sat = 1, hi = 1, us = 1 (MACCHIUS instruction)
T:
32
32
temp1 ← (0 || GPR[rs]) * (0 || GPR[rt])
temp2 ← saturation(temp1 + (HI31..0 || LO31..0 )
32
LO ← ((temp263 ) || temp263..32 )
32
HI ← ((temp231 ) || temp231..0 )
GPR[rd] ← HI
Exceptions:
None
320
User’s Manual U15509EJ2V0UM
MACC
CHAPTER 9 CPU INSTRUCTION SET DETAILS
MADD16
31
26 25
SPECIAL
000000
MADD16
Multiply and Add 16-bit integer
(for VR4181 only)
21 20
16 15
rs
rt
6 5
0
0000000000
0
MADD16
101000
Format:
MADD16 rs, rt
Description:
The contents of general registers rs and rt are multiplied, treating both operands as 16-bit 2’s complement
values. Bits 62 to 15 of the operand must be valid sign-extended values. If not, the result is unpredictable.
This multiplied result and the 64-bit data joined special register HI to LO are added to form the result. When the
operation completes, the low-order word of the result is loaded into special register LO, and the high-order word
of the result is loaded into special register HI.
No integer overflow exception occurs under any circumstances.
Hazard cycles required between MADD16 and other instructions are as follows.
Instruction sequence
No. of cycles
MULT/MULTU → MADD16
1 Cycle
DMULT/DMULTU → MADD16
4 Cycles
DIV/DIVU → MADD16
36 Cycles
DDIV/DDIVU → MADD16
68 Cycles
MFHI/MFLO → MADD16
2 Cycles
DMADD16 → MADD16
0 Cycles
MADD16 → MADD16
0 Cycles
Operation:
32, 64 T:
temp1 ← GPR [rs] * GPR [rt]
temp2 ← temp1 + (HI31…0 || LO31…0)
LO ← (temp2 31)
HI
← (temp2 63)
32
|| temp2 31…0
32
|| temp2 63…32
Exceptions:
None
User’s Manual U15509EJ2V0UM
321
CHAPTER 9 CPU INSTRUCTION SET DETAILS
MFC0
Move from System Control Coprocessor
31
26 25
COP0
010000
21 20
MF
00000
16 15
rt
11 10
rd
0
0
00000000000
Format:
MFC0 rt, rd
Description:
The contents of coprocessor register rd of the CP0 are loaded into general register rt.
Operation:
32
T:
data ← CPR [0, rd]
T+1: GPR [rt] ← data
64
T:
data ← CPR [0, rd]
T+1: GPR [rt] ← (data31)
32
|| data31…0
Exceptions:
Coprocessor unusable exception (User and Supervisor mode if CP0 not enabled)
322
MFC0
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
MFHI
MFHI
Move from HI
31
26 25
SPECIAL
000000
16 15
0
0000000000
11 10
rd
6 5
0
00000
0
MFHI
010000
Format:
MFHI rd
Description:
The contents of special register HI are loaded into general register rd.
To ensure proper operation in the event of interruptions, the two instructions which follow a MFHI instruction may
not be any of the instructions which modify the HI register: MACC, DMACC, MADD16, DMADD16, MULT,
MULTU, DIV, DIVU, MTHI, DMULT, DMULTU, DDIV, DDIVU.
Operation:
32, 64 T:
GPR [rd] ← HI
Exceptions:
None
User’s Manual U15509EJ2V0UM
323
CHAPTER 9 CPU INSTRUCTION SET DETAILS
MFLO
MFLO
Move from LO
31
26 25
SPECIAL
000000
16 15
0
0000000000
11 10
rd
6 5
0
00000
0
MFLO
010010
Format:
MFLO rd
Description:
The contents of special register LO are loaded into general register rd.
To ensure proper operation in the event of interruptions, the two instructions which follow a MFLO instruction may
not be any of the instructions which modify the LO register: MACC, DMACC, MADD16, DMADD16, MULT,
MULTU, DIV, DIVU, MTLO, DMULT, DMULTU, DDIV, DDIVU.
Operation:
32, 64 T:
GPR [rd] ← LO
Exceptions:
None
324
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
MTC0
MTC0
Move to Coprocessor0
31
26 25
COP0
010000
21 20
MT
00100
16 15
rt
11 10
rd
0
0
00000000000
Format:
MTC0 rt, rd
Description:
The contents of general register rt are loaded into coprocessor register rd of CP0.
Because the state of the virtual address translation system may be altered by this instruction, the operation of
load instructions, store instructions, and TLB operations immediately prior to and after this instruction are
undefined.
When using a register used by the MTC0 by means of instructions before and after it, refer to CHAPTER 11
COPROCESSOR 0 HAZARDS and place the instructions in the appropriate location.
Operation:
32, 64 T:
data ← GPR [rt]
T+1: CPR [0, rd] ← data
Exceptions:
Coprocessor unusable exception (User and Supervisor mode if CP0 not enabled)
User’s Manual U15509EJ2V0UM
325
CHAPTER 9 CPU INSTRUCTION SET DETAILS
MTHI
MTHI
Move to HI
31
26 25
SPECIAL
000000
21 20
rs
6 5
0
000000000000000
0
MTHI
010001
Format:
MTHI rs
Description:
The contents of general register rs are loaded into special register HI.
Restrictions:
The operation results written to the HI/LO register pair via a DDIV, DDIVU, DIV, DIVU, DMULT, DMULTU, MULT,
or MULTU instruction should be read by the MFHI or MFLO instruction before another result is written to either of
the registers. If the MTHI instruction is executed prior to the MFLO or MFHI instruction following the execution of
any one of the arithmetic instructions, the contents of the LO register are undefined as shown in the example
below.
MULT
r2, r4 # start operation that will eventually write to HI, LO
…
MTHI
# code not containing MFHI or MFLO
r6
…
MFLO
# code not containing MFLO
r3
# this MFLO would get an undefined value
Operation:
32, 64 T−2: HI
← undefined
T−1: HI
← undefined
T:
← GPR [rs]
HI
Exceptions:
None
326
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
MTLO
MTLO
Move to LO
31
26 25
SPECIAL
000000
21 20
rs
6 5
0
000000000000000
0
MTLO
010011
Format:
MTLO rs
Description:
The contents of general register rs are loaded into special register LO.
Restrictions:
The operation results written to the HI/LO register pair via a DDIV, DDIVU, DIV, DIVU, DMULT, DMULTU, MULT,
or MULTU instruction should be read by the MFHI or MFLO instruction before another result is written to either of
the registers. If the MTLO instruction is executed prior to the MFLO or MFHI instruction following the execution of
any one of the arithmetic instructions, the contents of the HI register are undefined as shown in the example
below.
MULT
r2, r4 # start operation that will eventually write to HI, LO
…
MTLO
# code not containing MFHI or MFLO
r6
…
MFHI
# code not containing MFHI
r3
# this MFHI would get an undefined value
Operation:
32, 64 T−2: LO ← undefined
T−1: LO ← undefined
T:
LO ← GPR [rs]
Exceptions:
None
User’s Manual U15509EJ2V0UM
327
CHAPTER 9 CPU INSTRUCTION SET DETAILS
MULT
MULT
Multiply
31
26 25
21 20
SPECIAL
000000
rs
16 15
rt
6 5
0
0000000000
0
MULT
011000
Format:
MULT rs, rt
Description:
The contents of general registers rs and rt are multiplied, treating both operands as signed 32-bit integer. No
integer overflow exception occurs under any circumstances. In 64-bit mode, the operands must be valid 32-bit,
sign-extended values.
When the operation completes, the low-order doubleword of the result is loaded into special register LO, and the
high-order doubleword of the result is loaded into special register HI.
If either of the two preceding instructions is MFHI or MFLO, the results of these instructions are undefined.
Correct operation requires separating reads of HI or LO from writes by a minimum of two other instructions.
Restrictions:
If the value of either general register rt or general register rs is not a sign-extended 32-bit value (bits 63 to 31
have the same value), the result of this operation will be undefined.
Operation:
32
T−2: LO ← undefined
HI
← undefined
T−1: LO ← undefined
T:
HI
← undefined
t
← GPR [rs] * GPR [rt]
LO ← t 31…0
HI
64
← t 63…32
T−2: LO ← undefined
HI
← undefined
T−1: LO ← undefined
T:
HI
← undefined
t
← GPR [rs] 31…0 * GPR [rt] 31…0
LO ← (t 31)
HI
← (t 63)
32
|| t 31…0
32
|| t 63…32
Exceptions:
None
328
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
MULTU
MULTU
Multiply Unsigned
31
26 25
21 20
SPECIAL
000000
rs
16 15
rt
6 5
0
0000000000
0
MULTU
011001
Format:
MULTU rs, rt
Description:
The contents of general registers rs and rt are multiplied, treating both operands as unsigned values.
No
overflow exception occurs under any circumstances. In 64-bit mode, the operands must be valid 32-bit, signextended values.
When the operation completes, the low-order doubleword of the result is loaded into special register LO, and the
high-order doubleword of the result is loaded into special register HI.
If either of the two preceding instructions is MFHI or MFLO, the results of these instructions are undefined.
Correct operation requires separating reads of HI or LO from writes by a minimum of two instructions.
Restrictions:
If the value of either general register rt or general register rs is not a sign-extended 32-bit value (bits 63 to 31
have the same value), the result of this operation will be undefined.
Operation:
32
T−2: LO ← undefined
HI
← undefined
T−1: LO ← undefined
T:
HI
← undefined
t
← (0 || GPR [rs]) * (0 || GPR [rt])
LO ← t 31…0
HI
64
← t 63…32
T−2: LO ← undefined
HI
← undefined
T−1: LO ← undefined
T:
HI
← undefined
t
← (0 || GPR [rs] 31…0 ) * (0 || GPR [rt] 31…0)
LO ← (t 31)
HI
← (t 63)
32
|| t 31…0
32
|| t 63…32
Exceptions:
None
User’s Manual U15509EJ2V0UM
329
CHAPTER 9 CPU INSTRUCTION SET DETAILS
NOR
NOR
NOR
31
26 25
SPECIAL
000000
21 20
rs
16 15
rt
11 10
rd
6 5
0
00000
0
NOR
100111
Format:
NOR rd, rs, rt
Description:
The contents of general register rs are combined with the contents of general register rt in a bit-wise logical NOR
operation. The result is placed into general register rd.
Operation:
32, 64 T:
GPR [rd] ← GPR [rs] nor GPR [rt]
Exceptions:
None
330
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
OR
OR
OR
31
26 25
SPECIAL
000000
21 20
rs
16 15
rt
11 10
rd
6 5
0
00000
0
OR
100101
Format:
OR rd, rs, rt
Description:
The contents of general register rs are combined with the contents of general register rt in a bit-wise logical OR
operation. The result is placed into general register rd.
Operation:
32, 64 T:
GPR [rd] ← GPR [rs] or GPR [rt]
Exceptions:
None
User’s Manual U15509EJ2V0UM
331
CHAPTER 9 CPU INSTRUCTION SET DETAILS
ORI
31
26 25
ORI
001101
OR Immediate
ORI
16 15
0
21 20
rs
rt
immediate
Format:
ORI rt, rs, immediate
Description:
The 16-bit immediate is zero-extended and combined with the contents of general register rs in a bit-wise logical
OR operation. The result is placed into general register rt.
Operation:
32
T:
GPR [rt] ← GPR [rs] 31…16 || (immediate or GPR [rs] 15…0)
64
T:
GPR [rt] ← GPR [rs] 63…16 || (immediate or GPR [rs] 15…0)
Exceptions:
None
332
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
SB
31
26 25
SB
101000
Store Byte
SB
16 15
0
21 20
base
rt
offset
Format:
SB rt, offset (base)
Description:
The 16-bit offset is sign-extended and added to the contents of general register base to form a virtual address.
The least-significant byte of register rt is stored at the effective address.
Operation:
32
T:
vAddr ← ((offset15)
16
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
3
pAddr ← pAddrPSIZE – 1…3 || (pAddr2…0 xor ReverseEndian )
byte ← vAddr2…0 xor BigEndianCPU
data ← GPR [rt]63 – 8*byte…0 || 0
3
8*byte
StoreMemory (uncached, BYTE, data, pAddr, vAddr, DATA)
64
T:
vAddr ← ((offset15)
48
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
3
pAddr ← pAddrPSIZE – 1…3 || (pAddr2…0 xor ReverseEndian )
byte ← vAddr2…0 xor BigEndianCPU
data ← GPR [rt]63 – 8*byte…0 || 0
3
8*byte
StoreMemory (uncached, BYTE, data, pAddr, vAddr, DATA)
Exceptions:
TLB refill exception
TLB invalid exception
TLB modified exception
Bus error exception
Address error exception
User’s Manual U15509EJ2V0UM
333
CHAPTER 9 CPU INSTRUCTION SET DETAILS
SD
SD
Store Doubleword
31
26 25
SD
111111
21 20
base
16 15
rt
0
offset
Format:
SD rt, offset (base)
Description:
The 16-bit offset is sign-extended and added to the contents of general register base to form a virtual address.
The contents of general register rt are stored at the memory location specified by the effective address.
If either of the three least-significant bits of the effective address are non-zero, an address error exception
occurs.
This operation is defined for the VR4100 Series operating in 64-bit mode or in 32-bit Kernel mode. Execution of
this instruction in 32-bit User or Supervisor mode causes a reserved instruction exception.
Operation:
32
T:
vAddr ← ((offset15)
16
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
data ← GPR [rt]
StoreMemory (uncached, DOUBLEWORD, data, pAddr, vAddr, DATA)
64
T:
vAddr ← ((offset15)
48
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
data ← GPR [rt]
StoreMemory (uncached, DOUBLEWORD, data, pAddr, vAddr, DATA)
Exceptions:
TLB refill exception
TLB invalid exception
TLB modified exception
Bus error exception
Address error exception
Reserved instruction exception (VR4100 Series in 32-bit User mode, VR4100 Series in 32-bit Supervisor mode)
334
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
SDL
SDL
Store Doubleword Left
31
26 25
SDL
101100
21 20
16 15
base
0
rt
offset
Format:
SDL rt, offset (base)
Description:
This instruction can be used with the SDR instruction to store the contents of a register into eight consecutive
bytes of memory, when the bytes cross a doubleword boundary. SDL stores the left portion of the register into
the appropriate part of the high-order doubleword in memory; SDR stores the right portion of the register into the
appropriate part of the low-order doubleword.
The SDL instruction adds its sign-extended 16-bit offset to the contents of general register base to form a virtual
address that may specify an arbitrary byte. It alters only the doubleword in memory that contains the specified
starting byte, with the high-order part of general register rt. From one to eight bytes will be stored, depending on
the starting byte specified.
Conceptually, it starts at the most-significant (leftmost) byte of the register and copies it to the specified byte in
memory; then it copies bytes from register to memory until it reaches the low-order byte of the doubleword in
memory.
Memory (little endian)
address 8
15 14 13 12 11 10
9
8
address 0
7
2
1
0
address 8
15 14 13 12 11 10
9
A
address 0
7
1
0
6
5
4
3
before
Register
A
B
C
D
E
F
G
H
$24
SDL $24, 8 ($0)
6
5
4
3
2
after
User’s Manual U15509EJ2V0UM
335
CHAPTER 9 CPU INSTRUCTION SET DETAILS
SDL
Store Doubleword Left
(Continued)
SDL
No address error exceptions due to alignment are possible.
This operation is defined for the VR4100 Series operating in 64-bit mode or in 32-bit Kernel mode. Execution of
this instruction in 32-bit User or Supervisor mode causes a reserved instruction exception.
Operation:
32
T:
vAddr ← ((offset15)
16
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
3
pAddr ← pAddrPSIZE – 1…3 || (pAddr2…0 xor ReverseEndian )
if BigEndianMem = 0 then
pAddr ← pAddrPSIZE – 1…3 || 0
3
endif
byte ← vAddr2…0 xor BigEndianCPU
data ← 0
56 – 8*byte
3
|| GPR [rt]63…56 – 8*byte
StoreMemory (uncached, byte, data, pAddr, vAddr, DATA)
64
T:
vAddr ← ((offset15)
48
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
3
pAddr ← pAddrPSIZE – 1…3 || (pAddr2…0 xor ReverseEndian )
if BigEndianMem = 0 then
pAddr ← pAddrPSIZE – 1…3 || 0
3
endif
byte ← vAddr2…0 xor BigEndianCPU
data ← 0
56 – 8*byte
3
|| GPR [rt]63…56 – 8*byte
StoreMemory (uncached, byte, data, pAddr, vAddr, DATA)
336
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
SDL
SDL
Store Doubleword Left
(Continued)
Given a doubleword in a register and a doubleword in memory, the operation of SDL is as follows:
Register
A
B
C
D
E
F
G
H
Memory
I
J
K
L
M
N
O
P
vAddr2..0
BigEndianCPU = 1 Note
BigEndianCPU = 0
destination
type
offset
LEM
destination
BEM
type
Note
offset
LEM
BEM
0
I J K LMNOA
0
0
7
A BCDE FGH
7
0
0
1
I J K LMNA B
1
0
6
I ABCDE FG
6
0
1
2
I J K LMA BC
2
0
5
I J ABCDE F
5
0
2
3
I J K L ABCD
3
0
4
I J KABCDE
4
0
3
4
I J KABCDE
4
0
3
I J K L ABCD
3
0
4
5
I J ABCDE F
5
0
2
I J K LMA BC
2
0
5
6
I ABCDE FG
6
0
1
I J K LMNA B
1
0
6
7
A BCDE FGH
7
0
0
I J K LMNOA
0
0
7
Note For VR4131 only
Remark
type:
access type (see Figure 2-2) sent to memory
offset: pAddr2..0 sent to memory
LEM:
Little-endian memory (BigEndianMem = 0)
BEM:
Big-endian memory (BigEndianMem = 1)
Exceptions:
TLB refill exception
TLB invalid exception
TLB modified exception
Bus error exception
Address error exception
Reserved instruction exception (VR4100 Series in 32-bit User mode, VR4100 Series in 32-bit Supervisor mode)
User’s Manual U15509EJ2V0UM
337
CHAPTER 9 CPU INSTRUCTION SET DETAILS
SDR
SDR
Store Doubleword Right
31
26 25
21 20
SDR
101101
16 15
base
0
rt
offset
Format:
SDR rt, offset (base)
Description:
This instruction can be used with the SDL instruction to store the contents of a register into eight consecutive
bytes of memory, when the bytes cross a doubleword boundary. SDR stores the right portion of the register into
the appropriate part of the low-order doubleword in memory; SDL stores the left portion of the register into the
appropriate part of the high-order doubleword.
The SDR instruction adds its sign-extended 16-bit offset to the contents of general register base to form a virtual
address that may specify an arbitrary byte. It alters only the doubleword in memory that contains the specified
starting byte, with the low-order part of general register rt. From one to eight bytes will be stored, depending on
the starting byte specified.
Conceptually, it starts at the least-significant (rightmost) byte of the register and copies it to the specified byte in
memory; then it copies bytes from register to memory until it reaches the high-order byte of the doubleword in
memory.
Memory (little endian)
address 8
15 14 13 12 11 10
9
8
address 0
7
2
1
0
address 8
15 14 13 12 11 10
9
8
address 0
B
H
0
6
5
4
3
before
Register
A
B
SDR $24, 1 ($0)
338
C
D
E
F
G
after
User’s Manual U15509EJ2V0UM
C
D
E
F
G
H
$24
CHAPTER 9 CPU INSTRUCTION SET DETAILS
SDR
Store Doubleword Right
(Continued)
SDR
No address error exceptions due to alignment are possible.
This operation is defined for the VR4100 Series operating in 64-bit mode or in 32-bit Kernel mode. Execution of
this instruction in 32-bit User or Supervisor mode causes a reserved instruction exception.
Operation:
32
T:
vAddr ← ((offset15)
16
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
3
pAddr ← pAddrPSIZE – 1…3 || (pAddr2…0 xor ReverseEndian )
if BigEndianMem = 0 then
pAddr ← pAddrPSIZE – 1…3 || 0
3
endif
byte ← vAddr2…0 xor BigEndianCPU
data ← GPR [rt]63 – 8*byte || 0
3
8*byte
StoreMemory (uncached, DOUBLEWORD-byte, data, pAddr, vAddr, DATA)
64
T:
vAddr ← ((offset15)
48
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
3
pAddr ← pAddrPSIZE – 1…3 || (pAddr2…0 xor ReverseEndian )
if BigEndianMem = 0 then
pAddr ← pAddrPSIZE – 1…3 || 0
3
endif
byte ← vAddr2…0 xor BigEndianCPU
data ← GPR [rt]63 – 8*byte || 0
3
8*byte
StoreMemory (uncached, DOUBLEWORD-byte, data, pAddr, vAddr, DATA)
User’s Manual U15509EJ2V0UM
339
CHAPTER 9 CPU INSTRUCTION SET DETAILS
SDR
SDR
Store Doubleword Right
(Continued)
Given a doubleword in a register and a doubleword in memory, the operation of SDR is as follows:
Register
A
B
C
D
E
F
G
H
Memory
I
J
K
L
M
N
O
P
vAddr2..0
BigEndianCPU = 1 Note
BigEndianCPU = 0
destination
type
offset
LEM
destination
BEM
type
Note
offset
LEM
BEM
0
A BCDE FGH
7
0
0
I J K LMNOA
0
7
0
1
BCDE FGHP
6
1
0
I J K LMNA B
1
6
0
2
CDE FGHOP
5
2
0
I J K LMA BC
2
5
0
3
DE FGHNOP
4
3
0
I J K L ABCD
3
4
0
4
E FGHMNOP
3
4
0
I J KABCDE
4
3
0
5
F GH LMNOP
2
5
0
I J ABCDE F
5
2
0
6
GHK LMNOP
1
6
0
I ABCDE FG
6
1
0
7
H J K LMNOP
0
7
0
A BCDE FGH
7
0
0
Note For VR4131 only
Remark
type:
access type (see Figure 2-2) sent to memory
offset: pAddr2..0 sent to memory
LEM:
Little-endian memory (BigEndianMem = 0)
BEM:
Big-endian memory (BigEndianMem = 1)
Exceptions:
TLB refill exception
TLB invalid exception
TLB modified exception
Bus error exception
Address error exception
Reserved instruction exception (VR4100 Series in 32-bit User mode, VR4100 Series in 32-bit Supervisor mode)
340
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
SH
31
26 25
SH
101001
Store Halfword
SH
16 15
0
21 20
base
rt
offset
Format:
SH rt, offset (base)
Description:
The 16-bit offset is sign-extended and added to the contents of general register base to form an effective
address. The least-significant halfword of register rt is stored at the effective address. If the least-significant bit
of the effective address is non-zero, an address error exception occurs.
Operation:
32
T:
vAddr ← ((offset15)
16
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
2
pAddr ← pAddrPSIZE – 1…3 || (pAddr2…0 xor (ReverseEndian || 0))
2
byte ← vAddr2…0 xor (BigEndianCPU || 0)
data ← GPR [rt]63 – 8*byte…0 || 0
8*byte
StoreMemory (uncached, HALFWORD, data, pAddr, vAddr, DATA)
64
T:
vAddr ← ((offset15)
48
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
2
pAddr ← pAddrPSIZE – 1…3 || (pAddr2…0 xor (ReverseEndian || 0))
2
byte ← vAddr2…0 xor (BigEndianCPU || 0)
data ← GPR [rt]63 – 8*byte…0 || 0
8*byte
StoreMemory (uncached, HALFWORD, data, pAddr, vAddr, DATA)
Exceptions:
TLB refill exception
TLB invalid exception
TLB modified exception
Bus error exception
Address error exception
User’s Manual U15509EJ2V0UM
341
CHAPTER 9 CPU INSTRUCTION SET DETAILS
SLL
SLL
Shift Left Logical
31
26 25
SPECIAL
000000
21 20
16 15
0
00000
rt
11 10
rd
6 5
sa
0
SLL
000000
Format:
SLL rd, rt, sa
Description:
The contents of general register rt are shifted left by sa bits, inserting zeros into the low-order bits. The result is
placed in register rd.
In 64-bit mode, the 32-bit result is sign-extended when placed in the destination register. It is sign extended for
all shift amounts, including zero; SLL with zero shift amount truncates a 64-bit value to 32 bits and then sign
extends this 32-bit value. SLL, unlike nearly all other word operations, does not require an operand to be a
properly sign-extended word value to produce a valid sign-extended word result.
Operation:
32
T:
64
T:
GPR [rd] ← GPR [rt] 31 – sa…0 || 0
s ← 0 || sa
temp ← GPR [rt] 31 – s…0) || 0
GPR [rd] ← (temp31)
32
sa
s
|| temp
Exceptions:
None
Caution SLL with a shift amount of zero may be treated as a NOP by some assemblers, at some
optimization levels.
If using SLL with a zero shift to truncate 64-bit values, check the
assembler you are using.
342
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
SLLV
SLLV
Shift Left Logical Variable
31
26 25
SPECIAL
000000
21 20
16 15
rs
rt
11 10
rd
6 5
0
00000
0
SLLV
000100
Format:
SLLV rd, rt, rs
Description:
The contents of general register rt are shifted left the number of bits specified by the low-order five bits contained
in general register rs, inserting zeros into the low-order bits. The result is placed in register rd.
In 64-bit mode, the 32-bit result is sign-extended when placed in the destination register. It is sign extended for
all shift amounts, including zero; SLLV with zero shift amount truncates a 64-bit value to 32 bits and then sign
extends this 32-bit value. SLLV, unlike nearly all other word operations, does not require an operand to be a
properly sign-extended word value to produce a valid sign-extended word result.
Operation:
32
T:
s ← GPR [rs] 4…0
GPR [rd] ← GPR [rt] 31 – s…0 || 0
64
T:
s ← 0 || GPR [rs] 4…0
temp ← GPR [rt] 31 – s…0) || 0
GPR [rd] ← (temp31)
32
s
s
|| temp
Exceptions:
None
Caution SLLV with a shift amount of zero may be treated as a NOP by some assemblers, at some
optimization levels.
If using SLLV with a zero shift to truncate 64-bit values, check the
assembler you are using.
User’s Manual U15509EJ2V0UM
343
CHAPTER 9 CPU INSTRUCTION SET DETAILS
SLT
SLT
Set on Less Than
31
26 25
SPECIAL
000000
21 20
rs
16 15
rt
11 10
rd
6 5
0
00000
0
SLT
101010
Format:
SLT rd, rs, rt
Description:
The contents of general register rt are subtracted from the contents of general register rs. Considering both
quantities as signed integers, if the contents of general register rs are less than the contents of general register
rt, the result is set to one; otherwise the result is set to zero. The result is placed into general register rd.
No integer overflow exception occurs under any circumstances. The comparison is valid even if the subtraction
used during the comparison overflows.
Operation:
32
T:
if GPR [rs] < GPR [rt] then
GPR [rd] ← 0
31
GPR [rd] ← 0
32
|| 1
else
endif
64
T:
if GPR [rs] < GPR [rt] then
GPR [rd] ← 0
63
GPR [rd] ← 0
64
|| 1
else
endif
Exceptions:
None
344
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
SLTI
Set on Less Than Immediate
31
26 25
SLTI
001010
21 20
16 15
rs
rt
SLTI
0
immediate
Format:
SLTI rt, rs, immediate
Description:
The 16-bit immediate is sign-extended and subtracted from the contents of general register rs. Considering both
quantities as signed integers, if the contents of general register rs are less than the sign-extended immediate, the
result is set to 1; otherwise the result is set to 0. The result is placed into general register rt.
No integer overflow exception occurs under any circumstances. The comparison is valid even if the subtraction
used during the comparison overflows.
Operation:
32
T:
if GPR [rs] < (immediate15)
GPR [rt] ← 0
31
GPR [rt] ← 0
32
16
|| immediate15…0 then
48
|| immediate15…0 then
|| 1
else
endif
64
T:
if GPR [rs] < (immediate15)
GPR [rt] ← 0
63
GPR [rt] ← 0
64
|| 1
else
endif
Exceptions:
None
User’s Manual U15509EJ2V0UM
345
CHAPTER 9 CPU INSTRUCTION SET DETAILS
SLTIU
Set on Less Than Immediate Unsigned
31
26 25
SLTIU
001011
21 20
rs
16 15
rt
SLTIU
0
immediate
Format:
SLTIU rt, rs, immediate
Description:
The 16-bit immediate is sign-extended and subtracted from the contents of general register rs. Considering both
quantities as unsigned integers, if the contents of general register rs are less than the sign-extended immediate,
the result is set to 1; otherwise the result is set to 0. The result is placed into general register rt.
No integer overflow exception occurs under any circumstances. The comparison is valid even if the subtraction
used during the comparison overflows.
Operation:
32
T:
if (0 || GPR [rs]) < (0 || (immediate15)
GPR [rt] ← 0
31
GPR [rt] ← 0
32
16
|| immediate15…0) then
48
|| immediate15…0) then
|| 1
else
endif
64
T:
if (0 || GPR [rs]) < (0 || (immediate15)
GPR [rt] ← 0
63
GPR [rt] ← 0
64
|| 1
else
endif
Exceptions:
None
346
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
SLTU
SLTU
Set on Less Than Unsigned
31
26 25
SPECIAL
000000
21 20
rs
16 15
rt
11 10
rd
6 5
0
00000
0
SLTU
101011
Format:
SLTU rd, rs, rt
Description:
The contents of general register rt are subtracted from the contents of general register rs. Considering both
quantities as unsigned integers, if the contents of general register rs are less than the contents of general
register rt, the result is set to 1; otherwise the result is set to 0. The result is placed into general register rd.
No integer overflow exception occurs under any circumstances. The comparison is valid even if the subtraction
used during the comparison overflows.
Operation:
32
T:
if (0 || GPR [rs]) < (0 || GPR [rt]) then
GPR [rd] ← 0
31
GPR [rd] ← 0
32
|| 1
else
endif
64
T:
if (0 || GPR [rs]) < (0 || GPR [rt]) then
GPR [rd] ← 0
63
GPR [rd] ← 0
64
|| 1
else
endif
Exceptions:
None
User’s Manual U15509EJ2V0UM
347
CHAPTER 9 CPU INSTRUCTION SET DETAILS
SRA
SRA
Shift Right Arithmetic
31
26 25
SPECIAL
000000
21 20
16 15
0
00000
rt
11 10
rd
6 5
sa
0
SRA
000011
Format:
SRA rd, rt, sa
Description:
The contents of general register rt are shifted right by sa bits, sign-extending the high-order bits. The result is
placed in register rd.
In 64-bit mode, the 32-bit result is sign-extended when placed in the destination register.
Restrictions:
If the value of general register rt is not a sign-extended 32-bit value (bits 63 to 31 have the same value), the
result of this operation will be undefined.
Operation:
32
T:
64
T:
GPR [rd] ← (GPR [rt] 31)
s ← 0 || sa
sa
|| GPR [rt] 31…sa
s
temp ← (GPR [rt] 31) || GPR [rt] 31…s
GPR [rd] ← (temp31)
32
|| temp
Exceptions:
None
348
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
SRAV
SRAV
Shift Right Arithmetic Variable
31
26 25
SPECIAL
000000
21 20
16 15
rs
rt
11 10
rd
6 5
0
00000
0
SRAV
000111
Format:
SRAV rd, rt, rs
Description:
The contents of general register rt are shifted right by the number of bits specified by the low-order five bits of
general register rs, sign-extending the high-order bits. The result is placed in register rd.
In 64-bit mode, the 32-bit result is sign-extended when placed in the destination register.
Restrictions:
If the value of general register rt is not a sign-extended 32-bit value (bits 63 to 31 have the same value), the
result of this operation will be undefined.
Operation:
32
T:
s ← GPR [rs] 4…0
s
GPR [rd] ← (GPR [rt] 31) || GPR [rt] 31…s
64
T:
s ← GPR [rs] 4…0
s
temp ← (GPR [rt] 31) || GPR [rt] 31…s
GPR [rd] ← (temp31)
32
|| temp
Exceptions:
None
User’s Manual U15509EJ2V0UM
349
CHAPTER 9 CPU INSTRUCTION SET DETAILS
SRL
SRL
Shift Right Logical
31
26 25
SPECIAL
000000
21 20
16 15
0
00000
rt
11 10
rd
6 5
sa
0
SRL
000010
Format:
SRL rd, rt, sa
Description:
The contents of general register rt are shifted right by sa bits, inserting zeros into the high-order bits. The result
is placed in register rd.
In 64-bit mode, the 32-bit result is sign-extended when placed in the destination register.
Restrictions:
If the value of general register rt is not a sign-extended 32-bit value (bits 63 to 31 have the same value), the
result of this operation will be undefined.
Operation:
32
T:
64
T:
GPR [rd] ← 0
sa
|| GPR [rt] 31…sa
s ← 0 || sa
s
temp ← 0 || GPR [rt] 31…s
GPR [rd] ← (temp31)
32
|| temp
Exceptions:
None
350
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
SRLV
SRLV
Shift Right Logical Variable
31
26 25
21 20
SPECIAL
000000
16 15
rs
rt
11 10
rd
6 5
0
00000
0
SRLV
000110
Format:
SRLV rd, rt, rs
Description:
The contents of general register rt are shifted right by the number of bits specified by the low-order five bits of
general register rs, inserting zeros into the high-order bits. The result is placed in register rd.
In 64-bit mode, the 32-bit result is sign-extended when placed in the destination register.
Restrictions:
If the value of general register rt is not a sign-extended 32-bit value (bits 63 to 31 have the same value), the
result of this operation will be undefined.
Operation:
32
T:
s ← GPR [rs] 4…0
s
GPR [rd] ← 0 || GPR [rt] 31…s
64
T:
s ← GPR [rs] 4…0
s
temp ← 0 || GPR [rt] 31…s
GPR [rd] ← (temp31)
32
|| temp
Exceptions:
None
User’s Manual U15509EJ2V0UM
351
CHAPTER 9 CPU INSTRUCTION SET DETAILS
STANDBY
31
Standby
26 25 24
COP0
010000
CO
1
STANDBY
6 5
0
0000000000000000000
0
STANDBY
100001
Format:
STANDBY
Description:
STANDBY instruction starts mode transition from Fullspeed mode to Standby mode.
When the STANDBY instruction finishes the WB stage, the VR4100 Series wait by the SysAD bus is idle state,
and then fix the internal clocks to high level, thus freezing the pipeline. In the VR4131 and VR4181A, IE bit of the
Status register in the CP0 is also set to 1.
The PLL, Timer/Interrupt clocks and the internal bus clocks (TClock and MasterOut) will continue to run.
Once the VR4100 Series is in Standby mode, any interrupt, including the internally generated timer interrupt, NMI,
Soft Reset, and Cold Reset will cause the VR4100 Series to exit Standby mode and to enter Fullspeed mode.
Operation:
32, 64 T:
T+1: Standby operation ( )
Exceptions:
Coprocessor unusable exception
Remark
Refer to Hardware User's Manual of each product for details about the operation of the peripheral
units at mode transition.
Program examples to enter Standby mode are shown below.
• For VR4121, VR4122, and VR4181
# Insert process to mask interrupts in the Interrupt Control Unit (ICU)
…
# Insert process for entering Standby mode
…
# Insert process to enable interrupts in the ICU
STANDBY
• For VR4131 and VR4181A
MFC0 t5, psr
ORI
t5, t5, 1
XORI
t5, t5, 1
MTC0
t5, psr
# Insert process for entering Standby mode
STANDBY
352
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
SUB
SUB
Subtract
31
26 25
SPECIAL
000000
21 20
16 15
rs
rt
11 10
rd
6 5
0
00000
0
SUB
100010
Format:
SUB rd, rs, rt
Description:
The contents of general register rt are subtracted from the contents of general register rs to form a result. The
result is placed into general register rd.
In 64-bit mode, the 32-bit result is sign-extended when placed in the destination register.
An integer overflow exception takes place if the carries out of bits 30 and 31 differ (2’s complement overflow).
The destination register rd is not modified when an integer overflow exception occurs.
Restrictions:
If the value of either general register rt or general register rs is not a sign-extended 32-bit value (bits 63 to 31
have the same value), the result of this operation will be undefined.
Operation:
32
T:
64
T:
GPR [rd] ← GPR [rs] – GPR [rt]
temp ← GPR [rs] – GPR [rt]
GPR [rd] ← (temp31)
32
|| temp31…0
Exceptions:
Integer overflow exception
User’s Manual U15509EJ2V0UM
353
CHAPTER 9 CPU INSTRUCTION SET DETAILS
SUBU
SUBU
Subtract Unsigned
31
26 25
SPECIAL
000000
21 20
16 15
rs
rt
11 10
rd
6 5
0
00000
0
SUBU
100011
Format:
SUBU rd, rs, rt
Description:
The contents of general register rt are subtracted from the contents of general register rs to form a result. The
result is placed into general register rd.
In 64-bit mode, the 32-bit result is sign-extended when placed in the destination register.
The only difference between this instruction and the SUB instruction is that SUBU never traps on overflow. No
integer overflow exception occurs under any circumstances.
Restrictions:
If the value of either general register rt or general register rs is not a sign-extended 32-bit value (bits 63 to 31
have the same value), the result of this operation will be undefined.
Operation:
32
T:
64
T:
GPR [rd] ← GPR [rs] – GPR [rt]
temp ← GPR [rs] – GPR [rt]
GPR [rd] ← (temp31)
32
|| temp31…0
Exceptions:
None
354
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
SUSPEND
31
Suspend
26 25 24
COP0
010000
CO
1
SUSPEND
6 5
0
0000000000000000000
0
SUSPEND
100010
Format:
SUSPEND
Description:
SUSPEND instruction starts mode transition from Fullspeed mode to Suspend mode.
When the SUSPEND instruction finishes the WB stage, the VR4100 Series wait by the SysAD bus is idle state,
and then fix the internal clocks including the TClock to high level, thus freezing the pipeline. In the VR4131 and
VR4181A, IE bit of the Status register in the CP0 is also set to 1.
The PLL, Timer/Interrupt clocks and MasterOut, will continue to run.
Once the VR4100 Series is in Suspend mode, any interrupt, including the internally generated timer interrupt,
NMI, Soft Reset and Cold Reset will cause the VR4100 Series to exit Suspend mode and to enter Fullspeed
mode.
Operation:
32, 64 T:
T+1: Suspend Operation ( )
Exceptions:
Coprocessor unusable exception
Remark
Refer to Hardware User's Manual of each product for details about the operation of the peripheral
units at mode transition.
Program examples to enter Suspend mode are shown below.
• For VR4121, VR4122, and VR4181
# Insert process to mask interrupts in the Interrupt Control Unit (ICU)
…
# Insert process for entering Suspend mode
…
# Insert process to enable interrupts in the ICU
SUSPEND
• For VR4131 and VR4181A
MFC0 t5, psr
ORI
t5, t5, 1
XORI
t5, t5, 1
MTC0
t5, psr
# Insert process for entering Suspend mode
SUSPEND
User’s Manual U15509EJ2V0UM
355
CHAPTER 9 CPU INSTRUCTION SET DETAILS
SW
31
26 25
SW
101011
Store Word
SW
16 15
0
21 20
base
rt
offset
Format:
SW rt, offset (base)
Description:
The 16-bit offset is sign-extended and added to the contents of general register base to form a virtual address.
The contents of general register rt are stored at the memory location specified by the effective address.
If either of the two least-significant bits of the effective address are non-zero, an address error exception occurs.
Operation:
32
T:
vAddr ← ((offset15)
16
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
2
pAddr ← pAddrPSIZE – 1…3 || (pAddr2…0 xor (ReverseEndian || 0 ))
2
byte ← vAddr2…0 xor (BigEndianCPU || 0 )
data ← GPR [rt]63 – 8*byte…0 || 0
8*byte
StoreMemory (uncached, WORD, data, pAddr, vAddr, DATA)
64
T:
vAddr ← ((offset15)
48
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
2
pAddr ← pAddrPSIZE – 1…3 || (pAddr2…0 xor (ReverseEndian || 0 ))
2
byte ← vAddr2…0 xor (BigEndianCPU || 0 )
data ← GPR [rt]63 – 8*byte…0 || 0
8*byte
StoreMemory (uncached, WORD, data, pAddr, vAddr, DATA)
Exceptions:
TLB refill exception
TLB invalid exception
TLB modified exception
Bus error exception
Address error exception
356
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
SWL
31
SWL
Store Word Left
26 25
SWL
101010
21 20
16 15
base
0
rt
offset
Format:
SWL rt, offset (base)
Description:
This instruction can be used with the SWR instruction to store the contents of a register into four consecutive
bytes of memory, when the bytes cross a word boundary. SWL stores the left portion of the register into the
appropriate part of the high-order word in memory; SWR stores the right portion of the register into the
appropriate part of the low-order word.
The SWL instruction adds its sign-extended 16-bit offset to the contents of general register base to form a virtual
address that may specify an arbitrary byte. It alters only the word in memory that contains the specified starting
byte, with the high-order part of general register rt. From one to four bytes will be stored, depending on the
starting byte specified.
Conceptually, it starts at the most-significant (leftmost) byte of the register and copies it to the specified byte in
memory; then it copies bytes from register to memory until it reaches the low-order byte of the word in memory.
No address error exceptions due to alignment are possible.
Memory (little endian)
address 4
7
6
5
4
address 0
3
2
1
0
before
Register
A
B
C
D
$24
SWL $24, 4 ($0)
address 4
7
6
5
A
address 0
3
2
1
0
after
User’s Manual U15509EJ2V0UM
357
CHAPTER 9 CPU INSTRUCTION SET DETAILS
SWL
SWL
Store Word Left
(Continued)
Operation:
32
T:
vAddr ← ((offset15)
16
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
3
pAddr ← pAddrPSIZE – 1…3 || (pAddr2…0 xor ReverseEndian )
if BigEndianMem = 0 then
pAddr ← pAddrPSIZE – 1…2 || 0
2
endif
byte ← vAddr1…0 xor BigEndianCPU
2
if (vAddr2 xor BigEndianCPU) = 0 then
data ← 0
32
data ← 0
24 – 8*byte
|| 0
24 – 8*byte
|| GPR [rt]31…24 – 8*byte
else
|| GPR [rt]31…24 – 8*byte || 0
32
endif
StoreMemory (uncached, byte, data, pAddr, vAddr, DATA)
64
T:
vAddr ← ((offset15)
48
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
3
pAddr ← pAddrPSIZE – 1…3 || (pAddr2…0 xor ReverseEndian )
if BigEndianMem = 0 then
pAddr ← pAddrPSIZE – 1…2 || 0
2
endif
byte ← vAddr1…0 xor BigEndianCPU
2
if (vAddr2 xor BigEndianCPU) = 0 then
data ← 0
32
data ← 0
24 – 8*byte
|| 0
24 – 8*byte
|| GPR [rt]31…24 – 8*byte
else
|| GPR [rt]31…24 – 8*byte || 0
32
endif
StoreMemory (uncached, byte, data, pAddr, vAddr, DATA)
358
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
SWL
SWL
Store Word Left
(Continued)
Given a doubleword in a register and a doubleword in memory, the operation of SWL is as follows:
Register
A
B
C
D
E
F
G
H
Memory
I
J
K
L
M
N
O
P
vAddr2..0
BigEndianCPU = 1 Note
BigEndianCPU = 0
destination
type
offset
LEM
destination
BEM
type
Note
offset
LEM
BEM
0
I J K LMNOE
0
0
7
E FGHMNOP
3
4
0
1
I J K LMNE F
1
0
6
I E F GMNOP
2
4
1
2
I J K LME FG
2
0
5
I J E FMNOP
1
4
2
3
I J K L E FGH
3
0
4
I J K EMNOP
0
4
3
4
I J K EMNOP
0
4
3
I J K L E FGH
3
0
4
5
I J E FMNOP
1
4
2
I J K LME FG
2
0
5
6
I E F GMNOP
2
4
1
I J K LMNE F
1
0
6
7
E FGHMNOP
3
4
0
I J K LMNOE
0
0
7
Note For VR4131 only
Remark
type:
access type (see Figure 2-2) sent to memory
offset: pAddr2..0 sent to memory
LEM:
Little-endian memory (BigEndianMem = 0)
BEM:
Big-endian memory (BigEndianMem = 1)
Exceptions:
TLB refill exception
TLB invalid exception
TLB modified exception
Bus error exception
Address error exception
User’s Manual U15509EJ2V0UM
359
CHAPTER 9 CPU INSTRUCTION SET DETAILS
SWR
31
SWR
Store Word Right
26 25
SWR
101110
21 20
16 15
base
0
rt
offset
Format:
SWR rt, offset (base)
Description:
This instruction can be used with the SWL instruction to store the contents of a register into four consecutive
bytes of memory, when the bytes cross a word boundary. SWR stores the right portion of the register into the
appropriate part of the low-order word in memory; SWL stores the left portion of the register into the appropriate
part of the high-order word.
The SWR instruction adds its sign-extended 16-bit offset to the contents of general register base to form a virtual
address that may specify an arbitrary byte. It alters only the word in memory that contains the specified starting
byte, with low-order part of general register rt. From one to four bytes will be stored, depending on the starting
byte specified.
Conceptually, it starts at the least-significant (rightmost) byte of the register and copies it to the specified byte in
memory; then copies bytes from register to memory until it reaches the high-order byte of the word in memory.
No address error exceptions due to alignment are possible.
Memory (little endian)
address 4
7
6
5
4
address 0
3
2
1
0
before
Register
A
SWR $24, 1 ($0)
address 4
7
6
5
4
address 0
B
C
D
0
360
after
User’s Manual U15509EJ2V0UM
B
C
D
$24
CHAPTER 9 CPU INSTRUCTION SET DETAILS
SWR
SWR
Store Word Right
(Continued)
Operation:
32
T:
vAddr ← ((offset15)
16
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
3
pAddr ← pAddrPSIZE – 1…3 || (pAddr2…0 xor ReverseEndian )
if BigEndianMem = 1 then
pAddr ← pAddrPSIZE – 1…2 || 0
2
endif
byte ← vAddr1…0 xor BigEndianCPU
2
if (vAddr2 xor BigEndianCPU) = 0 then
data ← 0
32
|| GPR [rt]31 – 8*byte…0 || 0
8*byte
else
data ← GPR [rt]31 – 8*byte || 0
8*byte
|| 0
32
endif
StoreMemory (uncached, WORD-byte, data, pAddr, vAddr, DATA)
64
T:
vAddr ← ((offset15)
48
|| offset15…0) + GPR [base]
(pAddr, uncached) ← AddressTranslation (vAddr, DATA)
3
pAddr ← pAddrPSIZE – 1…3 || (pAddr2…0 xor ReverseEndian )
if BigEndianMem = 1 then
pAddr ← pAddrPSIZE – 1…2 || 0
2
endif
byte ← vAddr1…0 xor BigEndianCPU
2
if (vAddr2 xor BigEndianCPU) = 0 then
data ← 0
32
|| GPR [rt]31 – 8*byte…0 || 0
8*byte
else
data ← GPR [rt]31 – 8*byte || 0
8*byte
|| 0
32
endif
StoreMemory (uncached, WORD-byte, data, pAddr, vAddr, DATA)
User’s Manual U15509EJ2V0UM
361
CHAPTER 9 CPU INSTRUCTION SET DETAILS
SWR
SWR
Store Word Right
(Continued)
Given a doubleword in a register and a doubleword in memory, the operation of SWR is as follows:
Register
A
B
C
D
E
F
G
H
Memory
I
J
K
L
M
N
O
P
vAddr2..0
BigEndianCPU = 1 Note
BigEndianCPU = 0
destination
type
offset
LEM
destination
BEM
offset
LEM
BEM
0
I J K L E FGH
3
0
4
H J K LMNOP
0
7
0
1
I J K L FGHP
2
1
4
GHK LMNOP
1
6
0
2
I J K LGHOP
1
2
4
F GH LMNOP
2
5
0
3
I J K L HNOP
0
3
4
E FGHMNOP
3
4
0
4
E FGHMNOP
3
4
0
I J K L HNOP
0
3
4
5
F GH LMNOP
2
5
0
I J K LGHOP
1
2
4
6
GHK LMNOP
1
6
0
I J K L FGHP
2
1
4
7
H J K LMNOP
0
7
0
I J K L E FGH
3
0
4
Note For VR4131 only
Remark
type:
access type (see Figure 2-2) sent to memory
offset: pAddr2..0 sent to memory
LEM:
Little-endian memory (BigEndianMem = 0)
BEM:
Big-endian memory (BigEndianMem = 1)
Exceptions:
TLB refill exception
TLB invalid exception
TLB modified exception
Bus error exception
Address error exception
362
type
Note
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
SYNC
SYNC
Synchronize
31
26 25
SPECIAL
000000
6 5
0
00000000000000000000
0
SYNC
001111
Format:
SYNC
Description:
The SYNC instruction is executed as a NOP on the VR4100 Series. This operation is compatible with code
compiled for the VR4000.
This instruction is defined for the purpose of maintaining software compatibility with the VR4000 and VR4400.
Operation:
32, 64 T:
SyncOperation ( )
Exceptions:
None
User’s Manual U15509EJ2V0UM
363
CHAPTER 9 CPU INSTRUCTION SET DETAILS
SYSCALL
31
System Call
26 25
SPECIAL
000000
SYSCALL
6 5
code
0
SYSCALL
001100
Format:
SYSCALL
Description:
A system call exception occurs by executing this instruction, immediately and unconditionally transferring control
to the exception handler.
The code field is available for use as software parameters, but is retrieved by the exception handler only by
loading the contents of the memory word containing the instruction.
Operation:
32, 64 T:
SystemCallException
Exceptions:
System call exception
364
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
TEQ
TEQ
Trap if Equal
31
26 25
SPECIAL
000000
21 20
rs
16 15
rt
6 5
code
0
TEQ
110100
Format:
TEQ rs, rt
Description:
The contents of general register rt are compared to general register rs. If the contents of general register rs are
equal to the contents of general register rt, a trap exception occurs.
The code field is available for use as software parameters, but is retrieved by the exception handler only by
loading the contents of the memory word containing the instruction.
Operation:
32, 64 T:
if GPR [rs] = GPR [rt] then
TrapException
endif
Exceptions:
Trap exception
User’s Manual U15509EJ2V0UM
365
CHAPTER 9 CPU INSTRUCTION SET DETAILS
TEQI
TEQI
Trap if Equal Immediate
31
26 25
REGIMM
000001
21 20
rs
16 15
TEQI
01100
0
immediate
Format:
TEQI rs, immediate
Description:
The 16-bit immediate is sign-extended and compared to the contents of general register rs. If the contents of
general register rs are equal to the sign-extended immediate, a trap exception occurs.
Operation:
32
T:
if GPR [rs] = (immediate15)
16
|| immediate15…0 then
48
|| immediate15…0 then
TrapException
endif
64
T:
if GPR [rs] = (immediate15)
TrapException
endif
Exceptions:
Trap exception
366
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
TGE
TGE
Trap if Greater Than or Equal
31
26 25
SPECIAL
000000
21 20
16 15
rs
rt
6 5
code
0
TGE
110000
Format:
TGE rs, rt
Description:
The contents of general register rt are compared to the contents of general register rs.
Considering both
quantities as signed integers, if the contents of general register rs are greater than or equal to the contents of
general register rt, a trap exception occurs.
The code field is available for use as software parameters, but is retrieved by the exception handler only by
loading the contents of the memory word containing the instruction.
Operation:
32, 64 T:
if GPR [rs] ≥ GPR [rt] then
TrapException
endif
Exceptions:
Trap exception
User’s Manual U15509EJ2V0UM
367
CHAPTER 9 CPU INSTRUCTION SET DETAILS
TGEI
Trap if Greater Than or Equal Immediate
31
26 25
REGIMM
000001
21 20
rs
16 15
TGEI
01000
TGEI
0
immediate
Format:
TGEI rs, immediate
Description:
The 16-bit immediate is sign-extended and compared to the contents of general register rs. Considering both
quantities as signed integers, if the contents of general register rs are greater than or equal to the sign-extended
immediate, a trap exception occurs.
Operation:
32
T:
if GPR [rs] ≥ (immediate15)
16
|| immediate15…0 then
48
|| immediate15…0 then
TrapException
endif
64
T:
if GPR [rs] ≥ (immediate15)
TrapException
endif
Exceptions:
Trap exception
368
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
TGEIU
Trap if Greater Than or Equal Immediate Unsigned
31
26 25
REGIMM
000001
21 20
rs
16 15
TGEIU
01001
TGEIU
0
immediate
Format:
TGEIU rs, immediate
Description:
The 16-bit immediate is sign-extended and compared to the contents of general register rs. Considering both
quantities as unsigned integers, if the contents of general register rs are greater than or equal to the signextended immediate, a trap exception occurs.
Operation:
32
T:
if (0 || GPR [rs]) ≥ (0 || (immediate15)
16
|| immediate15…0) then
48
|| immediate15…0) then
TrapException
endif
64
T:
if (0 || GPR [rs]) ≥ (0 || (immediate15)
TrapException
endif
Exceptions:
Trap exception
User’s Manual U15509EJ2V0UM
369
CHAPTER 9 CPU INSTRUCTION SET DETAILS
TGEU
TGEU
Trap if Greater Than or Equal Unsigned
31
26 25
SPECIAL
000000
21 20
16 15
rs
rt
6 5
code
0
TGEU
110001
Format:
TGEU rs, rt
Description:
The contents of general register rt are compared to the contents of general register rs.
Considering both
quantities as unsigned integers, if the contents of general register rs are greater than or equal to the contents of
general register rt, a trap exception occurs.
The code field is available for use as software parameters, but is retrieved by the exception handler only by
loading the contents of the memory word containing the instruction.
Operation:
32, 64 T:
if (0 || GPR [rs]) ≥ (0 || GPR [rt]) then
TrapException
endif
Exceptions:
Trap exception
370
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
TLBP
TLBP
Probe TLB for Matching Entry
31
26 25 24
COP0
010000
6 5
0
0000000000000000000
CO
1
0
TLBP
001000
Format:
TLBP
Description:
The Index register is loaded with the address of the TLB entry whose contents match the contents of the EntryHi
register. If no TLB entry matches, the high-order bit of the Index register is set.
The architecture does not specify the operation of memory references associated with the instruction
immediately after a TLBP instruction, nor is the operation specified if more than one TLB entry matches.
Operation:
32
T:
Index ← 1 || 0
25
|| Undefined
6
for i in 0…TLBEntries − 1
if (TLB [i]95…77 = EntryHi31…13) and (TLB [i]76 or (TLB [i]71…64 = EntryHi7…0)) then
Index ← 0
26
|| i5…0
endif
endfor
64
T:
Index ← 1 || 0
25
|| Undefined
6
for i in 0…TLBEntries − 1
if (TLB [i]167…141 and not (0
(EntryHi39…13 and not (0
15
15
|| TLB [i]216…205)) =
|| TLB [i]216…205)) and
(TLB [i]140 or (TLB [i]135…126 = EntryHi7…0)) then
Index ← 0
26
|| i5…0
endif
endfor
Exceptions:
Coprocessor unusable exception
User’s Manual U15509EJ2V0UM
371
CHAPTER 9 CPU INSTRUCTION SET DETAILS
TLBR
TLBR
Read Indexed TLB Entry
31
26 25 24
COP0
010000
CO
1
6 5
0
0000000000000000000
0
TLBR
000001
Format:
TLBR
Description:
The EntryHi and EntryLo registers are loaded with the contents of the TLB entry pointed at by the contents of the
Index register. The G bit (which controls ASID matching) read from the TLB is written into both of the EntryLo0
and EntryLo1 registers.
The operation is invalid (and the results are unspecified) if the contents of the Index register are greater than the
number of TLB entries in the processor.
Operation:
32
T:
PageMask ← TLB [Index5…0]127…96
EntryHi ← TLB [Index5…0]95…64 and not TLB [Index5…0]127…96
EntryLo1 ← TLB [Index5…0]63…33 || TLB [Index5…0]76
EntryLo0 ← TLB [Index5…0]31…1 || TLB [Index5…0]76
64
T:
PageMask ← TLB [Index5…0]255…192
EntryHi ← TLB [Index5…0]191…128 and not TLB [Index5…0]255…192
EntryLo1 ← TLB [Index5…0]127…65 || TLB [Index5…0]140
EntryLo0 ← TLB [Index5…0]63…1 || TLB [Index5…0]140
Exceptions:
Coprocessor unusable exception
372
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
TLBWI
TLBWI
Write Indexed TLB Entry
31
26 25 24
COP0
010000
CO
1
6 5
0
0000000000000000000
0
TLBWI
000010
Format:
TLBWI
Description:
The TLB entry pointed at by the contents of the Index register is loaded with the contents of the EntryHi and
EntryLo registers. The G bit of the TLB is written with the logical AND of the G bits in the EntryLo0 and EntryLo1
registers.
The operation is invalid (and the results are unspecified) if the contents of the Index register are greater than the
number of TLB entries in the processor.
Operation:
32, 64 T:
TLB [Index5…0] ← PageMask || (EntryHi and not PageMask) || EntryLo1 || EntryLo0
Exceptions:
Coprocessor unusable exception
User’s Manual U15509EJ2V0UM
373
CHAPTER 9 CPU INSTRUCTION SET DETAILS
TLBWR
TLBWR
Write Random TLB Entry
31
26 25 24
COP0
010000
CO
1
6 5
0
0000000000000000000
0
TLBWR
000110
Format:
TLBWR
Description:
The TLB entry pointed at by the contents of the Random register is loaded with the contents of the EntryHi and
EntryLo registers.
The G bit of the TLB is written with the logical AND of the G bits in the EntryLo0 and EntryLo1 registers.
Operation:
32, 64 T:
TLB [Random5…0] ← PageMask || (EntryHi and not PageMask) || EntryLo1 || EntryLo0
Exceptions:
Coprocessor unusable exception
374
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
TLT
TLT
Trap if Less Than
31
26 25
SPECIAL
000000
21 20
rs
16 15
rt
6 5
code
0
TLT
110010
Format:
TLT rs, rt
Description:
The contents of general register rt are compared to the contents of general register rs.
Considering both
quantities as signed integers, if the contents of general register rs are less than the contents of general register
rt, a trap exception occurs.
The code field is available for use as software parameters, but is retrieved by the exception handler only by
loading the contents of the memory word containing the instruction.
Operation:
32, 64 T:
if GPR [rs] < GPR [rt] then
TrapException
endif
Exceptions:
Trap exception
User’s Manual U15509EJ2V0UM
375
CHAPTER 9 CPU INSTRUCTION SET DETAILS
TLTI
Trap if Less Than Immediate
31
26 25
REGIMM
000001
21 20
rs
16 15
TLTI
01010
TLTI
0
immediate
Format:
TLTI rs, immediate
Description:
The 16-bit immediate is sign-extended and compared to the contents of general register rs. Considering both
quantities as signed integers, if the contents of general register rs are less than the sign-extended immediate, a
trap exception occurs.
Operation:
32
T:
if GPR [rs] < (immediate15)
16
|| immediate15…0 then
48
|| immediate15…0 then
TrapException
endif
64
T:
if GPR [rs] < (immediate15)
TrapException
endif
Exceptions:
Trap exception
376
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
TLTIU
Trap if Less Than Immediate Unsigned
31
26 25
REGIMM
000001
21 20
rs
16 15
TLTIU
01011
TLTIU
0
immediate
Format:
TLTIU rs, immediate
Description:
The 16-bit immediate is sign-extended and compared to the contents of general register rs. Considering both
quantities as unsigned integers, if the contents of general register rs are less than the sign-extended immediate,
a trap exception occurs.
Operation:
32
T:
if (0 || GPR [rs]) < (0 || (immediate15)
16
|| immediate15…0) then
48
|| immediate15…0) then
TrapException
endif
64
T:
if (0 || GPR [rs]) < (0 || (immediate15)
TrapException
endif
Exceptions:
Trap exception
User’s Manual U15509EJ2V0UM
377
CHAPTER 9 CPU INSTRUCTION SET DETAILS
TLTU
TLTU
Trap if Less Than Unsigned
31
26 25
SPECIAL
000000
21 20
rs
16 15
rt
6 5
code
0
TLTU
110011
Format:
TLTU rs, rt
Description:
The contents of general register rt are compared to the contents of general register rs.
Considering both
quantities as unsigned integers, if the contents of general register rs are less than the contents of general
register rt, a trap exception occurs.
The code field is available for use as software parameters, but is retrieved by the exception handler only by
loading the contents of the memory word containing the instruction.
Operation:
32, 64 T:
if (0 || GPR [rs]) < (0 || GPR [rt]) then
TrapException
endif
Exceptions:
Trap exception
378
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
TNE
TNE
Trap if Not Equal
31
26 25
SPECIAL
000000
21 20
rs
16 15
rt
6 5
code
0
TNE
110110
Format:
TNE rs, rt
Description:
The contents of general register rt are compared to the contents of general register rs. If the contents of general
register rs are not equal to the contents of general register rt, a trap exception occurs.
The code field is available for use as software parameters, but is retrieved by the exception handler only by
loading the contents of the memory word containing the instruction.
Operation:
32, 64 T:
if GPR [rs] ≠ GPR [rt] then
TrapException
endif
Exceptions:
Trap exception
User’s Manual U15509EJ2V0UM
379
CHAPTER 9 CPU INSTRUCTION SET DETAILS
TNEI
Trap if Not Equal Immediate
31
26 25
REGIMM
000001
21 20
rs
16 15
TNEI
01110
TNEI
0
immediate
Format:
TNEI rs, immediate
Description:
The 16-bit immediate is sign-extended and compared to the contents of general register rs. If the contents of
general register rs are not equal to the sign-extended immediate, a trap exception occurs.
Operation:
32
T:
if GPR [rs] ≠ (immediate15)
16
|| immediate15…0 then
48
|| immediate15…0 then
TrapException
endif
64
T:
if GPR [rs] ≠ (immediate15)
TrapException
endif
Exceptions:
Trap exception
380
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
XOR
XOR
Exclusive OR
31
26 25
SPECIAL
000000
21 20
rs
16 15
rt
11 10
rd
6 5
0
00000
0
XOR
100110
Format:
XOR rd, rs, rt
Description:
The contents of general register rs are combined with the contents of general register rt in a bit-wise logical
exclusive OR operation. The result is placed into general register rd.
Operation:
32, 64 T:
GPR [rd] ← GPR [rs] xor GPR [rt]
Exceptions:
None
User’s Manual U15509EJ2V0UM
381
CHAPTER 9 CPU INSTRUCTION SET DETAILS
XORI
XORI
Exclusive OR Immediate
31
26 25
XORI
001110
21 20
16 15
rs
rt
0
immediate
Format:
XORI rt, rs, immediate
Description:
The 16-bit immediate is zero-extended and combined with the contents of general register rs in a bit-wise logical
exclusive OR operation. The result is placed into general register rt.
Operation:
32
T:
GPR [rt] ← GPR [rs] xor (0
16
|| immediate)
64
T:
GPR [rt] ← GPR [rs] xor (0
48
|| immediate)
Exceptions:
None
382
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
9.4 CPU Instruction Opcode Bit Encoding
The remainder of this chapter presents the opcode bit encoding for the CPU instruction set (ISA and extensions),
as implemented by the VR4100 Series. Figure 9-1 lists the VR4100 Series Opcode Bit Encoding.
Figure 9-1. CPU Instruction Opcode Bit Encoding (1/3)
Opcode
28...26
31...29
0
1
2
3
4
5
6
7
0
SPECIAL
REGIMM
J
JAL
BEQ
BNE
BLEZ
BGTZ
1
ADDI
ADDIU
SLTI
SLTIU
ANDI
ORI
XORI
LUI
2
COP0
π
π
*
BEQL
BNEL
BLEZL
BGTZL
3
DADDIε
DADDIUε
LDLε
LDRε
*
JALXθ
*
*
4
LB
LH
LWL
LW
LBU
LHU
LWR
LWUε
5
SB
SH
SWL
SW
SDLε
SDRε
SWR
CACHEδ
6
*
π
π
*
*
π
π
LDε
7
*
π
π
*
*
π
π
SDε
SPECIAL function
2...0
5...3
0
1
2
3
4
5
6
7
0
SLL
*
SRL
SRA
SLLV
*
SRLV
SRAV
1
JR
JALR
*
*
SYSCALL
BREAK
*
SYNC
2
MFHI
MTHI
MFLO
MTLO
DSLLVε
*
DSRLVε
DSRAVε
3
MULT
MULTU
DIV
DIVU
DMULTε
DMULTUε
DDIVε
DDIVUε
4
ADD
ADDU
SUB
SUBU
AND
OR
XOR
NOR
5
Note 1
Note 2
SLT
SLTU
DADDε
DADDUε
DSUBε
DSUBUε
6
TGE
TGEU
TLT
TLTU
TEQ
*
TNE
*
7
DSLLε
*
DSRLε
DSRAε
DSLL32ε
*
DSRL32ε
DSRA32ε
REGIMM rt
18...16
20...19
0
1
2
3
4
5
6
7
0
BLTZ
BGEZ
BLTZL
BGEZL
*
*
*
*
1
TGEI
TGEIU
TLTI
TLTIU
TEQI
*
TNEI
*
2
BLTZAL
BGEZAL
BLTZALL
BGEZALL
*
*
*
*
3
*
*
*
*
*
*
*
*
Notes 1. VR4121, VR4122, VR4131, VR4181A … MACC
VR4181 … MADD16
2. VR4121, VR4122, VR4131, VR4181A … DMACC
VR4181 … DMADD16
User’s Manual U15509EJ2V0UM
383
CHAPTER 9 CPU INSTRUCTION SET DETAILS
Figure 9-1. CPU Instruction Opcode Bit Encoding (2/3)
COP0 rs
23...21
25…24
0
1
2
3
4
5
6
7
0
MF
DMFε
γ
γ
MT
DMTε
γ
γ
1
BC
γ
γ
γ
γ
γ
γ
γ
2
CO
3
COP0 rt
18...16
20...19
0
1
2
3
4
5
6
7
0
BCF
BCT
BCFL
BCTL
γ
γ
γ
γ
1
γ
γ
γ
γ
γ
γ
γ
γ
2
γ
γ
γ
γ
γ
γ
γ
γ
3
γ
γ
γ
γ
γ
γ
γ
γ
CP0 Function
2...0
384
5...3
0
1
2
3
4
5
6
7
0
φ
TLBR
TLBWI
φ
φ
φ
TLBWR
φ
1
TLBP
φ
φ
φ
φ
φ
φ
φ
2
ξ
φ
φ
φ
φ
φ
φ
φ
3
ERET χ
φ
φ
φ
φ
φ
φ
φ
4
φ
STANDBY
SUSPEND
HIBERNATE
φ
φ
φ
φ
5
φ
φ
φ
φ
φ
φ
φ
φ
6
φ
φ
φ
φ
φ
φ
φ
φ
7
φ
φ
φ
φ
φ
φ
φ
φ
User’s Manual U15509EJ2V0UM
CHAPTER 9 CPU INSTRUCTION SET DETAILS
Figure 9-1. CPU Instruction Opcode Bit Encoding (3/3)
Key:
* Operation codes marked with an asterisk cause reserved instruction exceptions in all current
implementations and are reserved for future versions of the architecture.
γ Operation codes marked with a gamma cause a reserved instruction exception. They are reserved
for future versions of the architecture.
δ Operation codes marked with a delta are valid only for processors conforming to MIPS III instruction
set or later with CP0 enabled, and cause a reserved instruction exception on other processors.
φ Operation codes marked with a phi are invalid but do not cause reserved instruction exceptions in
VR4100 Series implementations.
ξ Operation codes marked with a xi cause a reserved instruction exception on VR4100 Series
processors.
χ Operation codes marked with a chi are valid on processors conforming to MIPS III instruction set or
later only.
ε Operation codes marked with an epsilon are valid when the processor operating in 64-bit mode or
in 32-bit Kernel mode. These instructions will cause a reserved instruction exception if the
processor operates in 32-bit User or Supervisor mode.
π Operation codes marked with a pi are invalid and cause coprocessor unusable exception on
VR4100 Series processors.
θ Operation codes marked with a theta are valid when MIPS16 instruction execution is enabled, and
cause a reserved instruction exception when MIPS16 instruction execution is disabled.
User’s Manual U15509EJ2V0UM
385
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
This chapter describes the format of each MIPS16 instruction, and the format of the MIPS instructions that are
made by converting MIPS16 instructions in alphabetical order. For details of MIPS16 instruction conversion and
opcode, refer to CHAPTER 3 MIPS16 INSTRUCTION SET.
Caution
For some instructions, their format or syntax may become ineffective after they are converted to
a 32-bit instruction. For details of formats and syntax of 32-bit instructions, refer to CHAPTER 2
CPU INSTRUCTION SET SUMMARY and CHAPTER 9 CPU INSTRUCTION SET DETAILS.
386
User’s Manual U15509EJ2V0UM
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
ADDIU
Add Immediate Unsigned
(1/2)
15
ADDIU ry, rx, immediate
11 10
RRI-A
01000
31
26 25
ADDIU
001001
21 20
trx
rx
try
21 20
trx
11 10
trx
ADDIU
001001
sp
11101
0
immediate
0
immediate
11 10
16 15
sp
11101
8 7
sign
I8
01100
21 20
0
8 7
15
26 25
immediate
immediate
rx
16 15
ADDIU sp, immediate
31
ry
sign
ADDIU8
01001
ADDIU
001001
A
D
D
I
U
0
0
4 3
15
26 25
5 4 3
16 15
ADDIU rx, immediate
31
8 7
8 7
ADJSP
011
0
immediate
11 10
sign
User’s Manual U15509EJ2V0UM
3 2
immediate
0
0
000
387
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
ADDIU
Add Immediate Unsigned
(2/2)
15
ADDIU rx, pc, immediate
11 10
ADDIUSP
00001
31
26 25
21 20
ADDIU
001001
0
00000
rx
0
000000
trx
0
immediate
10 9
16 15
Note
8 7
2 1
0
0
00
immediate
Note Zeros are shown in the field of bits 21 to 25 as placeholders. The 32-bit PC-relative instruction
format shown above is provided here only to make the description complete; it is not a valid
32-bit MIPS instruction. See Chapter 3 for a complete definition of the semantics of the
MIPS16 PC-relative instructions.
15
ADDIU rx, sp, immediate
11 10
ADDIUSP
00000
31
26 25
ADDIU
001001
388
21 20
sp
11101
rx
0
immediate
10 9
16 15
trx
8 7
0
000000
User’s Manual U15509EJ2V0UM
2 1
immediate
0
00
0
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
ADDU
Add Unsigned
15
ADDU rz, rx, ry
11 10
RRR
11100
31
26 25
SPECIAL
000000
21 20
trx
16 15
rx
5 4
ry
11 10
2 1
rz
0
ADDU
01
6 5
0
00000
trz
try
8 7
0
ADDU
100001
AND
AND
15
AND, rx, ry
11 10
RR
11101
31
26 25
SPECIAL
000000
21 20
trx
16 15
try
8 7
rx
11 10
trx
User’s Manual U15509EJ2V0UM
5 4
ry
0
AND
01100
6 5
0
00000
0
AND
100100
389
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
B
Branch Unconditional
15
B immediate
11 10
0
B
00010
31
26 25
BEQ
000100
21 20
16 15
zero
00000
zero
00000
immediate
11 10
0
immediate Note
sign
Note In MIPS16 mode, the branch offset is interpreted as halfword aligned. This is unlike 32-bit
MIPS mode which interprets the offset value as word aligned. The 32-bit branch instruction
format shown above is provided here only to make the description complete; it is not a valid
32-bit MIPS instruction. See Chapter 2 and Chapter 9 for a complete definition of the
semantics of the branch instructions.
BEQZ
Branch on Equal to Zero
15
BEQZ rx, immediate
11 10
BEQZ
00100
31
26 25
BEQ
000100
21 20
trx
16 15
zero
00000
8 7
rx
0
immediate
8 7
sign
0
immediate
Note
Note In MIPS16 mode, the branch offset is interpreted as halfword aligned. This is unlike 32-bit
MIPS mode which interprets the offset value as word aligned. The 32-bit branch instruction
format shown above is provided here only to make the description complete; it is not a valid
32-bit MIPS instruction. See Chapter 2 and Chapter 9 for a complete definition of the
semantics of the branch instructions.
390
User’s Manual U15509EJ2V0UM
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
BNEZ
Branch on Not Equal to Zero
15
BNEZ rx, immediate
11 10
BNEZ
00101
31
26 25
BNE
000101
21 20
trx
8 7
rx
16 15
0
immediate
8 7
zero
00000
0
immediate Note
sign
Note In MIPS16 mode, the branch offset is interpreted as halfword aligned. This is unlike 32-bit
MIPS mode which interprets the offset value as word aligned. The 32-bit branch instruction
format shown above is provided here only to make the description complete; it is not a valid
32-bit MIPS instruction. See Chapter 2 and Chapter 9 for a complete definition of the
semantics of the branch instructions.
BREAK
Breakpoint
BREAK immediate
15
11 10
RR
11101
31
26 25
SPECIAL
000000
8 7
rx Note 1
5 4
rxNote 1
0
BREAK
00101
6 5
code Note 2
0
BREAK
001101
Notes 1. The two register fields in the MIPS16 break instruction may be used as a 6-bit code
(immediate) field for software parameters. The 6-bit code can be retrieved by the
exception handler.
2. The 32-bit break instruction format shown above is provided here only to make the
description complete; it is not a valid 32-bit MIPS instruction. The code field is entirely
ignored by the pipeline, and it is not visible in any way to the software executing on the
processor.
User’s Manual U15509EJ2V0UM
391
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
BTEQZ
Branch on T Equal to Zero
15
BTEQZ immediate
11 10
I8
01100
31
26 25
BEQ
000100
21 20
t8
11000
BTEQZ
000
16 15
zero
00000
8 7
0
immediate
8 7
0
immediate Note
sign
Note In MIPS16 mode, the branch offset is interpreted as halfword aligned. This is unlike 32-bit
MIPS mode which interprets the offset value as word aligned. The 32-bit branch instruction
format shown above is provided here only to make the description complete; it is not a valid
32-bit MIPS instruction. See Chapter 2 and Chapter 9 for a complete definition of the
semantics of the branch instructions.
BTNEZ
Branch on T Not Equal to Zero
15
BTNEZ immediate
11 10
I8
01100
31
26 25
BNE
000101
21 20
t8
11000
16 15
zero
00000
8 7
BTNEZ
001
0
immediate
8 7
sign
0
immediate Note
Note In MIPS16 mode, the branch offset is interpreted as halfword aligned. This is unlike 32-bit
MIPS mode which interprets the offset value as word aligned. The 32-bit branch instruction
format shown above is provided here only to make the description complete; it is not a valid
32-bit MIPS instruction. See Chapter 2 and Chapter 9 for a complete definition of the
semantics of the branch instructions.
392
User’s Manual U15509EJ2V0UM
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
CMP
Compare
15
CMP rx, ry
11 10
RR
11101
26 25
31
SPECIAL
000000
21 20
trx
16 15
rx
5 4
0
CMP
01010
ry
11 10
6 5
0
00000
t8
11000
try
8 7
CMPI
0
XOR
100110
Compare Immediate
15
CMPI rx, immediate
11 10
CMPI
01110
31
26 25
XORI
001110
21 20
trx
16 15
t8
11000
8 7
rx
0
immediate
8 7
0
00000000
User’s Manual U15509EJ2V0UM
0
immediate
393
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
DADDIU
Doubleword Add Immediate Unsigned
(1/2)
15
DADDIU ry, rx, immediate
11 10
RRI-A
01000
31
26 25
21 20
DADDIU
011001
trx
rx
4 3
try
sign
15
11 10
DADDIU
011001
21 20
try
26 25
DADDIU
011001
try
ry
21 20
11 10
0
immediate
8 7
DADDIU
PC
111
16 15
try
0
immediate
sign
15
0Note
00000
5 4
5 4
I64
11111
31
8 7
16 15
DADDIU ry, pc, immediate
0
immediate
DADD
IU5
101
I64
11111
26 25
5 4 3
0
D
A
D
ry
D immediate
I
U
1
16 15
DADDIU ry, immediate
31
8 7
5 4
ry
0
immediate
7 6
0
000000000
2 1
immediate
0
0
00
Note Zeros are shown in the field of bits 21 to 25 as placeholders. The 32-bit PC-relative instruction
format shown above is provided here only to make the description complete; it is not a valid
32-bit MIPS instruction. See Chapter 3 for a complete definition of the semantics of the
MIPS16 PC-relative instructions.
394
User’s Manual U15509EJ2V0UM
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
DADDIU
Doubleword Add Immediate Unsigned
(2/2)
15
DADDIU ry, sp, immediate
11 10
26 25
DADDIU
011001
21 20
sp
11101
16 15
15
26 25
DADDIU
011001
21 20
sp
11101
11 10
16 15
sp
11101
2 1
0
00
8 7
0
DADJ
SP
011
immediate
11 10
3 2
sign
0
0
0
000
immediate
Doubleword Add Unsigned
15
DADDU rz, rx, ry
11 10
RRR
11100
26 25
SPECIAL
000000
immediate
immediate
DADDU
31
0
7 6
I64
11111
31
ry
0
000000000
try
DADDIU sp, immediate
5 4
DADDIU
SP
111
I64
11111
31
8 7
21 20
trx
16 15
try
8 7
rx
11 10
trz
User’s Manual U15509EJ2V0UM
5 4
ry
2 1
rz
DADDU
00
6 5
0
00000
0
0
DADDU
101101
395
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
DDIV
Doubleword Divide
15
DDIV rx, ry
11 10
RR
11101
31
26 25
SPECIAL
000000
21 20
trx
8 7
rx
6 5
0
0000000000
try
0
DDIV
11110
ry
16 15
DDIVU
0
DDIV
011110
Doubleword Divide Unsigned
15
DDIVU rx, ry
11 10
RR
11101
31
26 25
SPECIAL
000000
396
5 4
21 20
trx
8 7
rx
16 15
try
5 4
ry
0
DDIVU
11111
6 5
0
0000000000
User’s Manual U15509EJ2V0UM
0
DDIVU
011111
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
DIV
Divide
15
DIV rx, ry
11 10
RR
11101
31
26 25
SPECIAL
000000
21 20
trx
8 7
rx
5 4
DIV
11010
ry
16 15
6 5
0
0000000000
try
0
0
DIV
011010
DIVU
Divide Unsigned
15
DIVU rx, ry
11 10
RR
11101
31
26 25
SPECIAL
000000
21 20
trx
8 7
rx
16 15
try
5 4
ry
0
DIVU
11011
6 5
0
0000000000
User’s Manual U15509EJ2V0UM
0
DIVU
011011
397
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
DMULT
Doubleword Multiply
15
DMULT rx, ry
11 10
RR
11101
31
26 25
SPECIAL
000000
21 20
trx
8 7
rx
6 5
0
0000000000
try
15
11 10
RR
11101
398
0
DMULT
011100
Doubleword Multiply Unsigned
DMULTU rx, ry
26 25
SPECIAL
000000
0
DMULT
11100
ry
16 15
DMULTU
31
5 4
21 20
trx
8 7
rx
16 15
try
5 4
ry
0
DMULTU
11101
6 5
0
0000000000
User’s Manual U15509EJ2V0UM
0
DMULTU
011101
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
DSLL
Doubleword Shift Left Logical
15
DSLL rx, ry, immediate
11 10
SHIFT
00110
31
26 25
SPECIAL
000000
21 20
0
00000
16 15
rx
shamt
0
DSLL
01
0
DSLL
111000
sa
Doubleword Shift Left Logical Variable
15
DSLLV ry, rx
11 10
RR
11101
26 25
SPECIAL
000000
2 1
6 5
trx
try
5 4
ry
11 10
DSLLV
31
8 7
21 20
trx
16 15
try
8 7
rx
11 10
try
User’s Manual U15509EJ2V0UM
5 4
ry
0
DSLLV
10100
6 5
0
00000
0
DSLLV
010100
399
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
DSRA
Doubleword Shift Right Arithmetic
15
DSRA ry, immediate
11 10
RR
11101
31
26 25
SPECIAL
000000
21 20
0
00000
16 15
try
shamt
11 10
DSRA
10011
ry
0
DSRA
111011
sa
15
11 10
RR
11101
400
0
Doubleword Shift Right Arithmetic Variable
DSRAV ry, rx
26 25
SPECIAL
000000
5 4
6 5
try
DSRAV
31
8 7
21 20
trx
16 15
try
8 7
rx
11 10
try
User’s Manual U15509EJ2V0UM
5 4
ry
0
DSRAV
10111
6 5
0
00000
0
DSRAV
010111
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
DSRL
Doubleword Shift Right Logical
15
DSRL ry, immediate
11 10
RR
11101
31
26 25
SPECIAL
000000
21 20
0
00000
16 15
try
shamt
11 10
DSRL
01000
ry
0
DSRL
111010
sa
15
11 10
RR
11101
26 25
SPECIAL
000000
21 20
trx
16 15
try
8 7
rx
5 4
6 5
0
00000
try
0
DSRLV
10110
ry
11 10
DSUBU
0
DSRLV
010110
Doubleword Subtract Unsigned
15
DSUBU rz, rx, ry
11 10
RRR
11100
26 25
SPECIAL
000000
0
Doubleword Shift Right Logical Variable
DSRLV ry, rx
31
5 4
6 5
try
DSRLV
31
8 7
21 20
trx
16 15
try
8 7
rx
11 10
trz
User’s Manual U15509EJ2V0UM
5 4
ry
2 1
rz
DSUBU
10
6 5
0
00000
0
0
DSUBU
101111
401
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
JAL
Jump and Link
15
JAL target
11 10 9
JAL
00011
0
0
5 4
immediate
20:16
0
immediate
25:21
15
0
immediate
15:0
31
26 25
0
JAL
000011
target address
JALR
Jump and Link Register
JALR ra, rx
15
11 10
RR
11101
26 25
31
SPECIAL
000000
402
21 20
trx
16 15
0
00000
8 7
rx
5 4
2
010
11 10
ra
11111
User’s Manual U15509EJ2V0UM
0
JALR
00000
6 5
0
00000
0
JALR
001001
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
JALX
Jump and Link Exchange
15
JALX target
11 10 9
JALX
00011
1
1
5 4
immediate
20:16
0
immediate
25:21
15
0
immediate
15:0
31
26 25
0
JALX
011101
target address
JR
Jump Register
15
JR rx
11 10
RR
11101
31
26 25
SPECIAL
000000
rx
0
000
15
26 25
11 10
0
000
21 20
ra
11111
JR
00000
0
JR
001000
0
000000000000000
trx
0
5 4
RR
11101
SPECIAL
000000
5 4
21 20
JR ra
31
8 7
8 7
5 4
1
001
0
JR
00000
5 4
0
000000000000000
User’s Manual U15509EJ2V0UM
0
JR
001000
403
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
LB
Load Byte
15
LB ry, offset (rx)
11 10
LB
10000
31
26 25
LB
100000
21 20
trx
8 7
rx
5 4
ry
16 15
5 4
0
immediate
LBU
Load Byte Unsigned
15
LBU ry, offset (rx)
11 10
LBU
10100
31
26 25
LBU
100100
404
immediate
0
00000000000
try
0
21 20
trx
8 7
rx
16 15
try
5 4
ry
0
immediate
5 4
0
00000000000
User’s Manual U15509EJ2V0UM
0
immediate
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
LD
Load Doubleword
15
LD ry, offset (rx)
11 10
LD
00111
31
26 25
LD
110111
21 20
trx
rx
16 15
15
26 25
21 20
0Note
00000
LD
110111
11 10
immediate
3 2
8 7
5 4
ry
0
immediate
8 7
0
00000000
0
0
000
immediate
LDPC
100
16 15
try
0
8 7
I64
11111
31
5 4
ry
0
00000000
try
LD ry, offset (pc)
8 7
3 2
0
0
000
immediate
Note Zeros are shown in the field of bits 21 to 25 as placeholders. The 32-bit PC-relative instruction
format shown above is provided here only to make the description complete; it is not a valid
32-bit MIPS instruction. See Chapter 3 for a complete definition of the semantics of the
MIPS16 PC-relative instructions.
15
LD ry, offset (sp)
11 10
I64
11111
31
26 25
LD
110111
21 20
sp
11101
LDSP
000
16 15
try
8 7
5 4
ry
0
immediate
8 7
0
00000000
User’s Manual U15509EJ2V0UM
3 2
immediate
0
0
000
405
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
LH
Load Halfword
15
LH ry, offset (rx)
11 10
LH
10001
31
26 25
LH
100001
21 20
trx
8 7
rx
ry
16 15
immediate
0
0000000000
try
0
6 5
1
0
0
0
immediate
LHU
Load Halfword Unsigned
15
LHU ry, offset (rx)
11 10
LHU
10101
31
26 25
LHU
100101
21 20
trx
8 7
rx
16 15
0
immediate
6 5
0
0000000000
try
5 4
ry
LI
1
immediate
0
0
0
Load Immediate
15
LI rx, immediate
11 10
LI
01101
31
26 25
ORI
001101
406
5 4
21 20
zero
00000
16 15
trx
8 7
rx
0
immediate
8 7
0
00000000
User’s Manual U15509EJ2V0UM
0
immediate
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
LW
Load Word
15
LW ry, offset (rx)
11 10
LW
10011
31
26 25
LW
100011
21 20
trx
rx
16 15
11 10
LWPC
10110
21 20
0 Note
00000
LW
100011
immediate
2 1
8 7
0
immediate
10 9
0
000000
0
0
00
immediate
rx
16 15
trx
0
7 6
15
26 25
5 4
ry
0
000000000
try
LW rx, offset (pc)
31
8 7
2 1
0
0
00
immediate
Note Zeros are shown in the field of bits 21 to 25 as placeholders. The 32-bit PC-relative instruction
format shown above is provided here only to make the description complete; it is not a valid
32-bit MIPS instruction. See Chapter 3 for a complete definition of the semantics of the
MIPS16 PC-relative instructions.
15
LW rx, offset (sp)
11 10
LWSP
10010
31
26 25
LW
100011
21 20
sp
11101
rx
0
immediate
10 9
16 15
trx
8 7
0
000000
User’s Manual U15509EJ2V0UM
2 1
immediate
0
0
00
407
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
LWU
Load Word Unsigned
15
LWU ry, offset (rx)
11 10
LWU
10111
31
26 25
LWU
100111
21 20
trx
8 7
5 4
rx
16 15
ry
immediate
7 6
0
000000000
try
2 1
immediate
MFHI
0
0
00
Move from HI Register
15
MFHI rx
11 10
RR
11101
26 25
31
SPECIAL
000000
16 15
0
0000000000
8 7
5 4
0
000
rx
11 10
MFHI
10000
0
MFHI
010000
0
00000
trx
0
6 5
MFLO
Move from LO Register
15
MFLO rx
11 10
RR
11101
26 25
31
SPECIAL
000000
408
0
16 15
0
0000000000
8 7
rx
5 4
0
000
11 10
trx
User’s Manual U15509EJ2V0UM
0
MFLO
10010
6 5
0
00000
0
MFLO
010010
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
MOVE
Move
15
MOVE ry, r32
11 10
I8
01100
26 25
31
SPECIAL
000000
21 20
16 15
zero
00000
r32
11 10
26 25
SPECIAL
000000
21 20
trz
16 15
zero
00000
6 5
11 10
User’s Manual U15509EJ2V0UM
0
OR
100101
8 7
MOV
32R
101
r32
0
r32
0
00000
try
I8
01100
31
5 4
ry
11 10
15
MOVE r32 rz
8 7
MOV
R32
111
5 4
r32
2:0
3 2
r32
4:3
0
rz
6 5
0
00000
0
OR
100101
409
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
MULT
Multiply
15
MULT rx, ry
11 10
RR
11101
31
26 25
SPECIAL
000000
21 20
trx
8 7
rx
5 4
MULT
11000
ry
16 15
6 5
0
0000000000
try
0
MULT
011000
MULTU
Multiply Unsigned
15
MULTU rx, ry
11 10
RR
11101
31
26 25
SPECIAL
000000
410
0
21 20
trx
8 7
rx
16 15
try
5 4
ry
0
MULTU
11001
6 5
0
0000000000
User’s Manual U15509EJ2V0UM
0
MULTU
011001
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
NEG
Negate
15
NEG rx, ry
11 10
RR
11101
26 25
31
SPECIAL
000000
21 20
zero
00000
16 15
try
8 7
rx
5 4
NEG
01011
ry
11 10
6 5
0
00000
trx
0
0
SUBU
100011
NOT
NOT
15
NOT rx, ry
11 10
RR
11101
26 25
31
SPECIAL
000000
21 20
zero
00000
16 15
try
8 7
rx
11 10
trx
User’s Manual U15509EJ2V0UM
5 4
ry
0
NOT
01111
6 5
0
00000
0
NOR
100111
411
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
OR
OR
15
OR rx, ry
11 10
RR
11101
26 25
31
SPECIAL
000000
21 20
trx
16 15
try
8 7
rx
5 4
6 5
0
00000
trx
OR
01101
ry
11 10
0
0
OR
100101
SB
Store Byte
15
SB ry, offset (rx)
11 10
SB
11000
31
26 25
SB
101000
412
21 20
trx
8 7
rx
16 15
try
5 4
ry
0
immediate
5 4
0
00000000000
User’s Manual U15509EJ2V0UM
0
immediate
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
SD
Store Doubleword
15
SD ry, offset (rx)
11 10
SD
01111
31
26 25
SD
111111
21 20
rx
16 15
11 10
I64
11111
31
26 25
SD
111111
21 20
sp
11101
26 25
SD
111111
21 20
sp
11101
11 10
16 15
ra
11111
8 7
0
immediate
3 2
0
0
000
8 7
0
immediate
11 10
User’s Manual U15509EJ2V0UM
0
000
immediate
SD
RASP
010
0
00000
0
5 4
ry
0
00000000
I64
11111
31
3 2
8 7
15
SD ra, offset (sp)
immediate
immediate
SDSP
001
16 15
try
0
8 7
15
SD ry, offset (sp)
5 4
ry
0
00000000
try
trx
8 7
3 2
immediate
0
0
000
413
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
SH
Store Halfword
15
SH ry, offset (rx)
11 10
SH
11001
31
26 25
21 20
SH
101001
rx
5 4
ry
16 15
6 5
1 0
0
0
immediate
SLL
Shift Left Logical
15
SLL rx, ry, immediate
11 10
SHIFT
00110
31
26 25
SPECIAL
000000
21 20
0
00000
16 15
8 7
ry
11 10
2 1
shamt
0
SLL
00
6 5
trx
try
5 4
rx
0
SLL
000000
sa
SLLV
Shift Left Logical Variable
15
SLLV ry, rx
11 10
RR
11101
26 25
31
SPECIAL
000000
414
0
immediate
0
0000000000
try
trx
8 7
21 20
trx
16 15
try
8 7
rx
11 10
try
User’s Manual U15509EJ2V0UM
5 4
ry
0
SLLV
00100
6 5
0
00000
0
SLLV
000100
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
SLT
Set on Less Than
15
SLT rx, ry
11 10
RR
11101
26 25
31
SPECIAL
000000
21 20
trx
16 15
rx
5 4
0
SLT
00010
ry
11 10
6 5
0
00000
t8
11000
try
8 7
SLTI
0
SLT
101010
Set on Less Than Immediate
15
SLTI rx, immediate
11 10
SLTI
01010
31
26 25
SLTI
001010
21 20
trx
16 15
t8
11000
8 7
rx
0
immediate
8 7
0
00000000
User’s Manual U15509EJ2V0UM
0
immediate
415
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
SLTIU
Set on Less Than Immediate Unsigned
15
SLTIU rx, immediate
11 10
SLTIU
01011
31
26 25
SLTIU
001011
21 20
immediate
8 7
0
0
00000000
immediate
SLTU
Set on Less Than Unsigned
15
SLTU rx, ry
11 10
RR
11101
26 25
31
SPECIAL
000000
416
0
rx
16 15
t8
11000
trx
8 7
21 20
trx
16 15
try
8 7
rx
11 10
t8
11000
User’s Manual U15509EJ2V0UM
5 4
ry
0
SLTU
00011
6 5
0
00000
0
SLTU
101011
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
SRA
Shift Right Arithmetic
15
SRA rx, ry, immediate
11 10
SHIFT
00110
31
26 25
SPECIAL
000000
21 20
0
00000
16 15
8 7
ry
rx
11 10
2 1
shamt
0
SRA
000011
sa
SRAV
0
SRA
11
6 5
trx
try
5 4
Shift Right Arithmetic Variable
15
SRAV ry, rx
11 10
RR
11101
26 25
31
SPECIAL
000000
21 20
trx
16 15
try
8 7
rx
11 10
try
User’s Manual U15509EJ2V0UM
5 4
ry
0
SRAV
00111
6 5
0
00000
0
SRAV
000111
417
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
SRL
Shift Right Logical
15
SRL rx, ry, immediate
11 10
SHIFT
00110
31
26 25
SPECIAL
000000
21 20
0
00000
16 15
try
8 7
ry
2 1
shamt
rx
11 10
0
SRL
000010
sa
trx
0
SRL
10
6 5
SRLV
Shift Right Logical Variable
15
SRLV ry, rx
11 10
RR
11101
26 25
31
SPECIAL
000000
418
5 4
21 20
trx
16 15
try
8 7
rx
11 10
try
User’s Manual U15509EJ2V0UM
5 4
ry
0
SRLV
00110
6 5
0
00000
0
SRLV
000110
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
SW
Store Word
15
SW ry, offset (rx)
11 10
SW
11011
31
26 25
SW
101011
21 20
trx
rx
15
11 10
SWSP
11010
26 25
SW
101011
21 20
sp
11101
26 25
SW
101011
21 20
sp
11101
immediate
2 1
SW
RASP
010
0
000000
User’s Manual U15509EJ2V0UM
0
0
00
8 7
0
immediate
10 9
16 15
ra
11111
0
immediate
11 10
0
0
00
8 7
0
000000
I8
01100
31
2 1
10 9
15
SW ra, offset (sp)
immediate
immediate
rx
16 15
trx
0
7 6
0
000000000
try
5 4
ry
16 15
SW rx, offset (sp)
31
8 7
2 1
immediate
0
0
00
419
CHAPTER 10 MIPS16 INSTRUCTION SET FORMAT
SUBU
Subtract Unsigned
15
SUBU rz, rx, ry
11 10
RRR
11100
31
26 25
SPECIAL
000000
21 20
trx
16 15
5 4
rx
ry
11 10
2 1
rz
SUBU
11
0
SUBU
100011
SYSCALL
System Call
15
SYSCALL
11 10
8 7
0
000
RR
11101
5 4
0
000
26 25
31
0
SYSCALL
01001
6 5
0
00000000000000000000
SPECIAL
000000
0
SYSCALL
001100
XOR
Exclusive OR
15
XOR rx, ry
11 10
RR
11101
26 25
31
SPECIAL
000000
420
0
6 5
0
00000
trz
try
8 7
21 20
trx
16 15
try
8 7
rx
11 10
trx
User’s Manual U15509EJ2V0UM
5 4
ry
0
XOR
01110
6 5
0
00000
0
XOR
100110
CHAPTER 11 COPROCESSOR 0 HAZARDS
The CPU core of the VR4100 Series avoids contention of its internal resources by causing a pipeline interlock in
such cases as when the contents of the destination register of an instruction are used as a source in the succeeding
instruction. Therefore, instructions such as NOP must not be inserted between instructions.
However, interlocks do not occur on the operations related to the CP0 registers and the TLB.
Therefore,
contention of internal resources should be considered when composing a program that manipulates the CP0
registers or the TLB. The CP0 hazards define the number of NOP instructions that is required to avoid contention of
internal resources, or the number of instructions unrelated to contention. This chapter describes the CP0 hazards.
The CP0 hazards of the CPU core of the VR4100 Series are as or less stringent than those of the VR4000. Table
11-1 lists the Coprocessor 0 hazards of the CPU core of the VR4100 Series. Code that complies with these hazards
will run without modification on the VR4000 Series.
The contents of the CP0 registers or the bits in the “Source” column of this table can be used as a source after
they are fixed.
The contents of the CP0 registers or the bits in the “Destination” column of this table can be available as a
destination after they are stored.
Based on this table, the number of NOP instructions required between instructions related to the TLB is computed
by the following formula, and so is the number of instructions unrelated to contention:
(Destination Hazard number of A) – [(Source Hazard number of B) + 1]
As an example, to compute the number of instructions required between an MTC0 and a subsequent MFC0
instruction, this is:
(5) – (3 + 1) = 1 instruction
The CP0 hazards do not generate interlocks of pipeline. Therefore, the required number of instruction must be
controlled by program.
User’s Manual U15509EJ2V0UM
421
CHAPTER 11 COPROCESSOR 0 HAZARDS
Table 11-1. Coprocessor 0 Hazards (1/2)
(a) VR4121, VR4122, VR4181, and VR4181A
Operation
Source
Source Name
Destination
No. of cycles
−
MTC0
Destination Name
CPR
No. of cycles
5
−
MFC0
CPR
3
TLBR
Index, TLB
2
PageMask, EntryHi, EntryLo0,
EntryLo1
5
TLBWI
TLBWR
Index or Random, PageMask,
EntryHi, EntryLo0, EntryLo1
2
TLB
5
TLBP
PageMask, EntryHi
2
Index
6
ERET
EPC or ErrorEPC, TLB
2
Status[EXL], [ERL]
4
Status
2
TagLo, TagHi, PErr
5
−
CACHE Index_Load_Tag
−
CACHE Index_Store_Tag
TagLo, TagHi, PErr
3
CACHE Hit ops.
cache line
3
Coprocessor usable test
Status[CU], [KSU], [EXL], [ERL]
2
−
Instruction fetch
EntryHi[ASID], Status[KSU],
[EXL], [ERL], [RE], Config[K0]
2
−
TLB
2
−
Instruction fetch
exception
cache line
5
EPC, Status
4
Cause, BadVAddr, Context,
XContext
5
Interrupt signals
Cause[IP], Status[IM], [IE], [EXL], 2
[ERL]
−
Loads/Stores
EntryHi[ASID], Status[KSU],
[EXL], [ERL], [RE], Config[K0],
TLB
3
−
Config[AD], [EP]
3
WatchHi, WatchLo
3
Load/Store exception
−
EPC, Status, Cause, BadVAddr,
Context, XContext
5
TLB shutdown
(VR4181 only)
−
Status[TS]
2 (Inst.),
4 (Data)
Remark
422
Brackets indicate a bit name or a field name of registers.
User’s Manual U15509EJ2V0UM
CHAPTER 11 COPROCESSOR 0 HAZARDS
Table 11-1. Coprocessor 0 Hazards (2/2)
(b) VR4131
Operation
Source
Source Name
Destination
No. of cycles
−
MTC0
Destination Name
CPR
No. of cycles
6
−
MFC0
CPR
4
TLBR
Index, TLB
3
PageMask, EntryHi, EntryLo0,
EntryLo1
6
TLBWI
TLBWR
Index or Random, PageMask,
EntryHi, EntryLo0, EntryLo1
3
TLB
6
TLBP
PageMask, EntryHi
3
Index
6
ERET
EPC or ErrorEPC, TLB
5
Status[EXL], [ERL]
6
Status
5
TagLo, TagHi, PErr
6
−
CACHE Index_Load_Tag
−
CACHE Index_Store_Tag
TagLo, TagHi, PErr
4
CACHE Hit ops.
cache line
4
Coprocessor usable test
Status[CU], [KSU], [EXL], [ERL]
2
−
Instruction fetch
EntryHi[ASID], Status[KSU],
[EXL], [ERL], [RE], Config[K0]
2
−
TLB
2
−
Instruction fetch
exception
cache line
6
EPC, Status
6
Cause, BadVAddr, Context,
XContext
6
Interrupt signals
Cause[IP], Status[IM], [IE], [EXL], 2
[ERL]
−
Loads/Stores
EntryHi[ASID], Status[KSU],
[EXL], [ERL], [RE], Config[K0],
TLB
4
−
Config[AD], [EP]
4
WatchHi, WatchLo
4
Load/Store exception
Remark
−
EPC, Status, Cause, BadVAddr,
Context, XContext
6
Brackets indicate a bit name or a field name of registers.
User’s Manual U15509EJ2V0UM
423
CHAPTER 11 COPROCESSOR 0 HAZARDS
Cautions 1. If the setting of the K0 bit in the Config register is changed by MTC0 for the kseg0 or ckseg0
area, the change is reflected at first to third instruction after MTC0.
2. The instruction following MTC0 must not be MFC0.
3. The five instructions following MTC0 to Status register that changes KSU bit and sets EXL and
ERL bits may be executed in the new mode, and not Kernel mode. This can be avoided by
setting EXL bit first, leaving KSU bit set to Kernel, and later changing KSU bit.
4. If interrupts are disabled by setting EXL bit in the Status register with MTC0, an interrupt may
occur immediately after MTC0 without change of the contents of the EPC register. This can be
avoided by clearing IE bit first, and later setting EXL bit.
5. There must be two non-load, non-CACHE instructions between a store and a CACHE
instruction directed to the same primary cache line as the store.
The status during execution of the following instruction for which CP0 hazards must be considered is described
below.
(1) MTC0
Destination: The completion of writing to a destination register (CP0) of MTC0.
(2) MFC0
Source:
The confirmation of a source register (CP0) of MFC0.
(3) TLBR
Source:
The confirmation of the status of TLB and the Index register before the execution of TLBR.
Destination: The completion of writing to a destination register (CP0) of TLBR.
(4) TLBWI, TLBWR
Source:
The confirmation of a source register of these instructions and registers used to specify a TLB
entry.
Destination: The completion of writing to TLB by these instructions.
(5) TLBP
Source:
The confirmation of the PageMask register and the EntryHi register before the execution of TLBP.
Destination: The completion of writing the result of execution of TLBP to the Index register.
(6) ERET
Source:
The confirmation of registers containing information necessary for executing ERET.
Destination: The completion of the processor state transition by the execution of ERET.
(7) CACHE Index_Load_Tag
Destination: The completion of writing the results of execution of this instruction to the related registers.
(8) CACHE Index_Store_Tag
Source:
424
The confirmation of registers containing information necessary for executing this instruction.
User’s Manual U15509EJ2V0UM
CHAPTER 11 COPROCESSOR 0 HAZARDS
(9) Coprocessor usable test
Source:
The confirmation of modes set by the bits of the CP0 registers in the “Source” column.
Examples 1. When accessing the CP0 registers in User mode after the CU0 bit of the Status register is
modified, or when executing an instruction such as TLB instructions, CACHE instructions, or
Branch instructions that use the resource of the CP0.
2. When accessing the CP0 registers in the operating mode set in the Status register after the KSU,
EXL, and ERL bits of the Status register are modified.
(10) Instruction fetch
Source:
The confirmation of the operating mode and TLB necessary for instruction fetch.
Examples 1. When changing the operating mode from User to Kernel and fetching instructions after the KSU,
EXL, and ERL bits of the Status register are modified.
2. When fetching instructions using the modified TLB entry after TLB modification.
(11) Instruction fetch exception
Destination: The completion of writing to registers containing information related to the exception when an
exception occurs on instruction fetch.
(12) Interrupts
Source:
The confirmation of registers judging the condition of occurrence of interrupt when an interrupt
factor is detected.
(13) Loads/Sores
Source:
The confirmation of the operating mode related to the address generation of Load/Store
instructions, TLB entries, the cache mode set in the K0 bit of the Config register, and the registers
setting the condition of occurrence of a Watch exception.
Example
When Loads/Stores are executed in the kernel field after changing the mode from User to Kernel.
(14) Load/Store exception
Destination: The completion of writing to registers containing information related to the exception when an
exception occurs on load or store operation.
(15) TLB shutdown (VR4181 only)
Destination: The completion of writing to the TS bit of the Status register when a TLB shutdown occurs.
User’s Manual U15509EJ2V0UM
425
CHAPTER 11 COPROCESSOR 0 HAZARDS
Table 11-2 indicates examples of calculation.
Table 11-2. Calculation Example of CP0 Hazard and Number of Instructions Inserted
Destination
Source
Contending
internal
resource
Number of instructions
inserted
VR4121,
VR4122,
VR4181,
VR4181A
VR4131
Formula
VR4121,
VR4122,
VR4181,
VR4181A
VR4131
TLBWR/TLBWI
TLBP
TLB Entry
2
2
5 – (2 + 1)
6 – (3 + 1)
TLBWR/TLBWI
Load or Store using newly
modified TLB
TLB Entry
1
1
5 – (3 + 1)
6 – (4 + 1)
TLBWR/TLBWI
Instruction fetch using newly
modified TLB
TLB Entry
2
3
5 – (2 + 1)
6 – (2 + 1)
MTC0
Status [CU]
Coprocessor instruction that
requires the setting of CU
Status [CU]
2
3
5 – (2 + 1)
6 – (2 + 1)
TLBR
MFC0 EntryHi
EntryHi
1
1
5 – (3 + 1)
6 – (4 + 1)
MTC0 EntryLo0
TLBWR/TLBWI
EntryLo0
2
2
5 – (2 + 1)
6 – (3 + 1)
TLBP
MFC0 Index
Index
2
1
6 – (3 + 1)
6 – (4 + 1)
MTC0 EntryHi
TLBP
EntryHi
2
2
5 – (2 + 1)
6 – (3 + 1)
MTC0 EPC
ERET
EPC
2
0
5 – (2 + 1)
6 – (5 + 1)
MTC0 Status
ERET
Status
2
0
5 – (2 + 1)
6 – (5 + 1)
MTC0
Status [IE] Note
Instruction that causes an
interrupt
Status [IE]
2
0
5 – (2 + 1)
6 – (5 + 1)
Note The number of hazards is undefined if the instruction execution sequence is changed by exceptions. In such
a case, the minimum number of hazards until the IE bit value is confirmed may be the same as the maximum
number of hazards until an interrupt request occurs that is pending and enabled.
Remark
426
Brackets indicate a bit name or a field name of registers.
User’s Manual U15509EJ2V0UM
APPENDIX INDEX
A
coprocessors.............................................................. 21
access types .......................................................36, 227
Count register........................................................... 160
Address Error exception ...........................................179
CP0 ............................................................................ 21
address spaces.........................................................133
CP0 registers.............................................................. 22
address translation ...................................128, 131, 132
CPU core.................................................................... 19
addressing ..................................................................26
CPU instruction set............................................. 33, 224
addressing modes ......................................30, 124, 164
CPU registers ............................................................. 20
B
D
BadVAddr register.....................................................160
data cache.......................................................... 19, 201
big endian .............................................................26, 27
data formats ............................................................... 26
branch delay ...............................................................90
delay slot .................................................. 36, 47, 70, 90
Branch instructions .......................................47, 82, 227
direct mapping.......................................................... 202
branch prediction ..........................................31, 94, 155
doubleword................................................................. 26
Breakpoint exception ................................................185
Bus Error exception ..................................................183
E
bypassing..................................................................123
endian ........................................................................ 26
EntryHi register................................................. 127, 151
C
EntryLo register ................................................ 127, 148
cache
EPC register ............................................................. 167
accessing ..............................................................204
ErrorEPC register ..................................................... 171
index......................................................................204
exception .................................................................. 116
line size .................................................................204
priority................................................................... 175
operations .............................................................202
types ..................................................................... 173
organization...........................................................200
vector address ...................................................... 173
size................................................................155, 204
exception code ......................................................... 166
states.....................................................................205
exception conditions................................................. 119
cache algorithm ........................................................149
exception processing................................................ 157
cache data ................................................................200
exception processing registers................................. 158
coherency..............................................................203
Extend instruction....................................................... 68
placement..............................................................202
Cache Error register..................................................170
G
cache line..........................................................200, 201
general-purpose register ...................................... 20, 55
replacement ..........................................................203
cache memory ..........................................................198
H
cache tag ..................................................................200
halfword...................................................................... 26
Cause register...........................................................165
hardware interrupts .................................................. 222
Cold Reset exception................................................176
HI register............................................................. 20, 56
Compare register ......................................................161
Computational instructions ...................................40, 74
I
Config register...........................................................153
Index register.................................................... 127, 147
Context register.........................................................159
instruction cache ................................................ 19, 200
Coprocessor 0.............................................................19
instruction formats.................................... 23, 25, 34, 59
coprocessor 0 hazards..............................................421
instruction notation conventions............................... 224
Coprocessor Unusable exception .............................186
instruction set architecture ......................................... 33
User’s Manual U15509EJ2V0UM
427
APPENDIX INDEX
instruction streaming ........................................ 101, 154
P
Integer Overflow exception....................................... 188
page sizes ................................................................ 149
interlock.................................................................... 116
PageMask register............................................ 127, 149
interrupt enable ........................................................ 164
Parity Error register .................................................. 170
Interrupt exception ................................................... 190
PC......................................................................... 20, 56
interrupt signals................................................ 222, 223
PC-relative instructions............................................... 67
interrupts .................................................................. 221
physical address............................................... 128, 133
ISA mode ................................................................... 56
pipeline ................................................................. 31, 84
ISA mode bit......................................................... 56, 57
pipeline activities ...................................................... 102
pipeline stages ............................................... 85, 87, 89
J
power mode instructions............................................. 35
joint TLB ..................................................................... 30
PRId register............................................................. 152
JTLB........................................................................... 30
product-sum operation instructions ............................ 35
Jump instruction ........................................... 47, 82, 227
R
K
Random register ............................................... 127, 147
Kernel mode..................................................... 124, 138
Reserved Instruction exception ................................ 187
Kernel mode address space .................................... 139
S
L
set associative .......................................................... 202
line lock function ...................................................... 203
slip conditions........................................................... 121
little endian ........................................................... 26, 28
Soft Reset exception ................................................ 177
LLAddr register......................................................... 155
software interrupts .................................................... 222
LO register ........................................................... 20, 56
Special instructions .............................................. 51, 83
load delay................................................................. 101
special registers ................................................... 20, 56
Load instructions .......................................... 36, 71, 226
stall conditions .......................................................... 120
stall cycles .................................................................. 46
M
Status register .......................................................... 161
MACC instructions ..................................................... 35
Store instructions.......................................... 36, 71, 226
memory hierarchy .................................................... 198
superscalar ................................................................. 87
memory management ........................................ 30, 124
Supervisor mode .............................................. 124, 135
memory management registers ............................... 146
Supervisor mode address space .............................. 136
MIPS III instructions ................................................... 23
System Call exception .............................................. 184
MIPS16 instruction set ....................................... 54, 386
System control coprocessor ....................................... 21
MIPS16 instructions ................................................... 24
system control coprocessor (CP0) instructions .. 52, 228
N
T
NMI........................................................................... 221
TagHi register ........................................................... 156
NMI exception .......................................................... 178
TagLo register .......................................................... 156
non-maskable interrupt ............................................ 221
timer interrupt ........................................................... 222
TLB ..................................................................... 30, 125
O
entry...................................................................... 125
on-chip caches ......................................................... 199
exceptions............................................................. 127
opcode ............................................................... 64, 383
instructions............................................................ 127
operating modes ........................................ 30, 124, 164
manipulation ......................................................... 126
ordinary interrupts .................................................... 221
TLB exceptions......................................................... 180
TLB Invalid exception ............................................... 181
TLB Modified exception ............................................ 182
428
User’s Manual U15509EJ2V0UM
APPENDIX INDEX
TLB Refill exception ..................................................180
WatchHi register....................................................... 168
translation lookaside buffer.................................30, 125
WatchLo register ...................................................... 168
Trap exception ..........................................................188
way(s)................................................................. 91, 202
Wired register........................................................... 150
U
word............................................................................ 26
User mode ........................................................124, 133
writeback .................................................................. 203
User mode address space ........................................134
X
V
XContext register...................................................... 169
virtual address...................................................128, 133
W
Watch exception .......................................................189
User’s Manual U15509EJ2V0UM
429
[MEMO]
430
User’s Manual U15509EJ2V0UM
Facsimile Message
From:
Name
Company
Tel.
Although NEC has taken all possible steps
to ensure that the documentation supplied
to our customers is complete, bug free
and up-to-date, we readily accept that
errors may occur. Despite all the care and
precautions we've taken, you may
encounter problems in the documentation.
Please complete this form whenever
you'd like to report errors or suggest
improvements to us.
FAX
Address
Thank you for your kind support.
North America
Hong Kong, Philippines, Oceania
NEC Electronics Inc.
NEC Electronics Hong Kong Ltd.
Corporate Communications Dept. Fax: +852-2886-9022/9044
Fax: +1-800-729-9288
+1-408-588-6130
Korea
Europe
NEC Electronics Hong Kong Ltd.
NEC Electronics (Europe) GmbH
Seoul Branch
Market Communication Dept.
Fax: +82-2-528-4411
Fax: +49-211-6503-274
Taiwan
NEC Electronics Taiwan Ltd.
Fax: +886-2-2719-5951
South America
NEC do Brasil S.A.
Fax: +55-11-6462-6829
Japan
NEC Semiconductor Technical Hotline
Fax: +81- 44-435-9608
P.R. China
NEC Electronics Shanghai, Ltd.
Fax: +86-21-6841-1137
Asian Nations except Philippines
NEC Electronics Singapore Pte. Ltd.
Fax: +65-250-3583
I would like to report the following error/make the following suggestion:
Document title:
Document number:
Page number:
If possible, please fax the referenced page or drawing.
Document Rating
Excellent
Good
Acceptable
Poor
Clarity
Technical Accuracy
Organization
CS 02.3