Contents 1 Overview of 3D-RAM and Its Functional Blocks Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Frame Buffer Design Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Simplified 3D-RAM Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3D-RAM Functional Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Block, Page, and Page Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 DRAM Banks and Basic DRAM Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Pixel Buffer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Video Buffers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Global Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Pixel ALU Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 ROP/Blend Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Dual Compare Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Pipelining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 The Picking Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2 Pin Descriptions and Pinouts Common Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Pixel ALU Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 DRAM Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Video Interface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Test Access Port . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Power & Ground . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3D-RAM Pinouts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Tracking Label. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Normal Pinout Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Reverse Pinout Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3 Pixel ALU Operations Elements of the Pixel Buffer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 Block and Word . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 Dirty Tag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Using Dirty Tag for Color Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Plane Mask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 i 3 Pixel ALU Operations Elements of the Pixel ALU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 ROP/Blend Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 Stencil Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 16-Bit Color Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Dual Compare Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Pipelining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 The Picking Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 Operations of the Pixel ALU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 Register Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Identification Register (ID[31:0]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Plane Mask Register (PM[31:0]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Constant Source Register (CSR[35:0]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Match Mask Register (MTM[31:0]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Magnitude Mask Register (MGM[31:0]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 ROP/Blend Control Register (RBC[31:0]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 Compare Control Register (CCR[31:0]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 Write Address Control Register (WAC[31:0]) . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 An Application of the Write Address Control Register . . . . . . . . . . . . . . . . . . . . 62 Blend_2 Control Register (BLD2[31:0]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 Preblend Control Register (PBC[31:0]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 Stencil Planes Register (StPl[31:0]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Stencil Control Register (StC[31:0]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 PASS_INs Select Register (PINS[31:0]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 Color Depth Select Register (CDS[31:0]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 Prohibited Register Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 Pixel Data Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Stateless Initial Data Write . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Stateless Normal Data Write . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Stateful Initial Data Write . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Stateful Normal Data Write . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Replace Dirty Tag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 OR Dirty Tag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 4 DRAM Operations An Overview of DRAM Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 Description of DRAM Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 Unmasked Write Block (UWB) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 ii 4 DRAM Operations Masked Write Block (MWB) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 Precharge Bank (PRE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 Video Transfer (VDX) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 Video Buffer Load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 Video Output Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 Initialize and Abort Video Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Prohibited Video Operation Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Duplicate Page (DUP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 Read Block (RDB) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 Access Page (ACP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 No Operation (NOP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 5 Pixel ALU Pipelines and DRAM Activities DRAM and Pixel ALU Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 DRAM Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 6 Frame Buffer Organizations Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 1280 x 1024 x 8 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 1280 x 1024 x 32 Single Buffered Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 1280 x 1024 x 32 Double Buffered Organization with Z . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 640 x 512 x 8 Double Buffered Organization with Z . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 7 Electrical Specifications Absolute Maximum Ratings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 Testing Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 DC Specifications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 AC Specifications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Pixel ALU Timing Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 DRAM Timing Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 Video Buffer Timing Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 Boundary-Scan Timing Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 8 Timing Diagrams Timing Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii 117 9 Packaging 3D-RAM Pinouts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 Normal Pinout Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 Reverse Pinout Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 Tracking Label . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 Mechanical Drawing for 128-pin FP and RF Packages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 Thermal Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 Thermal Resistance for Single Package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 Thermal Resistance for Twelve Packages Mounted on PCB . . . . . . . . . . . . . . . . . . . . 144 10 JTAG Boundary Scan Boundary-Scan Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 The TAP Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 Test-Logic-Reset State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 Run-Test/Idle State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 Select-DR-Scan State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 Capture-DR State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 Shift-DR State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 Exit1-DR State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 Pause-DR State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 Exit2-DR State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 Update-DR State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 Select-IR-Scan State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 Capture-IR State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 Shift-IR State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Exit1-IR State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Pause-IR State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Exit2-IR State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Update-IR State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Test Data Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Bypass Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Boundary-Scan Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Instruction Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 Bypass Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 Sample/Preload Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 Extest Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 VID_OE Boundary-Scan Cell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 iv 11 Formal Specification of Operations Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 Bit Ordering of Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 Access Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 Duplicate Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 Precharge Bank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 Read Block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 Masked Write Block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 Unmasked Write Block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 Video Transfer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 Video Cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 Data Read . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 Stateless Initial Data Write . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 Stateless Normal Data Write . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 Replace Dirty Tag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 OR Dirty Tag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 Write Plane Mask Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 12 Appendix A Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 v 0 Revision History MITSUBISHI Rev. 1.03 3D-RAM (M5M410092B) ELECTRONIC DEVICE GROUP Revision History Rev. 0.95 Rev. 0.96 • • Chapter 2 • The “Total” entry in Table 2.2 on pp. 19 Chapter 8 • All figure numbers and table numbers were corrected. was corrected. • The speed grade “-13” was replaced by • Entries in Table 2.6 on pp. 22 were cor- the speed grade “-12” in Tables 8.2 through 8.13. Note that both the “-10” and “-12” speed grades now have the same values for Video Buffer timing parameters. rected. • Tracking label mnemonic on p. 22 was corrected to show the speed grade “-12”. • Chapter 3 • Entries in Table 3.5 on pp. 39-41were • Chapter 9 • The mnemonic for the tracking label was corrected. • Entries in Table 3.5 on pp. 39-41were corrected to show the “-12” speed grade. • Table 9.43 on p. 157 now shows the cor- corrected. • Note 1 was added to Table 3.13 on p. 57. • Description of RBC[8n+4] on p. 60 was rect values for the parameters L and I2. • corrected. Chapter 10 • The paragraph Boundary-Scan Register • Description of BLD2[29:28] on p. 65 was on p. 169 now shows both bits 1 and 0 of the PASS_IN pins. corrected. • Description of PBC[29:28] on p. 66 was • Figure 10.4 on p. 170 now more correctly corrected. reflects the scan chain described on p. 171. • The mnemonic PBC on p. 66 was corrected. Rev. 1.00 • Note for the Color Depth Select register • Chapter 2 • Tracking label mnemonic on p. 22 was corrected to show the speed grade “10A”. on p. 71 was corrected. • Entries in Table 3.5 on pp. 39-41were corrected. • Entries in Table 3.5 on pp. 39-41were • Chapter 3 • Wording of note on bit fields on p. 60 corrected to clarify meaning. corrected. • Entries in Table 3.5 on pp. 39-41were corrected. • Chapter 5 • Tables 5.2 on p. 88, 5.4 on p. 90, 5.6 on p. 92, 5.8 on p. 94, 5.10 on p. 96, and 5.12 on p. 98 deleted. • Chapter 7 • The speed grade “-13” was replaced by the speed grade “-12” in Tables 7.4 vii Revision History First version of distributed databook 0 • through 7.9. Note that both the “-10” and “-12” speed grades now have the same values for Video Buffer timing parameters. Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI 0 Revision History ELECTRONIC DEVICE GROUP • Chapter 7 • The speed grade “-10A” was added to all tables. • Chapter 4 • Description of prohibited Video Transfer operation sequence is added. • Entries in Table 7.4 on pp. 117 and 118, Table 7.5 on p. 119, Table 7.6 on p. 120, and Table 77 on p. 121 were corrected. • Chapter 7 • The value of Icc<VID> in Table 7.3 is corrected to reflect the improvement of VID_CLK cycle time from 14.0 ns down to 12.0 ns. • Chapter 8 • The speed grade “-10A” was added to all tables. • Minor editorial corrections in Table 7.4 are done; no parameter values are changed. • Entries in Table 8.3 on p. 131, Table 8.5 on p. 135, Table 8.6 on p. 135, Table 8.7 on p. 137, and Table 8.0 on p. 140 were corrected. • Chapter 8 • Minor editorial corrections in Table 8.3 are done; no parameter values are changed. • Chapter 9 • Tracking label mnemonic on p. 22 was corrected to show the speed grade “10A”. • Chapter 9 • Tracking label and pinout diagrams are updated to reflect the new 5-character manufacturing code. Rev. 1.02 Rev. 1.03 • Chapter 2 • Tracking label and pinout diagrams are updated to reflect the new 5-character manufacturing code. • Table of contents is added • Chapter 3 • Figure 3.28 and the corresponding paragraph are corrected. • Chapter 3 • The stateless mode of the Color Depth Select register in Table 3.14 is corrected. • Chapter 4 • Figure 4.6 and the corresponding paragraph are corrected • Description of prohibited Write Control Register operation sequence is added. • Chapter 9 • The thermal resistance values in Tables 9.2 and 9.3 are updated. viii 1 Overview of 3D-RAM and Its Functional Blocks Overview of 3D-RAM and Its Functional Blocks Introduction • Flexible dual Video Buffer supporting 85-Hz CRT refresh One of the traditional bottlenecks of 3D graphics hardware has been the rate at which pixels can be rendered into a frame buffer using conventional DRAM or VRAM. The 3D-RAM emerged from a complete rethinking of frame buffer technology and produces an order of magnitude increase in rendering performance. The essence of the 3D-RAM architecture is: (1) an optimized array architecture that minimizes the average memory cycle time when rendering and (2) a selective on-chip logic that converts the interface with the rendering controller from a read-modified-write mode to a write-mostly mode. In addition to the performance boost, the new architecture also significantly reduces the system chip count. In 1994 Mitsubishi pioneered the introduction of the first member of the 3D-RAM family of products. This databook specifies all the features and operations of the third generation product of the 3D-RAM family to further elevate the performance of the 3D-RAM based 3D graphics systems. All references to 3D-RAM means the product M5M410092B, unless otherwise specifically designated. • • On-chip ALU • Four ROP units supporting 16 raster operations on byte data • Four Blend units blending the old pixel value with new information • On-chip hardware acceleration for all OpenGL blending modes(NEW) • On-chip hardware acceleration for all OpenGL stencil modes (NEW) • One 32-bit Match Comparator and one 32-bit Magnitude Comparator • Concurrent operations of DRAM, Pixel Buffer, ALU and Video Buffer • 32-bit synchronous high-bandwidth data bus interface with rendering controller • Blending operations in both (8, 8, 8, 8) and (4, 4, 4, 4) color modes (NEW) • One additional PASS_IN pin for flexible bit plane organization The factors responsible for the dramatic overall performance improvement include: • Write Mostly Interface Frame Buffer Design Example Figure 1.1 is a simple frame buffer design example showing a 1280 x 1024 x 32 single buffered configuration. The rendering controller writes pixel data across the 128-bit bus to the four 3D-RAMs. The controller commands most of the 3D-RAM operations, including ALU functions, Pixel Buffer addressing, and DRAM operations. The controller can also command video display by setting up the RAMDAC and requesting video transfers from 3D-RAMs. New Memory Architecture • 10-Mbits DRAM array supporting 1280 x 1024 x 8 frame buffer • Four independent, interleaved DRAM banks • 2048-bit SRAM Pixel Buffer as the cache between DRAM and ALU • Built-in tile-oriented memory addressing for rendering and scan line-oriented memory addressing for video refresh With the 128-bit pixel data bus shown in Figure 1.1, four pixels can be moved across the bus in one cycle. There are two ways to organize the 3D-RAMs: (1) Each 3D-RAM holds one of the • 256-bit global bus connecting DRAM banks and Pixel Buffer 1 Overview of 3D-RAM and Its Functional Blocks Rev. 1.03 3D-RAM (M5M410092B) 1 MITSUBISHI ELECTRONIC DEVICE GROUP controller to 3D-RAM is reduced to 64 bits, then two pixels are transferred in one cycle. Similarly, a 32-bit data bus can transfer only one pixel at a time. 8-bit color components—R, G, B, or a—for all 1280 x 1024 pixels; (2) Each 3D-RAM holds all 32 bits of a pixel value for 320 x 1024 pixels, allowing fast scrolling in the vertical direction and interleaving four 3D-RAMs in the horizontal direction. Chapter 6 provides more examples of frame buffer organizations using 3D-RAMs, such as 1280 x 1024 x 8, 320 x 1024 x 32, etc. If the width of the data bus from the rendering System Interface Address & Control Rendering Controller Pixel Data 1 Overview of 3D-RAM and Its Functional Blocks Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP 32 32 32 32 Monitor 3D-RAM 3D-RAM 3D-RAM 3D-RAM Video Control Video Data 16 Video Data 16 Video Data 16 Video Data 16 RAMDAC Figure 1.1 1280 x 1024 x 32 frame buffer consisting of four 3D-RAMs, shown together with a rendering controller and a RAMDAC 2 DRAM banks is transferred over the 256-bit Global Bus to the triple-ported Pixel Buffer. The Pixel Buffer consists of eight blocks, each of which is 256 bits and is updated in a single transfer on the Global Bus. Hence, the memory size of the Pixel Buffer is 2 Kbits. The ALU uses two of the Pixel Buffer ports to read and write data in the same clock cycle. Each Video Buffer is 80 x 8 bits and is loaded in a single DRAM operation. One Video Buffer can be loaded while the other is sending out video data. The 3D-RAM block diagram is shown in Figure 1.2. The DRAM array is partitioned into four independent banks of 2.5 Mbits each. Together, these four banks can support a screen resolution of 1280 x 1024 x 8. The independent banks can be interleaved to facilitate almost uninterrupted frame buffer update and, at the same time, can transfer pixel data to the dual Video Buffer for screen refresh. Data from the DRAM Bank A DRAM Bank B (2.5 Mbits) (2.5 Mbits) 640 640 Video Buffer I DRAM Bank C Video Data Video Buffer II 256 Global Bus 640 16 640 DRAM Bank D (2.5 Mbits) (2.5 Mbits) 32 Pixel Buffer (2 Kbits) ALU 32 32 Figure 1.2 Simplified 3D-RAM block diagram 3 Pixel Data Overview of 3D-RAM and Its Functional Blocks Simplified 3D-RAM Block Diagram 1 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Overview of 3D-RAM and Its Functional Blocks 1 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP 3D-RAM Functional Blocks Block (UWB), Masked Write Block (MWB), and Read Block (RDB). These operations are described in detail on page 44, “Description of DRAM Operations.” The 3D-RAM has five major functional blocks in: DRAM banks, Video Buffers, Pixel Buffer, Global Bus, and Pixel ALU. The following sections provide a quick overview of each of these functional blocks. Chapter 3 describes details of the Pixel ALU operations, Chapter 4 presents specifics of the DRAM operations, and Chapter 5 provides examples of parallelism between the Pixel ALU operations and the DRAM operations. Now, to give readers a better grasp of these functional blocks, we first describe the memory units on which these functional blocks operate. A page in a DRAM bank is organized into 10 x 4 blocks. Since a block has 256 bits, a page has 10,240 bits. There are four DRAM banks in a 3D-RAM chip, the pages of the same page address from all four DRAM banks compose a page group. Therefore, a page group has 20 x 8 blocks. Note in Figure 1.3, the block and page are purposely drawn as rectangular shapes. The user may relate these to a tiled frame buffer memory organization. For example, if the display resolution is 1280 x 1024 x 8, then a (32-bit) word contains four pixels. Since a block may be viewed as having 2 x 4 words, it contains 8 x 4 pixels. A page is organized into 10 x 4 blocks, so it contains 80 x 16 pixels, and a page group holds 160 x 32 pixels. Finally, a screen is composed of 8 x 32 page groups. The advantage of such a frame buffer memory organization is the minimization of page miss penalty. 3D objects frequently occupy portions of multiple scan lines. Since in this case a page contains 80 x 16 pixels instead of 10,240 x 1 pixels, page miss is reduced. When an object extends beyond a page boundary, bank interleaving allows hidden precharge and uninterrupted memory access. Details of the various frame buffer memory organizations using 3D-RAMs are discussed in Chapter 6. Block, Page, and Page Group A word has 32 bits and is the unit of data operations within the Pixel ALU and between the Pixel ALU and Pixel Buffer. When the Pixel ALU accesses the Pixel Buffer, not only a block address needs to be specified but also a word has to be identified. Since there are eight blocks in the Pixel Buffer and eight words in a block, the upper three bits of the input pins PALU_A designate the block, and the lower three bits select the word. The data in a word is directly mapped to PALU_DQ[31:0] in corresponding order. That is, bit 0 of the word is mapped to PALU_DQ0, bit 1 to PALU_DQ1, and so on. Although an ALU write operation operates on one word at a time, each of the four bytes in a word may be individually masked. The mapping is also direct and linear: byte 0 is PALU_DQ[7:0], byte 1 PALU_DQ[15:8], byte 2 PALU_DQ[23:16], and byte 3 PALU_DQ[31:24]. On the other hand, to support screen refresh, the Video Buffer must output pixel data one scan line at a time. The internal organization of a page also allows data to be transferred from a page to the Video Buffer, one of the sixteen scan lines of 80 bytes long each at a time. See the section “Video Buffers” on page 7 for a summary and the section “Video Transfer (VDX)” on page 46 for full details. A block has 256 bits and is the unit of memory operations between a DRAM bank and the Pixel Buffer over the Global Bus. The input pins DRAM_A selects a block from the Pixel Buffer and a block from a page of a DRAM bank. The DRAM operations on block data are Unmasked Write 4 256 256 Global Bus 0 1 2 3 4 5 6 7 4 block Pixel Buffer 00 04 08 0C 10 14 18 1C 20 24 01 05 09 0D 11 15 19 1D 21 25 02 06 0A 0E 12 16 1A 1E 22 26 03 07 0B 0F 13 17 1B 1F 23 27 10 blocks A Page in a DRAM Bank 4 6 1 3 5 8 7 6 5 4 3 2 1 0 DRAM_A[8:0] 1 0 2 7 Selecting a block in the height direction from a DRAM page Selecting a block in the width direction from a DRAM page Selecting one of eight blocks in the Pixel Buffer Block 0 5 7:0 15:8 23:16 31:24 Word 0 in Block 0 Overview of 3D-RAM and Its Functional Blocks Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP 4 3 2 1 0 PALU_A[5..0] Selecting one of eight words from the selected block Selecting one of eight blocks from the Pixel Buffer Figure 1.3 Relations and addressing scheme of blocks and words in the Pixel Buffer and in the DRAM page 5 DRAM Banks and Basic DRAM Operations on the sense amplifiers as a level-two pixel cache. Because the activated word line remains connected to the sense amplifiers after the ACP operation until the subsequent Precharge Bank operation, when a block of the sense amplifiers is updated by a block write operation (UMB or MWB), the corresponding block in the DRAM array is also updated. Therefore, the sense amplifiers function as a “write-through” cache, and no write back to the DRAM array is necessary. Alternatively, the data in the sense amplifiers can be written to any page in the same bank at this time, simply by selecting a word line without first equalizing the sense amplifiers. This function is called Duplicate Page (DUP). A typical application of this function is copying from the 257th page to one of the 256 normal pages—all 10,240 bits at a time—for fast area fill. The 3D-RAM contains four independent DRAM banks which can be interleaved to facilitate hidden precharge or access in one bank while screen refresh is being performed in another bank. Each DRAM bank has 256 pages with 10,240 bits per page for a total storage of 2,621,440 bits. An additional 257th page can be accessed for special functions or used to hold off-screen data. A row decoder takes 9-bit page address signals to generate 257 word lines, one for each page. The word lines select which page is connected to the sense amplifiers. The sense amplifiers read and write the page selected by the row decoder. Because the sense amplifiers retain data after the read/write operations, they function like a directmapped level-two pixel cache. (The Pixel Buffer, which is discussed on page 7, functions as a level-one pixel cache in a frame buffer with 3D-RAMs.) When the sense amplifiers in a DRAM bank completes the read/write operations with the Global Bus or Video Buffer, a Precharge Bank (PRE) operation usually follows. A Precharge Bank cycle simply deactivates the selected word line corresponding to the current page and equalizes the sense amplifiers. The PRE operation may be viewed as the close of a page access or as the preparation for the subsequent page access. The DRAM bank must be precharged prior to accessing a new page. During an Access Page (ACP) operation, the row decoder selects a page by activating its word line. Activating the word line of a particular page transfers the bit charges of that page to the sense amplifiers. The sense amplifiers amplify the charges. After the sensing and amplification are completed, the sense amplifiers are ready to interface the Global Bus or Video Buffer. In a way, ACP may be viewed as a “write cache” operation DRAM array 257 pages Latch 10,240 bits/page Row Decoder Overview of 3D-RAM and Its Functional Blocks 1 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Sense amplifiers Figure 1.4 DRAM bank consisting of row decoder, address latch, DRAM array, and sense amplifiers 6 transferred from the Pixel Buffer to a DRAM bank, the Dirty Tag determines which bytes are actually written. This feature can save as much as 50% of the power consumed by a 256-bit block write operation without the Dirty Tag. The Pixel Buffer is a 2048-bit SRAM organized into eight 256-bit blocks, as seen in Figure 1.3, and functions as a level-one write-back pixel cache. It has a 256-bit read/write port, a 32-bit read port, and a 32-bit write port. Referring to Figure 1.6, the 256-bit read/write port is connected to the Global Bus via a Write Buffer, and the two 32-bit ports are connected to the Pixel ALU and the pixel data pins. All three ports can be used simultaneously as long as the same memory cell is not accessed. If the two 32-bit ports access the same cell, the write operation will be successful but the read data will be undefined. The cache set associativity is determined external to the 3D-RAM, thereby permitting optimal cache design tailored to the particular graphics system. Video Buffers Each video buffer receives 640-bit data at a time from one of the two DRAM banks connected to it. (The reader is reminded of the 3D-RAM block diagram in Figure 1.2.) sixteen bits of data are shifted out onto the video data pins every video clock cycle at 14-ns rate. It takes 40 video clocks to shift all data out of a video buffer. The video counter counts modulo 40 and toggles the buffer select line when the count wraps around to 0. These two video buffers can be alternated to provide a seamless stream of video data. A 1-bit Dirty Tag bit is assigned to each byte data in the Pixel Buffer. Therefore, each block in the Pixel Buffer is associated with a 32-bit Dirty Tag in the dual-port Dirty Tag RAM. When a block is transferred from the sense amplifiers to the Pixel Buffer through the 256-bit port, the corresponding 32-bit Dirty Tag is cleared. When a block is A DRAM Page 0 1 2 8 7 6 5 4 3 2 1 0 DRAM_A[8..0] Selecting one of the sixteen 80-byte scan lines from the page ••• 16 bytes 80 bytes 14 15 Ignored 640 Other functions 16 Video Buffer (40 x 16 bits) Video Data Out Figure 1.5 Video transfer from a DRAM page to the Video Buffer 7 Overview of 3D-RAM and Its Functional Blocks Pixel Buffer 1 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Global Bus Note that all read/write operations are viewed from the perspective of the rendering controller. In other words, a read operation across the Global Bus always means a read by the Pixel ALU; that is, data is transferred from a DRAM bank into the Pixel Buffer. Similarly, a write operation across the Global Bus means data is updated from the Pixel Buffer to a DRAM bank. This is also specifically noted in Figure 1.6 by the signals Global Bus Write Block Enable and Global Bus Read Block Enable. The Global Bus connects the Pixel Buffer to the sense amplifiers of all four DRAM banks. The Global Bus consists of 256 data lines. Referring to Figure 1.6, during a transfer from the Pixel Buffer to DRAM, the 256 bits are conditionally written depending on the 32-bit Dirty Tag and the 32-bit Plane Mask. When a data block is transferred from the Pixel Buffer to the sense amplifiers, the Dirty Tag and Plane Mask control which bits of the sense amplifiers are changed via the Write Buffer. to DRAM Sense Amps Global Bus Write Block Enable Global (Pixel Buffer to DRAM) Bus 256 1 Overview of 3D-RAM and Its Functional Blocks Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP 32 Write Enable Logic 0000 H Write Buffer Enable 256 Global Bus Read Block Enable (DRAM to Pixel Buffer) 256 32 3 Dirty Tag RAM 3 read/write port Pixel Buffer 3 3 8 blocks x 256 bits 8 blocks x 32 bits write port read port 32 32 32 32-bit Plane Mask from Pixel ALU 3 Block Address from DRAM_A[8:6] Block Address from PALU_A[5:3] Word Address from PALU_A[2:0] 32 from Pixel ALU to Pixel ALU Figure 1.6 Tri-port Pixel Buffer, Global Bus and dual-port Dirty Tag RAM 8 The ROP/Blend units and the Dual Compare units are highly pipelined. Page 11 contains a brief discussion of the ALU pipeline. The output of a ROP/Blend unit is conditionally written to the Pixel Buffer, depending on the comparison results from the on-chip Dual Compare units and from the Dual Compare units of the preceding 3D-RAM chips. For example, for a 1280 x 1024 x 32 doublebuffered graphics system with 32-bit Z buffer, there are effectively 96 bits per pixel. In this case, eight 3D-RAMs are used as color chips and four as Z chips. The Pixel ALUs of the Z chips perform magnitude comparisons and feed the comparison results via their PASS_OUT pins to the The Pixel ALU consists of four 8-bit ROP/Blend units, which may be independently programed to perform either a raster operation or a blending function, one 32-bit Match Compare unit, and one 32-bit Magnitude Compare unit. The two Compare units are also commonly referred to as the Dual Compare units. The motivation for including the Pixel ALU on chip is to convert the interface from a read-modify-write interface to a write-mostly interface. This logic integration with memory arrays greatly improves rendering throughput by avoiding time consuming reads and direction changes on the data bus. PALU_DX[3:0] PALU_DQ[31:0] PASS_OUT Pixel Buffer ALU Read Port ALU Write Port PASS_IN Constant Register 36 36 32 Input Data, Old Data, byte 0 8 byte 3 and byte 0 plus ext. bits 18 Constant Data, bit 0 of extension bits plus byte 0 9 32 ROP/ Blend Unit 0 8 Old Data, byte 1 8 Input Data, byte 3 and byte 1 plus ext. bits 18 Constant Data, bit 1 of extension bits plus byte 1 9 ROP/ Blend Unit 1 8 Old Data, byte 2 8 Input Data, byte 3 and byte 2 plus ext. bits 18 Constant Data, bit 2 of extension bits plus byte 2 9 ROP/ Blend Unit 2 8 Old Data, byte 3 8 Input Data, byte 3 plus ext. bit 9 Constant Data, bit 3 of extension bits plus byte 3 9 ROP/ Blend Unit 3 8 Old Data 32 Input Data 32 Constant Data 32 Dual Compare Unit Figure 1.7 Pixel ALU (Pipeline stages are not shown.) 9 Overview of 3D-RAM and Its Functional Blocks Pixel ALU Basics 1 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Overview of 3D-RAM and Its Functional Blocks 1 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Dual Compare Unit corresponding color chips. It is important to note that due to the pipelining, the color chips do not wait for the magnitude comparison results from the Z chips; rather, the results of the ROP/ blending operations and comparison operations on the color chips, and the results of the magnitude comparison on the Z chips all are presented to the Pixel Buffer of the color chips in the same clock cycle. In this sense, the rendering controller can accomplish a pixel blending operation with Z compare and window ID compare all in a single clock cycle. Furthermore, because of the pipelining and the tri-ported architecture of the Pixel Buffer, the read and write operations may be performed on the Pixel Buffer of the 3D-RAM during the same clock cycle. Physically, the Dual Compare units consist of one 32-bit Match Compare unit and one 32-bit Magnitude Compare unit. Both Match Compare and Magnitude Compare are done in parallel. One of the sources is always the old data from the Pixel Buffer. The other source is independently selectable between the data from the PALU_DQ pins and the data from the Constant source register. There are also two mask registers, namely Match Mask and Magnitude Mask, that define which bits of the 32-bit words will be compared and which will be “don’t care.” One application of the Match Compare unit is Window ID comparison, and the Magnitude Compare unit is typically used in the depth comparison of a Z-buffer algorithm for hidden surface removal. When these Compare units are used together, the system can achieve hidden surface removal for only a specific window on the display in one cycle. Furthermore, since the data to be written into the Pixel Buffer always comes through the ROP/Blend units, a system with 3D-RAM can achieve a pixel update with a raster or blending operation specifically on only the new objects in the selected window that are closer to the viewer than the existing objects in the frame buffer. ROP/Blend Units The ROP/Blend units can be configured as either a ROP unit or a Blend unit by setting a register bit. Each ROP unit can perform all 16 standard ROP functions. These functions are listed in Chapter 3. One of the operands of the ROP functions is the old data from the Pixel Buffer, and the other operand may be either the data from the primary I/ O pins or the data from an internal register (called the Constant register). For the blending operation, the general equation is as follows: Write data to Pixel Buffer = New Term + (Old Data x Old Fraction) = (New Data x New Fraction) + (Old Data x Old Fraction) The results of both Match Compare and Magnitude Compare operations are logically ANDed together to generate the PASS_OUT pin. The PASS_IN signal (fed from another 3D-RAM chip) and the internally generated PASS_OUT signal are then logically ANDed together to produce a Write Enable signal to the Pixel Buffer. Thus, the PASS_IN and PASS_OUT pins offer hardware support for display resolutions where multiple 3D-RAM chips are required, such as in the cases of 1280x 1024 x 32 (single color buffer plus Z buffer) and 1280 x 1024 x 96 (double color buffer plus Z buffer). The 3D-RAM Blend units accomplish what is called destination blending in a single MCLK cycle, that is, the addition and the second multiplication in the above equation. In this case, the rendering controller must perform the multiplication of New Data with New Fraction (i.e., the source blending) and present the result as the New Term to 3D-RAM. In addition, 3D-RAM can also accomplish the full blending by taking two MCLK cycles, with a loop back mechanism. 10 The 3D-RAM Pixel ALU pipeline is designed so that read and write operations can be performed with minimal delay. This is achieved by having all operations conform to a uniform 7-stage pipeline. Figure 1.8 is an example that illustrates the efficiency afforded by the pipeline flow of Pixel ALU read/write operations. A pipeline stage begins with a rising edge of MCLK and ends before the next rising edge of MCLK. (In 3D-RAM, all references to MCLK are relative to the rising edge except for some boundary scan test operations.) For clarity, separate stage counts are provided for the first read and first write operations and are labeled as R1 through R4 and W1 through W7, respectively. The Read A operation is asserted for two cycles; Read A is first presented in Stage R1 and latched into the 3D-RAM by Clock 1 in Stage R2. Data A is piped out by Clock 2 in Stage R3 and becomes stable for sampling in Stage R4. Between Read B and WC (Write C), two single-cycle NOPs are inserted R1 MCLK PALU_A, PALU_OP, PALU_BE, PALU_WE, PALU_EN PALU_DQ, PALU_DX 0 R2 1 Read A R3 2 R4 3 4 Read B A 5 6 NOP 7 WC B 8 9 10 WD WE WF Data C Data D Data E 11 12 13 14 15 Read G Data F PASS_OUT G Pass C Pass D Pass E W6 W7 W8 Pass F HIT W1 W2 W3 W4 W5 M1029 Figure 1.8 Example of Pixel Port read/write operations that satisfy the pipeline flow 11 Overview of 3D-RAM and Its Functional Blocks to guarantee an idle cycle for the data bus to turn around. On the other hand, a read operation can immediately follow a write operation, as shown by Read G following WF. To allow maximum bandwidth for the rendering controller, a write operation may be started everything cycle. In this example, we start with the WC operation. The address and write instruction are presented in Stage W1 and latched into the 3D-RAM by Clock 7 in Stage W2; Data C and WD are presented in Stage W2 and latched into the 3D-RAM by Clock 8 in Stage W3. Then, after three cycles for internal processing, the valid PASS_OUT Pass C is piped out by Clock 11 in Stage W6. The actual updating of the Pixel Buffer takes place in Stage W7. Thus, n consecutive write operations take only 7 + n - 1 = n + 6 cycles to complete, including all internal activities. It is important to point out that the effective write cycle time from the perspective of the rendering controller interface is only n + 1 cycles for n consecutive write operations, as shown by WC through WF. Pipelining 1 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Overview of 3D-RAM and Its Functional Blocks 1 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP controller, while delay is still significant. The Picking Logic brings the glue logic on chip and provides an open-drain HIT pin to interface with the rendering controller. The Picking Logic From the user’s view point, a common experience of the picking function in 2D computer graphics may be using the mouse and the associated cursor to select an icon on the display screen, resulting in the selected icon highlighted in a different color. This is a basic function in interactive computer graphics, and 3D-RAM provides the Picking Logic and the HIT pin to support this picking function for selection of objects in a 3D scene. A block diagram of the Picking Logic is shown in Figure 1.9. Initially, the Picking Logic should be enabled and the HIT flag should be cleared, which is done by writing to byte 3 of the Compare Control Register. The HIT pin will be set to high (i.e., not driven low by 3D-RAM) after seven cycles (corresponding to the Pipeline Stage 8). In the figure below, this is indicated by the number 8 in the square box above the HIT pin label. This design of the pipeline flow for the HIT flag and the HIT pin prevents an incorrect HIT value from the Stateful Data Write operations before the Picking Logic is enabled. A sequence of Stateful Data Write operations may be issued immediately after the register writing. A low value on the HIT pin means that at least one of the Stateful Data Writes passed the on-chip and off-chip comparison tests and the pixel data was written to the Pixel Buffer. If the HIT pin is high, none of the Stateful Data Writes passed and no pixel is updated. See Figure 8.6, “Picking Logic Timing,” for an illustration of the operations described in this section. A picking function may involve redrawing the objects into the frame buffer and returning a list of objects that intersect with some predefined selection volume. When the user uses multiple 3D-RAMs in a frame buffer design to determine if a pixel data is successfully written by any Stateful Write operation (see “Pixel Data Operations” on page 40) during the redraw process, the comparison result on the PASS_OUT pin from each chip must be logically ANDed. If this logical operation is left to off-chip glue logic between the 3D-RAM frame buffer and the rendering controller, excessive delay is unavoidable in this critical timing path. If the rendering controller is to perform this logical operation, extra pins must be provided by both the 3D-RAM and the rendering D25 8 D Q HIT 0 D24 Compare Control Register (open drain) D D Q Q HIT Flag 1 D27 Pick Enable 0 D26 D Q 1 Stateful_WE PASS_IN PASS_OUT Set HIT Flag 7 M1040 Figure 1.9 Block diagram of the Picking Logic 12 2 Pin Descriptions and Pinouts MITSUBISHI Rev. 1.03 3D-RAM (M5M410092B) ELECTRONIC DEVICE GROUP Pin Descriptions and Pinouts Common Pins These signals are common to several sections of the 3D-RAM. Table 3.1 Signal Name Common control signals Pin Count I/O MCLK 1 I Master clock RESET 1 I Reset Total 2 Description MCLK The master clock MCLK is used for timing synchronization of internal circuitry. All external timing parameters, except video output operation and boundary scan, are specified with respect to the MCLK rising edge. 13 2 The RESET pin is an active low asynchronous signal used for power up and restart initialization. During power-up, the RESET signal should be held low for at least 500µs after stable VDD, so that the internal power supply can be stabilized. After the RESET signal goes high, nine idle cycles must elapse before the internal registers can be reset to default values. The power-up reset procedure is illustrated in Figure 8.1. When the RESET signal is asserted low during normal operations, a restart reset sequence begins. The restart reset includes resetting registers in nine idle cycles and initializing DRAM array as in the power-up reset. The restart reset sequence is shown in Figure 8.3. In DRAM array initialization, the Access Page (ACP) operation should be performed on one page for every DRAM bank, followed by the Precharge Bank (PRE) operation for every bank. Figure 8.3 shows two approaches to initializing the DRAM array. Pin Descriptions and Pinouts RESET This chapter describes the 3D-RAM pins. Unless otherwise specified, all signals comply with the Low Voltage TTL (LVTTL) standard. The functional block diagram in Figure 2.1 shows all I/O signals on the external pins. The master clock MCLK synchronizes all operations of the Pixel ALU Control and DRAM Covntrol. The Video Control specifies the video interface. The Test Access Port is used for the JTAG (Joint Test Action Group) boundary scan. The following sections describe each signal in detail. Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP VID_CLK Pin Descriptions and Pinouts Video Control DRAM Bank A VID_CKE VID_QSF DRAM Bank B VID_OE 640 640 16 VID_Q Video Buffer I Video Buffer II 256 2 DRAM Bank C Global Bus 640 3 640 2 DRAM Control 9 DRAM Bank D 2 SCAN_RST SCAN_TCK SCAN_TMS SCAN_TDI SCAN_TDO 3 Pixel Control Test Access Port 6 4 2 PALU_EN PALU_WE PALU_OP PALU_A PALU_BE PASS_OUT PASS_IN HIT 32 SRAM Pixel Buffer DRAM_EN DRAM_OP DRAM_BS DRAM_A MCLK RESET 4 ALU 32 32 PALU_DX PALU_DQ M1028 Figure 2.1 3D-RAM functional block diagram with external pins 14 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Pixel ALU Interface Table 3.2 Pixel ALU control signals Pin Count I/O Description PALU_EN 2 I Enable Pixel ALU operation starting next cycle PALU_WE 1 I Pixel ALU write enable PALU_OP 3 I Pixel ALU opcode PALU_A 6 I Read/Write address PALU_BE 4 I Byte write or output enable PALU_DQ 32 I/O PALU_DX 4 I Data extension pins for blending PASS_OUT 1 O Compare output (special signal level, see Table 7.2) PASS_IN 2 I Compare input (special signal level, see Table 7.2) HIT 1 O Picking Logic flag output (open-drain, see Table 7.2) Total 56 Data pins 2 Signal Name Pin Descriptions and Pinouts These signals control the Pixel ALU and Pixel Buffer. PALU_BE[3:0] PALU_EN[1:0] The PALU_EN[1:0] pins must be “11” to start a Pixel ALU operation. If either PALU_EN pin is “0”, then all other Pixel ALU pins are ignored. The PALU_BE[3:0] pins apply to all read and write operations, including register writes and Dirty Tag writes. If PALU_WE is low “0”, indicating a read, the PALU_BE pins are per byte output enables. If PALU_WE is high “1”, indicating a write, the PALU_BE pins are per byte write enables. PALU_BE0 controls PALU_DQ[7:0]; PALU_BE1 controls PALU_DQ[15:8]; PALU_BE2 controls PALU_DQ[23:16]; and PALU_BE3 controls PALU_DQ[31:24]. PALU_WE The PALU_WE indicates a write operation when high (“1”) and a read operation when low (“0”). PALU_OP[2:0] The PALU_OP[2:0] pins, together with PALU_WE, specify the operation to be performed. See Table 3.4 for the Pixel ALU operation encoding. PALU_DQ[31:0] Data is read from or written to the PALU_DQ[31:0] pins. The write address of Pixel Buffer may be input from PALU_DQ[29:24] in some modes of operation. See “An Application of the Write Address Control Register” on page 62. PALU_A[5:0] The PALU_A[5:0] pins provide an address for the specified operation. 15 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP DRAM Control These signals command operations on the four DRAM banks, Global Bus and Video Buffer. Extra high-order bits of PALU_DQ data are provided by PALU_DX[3:0]. PALU_DX0 is associated with PALU_DQ[7:0]; PALU_DX1 is associated with PALU_DQ[15:8]; PALU_DX2 is for PALU_DQ[23:16]; and PALU_DX3 is for PALU_DQ[31:24]. Table 3.3 PASS_OUT The comparison result of the Dual Compare unit is output on the PASS_OUT pin. PASS_OUT is low (“0”) only when the Pixel ALU operation during the fifth stage of Pixel ALU pipeline is a Stateful Initial/Normal Data Write operation (see “Pixel Data Operations” on page 40) and when either match comparison or magnitude comparison fails. Otherwise, PASS_OUT is high (“1”), indicating either the Pixel ALU operation is not a Stateful Initial/Normal Data Write or both comparison tests passed during the Stateful Initial/ Normal Data Write. 2 Pin Descriptions and Pinouts PALU_DX[3:0] DRAM control signals Signal Name Pin Count I/O Description DRAM_EN 1 I Enable DRAM operation at next cycle DRAM_OP 3 I DRAM opcode DRAM_BS 2 I DRAM select DRAM_A 9 I Address for page, block, and video line Total 15 bank DRAM_EN When DRAM_EN is high (“1”) at the rising edge of MCLK, a DRAM operation is initiated at the next clock cycle. Only the selected DRAM bank is enabled. PASS_IN[1:0] When the PASS_IN[1:0] pins are high (“11”) and the internal comparison test also passes (PASS_OUT is high (“1”)), data is written to the Pixel Buffer if the Pixel ALU operation is a Stateful Normal/Initial Data Write. Each of the PASS_IN[1:0] pins may be individually masked by the PASS_INs Select register bits 0 and 8, PINS[0, 8], respectively. DRAM_OP[2:0] The DRAM Opcode DRAM_OP[2:0] specifies the DRAM operation. See Table 4.1 for the DRAM operation encoding. DRAM_BS[1:0] DRAM_BS[1:0] is used to select one out of four banks. The selection codes are: “00” for Bank A, “01” for Bank B, “10” for Bank C, and “11” for Bank D. HIT The HIT pin is an open-drain, active low output. This pin reflects the internal status of the HIT flag. See “Compare Control Register (CCR [31:0])” on page 36 for a detailed description. 16 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP With 16-bit video data bus VID_Q[15:0], two bytes of data can be clocked out on the same cycle. In the 8-bit Video Buffer, the output format is arranged as even bytes on VID_Q[7:0] and odd bytes on VID_Q[15:8]. A detailed description of the two output data formats, normal mode and reversed mode, is in “Video Output Operation” on page 48. VID_QSF Video Interface The VID_QSF output indicates which video buffer is currently providing video data. VID_QSF is low (“0”) when Video Buffer I is shifting data out. VID_QSF is high (“1”) when Video Buffer II is shifting data out. These signals interface with a video RAMDAC chip. Table 3.4 Video signals Signal Name Pin Count I/O VID_CLK 1 I Video clock VID_CKE 1 I Video enable clock VID_OE 1 I Video enable output VID_Q 16 O Video data bus VID_QSF 1 O Video buffer indicator Total 20 Test Access Port Description These signals interface to the Test Access Port for partial compliance with the IEEE Standard 1149.1 Test Access Port and Boundary Scan—Scan Architecture. Each of the three input pins SCAN_RST, SCAN_TMS, and SCAN_TDI have an internal pull-up resistor of 10-Kohm. See Chapter 10, “JTAG Boundary Scan,” for more details. Table 3.5 VID_CLK VID_CLK is a free running or gated video shift clock. VID_CKE VID_CKE is a synchronous VID_CLK enable signal. When VID_CKE is high (“1”), the next VID_CLK cycle will be enabled. The video counter will also be enabled in the next cycle. VID_OE VID_OE is an asynchronous video output enable for VID_Q. The video data bus is enabled when VID_OE is high (“1”). 17 Serial test signals Signal Name Pin Count I/O SCAN_RST 1 I Scan reset SCAN_TCK 1 I Scan clock SCAN_TMS 1 I Scan test mode select SCAN_TDI 1 I Scan test data input SCAN_TDO 1 O Scan test data output Total 5 Description 2 The address pins DRAM_A[8:0] are used to select one of the following: (i) a page in a DRAM bank, (ii) a block of data to be transferred between the sense amplifiers of a DRAM bank and the Pixel Buffer over the Global Bus, or (iii) 80 bytes of video data from the sense amplifiers of a DRAM page to a Video Buffer. Details are described in Chapter 4, “DRAM Operations.” Pin Descriptions and Pinouts VID_Q[15:0] DRAM_A[8:0] Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Power & Ground device in reverse pinout by the letters “RF.” In both pinouts, the mapping of pin number with pin name is identical. 2 Pin Descriptions and Pinouts There are 13 Power Supply pins and 16 Ground pins. The NC pin should not be connected. Table 3.6 Tracking Label Power and Ground Signal Name Pin Count On the top surface of the 3D-RAM package, a tracking label is printed below the Mitsubishi logo and the 3D-RAM product number. The tracking label consists of 7 numbers followed by a dash and a speed/power grade designation and is represented by the mnemonic “DDDMMMMM-nn”. This mnemonic is explained below. Description VSS 16 Ground VDD 13 Power supply NC 1 No connection Total 30 DDD: Data code MMMMM: Manufacturing code nn: “10A” — tCLK (min) = 10 ns “10” — tCLK (min) = 10 ns for all operations except tCLK (min) = 12 ns for alpha saturate logic “12” — tCLK (min) = 12 ns 3D-RAM Pinouts There are two pinouts for 3D-RAM: normal pinout with pin 1 located at the lower left hand corner and specially marked by a small circle; and reverse pinout with pin 1 located at the upper left hand corner and marked by a large circle and a pointing triangle. The device in normal pinout is designated by the letters “FP” in the product number, and the 18 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP 1 SCAN_TCK 2 SCAN_RST VID_Q8 3 4 100 PALU_DQ26 99 PALU_DQ25 VID_Q9 5 98 PALU_DQ24 VSS VID_Q10 6 97 7 96 VSS PALU_DQ23 VID_Q11 8 95 PALU_DQ22 VID_Q12 9 94 PALU_DQ21 VID_Q13 10 93 PALU_DQ20 VDD VID_Q14 11 92 12 91 VDD PALU_DQ19 13 90 PALU_DQ18 14 89 PALU_DQ17 VID_CKE 15 88 PALU_DQ16 VSS 16 87 VSS VSS 17 86 VSS PASS_IN0 18 85 PASS_OUT VDD 19 84 VDD VID_CLK 20 83 MCLK VSS 21 82 NC PASS_IN1 22 81 VSS VSS 23 80 VID_OE 24 79 VSS PALU_DQ15 HIT VID_Q0 25 78 PALU_DQ14 26 77 PALU_DQ13 VID_Q1 27 76 PALU_DQ12 VDD VID_Q2 28 75 29 74 VDD PALU_DQ11 VID_Q3 30 73 PALU_DQ10 VID_Q4 31 72 PALU_DQ9 VID_Q5 32 71 PALU_DQ8 VSS VID_Q6 33 70 34 69 VSS PALU_DQ7 DDDMMMMM-nn M5M410092BFP VID_Q15 VID_QSF VID_Q7 35 68 PALU_DQ6 SCAN_TDO 36 67 PALU_DQ5 SCAN_TDI 37 66 PALU_DQ4 VDD 38 65 VDD 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 VDD DRAM_A0 DRAM_A1 DRAM_A2 DRAM_A3 DRAM_A4 DRAM_EN DRAM_A5 DRAM_OP0 VSS PALU_A0 PALU_A1 PALU_A2 PALU_EN0 PALU_OP0 VSS PALU_OP1 PALU_BE0 PALU_BE1 PALU_DX0 PALU_DX1 PALU_DQ0 PALU_DQ1 PALU_DQ2 VDD PALU_DQ3 64 39 19 M1048 2 VDD 103 PALU_DQ28 104 PALU_DQ29 105 PALU_DQ30 106 PALU_DQ31 107 PALU_DX2 108 PALU_DX3 109 PALU_BE2 110 PALU_BE3 111 VSS 112 PALU_OP2 113 PALU_WE 114 PALU_EN1 115 PALU_A3 116 PALU_A4 117 PALU_A5 118 VSS 119 DRAM_OP1 120 DRAM_OP2 121 DRAM_A6 122 DRAM_A7 123 DRAM_A8 124 DRAM_BS0 125 DRAM_BS1 126 VDD 128 RESET 127 102 VDD 101 PALU_DQ27 SCAN_TMS Pin Descriptions and Pinouts Normal Pinout Diagram Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP 1 SCAN_TMS 2 SCAN_TCK 3 PALU_DQ25 99 4 SCAN_RST VID_Q8 PALU_DQ24 98 5 VID_Q9 VSS PALU_DQ23 97 6 96 7 VSS VID_Q10 PALU_DQ22 95 8 VID_Q11 PALU_DQ21 94 9 VID_Q12 PALU_DQ20 93 10 VID_Q13 VDD PALU_DQ19 92 11 91 12 VDD VID_Q14 PALU_DQ18 90 13 VID_Q15 PALU_DQ17 89 14 VID_QSF PALU_DQ16 88 15 VID_CKE VSS 87 16 VSS VSS 86 17 VSS PASS_OUT 85 18 PASS_IN0 VDD 84 19 VDD MCLK 83 20 VID_CLK NC 82 21 VSS VSS 81 22 PASS_IN1 VSS PALU_DQ15 80 23 VSS 24 VID_OE PALU_DQ14 78 25 PALU_DQ13 77 26 HIT VID_Q0 PALU_DQ12 76 27 VID_Q1 VDD PALU_DQ11 75 28 74 29 VDD VID_Q2 PALU_DQ10 73 30 VID_Q3 PALU_DQ9 72 31 VID_Q4 PALU_DQ8 71 32 VID_Q5 VSS PALU_DQ7 70 33 69 34 VSS VID_Q6 PALU_DQ6 68 35 VID_Q7 PALU_DQ5 67 36 SCAN_TDO PALU_DQ4 66 37 SCAN_TDI VDD 65 38 VDD 79 M5M410092BRF VDD 102 PALU_DQ27 101 PALU_DQ26 100 DDDMMMMM-nn 2 VDD 128 RESET 127 DRAM_BS1 126 DRAM_BS0 125 DRAM_A8 124 DRAM_A7 123 DRAM_A6 122 DRAM_OP2 121 DRAM_OP1 120 VSS 119 PALU_A5 118 PALU_A4 117 PALU_A3 116 PALU_EN1 115 VSS 112 PALU_WE 114 PALU_OP2 113 PALU_BE3 111 PALU_BE2 110 PALU_DX3 109 PALU_DX2 108 PALU_DQ31 107 PALU_DQ30 106 VDD 103 PALU_DQ29 105 PALU_DQ28 104 Pin Descriptions and Pinouts Reverse Pinout Diagram 39 DRAM_A2 DRAM_A1 DRAM_A3 VDD DRAM_A0 42 DRAM_A4 40 43 DRAM_A5 41 44 VSS PALU_A0 45 PALU_A1 DRAM_EN DRAM_OP0 49 PALU_A2 46 50 PALU_EN0 47 51 PALU_OP0 48 52 PALU_DX0 PALU_OP1 PALU_DX1 53 58 PALU_DQ0 54 59 PALU_DQ1 55 60 PALU_DQ2 PALU_BE1 61 PALU_DQ3 VSS PALU_BE0 62 VDD 56 63 57 64 20 M1003 3 Pixel ALU Operations MITSUBISHI Rev. 1.03 3D-RAM (M5M410092B) ELECTRONIC DEVICE GROUP Global Bus. During a Pixel ALU operation, the 32-bit Pixel ALU accesses the Pixel Buffer, requiring not only the block address be specified but also the 32-bit word be identified. This is done via the 6-bit PALU_A pins. The upper three bits select one of eight blocks in the Pixel Buffer, and the lower three bits specifies one of the eight words in the selected block. The availability of both the DRAM_A and PALU_A pins allows concurrent DRAM and Pixel ALU operations. Since a word is mapped directly to PALU_DQ[31:0], PALU_DQ[7:0] is byte 0, PALU_DQ[15:8] is byte 1, PALU_DQ[23:16] is byte 2, and PALU_DQ[31:24] is byte 3. Figure 3.1 is a simplified block diagram of these Pixel Buffer elements. This chapter discusses details on the elements and operations of the Pixel Buffer and Pixel ALU in 3D-RAM. An operation that involves only the Pixel ALU and the Pixel Buffer is called a Pixel ALU operation. An operation that involves a DRAM array is categorized as a DRAM operation and is described in Chapter 4, “DRAM Operations.” All registers of the 3D-RAM are defined and explained in this chapter. Elements of the Pixel Buffer Block and Word As stated in Chapter 2, the 2,048-bit Pixel Buffer is organized into eight 256-bit blocks. During a DRAM operation, these blocks can be addressed from the DRAM_A pins for block transfers on the Dirty Tag RAM 0 1 2 3 4 5 0 8 16 24 1 Pixel Buffer 6 7 0 1 2 3 4 5 6 7 9 17 25 2 10 18 26 3 11 19 27 4 12 20 28 5 13 21 29 6 14 22 30 7 15 23 31 0 2 Dirty Tag for Block 0 1 4 3 5 6 7 Block 0 of Pixel Buffer 7:0 15:8 23:16 31:24 7:0 15:8 23:16 31:24 Plane Mask Word 0 in Block 0 Figure 3.1 Pixel Buffer elements 21 3 Pixel ALU Operations Pixel ALU Operations Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP When data is transferred from a Pixel Buffer block to the sense amplifiers of a DRAM bank (i.e., a Write Block transfer, another DRAM operation), the Dirty Tag determines which data bytes can be written into the sense amplifiers. When a Dirty Tag bit is “1”, the corresponding data byte is written under the control of the Plane Mask register (see the following section). When a Dirty Tag bit is “0”, the corresponding byte of data in the DRAM bank is not written and retains its former value. Because the Dirty Tag prevents the unaltered bytes of a 256-bit block from being written into a DRAM bank, the power consumption of a Write Block transfer may be reduced by as much as 50%. This may be a significant power saving when a high-resolution display is constantly redrawn, such as in the case of high-quality fullscreen animation. 3 Pixel ALU Operations Dirty Tag Each data byte of a 256-bit block is associated with a Dirty Tag bit. This means that each 4-byte word is associated with four Dirty Tag bits and that a 32-bit Dirty Tag memory controls the corresponding 32-byte block data. The Dirty Tag RAM in the Pixel Buffer contains eight such 32-bit Dirty Tags. There are three aspects of Dirty Tag operations: tag clear, tag set, and tag initialization. In normal operation modes, the clearing and setting of the Dirty Tag by these read and write operations are done by the on-chip logic in the 3D-RAM and are essentially transparent to the rendering controller. The Dirty Tag bits are used by the 3D-RAM internally and are not output to the external pins. When data is transferred from the sense amplifiers of a DRAM bank to a Pixel Buffer block over the Global Bus (i.e., a Read Block transfer which is a DRAM operation and is described in the next chapter), all 32 Dirty Tag bits associated with the selected Pixel Buffer block are cleared to “0”. When a data word is read from the 32-bit ALU port of Pixel Buffer, none of the 32-bit Dirty Tags is affected or has any effect on the out-going data. The setting and initialization of the Dirty Tags are described in the paragraphs below. Table 3.1 Pixel ALU operations involving Dirty Tags Pixel Operation Pixel Data New Dirty Tag Contents (Stateful/ Stateless)Normal Data Write Write bytes 0 to 3 from PALU_DQ pins (per PALU_BE pins) The four addressed Dirty Tag bits are ORed with PALU_BE[3:0]; the other 28 Dirty Tag bits are unchanged. (Stateful/ Stateless)Initial Data Write Write bytes 0 to 3 from PALU_DQ pins (per PALU_BE pins) PALU_BE[3:0] is written to the 4 addressed Dirty Tag bits; “0” is written to the 28 unaddressed Dirty Tag bits. Replace Dirty Tag Unchanged PALU_DQ[31:0] replaces 32 Dirty Tag bits. OR Dirty Tag Unchanged All 32 Dirty Tag bits are ORed with PALU_DQ[31:0]. 22 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP In 3D-RAM Color Expansion is done with the Dirty Tags associated with the Pixel Buffer blocks. The pixel color is written eight times to a Pixel Buffer block so that all of the pixels in the block are the same color. Next, a 32-bit word is written to the Dirty Tag of the associated block. Finally, the block is written to a DRAM bank. The pixel whose corresponding Dirty Tag bit is set is changed to the new color. The other pixels are unaffected. A new 32-bit word may be written to the Dirty Tag afterwards, and the same Pixel Buffer block may be written to a different part of the DRAM array. Thus, one Pixel Buffer block can be used to hold the foreground color and used repeatedly to write text to the frame buffer. Plane Mask The 32-bit Plane Mask register (PM[31:0]) is used to qualify two write functions: (1) as per-bit write enables on 32-bit data for a Stateful (Initial/ Normal) Data Write operation from the Pixel ALU to the Pixel Buffer; (2) as per-bit write enables on 256-bit data for a Masked Write Block (MWB) operation from the Pixel Buffer to the sense amplifiers of a DRAM bank over the Global Bus. For a Stateful Data Write, the Plane Mask serves as per-bit write enables over the entering data from the Pixel ALU write port; bit 0 of the Plane Mask enables or disables bit 0 of the incoming 32-bit pixel data, bit 1 of the Plane Mask enables or disables bit 1 of the incoming 32-bit pixel data, and so on. For a Masked Write Block operation on the Global Bus side, when a Pixel Buffer block is transferred out to the DRAM, the 32-bit Plane Mask applies to every 32-bit word as per-bit write enables. In other words, bit 0 of the Plane Mask enables or disables bits 0, 32, 64, 96, 128, 160, 192, and 224 of the 256-bit block; bit 1 of the Plane Mask enables or disables bits 1, 33, 65, 97, 129, 161, 193, and 225. The 32 Dirty Tag bits for a particular block can all be replaced with the PALU_DQ data through the Pixel ALU operation “Replace Dirty Tag.” Another Pixel ALU operation “OR Dirty Tag” changes the Dirty Tag contents for an addressed block with the result of the bitwise “OR” function on the original Dirty Tag data and the PALU_DQ[31:0] data. The bit mapping between the Dirty Tag and PALU_DQ pins is illustrated in Figure 3.1. For example, to change the Dirty Tag bits for word 0, the data should be placed on PALU_DQ0, PALU_DQ8, PALU_DQ16, and PALU_DQ24. To change the Dirty Tag bits for word 5, the data should be on PALU_DQ5, PALU_DQ13, PALU_DQ21, and PALU_DQ29. The following sub-section provides an application of these “Replace Dirty Tag” and “OR Dirty Tag” operations. Using Dirty Tag for Color Expansion Many 2D rendering operations, such as text drawing, involve writing the same color to many pixels. These operations can be greatly 23 3 Pixel ALU Operations accelerated by specifying individual pixels with a single bit and having hardware automatically expand each bit to an entire pixel. The Dirty Tag bits play an important role for all four write operations of the Pixel ALU to the Pixel Buffer: Stateful/Stateless Initial Data Write and Stateful/Stateless Normal Data Write. (These operations are also explained in “Pixel ALU Operations” on page 56.) Since the Pixel ALU operations conform to the 7-stage pipeline, the byte enable PALU_BE[3:0] data also gets into the pipeline when the operation is issued. At the end of the pipeline, pixel data is written into a Pixel Buffer word and PALU_BE[3:0] pins can change the four corresponding Dirty Tag bits. In the Initial Data Write operation, the four addressed Dirty Tag bits are replaced with PALU_BE[3:0], while the other 28 Dirty Tag bits for the same block are cleared to “0”. In the Normal Data Write operation, each of the four addressed Dirty Tag bits is set to “1” only when the corresponding PALU_BE pin is “1”. An addressed Dirty Tag bit is unchanged if the corresponding PALU_BE pin is “0”. The other 28 Dirty Tag bits for the same block are also unchanged. Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP operations and DRAM Masked Write Block operations must be avoided. Once the Plane Mask is written, the new Plane Mask is effective for only the Stateful Data Write operations issued at later cycles, thereby conforming to the uniform 7-stage pipeline rule. The Plane Mask register is loaded through a Pixel ALU “Write Control Register” operation. The mapping of the Plane Mask to the PALU_DQ pins is the same as the Word data to the pins (see also the section on “Block and Word” on page 21). It is important to note the simultaneous effects of the Plane Mask. Although 3D-RAM allows concurrent operations of Pixel ALU and DRAM, the user is cautioned that there is only one set of Plane Mask bits that can affect both Pixel ALU write and DRAM write operations at the same time. When different plane maskings are required, concurrent Pixel ALU Stateful Data Write 31 1 24 16 8 0 Dirty Tag Bits 3 Pixel ALU Operations A particular sense amplifier bit can be written only if both the Dirty Tag bit and the Plane Mask bit are logically “1”. This kind of relationship among multiple enables and block data is illustrated in Figure 3.2 for the first 40 bits (which are Word 0 and byte 0 of Word 1) of the Global Bus. Sense Amplifiers of a DRAM Bank ... 39 31 23 15 7 0 31 ... Plane Mask Bits 7 6 5 4 3 2 1 0 Sense Amplifiers of a DRAM Bank 39 Figure 3.2 31 23 15 7 0 The relationship between Dirty Tags and Plane Mask for first 40 bits of the Global Bus. (Both the Dirty Tag bit and the Plane Mask bit must be 1 before a particular Sense Amp bit can be written.) 24 ELECTRONIC DEVICE GROUP Elements of the Pixel ALU Blend units and the Dual Compare unit. In the figure, bus “O” is the old data from the Pixel Buffer; “N” and “NX” are from the PALU_DX and PALU_DQ pins, respectively; and finally, buses “KX” and “K” are from the internal 36-bit Constant Source register, with “KX” being the most significant four bits. The inputs to the Dual Compare unit are straightforward. The inputs to the ROP/Blend units are explained in the following sub-sections. Chapter 2 presented an overview of the Pixel ALU, with an emphasis on the motivation and applications of the elements in the Pixel ALU. In this section, some of the same information is repeated, but the emphasis is on detailed technical specification. The elements of the Pixel ALU are four 8-bit ROP/ Blend units, one 32-bit Match Compare unit, one 32-bit Magnitude Compare unit, and the Picking Logic. Figure 3.3 shows the inputs to the ROP/ PALU_DX[3:0] PALU_DQ[31:0] ALU Read Port Constant Source 36 PASS_OUT Pixel Buffer ALU Write Port PASS_IN[1:0] 36 32 O[7:0] 8 {NX3, N[31:24], NX0, N[7:0]} 18 {KX0, K[7:0]} 9 32 ROP/ Blend Unit 0 8 ROP/ Blend Unit 1 8 ROP/ Blend Unit 2 8 ROP/ Blend Unit 3 8 O[15:8] 8 {NX3, N[31:24], NX1, N[15:8]} 18 {KX1, K[15:8]} 9 O[23:16] 8 {NX3, N[31:24], NX2, N[23:16]} 18 {KX2, K[23:16]} 9 O[31:24] 8 {NX3, N[31:24], NX3, N[31:24]} 9 {KX3, K[31:24]} 9 O[31:0] 32 N[31:0] 32 K[31:0] 32 Figure 3.3 Dual Compare Unit Pixel ALU (Pipeline stages are not shown) 25 3 Pixel ALU Operations Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Da), respectively. The blending equation can be rewritten as: 3 Pixel ALU Operations ROP/Blend Units Each ROP/Blend unit can be independently configured as either a ROP unit or a Blend unit through the programming of the ROP/Blend Control register. Each ROP unit can perform all 16 standard ROP functions, which are listed in Table 3.16. ROP functions are performed on a byte of the Old Data (“O”) from the Pixel Buffer and a byte of the New Term, which is either the data from the data pins (“N”) or the data from the Constant Source register (“K”). Write data to Pixel Buffer = (SRC_Color x sfactor) + (DST_Color x dfactor) = (Rs, Gs, Bs, As) x (Sr, Sg, Sb, Sa) + (Rd, Gd, Bd, Ad) x (Dr, Dg, Db, Da) = (RsxSr+RdxDr, GsxSg+GdxDg, BsxSb+BdxDb, AsxSa+AdxDa) All the possible values for OpenGL blending factors are listed in Table 3.2. The subtraction of quadruplets means subtracting them componentwise. The column “Relevant Factor” indicates whether the corresponding parameter can be used to specify the source or destination blend factor. The first 11 rows in the table are parameters in the OpenGL specification 1.1; the last 4 rows are parameters specified by the OpenGL imaging extension GL_EXT_blend_color. Pixel ALU Blend Factor Selections (NEW) For the blending operation, the general equation is as follows: Write data to Pixel Buffer = New Term + (Old Data x Old Fraction) = (New Data x New Fraction) + (Old Data x Old Fraction) To each Blend unit, an ADDEND (e.g. the “New Term” or 00h) is input from the PALU_DX and PALU_DQ pins (marked as {“NX”, “N”}), from the Pixel Buffer (marked as “O”), or from the Constant Source register (marked as {“KX”, “K”}). Multiplicand 1 (marked as “MULTP1”) is the fraction term and is from one of five sources; Multiplicand 2 (marked as “MULTP2”) is the data term and is from one of six sources. See Table 3.5 for a complete selection mapping of the Addend and Multiplicands. The full blending function requires two multiplications and one addition for each of the four components in the quadruplet. Enumerating the twelve possible values of destination blend factor and thirteen possible values of source blend factor, we arrive at the 156 blending factor selection pairs illustrated in the matrix in Table 3.3. 72 of these 156 pairs are required by the OpenGL Specification 1.1 and the others are from the extension GL_EXT_blend_color. The majority of applications use a small number of pair combinations. Most of the blending with (0,0,0,0) or (1,1,1,1) as the blending factor can be realized with a half blender, meaning that they only require one multiplication and the addition in 3D-RAM. Furthermore, if one of the multiplications does not require destination colors or destination alpha from the frame buffer, this multiplication can be performed inside the rendering controller without having to read the destination data out of the frame buffer. Thus, only a half blender is needed inside the 3D-RAM to complete the blending equation in these cases. For the rest of the blending factor selections, the blending function In OpenGL terminology (see “The OpenGL Graphics System: A Specification (Version 1.1)”), New Data represents the color values of the source (SRC_Color) which enter the Pixel ALU path from the PALU_DQ pins; Old Data represents the color values of the destination (DST_Color) which are from the Pixel Buffer. New Fraction is known as the source blend factor (sfactor); Old Fraction is known as the destination blend factor (dfactor). The color values, SRC_Color and DST_Color, can be represented in RGBA quadruplets form as (Rs, Gs, Bs, As) and (Rd, Gd, Bd, Ad), respectively. Define sfactor and dfactor as (Sr, Sg, Sb, Sa) and (Dr, Dg, Db, 26 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP OpenGL Parameter Relevant Factor Computed Blend Factor GL_ZERO source or destination (0,0,0,0) GL_ONE source or destination (1,1,1,1) GL_DST_COLOR source (Rd,Gd,Bd,Ad) GL_SRC_COLOR destination (Rs,Gs,Bs,As) GL_ONE_MINUS_DST_COLOR source (1,1,1,1) – (Rd,Gd,Bd,Ad) GL_ONE_MINUS_SRC_COLOR destination (1,1,1,1) – (Rs,Gs,Bs,As) GL_SRC_ALPHA source or destination (As,As,As,As) GL_ONE_MINUS_SRC_ALPHA source or destination (1,1,1,1) – (As,As,As,As) GL_DST_ALPHA source or destination (Ad,Ad,Ad,Ad) GL_ONE_MINUS_DST_ALPHA source or destination (1,1,1,1) – (Ad,Ad,Ad,Ad) GL_SRC_ALPHA_SATURATE source (f,f,f,1); f=min(As, 1–Ad) GL_CONSTANT_COLOR_EXT source or destination (Rk,Gk,Bk,Ak) GL_ONE_MINUS_CONSTANT_COLOR_EXT source or destination (1,1,1,1) – (Rk,Gk,Bk,Ak) GL_CONSTANT_ALPHA_EXT source or destination (Ak,Ak,Ak,Ak) GL_ONE_MINUS_CONSTANT_ALPHA_EXT source or destination (1,1,1,1) – (Ak,Ak,Ak,Ak) these registers are set, the blending operation is accomplished by performing a Stateful Write operation. Each Blend unit first performs the multiplication of MULTP1 and MULTP2 and then the addition of the resulting product with the ADDEND, thereby completing a half blend. can be completed in two consecutive cycles using the 3D-RAM’s Two-Cycle Blend operation by looping back during the first cycle one of the two product terms in the equation on the preceding page combining the looped-back product term with the other product term during the second cycle. To execute a Two-Cycle Blend operation, it is necessary to program these registers. The Preblend Control register selects MULTP2 for the Preblend Cycle (the first cycle during the TwoCycle Blend Operation) and the ADDEND for the Normal Cycle (the second cycle of the Two-Cycle Blend Operation). During the Preblend Cycle, ADDEND is fixed to the {PALU_DX and PALU_DQ} or {KX, K} bus, and MULTP1 is fixed to the {PALU_DX, PALU_DQ} bus. The ROP/ Blend and Blend_2 Control registers are programmed to select MULTP1 and MULTP2 components for the Normal Cycle; the ADDEND selected by these two registers is ignored by the adder for the Preblend Cycle and may be “looped back” as one of the two choices for ADDEND during the Normal Cycle. Once these three 3D-RAM accelerates 44 blending factor selection pairs in single clock cycle throughput by half blending and the other 112 blending factor pairs in two cycles. In addition, there are five cases, twocycle blending factor pairs may be accelerated in just one cycle if the alpha blending can be ignored. Blending Operation (NEW) The simplified block diagram of the Blending unit is illustrated in Figure 3.4. To execute a singlecycle blending operation, the multiplicands and addend (MULTP1, MULTP2, ADDEND) must be selected by programming the ROP/Blend and Blend_2 Control registers. Also, the ROP/Blend Control register must be set for blending. Once 27 3 Pixel ALU Operations Table 3.2 Source and Destination Blending Factors 28 (A) (A) (A) (A) CONSTANT_COLOR ONE_MINUS_ CONSTANT_COLOR CONSTANT_ALPHA ONE_MINUS_ CONSTANT_ALPHA x x x x x x x x x x x x x o o o o o o o o o o o x x x o o o o o o o o o o o o o o o o o o o c, o c, o o o x x SRC_ ALPHA (M1) x o o o o o o o c, o c, o o o c, o o o o o o o o x x o o x x DST_ ALPHA (M2) Destination Blend Factor ONE_ MINUS_ SRC_ ALPHA (M1) o o o o o o o x x o o x x ONE_ MINUS_ DST_ ALPHA (M2) o o o o o o o o o o o o x CONSTANT_ COLOR (M1) o o o o o o o o o o o o x ONE_ MINUS_ CONSTANT_ COLOR (M1) o o o o o o o o o o o o x CONSTANT_ ALPHA (M1) o o o o o o o o o o o o x ONE_ MINUS_ CONSTANT_ ALPHA (M1) ELECTRONIC DEVICE GROUP Legend: “x” = half blending in a single clock cycle; “o” = full blending in two cycles using the Two-Cycle Blend operation; “c” = one cycle blending with alpha blending ignored. K = Constant Source register; A = ADDEND; M1 = MULTP1; M2 = MULTP2 (M2) (M2) SRC_ALPHA_ SATURATE (M2) DST_ALPHA ONE_MINUS_ DST_ALPHA x x x (A) SRC_ALPHA (A) x x (M2) x x x (M2) DST_COLOR ONE_MINUS_ DST_COLOR x x x (A) ONE ONE_MINUS_ SRC_ALPHA x x (A) ONE (M1) ZERO (KX, K) ZERO Source Blend Factor SRC_ COLOR (M1) ONE_ MINUS_ SRC_ COLOR (M1) 3 Pixel ALU Operations MITSUBISHI Rev. 1.03 3D-RAM (M5M410092B) Table 3.3OpenGL blending factor selection matrix BLANK SPACE BLANK SPA Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Figure 3.4 illustrates the above description with a simplified block diagram. The blocks labelled “N:NN”, “N:N0”, and “NX:N” on the blending path represent the manner in which the 8-bit Blend units duplicate 4-bit data for the special (4,4,4,4) 16-bit color mode. Specifically, “N:NN” means that the 4-bit data is nibble-wise duplicated to form an 8-bit data; “N:N0” means an 8-bit data is formed by padding the lower nibble with 0000b; and “NX:N” means a 4-bit data is produced by truncating the lower nibble, regardless of its value. More explanations may be found in the section on “4-bit to 8-bit Expansion for Pixel ALU.” Note that the special OpenGL stencil mode, which will be described in the section on “Stencil Modes,” uses portions of ROP/Blend unit 3 to accomplish its functions. For simplicity, the stencil logic is not shown in Figure 3.4 and, for the most part, can be thought of as a separate unit. It is important to note, however, that the stencil logic uses portions of the blending path and therefore, ROP/Blend unit 3 cannot be used for blending when the OpenGL stencil mode is being used. To help sort out the different sources for the various blending factors for both the single-cycle half blending and the two-cycle full blending, Table 3.3 is notated with example sources of all OpenGL blending factors. For example, all blending factors related to the alpha component should be selected through the MULTP2 datapath; these include DST_ALPHA, ONE_MINUS_DST_ALPHA, and SRC_ALPHA_SATURATE and are notated with “M2” in their respective rows and columns. Some source blending factors are not passed to 3D-RAM directly, but rather the product of the source blending factor and the source color is passed to 3D-RAM as the ADDEND term; these include ZERO, ONE, SRC_ALPHA, and ROP/Blend units 0, 1 and 2 are identical, but unit 3 is slightly different because this unit typically handles the Alpha data. The Alpha-Saturate block shown in Figure 3.4 and Figure 3.5 is only present in ROP/Blend unit 3. The result from the AlphaSaturate block is routed to all four ROP/Blend units as a possible source of MULTP2. For the specifics of the data multiplexing and selections by the various register bits, refer to Figure 3.6. The timing diagram of an example Two-Cycle Blend operation is presented in Figure 3.7. 29 3 Pixel ALU Operations ONE_MINUS_SRC-ALPHA. registers have been programmed, an “Initial TwoCycle Blending” operation (PALU_OP=110, PALU_WE=1) should be performed and followed by a Stateful Initial Data Write or Stateful Normal Data Write operation on the same pixel location (i.e. PALU_A with the same Block and Word address and PALU_BE[3:0] with the same enable settings). During the Preblend Cycle, MULTP1 and MULTP2 are multiplied and the result is “looped back” one stage in the pipeline. The ADDEND is also “looped back” one stage. The ADDEND and the multiplier output are then available as a possible ADDEND for the next cycle. Next, the Stateful Write is issued with the multiplicands selected by the ROP/Blend and Blend_2 Control registers. The ADDEND selected by the ROP/Blend and Blend_2 registers will be ignored. The blending occurs just as it would for a single-cycle operation except that the ADDEND source is chosen to be either the “looped back” multiplier output or the “looped back” ADDEND, based on the settings of the Preblend Control register. Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP 9 1 stage delay α−sat* 9 9 {NX3,N[31:24]} 100h N:NN 3 Pixel ALU Operations N[31:24] 9 9 8 1 stage delay 9 9 9 ADDEND 9 9 8 8 8 Clamped Result 8 MULTP2 Mult Intermediate Result 8 MULTP1 4 to 8 bit expansion (16-bit mode) NX:N O[31:24] 9 To ALU Write Port of the Pixel Buffer 8 8 ROP Clamp {NXn,N[8n+7:8n]} 8 8 N:N0 {KXn,K[8n+7:8n]} 8 N:NN O[8n+7:8n] * Note: The α−sat block is only present in ROP/Blend unit 3. The result is pased to all four ROP/Blend units. Figure 3.4 ROP/Blend unit n (Pipeline stages ar not shown) ~O[31:28] N[31:28] ~O[27:24] N[27:24] AU Compare A>B BU Note: This can be either a 4-bit or 8-bit comparator, based on the Color Depth Select register AL min(N[31:24],~O[31:24]) BL 0 1 00 N[31:24] 01 O[31:24] 10 11 to all ROP/Blend Units BLD2[29:28] Figure 3.5 Block diagram of the Alpha-Saturate unit 30 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP RBC[8n+5] N[8n+7:8n] K[8n+7:8n] 8 8 BLD2[8n] 0 8 1 8 O[8n+7:8n] 0 nibble mode 8 In nibble DUP direction 8 Out RBC[8n+5] N[8n+7:8n] K[8n+7:8n] 1 BLD2[8n] NXn 0 1 KXn 1 1 8 8 0 8 1 O[8n+7:8n] 8 RBC[3:0] CDS[0] 0 1 9 1 MSB 1 9 9 Preblend Command Cycle (PALU_OP[2:0]=110) 0 0 xNX 9 1 ↓ xN0 9 0 ROP 9 1 PBC[8n] Preblend Command from the previous clock cycle (PALU_OP = 110) RBC[8n+5] To ALU 8 Write 0 8 Port 8 1 of the Pixel Buffer RBC[8n+6] RBC[8n+7] 1 KXn NXn NX3 1 1 1 (bit 9 is padded with "0") BLD2[8n+1] 00 1 01 1 0 MSB 1 10 1 11 CDS[0] PALU_BEn 0x00 K[8n+7:8n] N[8n+7:8n] N[31:24] 8 BLD2[8n+1] 00 8 8 8 BLD2[8n+3] 0 PBC[8n+3] 1 01 10 11 8 O[8n+7:8n] 0 0 PBC[8n+2] 1 nibble DUP direction In Out CDS[0] Multiply 8 B MULTP1 1 8 8 0 A CDS[0] BLD2[8n+2] 8 nibble mode 8 8 Add 8 9 Clamp 8 NX→NN 8 0 1 1 PALU_BEn nibble nibble mode DUP direction O[8n+7:8n] In 8 8 Out α-Sat 8 0 1 8 MULTP2 Preblend Command Cycle (PALU_OP[2:0]=110) *Byte nibble control logic not shown for (4.4.4.4) blending mode. **Stencil logic not shown for the selection of ROP or Stencil OP. ***Pipeline stages not shown. Figure 3.6 Details of data selections in Blend unit n by the various register bits 31 ROP/Blend unit n 3 Pixel ALU Operations CDS[0] When PALU_BEn="0", upper nibble is copied to lower nibble; When PALU_BEn="1", lower nibble is copied to upper nibble. PALU_BEn Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP 1 MCLK 2 3 4 6 5 7 8 9 10 11 PALU_EN, PALU_BE, PALU_WE PALU_OP 3 Pixel ALU Operations PALU_A PALU_DQ,DX Control Register 111 111 111 000100 001000 001001 Data Data Data Preblend Data ROP/Blend Blend_2 Preblend 110 010 or 011 Block:Word Normal Data Old Data Pixel Buffer Invalid New Data Pass_In PASS_IN PASS_OUT M1049 Figure 3.7 An Example of a Two-Cycle Blend operation 32 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP result of internal Compare units, and (2) on the PASS_IN[1:0] pins, which is the PASS_OUT signal from the preceding 3D-RAM. The mathematic operations performed in the Blend unit are summarized in Table 3.4. The Clamped Result is written to the Pixel Buffer, depending (1) on the PASS_OUT pin, which is the Table 3.4 Mathematical operations in Blend unit n Multiplicand 2 Addend Range 0.00h ~ 0.FFh (8-bit unsigned) 0.00h ~ 0.FFh (8-bit unsigned) 0.00h ~ 0.FFh (8-bit unsigned) 0.00h ~ 0.FFh (8-bit unsigned) 1.00h (9-bit constant 1.00h) 0 ~ 255 (8-bit unsigned) –256 ~ 255 (9-bit signed) –256 ~ 255 (9-bit signed) –256 ~ 255 (9-bit signed) Intermediate Result 0 ~ 255 (8-bit unsigned) 0 ~ 255 (8-bit unsigned) –256 ~ 510 (10-bit signed) Clamped Result 0 ~ 255 (8-bit unsigned) Sources {NXn, N[8n+7:8n]}* Comments Source is from PALU_DXn and PALU_DQ[8n+7:8n] pins Source is from PALU_DX3 and PALU_DQ[31:24] pins Source is from the SRAM Pixel Buffer {NX3, N[31:24]}* O[8n+7:8n] {KXn, K[8n+7:8n]}* Source is from the internal Constant Register Multiplicand 1 greater than 1.00h is clamped to 1.00h 1.00h O[8n+7:8n] ~O[8n+7:8n] min{N[31:24], ~O[31:24]} (α-sat) N[31:24] O[31:24] ~O[31:24] {NXn, N[8n+7:8n]} {KXn, K[8n+7:8n]} Previous Addend {NXn, N[8n+7:8n]} or {KXn, K[8n+7:8n]} O[8n+7:8n] MULTP1 x MULTP2 Source is from the SRAM Pixel Buffer Source is inverted from O[8n+7:8n] Source is from the α-saturate block in ROP/Blend Unit 3 Source is from PALU_DXn and PALU_DQ[8n+7:8n] pins Source is from the internal Constant Register Source is from the previous stage Addend (Loop Back Blending) Source is from the SRAM Pixel Buffer Source is from the previous stage multiplier output (Loop Back Blending) (MULTP1 x MULTP2) + Addend Intermediate Result The Clamped Result is written to the Pixel Buffer if the pass condition is valid If source > 255, then result = 255 else if source < 0, then result = 0 else result = source *The multiplier is an 8 x 8 unsigned integeter multiplier, so only the lower 8 bits of multiplicand 1 is multiplied with multiplicand 2. However, if NXn, NX3, or KXn is “1” or if 1.00h is selected, then the calculated multiplier output is ignored and multiplicand 2 is passed through as multiplier output. 33 3 Pixel ALU Operations Operand Multiplicand 1 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI 3 Pixel ALU Operations ELECTRONIC DEVICE GROUP together in the table for simplicity. For example, the term “Cd, Ad” represents “Rd, Gd, Bd, Ad” and “1–Cd, 1–Ad” represents “1–Rd, 1–Gd, 1–Bd, 1– Ad.” Note that terms such as “1–Cd” and “1–Ad” that are generated inside the 3D-RAM are approximated by the 1’s complement. For example, the term “1–Ad” is actually ~Ad, the bitwise inverse of Ad. Note also that in alpha_saturate blending, the multiplicand selections for color and alpha are different. There are certainly more ways to do the blending operations than those listed in Table 3.5. This list demonstrates that the 3D-RAM does support all OpenGL blending modes. The “Alpha” value, denoted as As and Ad, should be placed at the most significant byte of the respective bus, i.e. As at N[31:24], which is from PALU_DQ[31:24]; and Ad at the O[31:24], which is from the Pixel Buffer. Table 3.5 lists possible multiplicand/addend selections for each OpenGL blending mode. Note that the “Preblend Cycle” column only applies to two-cycle blending operations. Each table entry represents data for all four Blending units, and some entries contain two terms. The first term applies to blending units that are designated for blending color data (in Blend units 0, 1, and 2). The second term is for the Blend unit operating on the alpha value (unit 3). The individual color terms have been grouped Table 3.5 Multiplicand/Addend selection for each OpenGL blending factor pairs Blending Fractions Preblend Cycle Normal Cycle sfactor dfactor MULTP1 MULTP2 ADDEND MULTP1 MULTP2 ADDEND 0, 0 0, 0 na na na 0,0 (from K) Cd, Ad 0,0 (from DQ) 1, 1 0, 0 na na na 0,0 (from K) Cd, Ad Cs, As Cd, Ad 0, 0 na na na Cs, As Cd, Ad 0,0 (from K) 1–Cd, 1–Ad 0, 0 na na na Cs, As 1–Cd, 1–Ad 0,0 (from K) As, As 0, 0 na na na 0,0 (from K) Cd, Ad Cs*As, As*As na na na Cs, As As, As 0,0 (from K) 1–As, 1–As 0, 0 na na na 0,0 (from K) Cd, Ad Cs*(1–As), As*(1–As) Ad, Ad 0, 0 na na na Cs, As Ad, Ad 0,0 (from K) 1–Ad, 1–Ad 0, 0 na na na Cs, As 1–Ad, 1–Ad 0,0 (from K) f, 1 0, 0 na na na Cs, 0 (from K) f, f 0 (from K), As Ck, Ak 0,0 na na na 0,0 (from K) Cd, Ad (NEW rev.1.03) Cs*Ck, As*Ak (NEW rev.1.03) 1–Ck, 1–Ak 0,0 na na na 0,0 (from K) Cd, Ad Ak, Ak 0,0 na na na 0,0 (from K) Cd, Ad Cs*(1–Ck), As*(1–Ak) (NEW rev.1.03) Cs*Ak, As*Ak (NEW rev.1.03) 1–Ak, 1–Ak 0,0 na na na 0,0 (from K) Cd, Ad Cs*(1–Ak), As*(1–Ak) 0, 0 1, 1 na na na 1, 1 Cd, Ad 1, 1 1, 1 na na na 1, 1 Cd, Ad Cs, As Cd, Ad 1, 1 na na na Cs, As Cd, Ad Cd, Ad 1–Cd, 1–Ad 1, 1 na na na Cs, As 1–Cd, 1–Ad Cd, Ad As, As 1, 1 na na na 1, 1 Cd, Ad Cs*As, As*As na na na Cs, As As, As Cd, Ad 1–As, 1–As 1, 1 na na na 1, 1 Cd, Ad Cs*(1–As), As*(1–As) Ad, Ad 1, 1 na na na Cs, As Ad, Ad Cd, Ad (NEW rev.1.03) 0,0 (from K) Cs=Rs,Gs,Bs; Cd=Rd,Gd,Bd; x=don’t care; na=not applicable; f=min(As,1–Ad); *=arithmetic multiplication MPY=multiplier result; ADD=Addend term; K=Constant Source register; DQ=PALU_DQ pins or {PALU_DX, PALU_DQ} pins 34 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Table 3.5 Multiplicand/Addend selection for each OpenGL blending factor pairs Blending Fractions Preblend Cycle Normal Cycle sfactor dfactor MULTP1 MULTP2 ADDEND MULTP1 MULTP2 ADDEND 1–Ad, 1–Ad 1, 1 na na na Cs, As 1–Ad, 1–Ad Cd, Ad f, 1 1, 1 na na na Cs, 1 f, Ad Cd, As Ck, Ak 1, 1 na na na 1, 1 Cd, Ad Cs*Ck, As*Ak 1–Ck, 1–Ak 1, 1 na na na 1, 1 Cd, Ad Cs*(1–Ck), As*(1–Ak) Ak, Ak 1, 1 na na na 1, 1 Cd, Ad Cs*Ak, As*Ak 1–Ak, 1–Ak 1, 1 na na na 1, 1 Cd, Ad Cs*(1–Ak), As*(1–Ak) 0, 0 Cs, As na na na Cs, As Cd, Ad (NEW rev.1.03) (NEW rev.1.03) 0,0 (from K) (NEW rev.1.03) 1, 1 Cs, As na na na Cs, As Cd, Ad Cs, As Cd, Ad Cs, As Cs, As Cd, Ad x Cs, As Cd, Ad Loop Back(MPY) 1–Cd, 1–Ad Cs, As Cs, As 1–Cd, 1–Ad x Cs, As Cd, Ad Loop Back(MPY) As, As Cs, As x x Cs*As, As*As Cs, As Cd, Ad Loop Back(ADD) Cs, As As, As x Cs, As Cd, Ad Loop Back(MPY) x Cs*(1–As), As*(1–As) Cs, As Cd, Ad Loop Back(ADD) 1–As, 1–As Cs, As x Ad, Ad Cs, As Cs, As Ad, Ad x Cs, As Cd, Ad Loop Back(MPY) 1–Ad, 1–Ad Cs, As Cs, As 1–Ad, 1–Ad x Cs, As Cd, Ad Loop Back(MPY) f, 1 Cs, As Cs, x f, x x, As Cs, As Cd, Ad Loop Back(MPY), Loop Back(ADD) Ck, Ak Cs, As x x Cs*Ck, As*Ak Cs, As Cd, Ad Loop Back(ADD) 1–Ck, 1–Ak Cs, As x x Cs*(1–Ck), As*(1–Ak) Cs, As Cd, Ad Loop Back(ADD) Ak, Ak Cs, As x x Cs*Ak, As*Ak Cs, As Cd, Ad Loop Back(ADD) 1–Ak, 1–Ak Cs, As x x Cs*(1–Ak), As*(1–Ak) Cs, As Cd, Ad Loop Back(ADD) 0, 0 1–Cs, 1–As na na na 1–Cs, 1–As Cd, Ad 0,0 (from K) 1, 1 1–Cs, 1–As x x Cs, As 1–Cs, 1–As Cd, Ad Loop Back(ADD) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) Cd, Ad 1–Cs, 1–As Cs, As Cd, Ad x 1–Cs, 1–As Cd, Ad Loop Back(MPY) 1–Cd, 1–Ad 1–Cs, 1–As Cs, As 1–Cd, 1–Ad x 1–Cs, 1–As Cd, Ad Loop Back(MPY) As, As 1–Cs, 1–As x x Cs*As, As*As 1–Cs, 1–As Cd, Ad Loop Back(ADD) Cs, As As, As x 1–Cs, 1–As Cd, Ad Loop Back(MPY) x Cs*(1–As), As*(1–As) 1–Cs, 1–As Cd, Ad Loop Back(ADD) 1–As, 1–As 1–Cs, 1–As x Ad, Ad 1–Cs, 1–As Cs, As Ad, Ad x 1–Cs, 1–As Cd, Ad Loop Back(MPY) 1–Ad, 1–Ad 1–Cs, 1–As Cs, As 1–Ad, 1–Ad x 1–Cs, 1–As Cd, Ad Loop Back(MPY) f, 1 1–Cs, 1–As Cs, x f, x x, As 1–Cs, 1–As Cd, Ad Loop Back(MPY), Loop Back(ADD) Ck, Ak 1–Cs, 1–As x x Cs*Ck, As*Ak 1–Cs, 1–As Cd, Ad Loop Back(ADD) (NEW rev.1.03) Cs=Rs,Gs,Bs; Cd=Rd,Gd,Bd; x=don’t care; na=not applicable; f=min(As,1–Ad); *=arithmetic multiplication MPY=multiplier result; ADD=Addend term; K=Constant Source register; DQ=PALU_DQ pins or {PALU_DX, PALU_DQ} pins 35 3 Pixel ALU Operations (NEW rev.1.03) Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Table 3.5 Multiplicand/Addend selection for each OpenGL blending factor pairs Blending Fractions Preblend Cycle Normal Cycle sfactor dfactor MULTP1 MULTP2 ADDEND MULTP1 MULTP2 ADDEND 1–Ck, 1–Ak 1–Cs, 1–As x x Cs*(1–Ck), As*(1–Ak) 1–Cs, 1–As Cd, Ad Loop Back(ADD) Ak, Ak 1–Cs, 1–As x x Cs*Ak, As*Ak 1–Cs, 1–As Cd, Ad Loop Back(ADD) 1–Ak, 1–Ak 1–Cs, 1–As x x Cs*(1–Ak), As*(1–Ak) 1–Cs, 1–As Cd, Ad Loop Back(ADD) 0, 0 As, As na na na As, As Cd, Ad 1, 1 As, As na na na As, As Cd, Ad Cs, As Cd, Ad As, As Cs, As Cd, Ad x As, As Cd, Ad Loop Back(MPY) 1–Cd, 1–Ad As, As Cs, As 1–Cd, 1–Ad x As, As Cd, Ad Loop Back(MPY) As, As As, As na na na As, x Cd, x Cs*As, x Cs, As As, As x As, As Cd, Ad Loop Back(MPY) na na na As, x Cd, x Cs*(1–As), x x x Cs*(1–As), As*(1–As) As, As Cd, Ad Loop Back(ADD) (NEW rev.1.03) 3 Pixel ALU Operations (NEW rev.1.03) 1–As, 1–As As, As (NEW rev.1.03) 0,0 (from K) Ad, Ad As, As Cs, As Ad, Ad x As, As Cd, Ad Loop Back(MPY) 1–Ad, 1–Ad As, As Cs, As 1–Ad, 1–Ad x As, As Cd, Ad Loop Back(MPY) f, 1 As, As Cs, x f, x x, As As, As Cd, Ad Loop Back(MPY), Loop Back(ADD) Ck, Ak As, As x x Cs*Ck, As*Ak As, As Cd, Ad Loop Back(ADD) 1–Ck, 1–Ak As, As x x Cs*(1–Ck), As*(1–Ak) As, As Cd, Ad Loop Back(ADD) Ak, Ak As, As x x Cs*Ak, As*Ak As, As Cd, Ad Loop Back(ADD) 1–Ak, 1–Ak As, As x x Cs*(1–Ak), As*(1–Ak) As, As Cd, Ad Loop Back(ADD) 0, 0 1–As, 1–As na na na 1–As, 1–As Cd, Ad 1, 1 1–As, 1–As na na na 1–As, x Cd, x Cs, x x x Cs, As 1–As, 1–As Cd, Ad Loop Back(ADD) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) 0,0 (from K) Cd, Ad 1–As, 1–As Cs, As Cd, Ad x 1–As, 1–As Cd, Ad Loop Back(MPY) 1–Cd, 1–Ad 1–As, 1–As Cs, As 1–Cd, 1–Ad x 1–As, 1–As Cd, Ad Loop Back(MPY) As, As 1–As, 1–As na na na 1–As, x Cd, x Cs*As, x Cs, As As, As x 1–As, x Cd, Ad Loop Back(MPY) na na na 1–As, x Cd, x Cs*(1–As), x 1–As, 1–As 1–As, 1–As x x Cs*(1–As), As*(1–As) 1–As, 1–As Cd, Ad Loop Back(ADD) Ad, Ad 1–As, 1–As Cs, As Ad, Ad x 1–As, 1–As Cd, Ad Loop Back(MPY) 1–Ad, 1–Ad 1–As, 1–As Cs, As 1–Ad, 1–Ad x 1–As, 1–As Cd, Ad Loop Back(MPY) f, 1 1–As, 1–As Cs, x f, x x, As 1–As, 1–As Cd, Ad Loop Back(MPY), Loop Back(ADD) Ck, Ak 1–As, 1–As x x Cs*Ck, As*Ak 1–As, 1–As Cd, Ad Loop Back(ADD) (NEW rev.1.03) (NEW rev.1.03) Cs=Rs,Gs,Bs; Cd=Rd,Gd,Bd; x=don’t care; na=not applicable; f=min(As,1–Ad); *=arithmetic multiplication MPY=multiplier result; ADD=Addend term; K=Constant Source register; DQ=PALU_DQ pins or {PALU_DX, PALU_DQ} pins 36 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Table 3.5 Multiplicand/Addend selection for each OpenGL blending factor pairs Blending Fractions Preblend Cycle Normal Cycle sfactor dfactor MULTP1 MULTP2 ADDEND MULTP1 MULTP2 ADDEND 1–Ck, 1–Ak 1–As, 1–As x x Cs*(1–Ck), As*(1–Ak) 1–As, 1–As Cd, Ad Loop Back(ADD) Ak, Ak 1–As, 1–As x x Cs*Ak, As*Ak 1–As, 1–As Cd, Ad Loop Back(ADD) 1–Ak, 1–Ak 1–As, 1–As x x Cs*(1–Ak), As*(1–Ak) 1–As, 1–As Cd, Ad Loop Back(ADD) 0, 0 Ad, Ad na na na Cd, Ad Ad, Ad (NEW rev.1.03) (NEW rev.1.03) 0,0 (from K) (NEW rev.1.03) 1, 1 Ad, Ad na na na Cd, Ad Ad, Ad Cs, As (NEW rev.1.03) Cd, Ad Ad, Ad Cs, As Cd, Ad x Cd, Ad Ad, Ad Loop Back(MPY) 1–Cd, 1–Ad Ad, Ad Cs, As 1–Cd, 1–Ad x Cd, Ad Ad, Ad Loop Back(MPY) As, As Ad, Ad na na na Cd, Ad Ad, Ad Cs*As, As*As 1–As, 1–As Ad, Ad na na na Cd, Ad Ad, Ad Cs*(1–As), As*(1–As) Ad, Ad Ad, Ad Cs, As Ad, Ad x Cd, Ad Ad, Ad Loop Back(MPY) 1–Ad, 1–Ad Ad, Ad Cs, As 1–Ad, 1–Ad x Cd, Ad Ad, Ad Loop Back(MPY) f, 1 Ad, Ad Cs, x f, x x, As Cd, Ad Ad, Ad Loop Back(MPY), Loop Back(ADD) Ck, Ak Ad, Ad x x Cs*Ck, As*Ak Cd, Ad Ad, Ad Loop Back(ADD) 1–Ck, 1–Ak Ad, Ad x x Cs*(1–Ck), As*(1–Ak) Cd, Ad Ad, Ad Loop Back(ADD) Ak, Ak Ad, Ad x x Cs*Ak, As*Ak Cd, Ad Ad, Ad Loop Back(ADD) 1–Ak, 1–Ak Ad, Ad x x Cs*(1–Ak), As*(1–Ak) Cd, Ad Ad, Ad Loop Back(ADD) 0, 0 1–Ad, 1–Ad na na na Cd, Ad 1–Ad, 1–Ad 1, 1 1–Ad, 1–Ad na na na Cd, Ad 1–Ad, 1–Ad Cs, As Cd, Ad 1–Ad, 1–Ad Cs, As Cd, Ad x Cd, Ad 1–Ad, 1–Ad Loop Back(MPY) 1–Cd, 1–Ad 1–Ad, 1–Ad Cs, As 1–Cd, 1–Ad x Cd, Ad 1–Ad, 1–Ad Loop Back(MPY) As, As 1–Ad, 1–Ad na na na Cd, Ad 1–Ad, 1–Ad Cs*As, As*As 1–As, 1–As 1–Ad, 1–Ad na na na Cd, Ad 1–Ad, 1–Ad Cs*(1–As), As*(1–As) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) 0,0 (from K) Ad, Ad 1–Ad, 1–Ad Cs, As Ad, Ad x Cd, Ad 1–Ad, 1–Ad Loop Back(MPY) 1–Ad, 1–Ad 1–Ad, 1–Ad Cs, As 1–Ad, 1–Ad x Cd, Ad 1–Ad, 1–Ad Loop Back(MPY) f, 1 1–Ad, 1–Ad Cs, x f, x x, As Cd, Ad 1–Ad, 1–Ad Loop Back(MPY), Loop Back(ADD) Ck, Ak 1–Ad, 1–Ad x x Cs*Ck, As*Ak Cd, Ad 1–Ad, 1–Ad Loop Back(ADD) 1–Ck, 1–Ak 1–Ad, 1–Ad x x Cs*(1–Ck), As*(1–Ak) Cd, Ad 1–Ad, 1–Ad Loop Back(ADD) Ak, Ak 1–Ad, 1–Ad x x Cs*Ak, As*Ak Cd, Ad 1–Ad, 1–Ad Loop Back(ADD) 1–Ak, 1–Ak 1–Ad, 1–Ad x x Cs*(1–Ak), As*(1–Ak) Cd, Ad 1–Ad, 1–Ad Loop Back(ADD) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) Cs=Rs,Gs,Bs; Cd=Rd,Gd,Bd; x=don’t care; na=not applicable; f=min(As,1–Ad); *=arithmetic multiplication MPY=multiplier result; ADD=Addend term; K=Constant Source register; DQ=PALU_DQ pins or {PALU_DX, PALU_DQ} pins 37 3 Pixel ALU Operations (NEW rev.1.03) Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Table 3.5 Multiplicand/Addend selection for each OpenGL blending factor pairs Blending Fractions Preblend Cycle Normal Cycle sfactor dfactor MULTP1 MULTP2 ADDEND MULTP1 MULTP2 ADDEND 0, 0 Ck, Ak na na na Ck, Ak Cd, Ad 0,0 (from K) 1, 1 Ck, Ak x x Cs, As Ck, Ak Cd, Ad Loop Back(ADD) Cd, Ad Ck, Ak Cs, As Cd, Ad x Ck, Ak Cd, Ad Loop Back(MPY) 1-Cd, 1-Ad Ck, Ak Cs, As 1–Cd, 1–Ad x Ck, Ak Cd, Ad Loop Back(MPY) As, As Ck, Ak x x Cs*As, As*As Ck, Ak Cd, Ad Loop Back(ADD) 1–As, 1–As Ck, Ak x x Cs*(1–As), As*(1–As) Ck, Ak Cd, Ad Loop Back(ADD) Ad, Ad Ck, Ak Cs, As Ad, Ad x Ck, Ak Cd, Ad Loop Back(MPY) 1–Ad, 1–Ad Ck, Ak Cs, As 1–Ad, 1–Ad x Ck, Ak Cd, Ad Loop Back(MPY) f, 1 Ck, Ak Cs, x f, x x, As Ck, Ak Cd, Ad Loop Back(MPY), Loop Back(ADD) Ck, Ak Ck, Ak x x Cs*Ck, As*Ak Ck, Ak Cd, Ad Loop Back(ADD) 1–Ck, 1–Ak Ck, Ak x x Cs*(1–Ck), As*(1–Ak) Ck, Ak Cd, Ad Loop Back(ADD) Ak, Ak Ck, Ak x x Cs*Ak, As*Ak Ck, Ak Cd, Ad Loop Back(ADD) 1–Ak, 1–Ak Ck, Ak x x Cs*(1–Ak), As*(1–Ak) Ck, Ak Cd, Ad Loop Back(ADD) 0, 0 1–Ck, 1–Ak na na na 1–Ck, 1–Ak Cd, Ad (NEW rev.1.03) (NEW rev.1.03) 3 Pixel ALU Operations (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) 0,0 (from K) (NEW rev.1.03) 1, 1 1–Ck, 1–Ak x x Cs, As 1–Ck, 1–Ak Cd, Ad Loop Back(ADD) Cd, Ad 1–Ck, 1–Ak Cs, As Cd, Ad x 1–Ck, 1–Ak Cd, Ad Loop Back(MPY) 1–Cd, 1–Ad 1–Ck, 1–Ak Cs, As 1–Cd, 1–Ad x 1–Ck, 1–Ak Cd, Ad Loop Back(MPY) As, As 1–Ck, 1–Ak x x Cs*As, As*As 1–Ck, 1–Ak Cd, Ad Loop Back(ADD) 1–As, 1–As 1–Ck, 1–Ak x x Cs*(1–As), As*(1–As) 1–Ck, 1–Ak Cd, Ad Loop Back(ADD) Ad, Ad 1–Ck, 1–Ak Cs, As Ad, Ad x 1–Ck, 1–Ak Cd, Ad Loop Back(MPY) 1–Ad, 1–Ad 1–Ck, 1–Ak Cs, As 1–Ad, 1–Ad x 1–Ck, 1–Ak Cd, Ad Loop Back(MPY) f, 1 1–Ck, 1–Ak Cs, x f, x x, As 1–Ck, 1–Ak Cd, Ad Loop Back(MPY), Loop Back(ADD) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) Cs=Rs,Gs,Bs; Cd=Rd,Gd,Bd; x=don’t care; na=not applicable; f=min(As,1–Ad); *=arithmetic multiplication MPY=multiplier result; ADD=Addend term; K=Constant Source register; DQ=PALU_DQ pins or {PALU_DX, PALU_DQ} pins 38 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Table 3.5 Multiplicand/Addend selection for each OpenGL blending factor pairs Blending Fractions Preblend Cycle Normal Cycle sfactor dfactor MULTP1 MULTP2 ADDEND MULTP1 MULTP2 ADDEND Ck, Ak 1–Ck, 1–Ak x x Cs*Ck, As*Ak 1–Ck, 1–Ak Cd, Ad Loop Back(ADD) 1–Ck, 1–Ak 1–Ck, 1–Ak x x Cs*(1–Ck), As*(1–Ak) 1–Ck, 1–Ak Cd, Ad Loop Back(ADD) Ak, Ak 1–Ck, 1–Ak x x Cs*Ak, As*Ak 1–Ck, 1–Ak Cd, Ad Loop Back(ADD) 1–Ak, 1–Ak 1–Ck, 1–Ak x x Cs*(1–Ak), As*(1–Ak) 1–Ck, 1–Ak Cd, Ad Loop Back(ADD) 0, 0 Ak, Ak na na na Ak, Ak Cd, Ad (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) 0,0 (from K) (NEW rev.1.03) 1, 1 Ak, Ak x x Cs, As Ak, Ak Cd, Ad Loop Back(ADD) Cd, Ad Ak, Ak Cs, As Cd, Ad x Ak, Ak Cd, Ad Loop Back(MPY) 1–Cd, 1–Ad Ak, Ak Cs, As 1–Cd, 1–Ad x Ak, Ak Cd, Ad Loop Back(MPY) As, As Ak, Ak x x Cs*As, As*As Ak, Ak Cd, Ad Loop Back(ADD) 1–As, 1–As Ak, Ak x x Cs*(1–As), As*(1–As) Ak, Ak Cd, Ad Loop Back(ADD) Ad, Ad Ak, Ak Cs, As Ad, Ad x Ak, Ak Cd, Ad Loop Back(MPY) 1–Ad, 1–Ad Ak, Ak Cs, As 1–Ad, 1–Ad x Ak, Ak Cd, Ad Loop Back(MPY) f, 1 Ak, Ak Cs, 1 f, As x Ak, Ak Cd, Ad Loop Back(MPY) Ck, Ak Ak, Ak x x Cs*Ck, As*Ak Ak, Ak Cd, Ad Loop Back(ADD) 1–Ck, 1–Ak Ak, Ak x x Cs*(1–Ck), As*(1–Ak) Ak, Ak Cd, Ad Loop Back(ADD) Ak, Ak Ak, Ak x x Cs*Ak, As*Ak Ak, Ak Cd, Ad Loop Back(ADD) 1–Ak, 1–Ak Ak, Ak x x Cs*(1–Ak), As*(1–Ak) Ak, Ak Cd, Ad Loop Back(ADD) 0, 0 1–Ak, 1–Ak na na na 1–Ak, 1–Ak Cd, Ad (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) 0,0 (from K) (NEW rev.1.03) 1, 1 1–Ak, 1–Ak x x Cs, As 1–Ak, 1–Ak Cd, Ad Loop Back(ADD) Cd, Ad 1–Ak, 1–Ak Cs, As Cd, Ad x 1–Ak, 1–Ak Cd, Ad Loop Back(MPY) 1–Cd, 1–Ad 1–Ak, 1–Ak Cs, As 1–Cd, 1–Ad x 1–Ak, 1–Ak Cd, Ad Loop Back(MPY) As, As 1–Ak, 1–Ak x x Cs*As, As*As 1–Ak, 1–Ak Cd, Ad Loop Back(ADD) 1–As, 1–As 1–Ak, 1–Ak x x Cs*(1–As), As*(1–As) 1–Ak, 1–Ak Cd, Ad Loop Back(ADD) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) Cs=Rs,Gs,Bs; Cd=Rd,Gd,Bd; x=don’t care; na=not applicable; f=min(As,1–Ad); *=arithmetic multiplication MPY=multiplier result; ADD=Addend term; K=Constant Source register; DQ=PALU_DQ pins or {PALU_DX, PALU_DQ} pins 39 3 Pixel ALU Operations (NEW rev.1.03) Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Table 3.5 Multiplicand/Addend selection for each OpenGL blending factor pairs Blending Fractions Preblend Cycle Normal Cycle sfactor dfactor MULTP1 MULTP2 ADDEND MULTP1 MULTP2 ADDEND Ad, Ad 1–Ak, 1–Ak Cs, As Ad, Ad x 1–Ak, 1–Ak Cd, Ad Loop Back(MPY) 1–Ad, 1–Ad 1–Ak, 1–Ak Cs, As 1–Ad, 1–Ad x 1–Ak, 1–Ak Cd, Ad Loop Back(MPY) f, 1 1–Ak, 1–Ak Cs, x f, x x, As 1–Ak, 1–Ak Cd, Ad Loop Back(MPY), Loop Back(ADD) Ck, Ak 1–Ak, 1–Ak x x Cs*Ck, As*Ak 1–Ak, 1–Ak Cd, Ad Loop Back(ADD) 1–Ck, 1–Ak 1–Ak, 1–Ak x x Cs*(1–Ck), As*(1–Ak) 1–Ak, 1–Ak Cd, Ad Loop Back(ADD) Ak, Ak 1–Ak, 1–Ak x x Cs*Ak, As*Ak 1–Ak, 1–Ak Cd, Ad Loop Back(ADD) 1–Ak, 1–Ak 1–Ak, 1–Ak x x Cs*(1–Ak), As*(1–Ak) 1–Ak, 1–Ak Cd, Ad Loop Back(ADD) (NEW rev.1.03) 3 Pixel ALU Operations (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) (NEW rev.1.03) Cs=Rs,Gs,Bs; Cd=Rd,Gd,Bd; x=don’t care; na=not applicable; f=min(As,1–Ad); *=arithmetic multiplication MPY=multiplier result; ADD=Addend term; K=Constant Source register; DQ=PALU_DQ pins or {PALU_DX, PALU_DQ} pins Stencil Modes OpenGL Stencil Mode Operations (NEW) (NEW) This stencil mode provides fully compliant OpenGL stencil operations. The 3D-RAM stencil features are implemented in ROP/Blend unit 3 so that bits [31:24] of the 32-bit ALU unit are available as stencil planes. These 8 bits are bitwise enabled, so that any number of stencil planes from 0 through 8 may be used. The stencil buffer in a 3D graphics system may be used to restrict drawing to a certain portion of the screen, just as a cardboard stencil may be used with a can of spray paint to make precise, painted images. 3D-RAM offers a broad range of support for on-chip stencil hardware acceleration. There are two distinct stencil modes supported inside the 3D-RAM. The OpenGL stencil mode is fully compliant with the OpenGL specification that allows for any number of stencil planes from 0 through 8. The 3D-RAM also offers the decal stencil mode, which is compatible with the previous generation of the 3D-RAM, M5M410092A. These two stencil modes should not be used at the same time. If the OpenGL stencil mode is being used, the decal stencil mode should be disabled. Similarly, the OpenGL stencil mode should be disabled when using the decal stencil mode. However, the 3D-RAM chip itself does not check or inhibit such conflict, and it is the controller’s responsibility to ensure that such conflict does not occur. There are two parts in the data flow of this stencil mode, as illustrated in Figure 3.8, and the paragraphs that follow refer to the blocks in this figure. The first part involves a stencil function which compares the OLD stencil data to the reference data StF.REF which may be either the most significant byte data at the PALU_DQ pins or the most significnat byte in the Constant Source register. The second part involves the execution of a certain stencil operation on the OLD stencil data or the StF.REF data, based on the results of the stencil and magnitude comparison functions. 40 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP StOP.ZPASS[2:0] StOP.ZFAIL[2:0] StOP.FAIL[2:0] StF.FUNC[2:0] StF.MASK StF.REF OLD[31:24] Stencil Function OLD ROP Unit 3 Result Stencil St.Enable[7:0] Operation PALU_DQ[31:24] 0 K[31:24] 1 StC[19] Magnitude Compare PASS_IN[1:0] Match Compare SRAM Write Enable Logic 8 / Byte 3 to Pixel Buffer Write Enable Bytes 0,1,2 Write Enable Byte 3 PASS_OUT Figure 3.8 Operations of OpenGL stencil mode considers the PASS_IN pins, the Match Compare result, the Magnitude Compare result, and the Stencil Function result. If either of the PASS_IN[1:0] pins is “0” or the Match Compare result is “0”, the Pixel Buffer will not be updated and the PASS_OUT pin will be “0”. If both PASS_IN[1:0] pins are “1” and the Match Compare passes, then depending on the results of the The Stencil Function block compares the magnitude of the stencil reference data, StF.REF, to the OLD stencil data that is read from the Pixel Buffer. A mask for the stencil data is available which provides the capability to ignore certain bits in the stencil data comparison. The exact type of comparison executed in the Stencil Function block is defined by StF.FUNC[2:0]. The options for this comparison are: GL_ALWAYS, GL_GREATER, GL_EQUAL, GL_GEQUAL, GL_NEVER, GL_LEQUAL, GL_NOTEQUAL, and GL_LESS, as defined in Table 3.29. Stencil Function and the Magnitude Compare, the SRAM Write Enable Logic block determines whether to write to the Pixel Buffer and what the state of the PASS_OUT pin should be. The Stencil Operation block considers the results from the Stencil Function and Magnitude Compare and alters the stencil data stored in the Pixel Buffer based on the settings of the Stencil Control register. The Stencil Operation block can be set to ZERO, KEEP, INVERT, REPLACE, INCREMENT, or DECREMENT the stencil planes. The actions taken by the Stencil Operation block for three different cases are defined in Table 3.26 through Table 3.28. It is helpful to point out that for the stencil support to be useful in the overall graphics processing, the overriding conditions above and beyond the results of the stencil comparison functions and depth test are the states of two PASS_IN pins and the Match Compare result. For example, it may be desired to restrict stencil functions and stencil operations to only a certain window on the display monitor based on the result of the comparison of window ID bits, which may or may not be stored on the same 3D-RAM chips as the Z buffer and Separately, the SRAM Write Enable Logic block 41 3 Pixel ALU Operations StF.Mask[7:0] Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP stencil operations StOP.FAIL, StOP.ZPASS, and StOP.ZFAIL take St.Enable[7:0] into account for proper execution of the GL_DECR and GL_INCR operations, with the correct bit alignment and underflow and overflow clamping. The details of the calculations are not explicitly shown. See also page 66 for the paragraph describing the bit field St.Enable. 3 Pixel ALU Operations the stencil planes. The pseudo code below summarizes the above explanations in a concise format. Note that the expression “St.Test(StF)” refers to an OpenGL stencil test based on the stencil function selected in the glStencil_Func. The glStencil_Func should set and reset the bits in the 3D-RAM Stencil Planes register and Stencil Control register. Note also that the above pseudo code assumes that the 42 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Restrictions on OpenGL Stencil Mode IF (StC[19]==0) StF.REF=PALU_DQ[31:24] ELSE StF.REF=K[31:24] There are several restrictions that must be met for the OpenGL stencil mode to function correctly. The restrictions are listed below. IF( ((!PINS[0]||PASS_IN[0]) && (!PINS[8]||PASS_IN[1]) && MATCH_COMP)==0 ) { Pixel_Buffer_Write_Enable_Byte[3:0]=0000b PASS_OUT=0 } ELSE /* match test passes and both PASS_IN */ /* pins TRUE */ { IF ( St.Test(StF)==0 ) { Pixel_Buffer_Write_Enable_Byte[3:0] =1000b For bits disabled by St.Enable[7:0] Pixel_Buffer_Data_In[31:24]=OLD[31:24] For bits enabled by St.Enable[7:0] Pixel_Buffer_Data_In[31:24] =StOP.FAIL(StF.REF, OLD[31:24]) PASS_OUT=0 } ELSE /* stencil test passes */ { IF ( MAGNITUDE_COMP==1 ) { Pixel_Buffer_Write_Enable_Byte[3:0] =1111b For bits disabled by St.Enable[7:0] Pixel_Buffer_Data_In[31:24] =ROP(OLD[31:24]) For bits enabled by St.Enabled[7:0] Pixel_Buffer_Data_In[31:24] =StOP.ZPASS(StF.REF, OLD[31:24]) PASS_OUT=1 } ELSE /* depth test fails */ { Pixel_Buffer_Write_Enable_Byte[3:0] =1000b For bits disabled by St.Enable[7:0] Pixel_Buffer_Data_In[31:24] =OLD[31:24] For bits enabled by St.Enable[7:0] Pixel_Buffer_Data_In[31:24] =StOP.ZFAIL(StF.REF, OLD[31:24]) PASS_OUT=0 } } /* depth test */ } /* stencil test */ } /* PASS_IN/MATCH test */ • The Z planes must be stored in the same 3D-RAM chip as the stencil planes. • The Magnitude Compare unit must be used for the depth test. • If any stencil bits are enabled, the ROP/ Blend unit 3 cannot be used for blending, i.e. RBC[27] must be “0”. • The source for the stencil reference value (StF.REF) is selected by the Stencil Control register bit 19 (StC[19]). When StC[19] is 0, PALU_DQ[31:24] is StF.REF, and when StC[19] is 1, bits 31 through 24 of the Constant Source register (K[31:24]) is StF.REF. StF.REF is used in both stencil test and stencil operations at the same time. Decal Stencil Mode Operations This decal stencil mode is offered to provide compatibility wiith the stencil mode of the previous generation of the 3D-RAM, M5M410092A. By setting the Match Mask register and the Magnitude Mask register, the user has flexible plane depths for the stencil buffer and the Zbuffer, respectively. The Match Compare unit, as in the normal, non-stencil mode, supports NEVER, ALWAYS, EQUAL, and NOTEQUAL stencil test functions, while the byte-wide ROP units support only the logical stencil update operations, namely KEEP, ZERO, REPLACE, and INVERT, plus the additional ONE operation; the arithmetic stencil update operations INCREMENT and DECREMENT are not implemented in the Decal/ Invert Stencil Mode. List 3.1 Pseudo code for OpenGL stencil operations 43 3 Pixel ALU Operations • In order for INCREMENT and DECREMENT operations to perform correctly, it is necessary that the enabled stencil bits be in one contiguous group. Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI 3 Pixel ALU Operations ELECTRONIC DEVICE GROUP The decal stencil mode may be selected by setting the Compare Control register bit 10 to “1”. In this mode, the internal Pixel Buffer write enable for a stateful write is no longer solely controlled by PASS_IN[1:0] and PASS_OUT. The added condition to the write enable signal generation is the output of the Match Compare unit, which now overwrites the PASS_OUT and the inverted output of the Match Compare unit is logically ANDed with PASS_IN[1:0] to set the Pixel Buffer write enable signal. When the stencil buffer and the Z buffer are placed on the same 3D-RAM chip, the Match Compare unit performs the stencil match test and the Magnitude Compare unit performs the Z (depth) compare test. When both tests pass, the PASS_OUT can enable the update of the color buffer in other 3D-RAM chips through their PASS_IN pins. In the meantime, the stencil and Z buffers are also updated internally. If the stencil match test fails, then only the stencil and Z buffer may be updated internally and the color buffer on other 3D-RAM chips are left unchanged. 16-Bit Color Mode There are two popular 16-bit color modes: the (4, 4, 4, 4) mode with 4 bits each for Alpha, Red, Green, and Blue; and the (5, 6, 5, 0) mode with 5bit Red, 6-bit Green, and 5-bit Blue. The 16-bit onchip blending functions are only for (4, 4, 4, 4) mode. If not invoking any ALU operation, the 3D-RAM can be used to store (5, 6, 5, 0) 16-bit data. In the normal (8, 8, 8, 8) color mode, Alpha must be placed in the most significant byte PALU_DQ[31:24] due to the special circuits to handle the case of Alpha-Saturate in Blend unit 3. Red, green, and blue color data may be placed on the other three bytes PALU_DQ[23:0] in any permutation. The four ROP/Blend units operate on the four color components and write the results back to the Pixel Buffer. In the (4, 4, 4, 4) 16-bit mode, the 32-bit bus within the 3D-RAM is used for double buffering the 16-bit (4, 4, 4, 4) data. For example, Buffer A data is placed on the upper nibble of each byte: PALU_DQ[31:28] for Alpha, and PALU_DQ[23:20], PALU_DQ[15:12], and PALU_DQ[7:4] for red, green, and blue; and Buffer B data is placed on the lower nibble of each byte: PALU_DQ[27:24] for Alpha, and PALU_DQ[19:16], PALU_DQ[11:8], and PALU_DQ[3:0] for red, green, and blue. Note again that the Alpha data of both Buffers A and B can only be stored in the most significant byte. Table 3.6 Stateful write Enable in Decal stencil mode Stencil Mode? Stateful Write Enable No PASS_IN[1:0] && PASS_OUT MAGPASS && MATCHPASS YES PASS_IN[1:0] && (PASS_OUT + ~MATCHPASS) MAGPASS && MATCHPASS PASS_OUT Byte Enable and Nibble Control Let NBLn represent the nibble data on the PALU_DQ[4n+3:4n] pins, for n = 0 through 7. In the (8, 8, 8, 8) mode, PALU_BE[3] enables NBL7 and NBL6; PALU_BE[2] enables NBL5 and NBL4; PALU_BE[1] enables NBL3 and NBL2; PALU_BE[0] enables NBL1 and NBL0. An example of this arrangement is illustrated in Figure 3.9 below. Invert Stencil Mode Operations Warning! The invert stencil mode is removed from this device M5M410092B, and an incompatibility exists between this device M5M410092B and its previous generation, M5M410092A, with respect to this function. 44 Rev. 1.03 3D-RAM (M5M410092B) ELECTRONIC DEVICE GROUP Write Control DQ 31 PALU_BE[3] α PALU_BE[2] R PALU_BE[1] G PALU_BE[0] B “Write” 8 Pixel ALU Pixel Buffer 31 Read Control for α α α PALU_BE[3] for R R R PALU_BE[2] for G G G PALU_BE[1] for B B B PALU_BE[0] 8 8 8 DQ “Read” 0 0 Figure 3.9 Data mapping in the (8, 8, 8, 8) color mode Write Control DQ 31 PALU_BE[3] αA PALU_BE[1] αB PALU_BE[3] RA PALU_BE[1] RB PALU_BE[2] GA PALU_BE[0] GB PALU_BE[2] BA “Write” Pixel ALU Pixel Buffer αA for α for R for G for B BB PALU_BE[0] “Read” DQ αA 31 Read Control PALU_BE[3] αB αB PALU_BE[1] RA RA PALU_BE[3] RB RB PALU_BE[1] GA GA PALU_BE[2] GB GB PALU_BE[0] BA BA PALU_BE[2] BB BB PALU_BE[0] 0 0 Figure 3.10 Data Mapping in the (4, 4, 4, 4) Color Mode controls (A, R, G, B) in Buffer B. This byte enable assignment allows for support of (4, 4, 4, 4) and (5, 6, 5, 0) data formats with identical PALU_BE controls from the controller. Figure 3.10 illustrates how data is mapped in the (4, 4, 4, 4) mode. In the (4, 4, 4, 4) mode, PALU_BE[3] enables NBL7 and NBL5; PALU_BE[2] enables NBL3 and NBL1; PALU_BE[1] enables NBL6 and NBL4; PALU_BE[0] enables NBL2 and NBL0. From the rendering controller output, PALU_BE[3,2] controls (A, R, G, B) in Buffer A, while PALU_BE[1,0] 45 3 Pixel ALU Operations MITSUBISHI Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP 3 Pixel ALU Operations The solid black arrows correpond to data flow of Buffer A, and the gray arrows correspond to data flow of Buffer B. During write operations, the 4-bit data may be duplicated or padded with zeros in the lower nibble when processed by the byte-wide Pixel ALU; after the Pixel ALU processing, the lower nibble of the resulting 8-bit data is truncated to form a 4-bit data before written into the Pixel Buffer. stored in NBL3 to NBL0, which are controlled by PALU_BE[1:0]. The data mapping for the (5, 6, 5, 0) color mode is shown in Figure 3.11 below. In summary, when the controller is in 16-bit color mode, either (4, 4, 4, 4) or (5, 6, 5, 0), asserting PALU_BE[3:2] controls Buffer A read and write operations; asserting PALU_BE[1:0] controls Buffer B read and write operations. Because the same blending circuits for (4, 4, 4, 4) mode are used for both Color Buffers, Buffer A and B Stateful Writes cannot occur on the same clock cycle. To store 16-bit color in (5, 6, 5, 0) data format, the 3D-RAM is programmed to operate in (8, 8, 8, 8) mode. The (R, G, B) data for buffer A is stored in NBL7 to NBL4, which are controlled by PALU_BE[3:2]. The (R, G, B) data for Buffer B is DQ Write Control “Write” 31 PALU_BE[3] PALU_BE[2] PALU_BE[1] PALU_BE[0] Pixel ALU BA 5 GA Pixel Buffer (Pass Through) DQ 31 “Read” BA BA 6 GA GA RA 5 RA RA BB 5 BB BB GB 6 GB GB RB 5 RB RB Read Control PALU_BE[3] PALU_BE[2] PALU_BE[1] PALU_BE[0] 0 0 Figure 3.11 Data mapping in the (5, 6, 5, 0) color mode Table 3.7 Byte Enable controls and color data placement in (4,4,4,4) mode PALU_DQ [31:28] [27:24] [23:20] [19:16] [15:12] [11:8] [7:4] [3:0] NBL 7 6 5 4 3 2 1 0 PALU_BE 3 1 3 1 2 0 2 0 Color Data αA αB RA RB GA GB BA BB 46 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Table 3.8 Byte Enable controls and color data placement in (5, 6, 5, 0) mode [31:28] NBL 7 PALU_BE Color Data [27:24 [23:20] 6 5 3 3 [11:8] [7:4] 2 1 1 [3:0] 0 0 RB,GB,BB • Either PALU_BE[0] or PALU_BE[2] sets the Dirty_Tag bits corresponding to both Byte 0 and Byte 1 of the addressed data word In Read Data, Stateless Initial/Normal Write operations • Either PALU_BE[1] or PALU_BE[3] sets the Dirty_Tag bits corresponding to both Byte 2 and Byte 3 of the addressed data word In Stateful Initial/Normal Write, Initiate Two-Cycle Blending operations n • PALU_BE[3:0] controls either Buffer A (as 1100b) data or Buffer B (as 0011b) data, but not both (i.e. not 1111b) Write Control Register(s) • Same decoding as in (8, 8, 8, 8) mode n Read ID Register • Same decoding as in (8, 8, 8, 8) mode) Warning! During a Stateful Write or Initiate Two-Cycle Blending operation, PALU_BE[3:0] should be set to only enable either Buffer A or Buffer B, but not both. Enabling both buffers would result in Buffer A and Buffer B datausing the same blending circuits at the same time and therefore in undefined resulting data. In other words, in these operations PALU_BE[3] and PALU_BE[1] cannot be enabled at the same time. Similarly, PALU_BE[2] and PALU_BE[0] cannot be enabled at the same time. n 4 RA,GA,BA • PALU_BE[3:0] may be any combination (decoded as in Table 3.7) n [15:12 2 The following is how PALU_BE[3:0] can be used in (4, 4, 4, 4) mode: n [19:16] n (4, 4, 4, 4) mode has no effect on the use of the Dirty Tag (Masked/Unmasked Write Block from the Pixel Buffer to a DRAM bank); it only affects the writing of the Dirty Tag during Stateful/Stateless Initial/ Normal Writes operations. Use of Dirty Tags in (4, 4, 4, 4) 16-bit Mode The mapping for the “Replace Dirty Tag” and “OR Dirty Tag” operations in (4, 4, 4, 4) mode is the same as in (8, 8, 8, 8) mode, only that the PALU_BE control decoding is for the (4, 4, 4, 4) mode. The Dirty Tag functions in (4, 4, 4, 4) mode are summarized in Table 3.9. The mapping of the PALU_BE pins to the Dirty Tags is illustrated in Figure 3.12. PALU_BE mapping to the Dirty Tag in Stateful/Stateless Writes operations 47 3 Pixel ALU Operations PALU_DQ Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP 3 Pixel ALU Operations Table 3.9 Pixel ALU operations involving Dirty Tags in (4, 4, 4, 4) mode Pixel Operation Pixel Data New Dirty Tag Contents (Stateful/Stateless) Normal Data Write Write to Buffer A from PALU_DQ pins (per PALU_BE[3:2] pins) The Dirty Tag bits for bytes 3 and 2 are ORed with PALU_BE[3]; the Dirty Tag bits for bytes 1 and 0 are ORed with PALU_BE[2]; the other 28 Dirty Tag bits are unchanged. (Stateful/Stateless) Normal Data Write Write to Buffer B from PALU_DQ pins (per PALU_BE[1:0] pins) The Dirty Tag bits for bytes 3 and 2 are ORed with PALU_BE[1]; the Dirty Tag bits for bytes 1 and 0 are ORed with PALU_BE[0]; the other 28 Dirty Tag bits are unchanged. (Stateful/Stateless) Initial Data Write Write to Buffer A from PALU_DQ pins (per PALU_BE[3:2] pins) PALU_BE[3] is written to the Dirty Tag bits for bytes 3 and 2; PALU_BE[2] is written to the Dirty Tag bits for bytes 1 and 0; “0” is written to the 28 unaddressed Dirty Tag bits. (Stateful/Stateless) Initial Data Write Write to Buffer B from PALU_DQ pins (per PALU_BE[1:0] pins) PALU_BE[1] is written to the Dirty Tag bits for bytes 3 and 2; PALU_BE[0] is written to the Dirty Tag bits for bytes 1 and 0; “0” is written to the 28 unaddressed Dirty Tag bits. Replace Dirty Tag Unchanged PALU_DQ[31:0] replaces all 32 Dirtyu Tag bits. OR Dirty Tag Unchanged All 32 Dirty Tag bits are ORed with PALU_DQ[31:0]. 48 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP 0 8 16 24 9 17 25 10 18 26 3 11 19 27 4 12 20 28 5 13 21 29 6 14 22 30 7 15 23 31 PALU_BE[1] A 256-bit data block 0 8 16 24 1 6 4 BL N 2 BL BL N 7 5 BL BL N 3 BL N N N BL 1 N 24 N BL 0 16 0 1 2 3 4 5 6 7 9 17 25 2 10 18 26 3 11 19 27 4 12 20 28 5 13 21 29 6 14 22 30 7 15 23 31 A 32-bit Dirty Tag Figure 3.12 PALU_BE Mapping to Dirty Tags for (4,4,4,4) Mode 49 3 Pixel ALU Operations 8 PALU_BE[3] 0 1 PALU_BE[0] PALU_BE[2] PALU_BE[3] PALU_BE[1] PALU_BE[0] PALU_BE[2] 2 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP 3 Pixel ALU Operations 4-bit to 8-bit Expansion for Pixel ALU Blending Table 3.11 Blending color value and Alpha, with Alpha = “F”, duplication at lower nibble The Pixel ALU blending function operates on 8-bit components. To implement the 8-bit blending operation on a 4-bit color component, it is necessary to expand the 4-bit data (either from PALU_DQ pins or from Pixel Buffer) to an 8-bit operand. For the ADDEND term, the 4-bit component is mapped to the upper nibble and zeros are padded into the lower nibble. For the multiplicand terms, simply padding zeros in the lower nibble would result in an incorrect pixel value due to computational error from short bit length representation. By duplicating the 4-bit data in both the upper and lower nibbles, we can avoid corrupting the pixel value. This effect can be illustrated with the following two examples, when the blending factor Alpha is equal to “F”. Color Blending Arithmetic Multiplier Value (ColorxAlpha) Result Output D200 D 2 20 x F0 1A00 1 ED12 E 22 x FF 21DE 2 On the other hand, in the (4, 4, 4, 4) mode, the 4bit ADDEND, whether supplied from the PALU_DQ pins in both the Preblend Cycle and Normal Cycle or looped back from the Multiplier or the ADDEND in the Preblend Cycle, will always be paddd with 0000b in the least significant bits to minimize the error due to incorrect round-up. Color Blending Arithmetic Multiplier Value (ColorxAlpha) Result Output E0 x F0 EE x FF 2 As we can see from the Table 3.11, the duplication of the upper nibble and the lower nibble allows the color data to blend with “1”, without invoking an extra bit in the circuit. After the blending has occurred, the upper nibble is mapped back to the correct nibble in the Pixel Buffer. This mapping is illustrated in Figure 3.10. Table 3.10 Blending color value and Alpha, with Alpha = “F”, zeros padded at lower nibble E E 50 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Dual Compare Unit generate the Write Enable signal to the Pixel Buffer. A functional block diagram of the Dual Compare units is shown in Figure 3.13. Both Match Compare and Magnitude Compare are performed in parallel. The Match Mask and Magnitude Mask define which bits of the 32-bit word will be compared and which will be “don’t care.” One of the sources is always the Old Data (“O”) from the Pixel Buffer. The other source is independently selectable between the New Data (“N”) from the PALU_DQ pins and the Register Data (“K”) from the Constant register. In the normal mode (all stencil planes are disabled, Stencil Function=ALWAYS PASS), the results of Match Compare and Magnitude Compare operations are logically ANDed to generate the PASS_OUT signal. The external PASS_IN[1:0] and internal PASS_OUT are then logically ANDed together to generate the Write Enable signal to the Pixel Buffer. If OpenGL stencil mode is enabled, the PASS_OUT and Write Enable generation are affected by the stencil function result. Note that the Decal stencil mode logic is not shown below. See the sectoion on “Stencil Modes” on page 40 for more details on how stencil affects PASS_OUT and Pixel Buffer Write Enable The results of both Match Compare and Magnitude Compare operations are logically ANDed together to generate the PASS_OUT pin. The external PASS_IN[1:0] and internal PASS_OUT are then logically ANDed together to Stencil Enable Load Match Mask 32 N[31:0] 32 Match Write Enable to Pixel Buffer Byte 3 Write Enable to Pixel Buffer Bytes 0,1 2 1 0 Compare K[31:0] 32 1 2 32 2 O[31:0] 32 Magnitude 32 Compare PASS_OUT 3 Load Magnitude Mask Load Compare Control Stencil Function Result 7 (!PINS[0] || PASS_IN[1]) && (!PINS[8] || PASS_IN[0]) Figure 3.13 Block diagram of the Dual Compare unit (Pipeline stages are not shown) (NEW rev.1.03) 51 3 Pixel ALU Operations (NEW) Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Pipelining write operations with minimal cycle time. This is achieved by having all operations conform to a uniform 7-stage pipeline. The 3D-RAM Pixel ALU pipeline is designed so that the rendering controller can issue read and 3 Pixel ALU Operations Table 3.12 Pixel ALU and Pixel Buffer operation pipeline Stage External Activities Internal Activities 1 Operation specified on PALU_EN, PALU_WE, PALU_OP, PALU_A, and PALU_BE pins 2 Write data on PALU_DQ and PALU_DX pins if write operation Read Pixel Buffer. Decode operation. 3 Read data on PALU_DQ pins if read operation First stage of ROP/Blend and Compare Units 4 Second stage of ROP/Blend and Compare Units 5 Third stage of ROP/Blend and Compare Units 6 Compare result transferred from PASS_OUT pin to PASS_IN pin Fourth stage of ROP/Blend Units; Write Enable generated, if write operation. 7 Write result to Pixel Buffer and Dirty Tags if allowed. 8 HIT pin changes, if the write operation is a Stateful Data Write and it is successful. after the read instruction. In other words, to the rendering controller, n consecutive read operations take 2(n+1) cycles plus access time. A sequence of Pixel ALU write operations may be issued consecutively, one write operation per cycle. The write opcode and address are sampled by the rising edge of MCLK in one cycle, and the write data is loaded by the rising edge of the next MCLK. To the rendering controller, n consecutive writes take only n+1 cycles. Register writes do not affect operations issued in previous cycles; register writes always affect operations issued in subsequent cycles. A read operation may be issued immediately after a write operation without any delay cycles. An idle cycle will be automatically generated by the 3D-RAM chip on the PALU_DQ pins between a write operation and the subsequent read operation. At least two NOP cycles must be inserted between a read operation and a subsequent write operation. The two NOPs are needed to guarantee one idle cycle on the PALU_DQ pins between the read data and subsequent write data. Read operations may be issued consecutively, one read operation per two cycles. Specifically, all read operations require that the same address must be stable for at least two rising edges of MCLK plus the set-up time. (See Figure 8.4 for details.) Due to the pin output delay in the worst case, the read data may be available for sampling by the rendering controller in the second cycle Figure 3.14 illustrates the above statements on the pipeline flow of Pixel ALU read/write operations. 52 ELECTRONIC DEVICE GROUP MCLK, they are latched in and the pipeline stage 2 starts. Beginning in the pipeline stage 3, data from the Pixel Buffer becomes available either as output to the PALU_DQ pins during an ALU read operation or as input to the Dual Compare unit and the ROP/Blend units during an ALU write operation. If it is a write, then PALU_DQ must be presented with data from the rendering controller in the pipeline stage 2, to be latched in and used by the Pixel ALU, together with the data from the Pixel Buffer. The pipeline flow of the ROP/Blend units and the Dual Compare units of the Pixel ALU and the Pixel Buffer is illustrated in Figure 3.15. It is helpful to point out that the dotted lines show the boundaries of the pipeline stages and the numbers in the square boxes indicate the pipeline stages. A pipeline stage begins with the rising edge of MCLK and ends just prior to the next rising edge. For example, PALU_BE and PALU_A pins are presented to 3D-RAM by the rendering controller in the pipeline stage 1. On the rising edge of R1 MCLK PALU_A, PALU_OP, PALU_BE, PALU_WE, PALU_EN PALU_DQ, PALU_DX 0 R2 1 Read A R3 2 R4 3 4 Read B A 5 6 7 WC NOP B 8 9 10 WD WE WF Data C Data D Data E 11 12 13 14 15 Read G Data F PASS_OUT G Pass C Pass D Pass E W6 W7 W8 Pass F HIT W1 W2 W3 W4 W5 M1029 Figure 3.14 Example of Pixel Port read/write operations to satisfy the pipeline flow 53 3 Pixel ALU Operations Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP 6 PASS_IN 2 6 PASS_OUT 3 3 Pixel ALU Operations 4 Compare Unit 3 4 5 5 2 6 Enables ROP/Blend Units PALU_DQ 3 4 5 Write Addr 7 Write Data 6 Pixel Buffer 3 2 3 Read Data 2 Read Addr 2 1 PALU_BE 1 PALU_A Figure 3.15 Pixel ALU and Pixel Buffer block diagram with pipeline flow 54 ELECTRONIC DEVICE GROUP by the number 8 in the square box above the HIT pin label. A sequence of Stateful Data Write operations may be issued immediately after the register writing, and their effects will take place after the HIT pin is set high by this initial register writing. If any of the Stateful Data Writes in the sequence causes the on-chip and off-chip comparison tests to pass (PASS_IN[1:0] and PASS_OUT are both high at the pipeline stage 6), the HIT pin is set low until the HIT flag is cleared by writing “10” into bits 25 and 24 of the Compare Control register. See also Figure 8.6, “Pick Logic timing”, for an illustration of the operations described in this section. The Picking Logic The block diagram of the Picking Logic is shown again in Figure 3.16. At the beginning, the Picking Logic should be enabled and the HIT flag should be cleared. This is done either by asserting the RESET pin low or by writing the data Eh into byte 3 of the Compare Control register. It is helpful to note that writing “1” to bits 27 and 25 of the Compare Control register effectively generates one-shots to load the Pick Enable flag and the HIT flag, respectively. The user will not need to perform a second register write operation to reset bits 27 and 25 to “0”. The HIT pin will be set to high after seven cycles (corresponding to the pipeline stage 8, as in Table 3.12). In the figure below, this is indicated D25 8 D Q HIT 0 D24 Compare Control Register (open drain) D D Q Q HIT Flag 1 D27 Pick Enable 0 D26 D Q 1 Stateful_WE PASS_IN PASS_OUT Set HIT Flag 7 M1040 Figure 3.16 Block diagram of the Picking Logic 55 3 Pixel ALU Operations Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP 3 Pixel ALU Operations Operations of the Pixel ALU the Pixel Buffer in four different modes, replacing Dirty Tag data, and changing Dirty Tag data with OR function; in this case, the block address is assigned by the PALU_A[5:3] pins, and the word address is assigned by the PALU_A[2:0] pin. All operations that involve the Pixel ALU and the Pixel Buffer but not the DRAM array are collectively referred to as Pixel ALU operations. Table 3.13 summarizes the Pixel ALU operations. There are two categories of Pixel ALU operations: register operations and pixel data operations. Register operations include reading the Identification register and writing the control registers; in this case, the register is specified by the PALU_A pins. Pixel data operations include reading data from the Pixel Buffer, writing data to To support the all blending operations specified in the OpenGL specification version 1.1 (December 21, 1995), a new Pixel ALU command, called Initial Two-Cycle Blending, is added to this implementation of 3D-RAM. Table 3.13 Pixel ALU operation encoding PALU_EN PALU_WE PALU_OP PALU_A Operation 00 — — — NOP 10 — — — NOP 01 — — — NOP 11 0 000 Block:Word Read Pixel Buffer (note 1) 11 0 001 — Reserved 11 0 010 — Reserved 11 0 011 — Reserved 11 0 100 — Reserved 11 0 101 — Reserved 11 0 110 — Reserved 11 0 111 000111 Read Identification Register 11 1 000 Block:Word Stateless Initial Data Write 11 1 001 Block:Word Stateless Normal Data Write 11 1 010 Block:Word Stateful Initial Data Write (note 1) 11 1 011 Block:Word Stateful Normal Data Write (note 1) 11 1 100 Block:xxx Replace Dirty Tag 11 1 101 Block:xxx OR Dirty Tag 11 1 110 Block:Word Initiate Two-Cycle Blending (NEW) (note 1) 11 1 111 Register Write Control Registers (note 1) Note 1: One MCLK cycle must be inserted between the write to the Color Depth Select register and one of the following ALU operations: Read Pixel Buffer, Stateful Initial Data Write, Stateful Normal Data Write, and Initial Two-Cycle Blending. See also the decription on the Color Depth Select register. (NEW rev.1.03) 56 ELECTRONIC DEVICE GROUP register load. The PALU_BE[3..0] pins apply to all register read and write operations. PALU_BE0 enables writes to bits 7 through 0; PALU_BE1 enables writes to bits 15 through 8; PALU_BE2 enables writes to bits 23 through 16; PALU_BE3 enables writes to bits 31 through 24. Finally, during the Stateful Data Write modes, all bits in these registers are fully effective; during the Stateless Data Write modes, all register bits are ignored except that the Color Depth Select register, the bits controlling the Picking Logic, and the Stencil Control register maintain some of their special functions. Please refer to the description of these bits for more details. Register Operations There are fourteen registers in the 3D-RAM. Their encoding is shown in Table 3.14. Among these registers, the Identification Register is read-only, and all other registers are write-only. All registers are 32 bits wide except the Constant Source register, which is 36 bits. The write-only registers are loaded from the PALU_DQ[31:0]. In the case of the 36-bit Constant Source register, the PALU_DX[3:0] pins specify the most significant four bits. The operations launched in the previous cycles are never affected by the current register load. The operations launched in the following cycles are always affected by the Table 3.14 Register address encoding PALU_A Register Mnemonic Type Reset Value Stateless Mode PM Write Only FFFF FFFFh not applicable CSR Write Only 0 0000 0000h not applicable 000 000 Plane Mask 000 001 Constant Source 000 010 Match Mask MatMask Write Only 0000 0000h not applicable 000 011 Magnitude Mask MagMask Write Only 0000 0000h not applicable 000 100 ROP/Blend Control RBC Write Only 0303 0303h 0303 0303h 000 101 Compare Control CCR Write Only 0A00 0000h 0000 0000h 000 110 Write Address Control WAC Write Only 0000 0000h 0000 0000h 000 111 Identification* ID Read Only 0130 A039h not applicable 001 000 Blend_2 Control (NEW) BLD2 Write Only 0000 0000h 0000 0000h 001 001 Preblend Control (NEW) PBC Write Only 0000 0000h 0000 0000h 001 010 Stencil Planes (NEW) StP Write Only 00FF 0000h 00FF 0000h 001 011 Stencil Control (NEW) StC Write Only 3330 0000h 3330 0000h 001 110 PASS_INs Select (NEW) PINS Write Only 0000 0100h not applicable 001 111 Color Depth Select (NEW) CDS Write Only 0000 0000h programmed value *Note: Reset value for Identification register is for Version 0 57 3 Pixel ALU Operations Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI 3 Pixel ALU Operations ELECTRONIC DEVICE GROUP Identification Register (ID[31:0]) Constant Source Register (CSR[35:0]) The read-only Identification register contains the manufacturer identification code (ID), part number code, and version code in the format shown in Figure 3.17. The manufacturer ID is 01Ch for Mitsubishi Electronics. The part number is read as 130Ah for M5M410092B. Bit 0 is always “1”, so for Version 0, this identifi-cation register should be read as 0130 A039h. This register is used to store 36-bit data that is loaded from the PALU_DQ and PALU_DX pins. (The data extension pins PALU_DX[3:0] are loaded into the most significant four bits of the Constant Source register.) The bits of this register are commonly referred to as KX[3:0] for the most significant four bits and K[31:0] for the low-order 32 bits. The four ROP/Blend units and the Dual Compare units can individually select this register to provide data. Plane Mask Register (PM[31:0]) This register affects both the Stateful Data Writes of the Pixel ALU operations and the Masked Write Block (MWB) of the DRAM operations. The effect is simultaneous on both types of operations. Therefore, the user must exercise caution to ensure the desired plane masking is achieved when such concurrency between the Pixel ALU and the DRAM array is exploited. For the Stateful Data Writes, each bit of the Plane Mask register is a per-bit write enable for the 32-bit data entering the Pixel Buffer. For the MWB operation, each bit of the Plane Mask register serves as a per-bit write enable for the 32-bit word 0 entering the sense amplifiers of the a DRAM bank, and the same write masking mechanism is applied to the upper seven words of the specified Pixel Buffer block. Figure 3.2 provides a clear illustration of this masking relationship between the write data and the bits of the Plane Mask register. The value “1” means write enable; the value “0” means write disable. This register resets to 0000 0000h. Match Mask Register (MTM[31:0]) This register determines which data bits participate in the match test. Setting the bits of this register to “1” causes the corresponding data bits to be compared by the Match Comparison unit. Setting the bits of this register to “0” causes the corresponding data bits to be ignored in the match test. This register resets to 0000 0000h. Magnitude Mask Register (MGM[31:0]) This register determines which data bits participate in the magnitude test. Setting the bits of this register to “1” causes the corresponding data bits to be compared by the Magnitude Comparison unit. Setting the bits of this register to “0” causes the corresponding data bits to be ignored in the magnitude test. This register resets to FFFF FFFFh. This register resets to 0000 0000h. 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 Version Part Number Manufacturer ID 1 Figure 3.17 Identification register data format 58 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 ROP/Blend Unit 3 ROP/Blend Unit 2 ROP/Blend Unit 1 Figure 3.18 ROP/Blend Control register data format ROP/Blend Control Register (RBC[31:0]) • Bit 8n+5 selects a source for ROP unit n and for the adder in the Blend unit n. If this bit is “0”, the data from the PALU_DQ[8n+7:8n] is selected; if this bit is “1”, the Constant Source register bits K[8n+7:8n] are selected. This register controls the operations of the four ROP/Blend units. Each ROP/Blend unit is independently controlled by an 8-bit field of this 32-bit register. Bits 7 through 0 are repeated three more times for Units 1, 2, and 3. That is, bits 15 through 8 for unit 1; bits 23 through 16 for unit 2; and bits 31 through 24 for unit 3. • Bit 8n+4 configures ROP/Blend unit n. For bit 28, the bit value of “0” sets the ROP/ Blend unit 3 in ROP and Stencil mode and forces the output of Alpha-Saturate block to be always OLD[31:24] regardless of the programmed values in BLD2[29:28] and PBC[29:28]; the bit value of “1” sets the ROP/Blend unit 3 in Blend mode and enable the Alpha Saturate logic . For bits 4, 12 and 20, the bit value of “0” sets the ROP/ Blend units 0, 1 and 2 in ROP mode, respectively; the bit value of “1” sets the ROP/Blend units 0, 1 and 2 in Blend mode, respectively. Note that when in Blend mode, Blend units 0, 1 and 2 will calculate with the correct alpah saturate value only when bit 28 of this register is also set to “1”. Note also that the bit field St.Enable does not specifically set the operation mode of ALU unit 3; St. Enable enables the bit planes 31 through 24 to be recognized as stencil bits when ALU unit 3 is in ROP and Stencil mode, and is ignored when ALU unit 3 is in Blend mode. (NEW rev.1.03) This register resets to 0303 0303h. This value passes data unchanged from the PALU_DQ pins through all four ROP/Blend units. During a Stateless Data Write access, the ROP/ Blend units behave as if this register were set to 0303 0303h, regardless of its actual value. The data format of the RBC register is illustrated in Figure 3.18 above and explained in the paragraph below. • Bits 8n+7 through 8n+6 select a source for MULTP1 (Table 3.15). Table 3.15 MULTP1 source encoding for ROP/Blend unit n RBC[8n+7:8n+6] Fraction Source for ROP/ Blend Unit n 00 100h (1.00) 01 {KXn, K[8n+7:8n]} {PALU_DXn, PALU_DQ[8n+7:8n]} 10 11 • Bits 8n+3 through 8n+0 select one of the sixteen possible raster operations for Unit n. In Table 3.16, “NEW” represents the data from the PALU_DQ[31:0] pins or from the {PALU_DX3, PALU_DQ[31:24]} 59 3 Pixel ALU Operations MULTP1 Select ROP/Adder Source Select ROP/Blend Select Raster Op. Select Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP bits of this register are currently defined. The other 20 bits are reserved. Constant Source register bits K[31:0] (as selected by bit 8n+5), “OLD” represents the 32-bit data from Pixel Buffer, and “~” means logical inversion. All of these operations are bit-wise logical operations. • Bits 31 through 28, 23 through 18, 15 through 11, and 7 through 3 are reserved. They are written as “0” for future compatibility. Table 3.16 Raster operation encoding 3 Pixel ALU Operations RBC[8n+3:8n+0] • Bits 27 through 24 control the Picking Logic. Bits 25 and 24 clear or set the HIT flag. Bits 27 and 26 enable or disable the Picking Logic. The encoding tables are in Table 3.17 and Table 3.18. It is helpful to note that after bits 27 and 25 are loaded with “1”, these bits are automatically reset to “0” in the next MCLK cycle, thereby restoring the state machine to the “No Change” state and saving the follow-up register writes. In this sense, writing “1” into bits 27 and 25 generates one-shots at the outputs of CCR[27] and CCR[25]. Raster Operation 0000 All bits zero 0001 NEW and OLD 0010 NEW and ~OLD 0011 NEW 0100 ~NEW and OLD 0101 OLD 0110 NEW xor OLD 0111 NEW or OLD 1000 ~NEW and ~OLD 1001 ~NEW xor OLD 1010 ~OLD 1011 NEW or ~OLD 1100 ~NEW 1101 ~NEW or OLD 1110 ~NEW or ~OLD Table 3.18 Pick Hit encoding 1111 All bits one CCR[25:24] Table 3.17 Pick Enable encoding CCR[27:26] Compare Control Register (CCR[31:0]) This register controls the Picking Logic and the Dual Compare unit, and thereby indirectly influences the status of the PASS_OUT pin. Only 12 Function 0X No change to Pick Enable flag 10 Disable Picking Logic 11 Enable Picking Logic Function 0X No change to HIT flag 10 Clear HIT flag 11 Set HIT flag 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 reserved reserved reserved reserved Picking Logic Dual Compare Source Decal Stencil Match Test Figure 3.19 Compare Control register data format 60 Magnitude Test Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP • Bits 17 through 16 select the source for the Dual Compare unit. Bit 16 directly controls the Match Compare source, while the result of Bit 17 XOR Bit 16 controls the Magnitude Compare source. In this way, the first two codes are compatible with the previous generations, M5M410092 and M5M410092A which had bit 17 reserved and always set to “0”. (NEW) PASS_OUT are all high; or (2) PASS_IN[1:0] are high and match compare output is low (failing the match test). • Bits 9 through 8 select one of four tests for the Match Compare unit (Table 3.20). Table 3.20Match test encoding Table 3.19 Dual Compare source selection encoding Magnitude Compare Source Match Compare Source 00 PALU_DQ pins PALU_DQ backward pins compatible 01 Constant Source register Constant Source register Constant Source register PALU_DQ new feature pins PALU_DQ pins Constant Source register CCR [17:16] 10 11 Test condition 00 Always pass 01 Never pass 10 Pass if NEW == OLD 11 Pass if NEW != OLD • Bits 2 through 0 select one of eight tests for the Magnitude Compare unit (Table 3.21). Comments Table 3.21Magnitude test encoding backward compatible CCR[2:0] new feature • Bit 11 previously enables invert stencil mode in M5M410092A but is now reserved and should be always written as “0”. Test condition 000 Always pass 001 Pass if NEW > OLD 010 Pass if NEW == OLD 011 Pass if NEW >= OLD 100 Never pass 101 Pass if NEW <= OLD 110 Pass if NEW != OLD 111 Pass if NEW < OLD During a Stateful Data Write to the Pixel Buffer, the pixel data is actually written only when the Magnitude test, the Match test, and the external PASS_IN pin all pass. The PASS_OUT pin is set to pass only when the Magnitude test and the Match test both pass. Warning! The invert stencil mode is removed from this device M5M410092B, and an incompatibility exists between this device M5M410092B and its previous generation, M5M410092A, with respect to this function. While the Picking Logic is enabled, all Stateful Data Writes which pass both compare tests (PASS_OUT high) while PASS_IN is high will set (i.e., OR a “1” into) the HIT flag. The HIT flag will then remain set until cleared by writing “10” into bits 25 and 24. The HIT flag is active high, and when it is “1”, it drives the open-drain HIT pin low. • Bit 10 enables the decal stencil mode (see the section on “Stencil Modes” on page 40). “0” selects the normal rendering operation, where stateful write is enabled when PASS_IN[1:0] and PASS_OUT all are high. “1” selects the decal stencil mode, where the stateful write is enabled in one of the two conditions: (1) PASS_IN[1:0] and This register resets to 0A00 0000h, which means 61 3 Pixel ALU Operations CCR[9:8] Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP the HIT flag is cleared and the Picking Logic is disabled. The Write Address Control register is used to speed up vertical scroll in screen display. Taking advantage of the pipeline structure, reading data from one Pixel Buffer location and writing into another location can be achieved in one Stateful Data Write. During a Stateless Data Write access, the Dual Compare unit behaves as if this register were set to 0000 0000h, regardless of its actual value. 3 Pixel ALU Operations Write Address Control Register (WAC[31:0]) The 1-bit Write Address Control register selects the Pixel Buffer write address between the PALU_A[5:0] pins (the normal path) and the PALU_DQ[29:24] pins (the vertical scroll acceleration path). Only 1 bit of this register is currently used for the Pixel ALU function. The other 31 bits are reserved. • Bit 0 selects the source for the Pixel Buffer write address. “0” selects the Pixel Buffer write address from the PALU_A[5:0] pins. “1” selects the Pixel Buffer write address from the PALU_DQ[29:24] pins. For the vertical scroll in a screen as illustrated in Figure 3.20, the data in Pixel A is to be moved to Pixel B. Assume that Pixel data A is stored in Pixel Buffer [Block 3:Word 0] and that Pixel B is in [Block 0:Word 5]. Figure 3.21 shows the pipeline flow for the write address selection, and Figure 3.12 shows the data stream for the example in Figure 3.20. Before the Stateful Data Write to move Pixel A data to Pixel B location is started, four registers should be set as follows: • Bits 31 through 1 are reserved. They are written as “0” for future compatibility. This register resets to 0000 0000h. During a Stateless Data Write, the Write Address Control register behaves as if this register were set to 0000 0000h, regardless of its actual value. • Write into Write Address Control Register with “1”. An Application of the Write Address Control Register • Write into ROP/Blend Control Register with 0505 0505h to select old data. SCREEN DISPLAY [Block 0 : Word 5] in the Pixel Buffer B [Block 3 : Word 0] in the Pixel Buffer A • • • • • • M1027 Figure 3.20 Pixel movement in vertical scroll 62 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP The Stateful Data Write issued later should have the read address asserted on the PALU_A[5:0] pins and the write address on the PALU_DQ[29:24] pins at the next cycle. Seven cycles later in the pipeline path, the Pixel A data will be written into the Pixel B location. •Write into Compare Control Register with 0000 0000h to pass data into the the Pixel Buffer. •Write into Plane Mask Register with FFFF FFFFh to pass every bit into the Pixel Buffer for the Stateful Data Write. 3 4 5 6 PALU_DQ PALU_DQ [29:24] [29 . . 24] 3 Pixel ALU Operations 2 7 3 4 5 1 6 0 6 Write Write ADDR Address Write Write Data Data 5 Pixel Buffer Pixel Buffer Read Read Data Data 4 3 Read Read ADDR Address 2 1 PALU_A PALU_A [5[5:0] . . 0] M1026 Figure 3.21 Pipeline flow of the Write Address Control MCLK PALU_OP PALU_A 0 1 111 000110 2 111 000100 3 111 000101 4 5 111 000000 6 011 011 Stfl Write Stfl Write 7 8 9 011000 Read Addr. Read Addr. PALU_WE PALU_DQ 0000 0001h 0505 0505h 0000 0000h FFFF FFFFh 000101 Write Addr. Write Addr. M1025 Figure 3.22 Pipeline for performing a vertical scroll 63 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 Unit 1 R Unit 2 R=reserved R=reserved R=reserved Unit 3 MULTP2 Select MULTP1 Select ADDEND Select 3 Pixel ALU Operations Alpha-Saturate Output Select Figure 3.23 Blend_2 Control register data format Blend_2 Control Register(BLD2[31:0]) (NEW) . This register provides additional control of the multiplicands and addends for the four Blend units. Each Blend unit is independently controlled by a 4-bit field of this 32-bit register. Bits 3 through 0 are repeated three more times on byte boundaries for Units 1, 2, and 3. That is, bits 11 through 8 for unit 1; bits 19 through 16 for unit 2; and bits 27 through 24 for unit 3. In addition, Bits 29 and 28 are used to select the output of the Alpha-Saturate block. This output can then be selected by each of the ROP/Blend units as the source for the second multiplicand, MULTP2. The data format of this register is illustrated in Figure 3.23 above. Table 3.22 Encoding for Output of AlphaSaturate Block BLD2[29:28] Output of Alpha-Saturate Block 00 min{NEW[31:24], ~OLD[31:24]} (α-sat) 01 NEW[31:24](As) 10 OLD[31:24] (Ad) 11 ~OLD[31:24] (~Ad) • Bits 8n+3 through 8n+2 select the source for the second multiplicand, MULTP2 (Table 3.23) Table 3.23 Encoding for MULTP2 Source Selection • Bits 31 through 30, 23 through 20, 15 through 12, and 7 through 4 are reserved. They shall be written as “0” for future compatibility. BLD2[8n+3:8n+2] • Bits 29 through 28 determine the output of the Alpha-Saturate block (Table 3.22), when RBC[28]=1 (see also the description of the ROP/Blend Control register). This output can then be selected by any of the four units as the source of the second multiplicand, MULTP2, using BLD2[8n+3:8n+2](NEW rev.1.03) Multiplicand 2 (MULTP2) 00 OLD[8n+7:8n] 01 ~OLD[8n+7:8n] 1X Data selected by bits [29:28] of this register • Bit 8n+1 selects the source for the first multiplicand, MULTP1. If this bit is “0”, the data selected by Bits [8n+7:8n+6] of the ROP/Blend Control Register is used; if this bit is “1”, the OLD[8n+7:8n] data is used. 64 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP the settings in the ROP/Blend Control Register, which assures compatibility with previous generations of the 3D-RAM. • Bit 8n selects the source for the ADDEND. If this bit is “0”, the data selected by Bit [8n+5] of the ROP/Blend Control Register is used; if this bit is “1”, the OLD[8n+7:8n] data is used. This register resets to 0000 0000h. This value causes the ROP/Blend units to operate based on 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 Unit 1 R Unit 2 R=reserved R=reserved R=reserved Unit 3 R MULTP2 Select Alpha-Saturate Output Select ADDEND Select Figure 3.24 Preblend Control register data format Preblend Control Register(PBC[31:0]) (NEW) • Bits 31, 30, 25, 23 through 20, 17, 15 through 12, 9, 7 through 4, and 1 are reserved. They shall be written as “0” for future compatibility. (NEW rev.1.03) This register controls the first cycle, known as the Preblend Cycle, of the two-cycle Loop Back operation for the four ROP/Blend units by selecting MULTP2 for the Preblend Cycle and the ADDEND for the second, or Normal, cycle (see the section on “Pixel ALU Blend Factpr Selections” on page 26). During the Preblend Cycle, MULTP1 is fixed to PALU_DQ[8n+7:8n] and ADDEND is fixed to {PALU_DXn, PALU_DQ[8n+7:8n]}. Each ROP/Blend unit is independently controlled by an 4-bit field of this 32-bit register. Bits 3 through 0 are repeated three more times on byte boundaries for Units 1, 2, and 3. That is, bits 11 through 8 for unit 1; bits 19 through 16 for unit 2; and bits 27 through 24 for unit 3. Since there is only one Alpha-Saturate block (located in ROP/Blend Unit 3), the selection made by Bits 29 and 28 applies to all four Blend units. • Bits 29 through 28 determine the output of the Alpha-Saturate block (Table 3.24), when RBC[28]=1 (see also the description of the ROP/Blend Control register). This output can then be selected by any of the four units as the source of the second multiplicand, MULTP2, using PBC[8n+3:8n+2]. (NEW rev.1.03) Table 3.24 Encoding for Output of AlphaSaturate Block PBC[29:28] The data format of this register is illustrated in Figure 3.24 above. 65 Output of Alpha-Saturate Block 00 min{NEW[31:24], ~OLD[31:24]} (α-sat) 01 NEW[31:24] (As) 10 OLD[31:24] (Ad) 11 ~OLD[31:24] (~Ad) 3 Pixel ALU Operations During a Stateless Data Write access, the ROP/ Blend units behave as if this register were set to 0000 0000h regardless of its actual value. Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Cycle is “looped back” to become the ADDEND for the Normal Cycle; if this bit is “1”, the ADDEND from the Preblend Cycle is “looped back” to become the ADDEND for the Normal Cycle. • Bits 8n+3 through 8n+2 select the source for the second multiplicand, MULTP2, for the Preblend Cycle(Table 3.25) Table 3.25 Encoding for MULTP2 Source Selection for Preblend Cycle 3 Pixel ALU Operations PBC[8n+3:8n+2] This register resets to 0000 0000h. This value causes the ROP/Blend units to operate based on the settings in the ROP/Blend Control Register, which assures compatibility with previous generations of the 3D-RAM. MULTP2 Source for Preblend Cycle 00 OLD[8n+7:8n] 01 ~OLD[8n+7:8n] 1X Data selected by bits [29:28] of this register During a Stateless Data Write access, the ROP/ Blend units behave as if this register were set to 0000 0000h regardless of its actual value. • Bit 8n selects the source for the ADDEND for the second, or Normal cycle. If this bit is “0”, the multiplier output from the Preblend 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 St.Enable[7:0] StF.MASK[7:0] reserved Figure 3.25 Stencil Planes register data format Stencil Planes Register(StPl[31:0]) (NEW) For each bit, a “0” disables that bit for OpenGL stencil mode. A “1” enables that bit for OpenGL stencil mode. Again, for the INCREMENT and DECREMENT stencil functions to work properly, it is necessary that the enabled stencil planes be in one contiguous group. For example, if St.Enable[7:0] is set to the value 0011 1100, then bits 29 through 26 of the ALU unit are used for stencil planes and bits 31, 30, 25, and 24 may be used for ROP or Compare functions. Note that the blending function in ROP/Blend Unit 3 cannot be used if any bits in this unit are used for OpenGL stencil. This register defines the bits allocated for stencil planes in OpenGL stencil mode and the value masking of these stencil planes when the selected stencil comparison function is performed against the stencil reference value. The on-chip stencil hardware acceleration features are implemented in ROP/Blend unit 3. Therefore, only bits 31 through 24 of the 32-bit ALU unit are available for use as stencil planes. Any number of bits from 0 through 8 may be allocated as stencil planes. In order for the GL_INCREMENT and GL_DECREMENT functions to work properly, it is necessary to keep the stencil planes in one contiguous group. • Bits 23 through 16 = StF.MASK[7:0] provide a bitwise stencil value mask for stencil comparison functions. This mask maps directly to bits 31 through 24 of the 32-bit bus. That is, bit 23 of this register will mask/unmask bit 31 of the ALU bus; bit 22 will mask/unmask bit 30 of the ALU bus, and so on. For each bit, a “0” will cause the corresponding bit to be ignored in stencil • Bits 31 through 24 = St.Enable[7:0] provide a bitwise selection of which bits are allocated as stencil planes in OpenGL mode. Bits that are enabled are subject to the controls of the Stencil Control register. Bits that are disabled are available for use by ROP Unit 3 or the Dual Compare unit. 66 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP • Bits 15 through 0 are reserved. They shall be written as “0” for future compatibility. comparison functions. A “1” will cause the corresponding bit to be used in the stencil comparison functions. For convenience and clarity, the mnemonic used here corresponds to the OpenGL terminology. Specifically, “StF” refers to the OpenGL command “glStencilFunc” with “MASK” referring to the parameter “mask” of this command. Thus, this bit field also corresponds to the symbolic parameter “GL_STENCIL_VALUE_MASK”. This register resets to 00FF 0000h, which means that all stencil planes are disabled. 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 R = reserved R R R StF.FUNC[2:0] StF.REF_SELECT[2:0] StOP.ZPASS[2:0] StOP.ZFAIL[2:0] StOP.FAIL[2:0] Figure 3.26 Stencil Control register data format Stencil Control Register(StC[31:0]) (NEW) PASS_OUT pin will be “0”. • Bits 31, 27, 23, 19, and 15 through 0 are reserved. They shall be written as “0” for future compatibility. This register defines the stencil comparison function and the stencil operations that will be performed for the OpenGL stencil mode. Only 12 bits of this register are currently used for stencil hardware acceleration, and the other 20 bits are reserved. For convenience and clarity, the mnemonics used here correspond to the OpenGL terminology. Specifically, “StOP” refers to the OpenGL command “glStencilOp” with “ZPASS”, “ZFAIL”, and FAIL” referring to the parameters of this command; “StF” refers to “glStencilFunc” with “FUNC” referring to the parameter “func” of this command. Also, it is important to note that the bits programmed in this register are effective only in the case when both PASS_IN pins are “1” and the Match Compare passes, since otherwise no data will be written into the Pixel Buffer and the • Bits 30 through 28 = StOP.FAIL[2:0] define the stencil operation to be executed in the case of GL_STENCIL_FAIL. That is, these bits determine which one of the stencil operations listed in Table 3.26 will be performed when the Stencil Compare function fails. Disabled bits keep their OLD data. In the GL_INCR operation, the maximum value is defined as 2n-1, where n=the number of bits enabled by St.Enable[7:0]. 67 3 Pixel ALU Operations During a Stateless Data Write access, the Pixel ALU behaves as if this register were set to 00FF 0000h regardless of its actual value. Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Table 3.26 Stencil Operation for StOP.FAIL 3 Pixel ALU Operations StOP.FAIL[2:0] Stencil Operation Definition 000 GL_ZERO Enabled bits cleared to zero 001 GL_KEEP Enabled bits remain OLD data 010 GL_INVERT Enabled bits are inverted OLD 011 GL_REPLACE Enabled bits are replaced by StF.REF 100 110 GL_INCR Enabled bits are incremented by 1 (clamped to max. value) 101 111 GL_DECR Enabled bits are decremented by 1 (clamped to zero value) function passes, but the MAGNITUDE Compare function fails. Disabled bits keep their OLD data In the GL_INCR operation, the maximum value is defined as 2n-1, where n=the number of bits enabled by St.Enable[7:0]. • Bits 26 through 24 = StOP.ZFAIL[2:0] define the stencil operation to be executed in the case of GL_STENCIL_PASS_DEPTH_FAIL. That is, these bits determine which one of the stencil operations listed in Table 3.27 will be performed when the Stencil Compare Table 3.27 Stencil Operation for StOP.ZFAIL StOP.ZFAIL[2:0] Stencil Operation Definition 000 GL_ZERO Enabled bits cleared to zero 001 GL_KEEP Enabled bits remain OLD data 010 GL_INVERT Enabled bits are inverted OLD 011 GL_REPLACE Enabled bits are replaced by StF.REF 100 110 GL_INCR Enabled bits are incremented by 1 (clamped to max. value) 101 111 GL_DECR Enabled bits are decremented by 1 (clamped to zero value) compare and Stencil compare functions both pass. Disabled bits use the ROP unit results. In the GL_INCR operation, the maximum value is defined as 2n-1, where n=the number of bits enabled by St.Enable[7:0]. • Bits 22 through 20 = StOP.ZPASS[2:0] define the stencil operation to be executed for the case of GL_STENCIL_PASS_DEPTH_PASS. That is, these bits determine which one of the stencil operations listed in Table 3.28 will be performed when the MAGNITUDE 68 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Table 3.28 Stencil Operation for StOP.ZPASS Stencil Operation Definition 000 GL_ZERO Enabled bits cleared to zero 001 GL_KEEP Enabled bits remain OLD data 010 GL_INVERT Enabled bits are inverted OLD 011 GL_REPLACE Enabled bits are replaced by StF.REF 100 110 GL_INCR Enabled bits are incremented by 1 (clamped to max. value) 101 111 GL_DECR Enabled bits are decremented by 1 (clamped to zero value) • Bits 19 = StF.REF_SELECT defines the source of the StF.REF stencil data. Setting this bit to 0 selects the StF.REF from the pins PALU_DQ[31:24]; setting this bit to 1, the bits 31 through 24 in the Constant Source register will be used as the StF.REF. register bits. The mnemonic “StF” refers to the OpenGL command “glStencilFunc” with “REF” referring to the parameter “ref” of this command. Table 3.29 defines which test will be used for the stencil comparison. This register resets to 3330 0000h, which means that the stencil test always passes. • Bits 18 through 16 = StF.FUNC[2:0] define the stencil comparison function. The StF.REF stencil data (from the pins PALU_DQ[31:24]) is compared with the OLD stencil data based on the settings of these During a Stateless Data Write access, the Pixel ALU behaves as if this register were set to 3330 0000h regardless of its actual value. Table 3.29 Stencil Comparison Functions StF.FUNC[2:0] Stencil Test Definition 000 GL_ALWAYS Always pass stencil test 001 GL_GREATER Pass stencil test if ( StF.REF && StF.Mask ) > ( OLD && StF.Mask ) 010 GL_EQUAL Pass stencil test if (StF.REF && StF.Mask ) == ( OLD && StF.Mask) 011 GL_GEQUAL Pass stencil test if ( StF.REF && StF.Mask ) >= ( OLD && StF.Mask) 100 GL_NEVER Always fail stencil test 101 GL_LEQUAL Pass stencil test if ( StF.REF && StF.Mask ) <= ( OLD && StF.Mask) 110 GL_NOTEQUAL Pass stencil test if ( StF.REF && StF.Mask ) != ( OLD && StF.Mask) 111 GL_LESS Pass stencil test if ( StF.REF && StF.Mask ) < ( OLD && StF.Mask) 69 3 Pixel ALU Operations StOP.ZPASS[2:0] Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 reserved reserved PASS_IN[0] Select PASS_IN[1] Select 3 Pixel ALU Operations Figure 3.27 PASS_IN Select register data format PASS_INs Select Register (PINS[31:0]) (NEW) Color Depth Select Register(CDS[31:0]) (NEW) Only 2 bits of this register are currently used for the Pixel ALU function, and the other 30 bits are reserved. Only 1 bit of this register is currently used for the Pixel ALU function, and the other 31 bits are reserved. • Bit 0 enables the PASS_IN[1] pin to participate in the internal write enable logic of the Pixel Buffer. Setting this bit to “0” disables the PASS_IN[1] which is then internally set to “1” and has no effect on the operation of the Pixel ALU. Setting this bit to “1” enables and passes through the PASS_IN[1] signal to be ANDed with the PASS_IN[0] signal. See also Figure 3.13 for illustration. • Bit 0 selects the color depth for Pixel ALU operations. “0” selects the normal (8,8,8,8) 32-bit blending mode. Also, with this setting, color data can be stored in the (5,6,5,0) 16-bit mode. “1” selects the (4,4,4,4) 16-bit blending mode. For details on the 16-bit color modes, refer to the section titled “16-bit Color Modes” in this chapter. • Bits 31 through 1 are reserved. They shall be written as “0” for future compatibility. • Bit 8 enables the PASS_IN[0] pin to participate in the internal write enable logic of the Pixel Buffer. Setting this bit to “0” disables the PASS_IN[0] which is then internally set to “1” and has no effect on the operation of the Pixel ALU. Setting this bit to “1” enables and passes through the PASS_IN[0] signal to be ANDed with the PASS_IN[1] signal. See also Figure 3.13 for illustration. This register resets to 0000 0000h. This value assures compatibility with previous generations of the 3D-RAM. Note One MCLK cycle must be inserted between (a) the write to the Color Depth Select register and (b) the following ALU operations: Read Pixel Buffer, Stateful Initial Data Write, Stateful Normal Data Write, and Initiate Two-Cycle Blending. Valid ALU operations that provide such one-MCLKcycle insertion include ALU NOP, Write Control Register, or a stateless operation. (NEW rev.1.03) • Bits 31 through 1 are reserved. They shall be written as “0” for future compatibility. This register resets to 0000 0100h. This value assures compatibility with previous generations of the 3D-RAM. Note During a Stateless Data Write, the Color Depth Select register still behaves based on its current state. In other words, if bit 0 of the register is set to “1”, the Stateless Write will be done in the (4,4,4,4) mode of operation. 70 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Prohibited Register Access will cause the device to enter into a special test mode and is strictly prohibited in order to avoid unexpected device behavior in the system. Writing to control register address with PALU_A[5:0] = “011000” and PALU_OP[2:0] = “111” for three consecutive rising edges of MCLK 1 PALU_OP[2:0] WCR: "111" PALU_A[5:0] "011000" 2 3 "L" RESET Test_Mode Test Mode In Figure 3.28 Prohibited register access 71 Test Mode Out 3 Pixel ALU Operations MCLK Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI 3 Pixel ALU Operations ELECTRONIC DEVICE GROUP Pixel Data Operations Stateless Normal Data Write There are six pixel data operations: Stateless Initial Data Write, Stateless Normal Data Write, Stateful Initial Data Write, Stateful Normal Data Write, Replace Dirty Tag, and OR Dirty Tag. Simply put, Stateless Data Writes refer to the condition that the states of the Pixel ALU units are entirely ignored and that the write data is passed to the Pixel Buffer unaffected, whereas in Stateful Data Writes, the settings of the various registers in the Pixel ALU, the results of the compare tests, and the state of the PASS_IN pin all affect whether the bits of the pixel data will be written into the Pixel Buffer. Initial and Normal Data Writes refer to the manner in which the Dirty Tag is updated. In an Initial Data Write, the bits of the Dirty Tag are selectively set and cleared. In a Normal Data Write, the bits of the Dirty Tag associated with the addressed block and word are inclusive ORed with the PALU_BE pins, and the other bits of the Dirty Tag are unchanged. The following sections describe these operations in details. The Stateless Normal Data Write operation writes 32-bit data to the addressed block and word in the Pixel Buffer. No register values affect this operation. The ROP/Blend units simply pass the write data through without affecting it. The Dual Compare unit is ignored and does not inhibit the writing of data to Pixel Buffer. The PASS_OUT pin is forced to “1” for this operation. The PASS_IN pin has no effect. The four Dirty Tag bits corresponding to the addressed block and word are inclusive ORed with the PALU_BE[3:0] pin. The other 28 Dirty Tag bits corresponding to the addressed block are unchanged. Stateful Initial Data Write The Stateful Initial Data Write operation writes 32bit data to the addressed block and word. The new data may be combined with the existing destination data. The conditional write enable applies. All register values can affect this operation. Stateless Initial Data Write The Stateless Initial Data Write operation writes 32-bit data to the addressed block and word in the Pixel Buffer. No register values affect this operation. The four Dirty Tag bits corresponding to the addressed block and word are set to the PALU_BE[3:0] value. The other 28 Dirty Tag bits corresponding to the addressed block are cleared to “0”. The ROP/Blend units simply pass the write data through without affecting it. The Dual Compare unit is ignored and does not inhibit the writing of data to Pixel Buffer. The PASS_OUT pin is forced to “1” for this operation. The PASS_IN pin has no effect. Both the writing to the Pixel Buffer and the updating of the Dirty Tag can be inhibited by a compare test failure (which means that either PASS_IN or PASS_OUT is low). Stateful Normal Data Write The corresponding four Dirty Tag bits for the addressed word are set to the respective PALU_BE[3:0] value of the 32-bit data. The other 28 Dirty Tag bits corresponding to the addressed block are cleared to “0”. The Stateful Normal Data Write operation writes 32-bit data to the addressed block and word. The new data may be combined with the existing destination data. The conditional write enable applies. All register values can affect this operation. 72 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP The four Dirty Tag bits corresponding to the addressed block and word are inclusive ORed with the PALU_BE[3:0] value. The other 28 Dirty Tag bits corresponding to the addressed block are unchanged. gets written into the Dirty Tag RAM. The Dirty Tag data passes through the ROP portion of the ROP/ Blend units. All of the registers behave the same way they would during a Stateless Data Write. This operation initiates a Preblend Cycle, which is the first cycle of the Two-Cycle Blend operation. The Preblend Cycle is similar to a Stateful Write, except that the Preblend Cycle does not actually write the data back to the Pixel Buffer or affect the Dirty Tags in any way. When this operation is issued, the ROP/Blend units begin blending the data based on the settings of three registers: ROP/Blend Control, Blend_2 Control, and Preblend Control. After the multiplier stage of the Preblend Cycle, the multiplier output and the Addend are looped back as possible addends for the next cycle, which is called the Normal Cycle. The Preblend Cycle must always be followed by a Stateful Initial/Normal Write with the same Pixel Buffer Address on the PALU_A pins. This operation is only for blending. The ROP/Blend Control register must be set to perform blending for all ROP/Blend units or this operation will not function correctly. See the paragraphs on “Pixel ALU Blend Modes” starting on page 26 for further details. Replace Dirty Tag The 32-bit data on the PALU_DQ[31:0] pins replaces the Dirty Tag of the addressed block. The bit mapping between the Dirty Tag and PALU_DQ pins is explained on pages 22 and 42. The PALU_BE[3:0] pins determine which byte of the PALU_DQ[31:0] data gets written into the Dirty Tag RAM. The Dirty Tag data passes through the ROP portion of the ROP/Blend units. All of the registers behave the same way they would during a Stateless Data Write. OR Dirty Tag The 32-bit data on the PALU_DQ[31:0] pins is inclusive ORed with the Dirty Tag of the addressed block. The bit mapping between the Dirty Tag and PALU_DQ pins is explained on pages 22 and 42. The PALU_BE[3:0] pins determine which byte of the PALU_DQ[31:0] data 73 3 Pixel ALU Operations Initiate Two-Cycle Blending Both the writing to the Pixel Buffer and the updating of the Dirty Tag can be inhibited by a compare test failure (which means that either PASS_IN or PASS_OUT is low). Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI 3 Pixel ALU Operations ELECTRONIC DEVICE GROUP 74 4 DRAM Operations MITSUBISHI Rev. 1.03 3D-RAM (M5M410092B) ELECTRONIC DEVICE GROUP DRAM page and the Pixel Buffer. For convenience, the latter reference is used liberally in this document.) A DRAM page is always organized as 10 blocks wide and 4 blocks high; this is fixed. A block always contains eight 32-bit words, for a total of 256 bits. In the case of 8 bits per pixel, the eight words in a given block may be viewed as 8 pixels wide by 4 pixels high. Thus, a DRAM page would be mapped to the screen as 80 pixels wide by 16 pixels high with 8 bits per pixel. In the case of 32 bits per pixel, the eight words in a given block may be viewed as 2 pixels wide by 4 pixels high. Thus, a DRAM page would be mapped to the screen as 20 pixels wide by 16 pixels with 32 bits per pixel. For simplicity, we represent these frame buffer organizations with the short hand notations 80W x 16H x 8 and 20W x 16H x 32, respectively. Several frame buffer organization examples are shown in Chapter 6. This chapter discusses the 3D-RAM operations involving the DRAM arrays. These include the data transfers between a DRAM bank and the Pixel Buffer and between a DRAM bank and a Video Buffer. An Overview of DRAM Operations Depending on the DRAM_OP code, the DRAM_A address pins may be interpreted in three different ways: (1) page access (for Access Page and Duplicate Page operations), (2) block access (for Read Block, Unmasked Write Block, and Masked Write Block operations), and (3) scan line access (for Video Transfer operation). A page access selects one page out of 257 pages (256 normal pages plus one extra page). DRAM_A8 is used to select the extra page—”1” is for the extra page, “0” is for choosing one of the 256 normal pages from a given bank. When DRAM_A8 is equal to “1”, the lower eight address pins DRAM_A[7:0] should still be driven to stable states although they are not decoded internally. During a data transfer between a DRAM page and the Pixel Buffer, both the block location in the Pixel Buffer and the block location in the DRAM page must be specified. In the Pixel Buffer, the selection of one of eight blocks is through the DRAM_A[8:6] pins. In the height direction of a DRAM page, the DRAM_A[1:0] pins select one of four block rows. In the width direction of the page, a block is selected from the ten block columns through the DRAM_A[5:2] pins. Figure 4.1 illustrates the addressing scheme for block transfer with a block configured as 8W x 4H x 8. The hexadecimal number written on every block of the DRAM page corresponds to the six address pins DRAM_A[5:0]. Similarly, the number on the Pixel Buffer block is from the other address bits DRAM_A[8:6]. The position and orientation of all pages displayed on the screen are controlled by the user. However, the mapping of data within a given page to a Pixel Buffer block is fixed and is shown in Figure 4.1. (More precisely from the perspective of DRAM operations, we should speak of the mapping between the sense amplifiers of a selected DRAM bank with a Pixel Buffer block. However, since the page-wide sense amplifiers act as a direct-mapped write-through pixel cache for a DRAM bank, the mapping between the sense amplifiers of a DRAM bank and the Pixel Buffer is the same as the mapping between a 75 4 DRAM Operations DRAM Operations Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Description of DRAM Operations Table 4.1 DRAM operation encoding Operation DRAM_BS DRAM_OP DRAM_A 000 Pixel Buffer Block(3 pins), Bank Unmasked Write Block (UWB) DRAM Block(6 pins) 001 Pixel Buffer Block(3 pins), Masked Write Block (MWB) 010 Precharge Bank (PRE) Bank DRAM Block(6 pins) Bank — Control (2pins), 100 Duplicate Page (DUP) Bank Line (4pins) Bank Page (9 pins) 101 Pixel Buffer Block(3 pins), Read Block (RDB) Bank DRAM Block(6 pins) Access Page (ACP) 110 Bank Page (9 pins) No Operation (NOP) 111 — — Precharge Bank, and the only operation after Precharge Bank is Access Page. Tables 7.6 and 7.7 contain the specific timing interlocks for DRAM operations in the same bank and between different banks. Table 4.1 lists all of the DRAM operations. One operation can be launched in every cycle. However, the sequence of these DRAM operations is bounded by the resource interlocks. The Access Page can only be issued after Global Bus 256 256 0 1 2 3 4 5 6 7 16H 4 DRAM Operations 011 Video Transfer (VDX) Pixel Buffer 00 04 08 0C 10 14 18 1C 20 24 01 05 09 0D 11 15 19 1D 21 25 02 06 0A 0E 12 16 1A 1E 22 26 03 07 0B 0F 13 17 1B 1F 23 27 80W A Page in a DRAM Bank 8 7 6 5 4 3 2 1 0 DRAM_A [8..0] Selecting a block in the height direction from a DRAM page Selecting a block in the width direction from a DRAM page Selecting one out of eight blocks in the Pixel Buffer Figure 4.1 Addressing scheme for block transfer on the Global Bus, for a block size of 8W x 4H x 8 (or 2W x 4H x 32). The blocks in the DRAM page are numbered with hexadecimal values and selected by DRAM_A[5:0]. 76 ELECTRONIC DEVICE GROUP Unmasked Write Block (UWB) Masked Write Block (MWB) The UWB operation copies 32 bytes from the specified Pixel Buffer block over the Global Bus to the specified block in the sense amplifiers and the DRAM page of a selected DRAM bank. The DRAM_A[5:0] pins select one of the 40 blocks in a DRAM page. The DRAM_A[8:6] pins select one of the eight Pixel Buffer blocks. The 32-bit Plane Mask register has no effect on Unmasked Write Block operation. The 32-bit Dirty Tag still controls which bytes of the block are updated. The MWB operation copies 32 bytes from the specified Pixel Buffer block over the Global Bus to the specified block in the sense amplifier and the DRAM page of a selected DRAM bank. The DRAM_A[5:0] pins select one of the 40 blocks in a DRAM page. The DRAM_A[8:6] pins select one of the eight Pixel Buffer blocks. Both the 32-bit Dirty Tag and the 32-bit Plane Mask register control which bytes of the block are updated. 1 page/257 1 block/40 Bank-B Bank-C Bank-D 257pages 257pages 257pages RDB UWB 256-bit Global 256 Global BusBus (8W bytes) x 4H x 8) (32 or MWB Pixel Buffer Figure 4.2 Unmasked Write Block, Masked Write Block, and Read Block on the Global Bus 77 4 DRAM Operations Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI 4 DRAM Operations ELECTRONIC DEVICE GROUP Precharge Bank (PRE) Video Transfer (VDX) The PRE operation first deactivates the word line corresponding to the most recently accessed DRAM page of a selected DRAM bank and then equalizes the bit lines of the sense amplifiers for a subsequent Access Page operation. After a Precharge Bank operation has been performed on a certain DRAM bank, the operations that can be performed on that DRAM bank are Access Page, Precharge Bank, and NOP. Other operations after a Precharge Bank operation are illegal, and the resulting data is undefined. There are two parts to the VDX operation: video buffer load and video output. Video Buffer load relates to the transfer from the sense amplifiers of a selected DRAM bank to a corresponding Video Buffer. Video output relates to the transfer from a Video Buffer to the VID_Q pins. Bank-A 1page/257 Bank-B 1page/257 Video Buffer Load There are two video buffers available for interleave transfer. Video Buffer I is for Bank A and Bank C. Video Buffer II is for Bank B and Bank D. Figure 4.3 illustrates a Video Transfer example from a page in Bank A to Video Buffer I. Bank-C 257pages Video Buffer I Bank-D 257pages Video Buffer II 16 VID_Q Video Transfer Figure 4.3 Video Transfer from a Bank A page to Video Buffer I 78 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP A DRAM Page 80W 8 7 6 5 4 3 2 1 0 ••• 16H 0 1 2 DRAM_A [8..0] Selecting one line memory from the page 14 15 Ignored 80 Selecting byte pair mode when DRAM_A8 is "1" “0”: normal Normanlmode mode "0": “1”: reversed Reversedmode mode "1": Video Buffer "1": Load DRAM_A7 into latch, initialize video counter. "0": The video output operation is not affected. Figure 4.4 Addressing scheme for video transfer 79 4 DRAM Operations transfers a 80W x 1H x 8 line of pixel data from the sense amplifiers of a DRAM page to the corresponding Video Buffer. The DRAM_A[3:0] pins are used to select one of the 16 rows in a DRAM page. Since there are 16 VID_Q pins, one may think of the Video Buffer as 40 double-bytes. The DRAM_A[6:4] pins are ignored in this operation but should still be driven to stable states. This paragraph describes the addressing scheme for the Video Transfer operation in detail. A DRAM page has a fixed organization of 10 blocks wide by 4 blocks high. For VDX operation, a 32-byte block is always considered as being 4 rows high (either 8W x 4H x 8 or 2W x 4H x 32). That is, for VDX operation, a DRAM page is always viewed as containing 16 rows of 80 bytes each. In the case of 8 bits per pixel, the Video Transfer operation Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Video Output Operation pins in normal sequence as in [byte 0, byte 1] at video clock 0, [byte 2, byte 3] at video clock 1 … [byte 78, byte 79] at video clock 39. However, in many systems, byte 3 contains control bits, while bytes 0, 1, and 2 are the RGB data. Therefore, it may be desirable to make byte 3 available on the first video clock to allow the RAMDAC chip the maximum time to make use of the control bits. The reversed mode is designed for this purpose, where the byte data is shifted in reversed sequence as in [byte 2, byte 3] at video clock 0, [byte 0, byte 1] at video clock 1 … [byte 78, byte 79] at video clock 38, and finally [byte 76, byte 77] at video clock 39. In summary, the 16-bit VID_Q bus output scheme is illustrated in Figure 4.5 for a 80W x 1H x 8 Video Buffer. 4 DRAM Operations There are two byte order formats for the VID_Q video output pins: normal mode and reversed mode. This byte ordering is selected by an internal byte pair mode latch, which is loaded from the DRAM_A7 pin when the DRAM_A8 pin is equal to “1”. If the latched data is “0”, the normal video output mode is applied to the VID_Q bus. If the latch data is “1”, the reversed video output mode is selected. Since a Video Buffer holds 640 bits, we may number the bytes in the Video Buffer from byte 0 through 79. In both normal and reversed modes, even bytes always appear on the VID_Q[7:0] pins, and odd bytes on VID_Q[15:8] pins. In normal mode, the byte data is shifted out to the VID_Q 80W x 1H x 8 Video Buffer 8-bit ••• 0 VID_CLK 0 1 2 1 3 4 2 5 6 7 8 78 79 39 38 3 VID_Q 0 ••• 0 2 4 6 76 78 ••• VID_Q 7 Normal Mode VID_Q 8 ••• 1 3 5 7 2 0 6 4 77 79 78 76 VID_Q 15 VID_Q 0 ••• ••• VID_Q 7 Reversed Mode VID_Q 8 ••• 3 1 7 5 79 77 VID_Q 15 Figure 4.5 16-bit VID_Q bus output scheme for a 80W x 1H x 8 Video Buffer 80 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP ready, the clock enable VID_CKE is asserted to allow video output. For a normal mode video output, Figure 8.16 illustrates an example of continuous video output from both Video Buffers by issuing consecutive Video Transfer operations on the four DRAM banks. Initialize and Abort Video Output When the DRAM_A8 pin is “1”, the byte pair mode latch is loaded, and the current Video Buffer output operation is aborted. The VID_Q bus is driven starting from the Video Buffer indicated by the DRAM_BS0 pin. Also, the modulo-40 Video Counter is initialized. If DRAM_A8 is “0”, the Video Counter is not affected. The video output from the current Video Buffer continues until this buffer is exhausted. Then, the Video Buffer is automatically switched and the Video Counter is initialized. guaranteed for every occurrence of Video Buffer interleave. To avoid data corruption in the Video Buffer, the user should not start a Video Transfer operation to the Video Buffer that is outputting data to the VID_Q bus. Prohibited Video Operation Sequence Performing VDX operation with DRAM_A[8:0] = “01xxxxxxx” and RESET = “H” for eight consecutive rising edges of MCLK will cause the device to enter into a special manufacturing test mode and is strictly prohibited to avoid unexpected device behavior in the system. Figure 8.15 shows an example of initiating a video output process. It begins with commanding a Video Transfer from the DRAM control port, while holding VID_CKE signal low to disable internal VID_CLK until VID_QSF changes. When VID_QSF indicates the specific Video Buffer is MCLK 1 2 DRAM_OP[2:0] "011" (VDX) DRAM_A[8:0] "0 1xxx xxxx" "H" 8 "001" (MWB) "H" RESET Test_Mode Test Mode In Figure 4.6 Prohibited video operation sequence 81 Test Mode Out 4 DRAM Operations Note that VID_QSF settles from an unknown state to a known state after the initial Video Transfer with DRAM_A8 = 1. Except for this initial Video Transfer, the clean edge transition on VID_QSF is Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP function as a level-two write-through pixel cache. DUP is a special performance function that offers ultra-fast data movement in a frame buffer. Consider the task of clearing the entire frame buffer of 1280 x 1024 x 32. Using only the MWB operations for this task, the 256-bit Global Bus and four bank interleaving plus parallel operations to the four 3D-RAM chips offer very good bandwidth. The data rate is 5.8 GB/s for the -10 grade of 3D-RAM, and the entire screen is cleared in 860 µs, without considering the interruptions of video refresh. However, with the DUP performance function, the data rate increases tenfold to 58.6 GB/s, and the entire screen is cleared in only 85 µs, with the same -10 grade of 3D RAM. 4 DRAM Operations Duplicate Page (DUP) All 10,240 bits of the data in the sense amplifiers of a selected DRAM bank can be transferred to any specified page in the same bank within one Duplicate Page operation. The data in the sense amplifiers is not affected by this operation. If the DRAM_A8 pin is 0, then the DRAM_A[7:0] pins select one of the 256 normal pages. If DRAM_A8 is 1, then the DRAM_A[7:0] pins are ignored and the extra page is written. The Plane Mask register does not apply to this operation. It may be helpful to point out that it is not necessary to use the DUP operation to write back the data in the sense amplifiers, because they page DUP Bank-B Bank-C Bank-D 257 pages 257 pages 257 pages page To Pixel Buffer Figure 4.7 Duplicate Page in DRAM Bank A Read Block (RDB) selected DRAM bank and transfers the data in the DRAM array to the sense amplifiers. If the DRAM_A8 pin is “0”, then the DRAM_A[7:0] pins select one of the 256 normal pages. If the DRAM_A8 pin is “1”, then the DRAM_A[7:0] pins are ignored and the extra page is transferred. The RDB operation copies 32 bytes from the sense amplifiers of a selected DRAM bank over the Global Bus to the specified block in the Pixel Buffer. The corresponding 32-bit Dirty Tag is cleared. The DRAM_A[5:0] pins select one of the 40 blocks in a DRAM page. The DRAM_A[8:6] pins select one of the eight Pixel Buffer blocks. The Read Block operation is also illustrated in Figure 4.2. Before an Access Page operation can be performed on a certain DRAM bank, a Precharge Bank operation must have been performed for that DRAM bank. After an Access Page operation, several DRAM read and write operations, such as UWB, MWB, RDB, DUP, and VDX, may be performed. Access Page (ACP) The ACP operation activates the word line corresponding to the specified DRAM page of a 82 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP BANK Sense Amp Figure 4.8 Access Page means transferring a specified page to the sense amplifiers. 83 4 DRAM Operations Page Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP load is necessary. More importantly, NOPs are required to satisfy the timing interlocks of the various DRAM operations, as listed in Tables 7.6 and 7.7; for this application, each NOP operation simply takes one clock period. No Operation (NOP) 4 DRAM Operations The NOP operation may be freely inserted between the ACP operation and the PRE operation on the same bank. NOPs are issued when the DRAM arrays are idle, no read or write is required by the Pixel Buffer, and no Video Buffer 84 5 Pixel ALU Pipelines and DRAM Activities MITSUBISHI Rev. 1.03 3D-RAM (M5M410092B) ELECTRONIC DEVICE GROUP DRAM and Pixel ALU Interactions This chapter inculdes some pipeline examples of the interaction between the Global Bus and the Pixel ALU, as well as some typical sequences of DRAM operations on the same bank. For DRAM operations, we assume that the clock cycle time equals to the minimum requirements of the specification, depending on the speed grade of the parts. If the 3D-RAM is not running at the minimum MCLK cycle time, then the DRAM operations shown in the tables of this chapter do not govern the cycles of operations. The interlocks listed in Tables 7.6 and 7.7 are always the governing parameters that determine the cycles of DRAM operations. These interlocks specify time durations only and are independent of the clock period and the number of clock cycles, unless specifically noted otherwise. Table 5.1 The Global Bus and the Pixel ALU interact as shown in the following tables. A word on the notation may be in order here. Braces are used to enclose more than one specifications that can qualify the operation outside the braces. For example, Stateful {Initial, Normal} Data Write means either Stateful Initial Data Write or Stateful Normal Data Write may be applied. In fact, the table entries shorten this notation to simply Stateful Data Write. Table 5.1 shows a Read Block operation immediately followed by a Read Data operation that uses data from the Read Block. Read Block on Global Bus to Read Data on Pixel ALU Cycle DRAM Activities Pixel ALU Activities n Read Block op specified on DRAM_EN, DRAM_OP … n+1 Read Block on Global Bus n+2 Read Block on Global Bus Read Data op specified on PALU_EN, PALU_WE, PALU_OP … n+3 Read Data op specified on PALU_EN, PALU_WE, PALU_OP … Data read from Pixel Buffer n+4 Data read from Pixel Buffer Data on PALU_DQ n+5 Data on PALU_DQ 85 5 Pixel ALU Pipelines and DRAM Activities Pixel ALU Pipelines and DRAM Activities Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI 5 Pixel ALU Pipelines and DRAM Activities ELECTRONIC DEVICE GROUP Table 5.2 shows a Read Block operation immediately followed by a Stateful {Initial, Normal} Table 5.2 Data Write operation that performs a Read Modify Write on the data from the Read Block. Read Block on Global Bus to Stateful {Initial, Normal} Data Write on Pixel ALU Cycle DRAM Activities Pixel ALU Activities n Read Block op specified on DRAM_EN, DRAM_OP … n+1 Read Block on Global Bus n+2 Read Block on Global Bus Stateful Data Write op specified on PALU_EN, PALU_WE, PALU_OP … n+3 Old data read from Pixel Buffer New data read from PALU_DQ, PALU_DX n+4 ROP/Blend 1 n+5 ROP/Blend 2 n+6 ROP/Blend 3 n+7 ROP/Blend 4 n+8 Result data written to Pixel Buffer 86 Table 5.3 shows a {Stateless, Stateful} {Initial, Normal} Data Write operation immediately followed by a {Masked, Unmasked} Write Block operation. Table 5.3 {Stateless, Stateful} {Initial, Normal} Data Write on Pixel ALU {Masked, Unmasked} Write Block on Global Bus Cycle DRAM Activities Pixel ALU Activities n Data Write op specified on PALU_EN, PALU_WE, PALU_OP … n+1 Old data read from Pixel Buffer New data on PALU_DQ, PALU_DX n+2 ROP/Blend 1 n+3 ROP/Blend 2 n+4 ROP/Blend 3 n+5 ROP/Blend 4 n+6 {Masked, Unmasked} Write Block op specified on DRAM_EN, DRAM_OP … n+7 {Masked, Unmasked} Write Block on Global Bus n+8 {Masked, Unmasked} Write Block on Global Bus 87 Result data written to Pixel Buffer 5 Pixel ALU Pipelines and DRAM Activities Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI 5 Pixel ALU Pipelines and DRAM Activities ELECTRONIC DEVICE GROUP Table 5.4 shows a {Replace, OR} Dirty Tag operation immediately followed by a {Masked, Unmasked} Write Block operation. Table 5.4 Cycle {Replace, OR} Dirty Tag on Pixel ALU to {Masked, Unmasked} Write Block on Global Bus DRAM Activities Pixel ALU Activities n Dirty Tag op specified on PALU_EN, PALU_WE, PALU_OP … n+1 Dirty Tag data on PALU_DQ, n+2 Pass through ROP/Blend 1 n+3 Pass through ROP/Blend 2 n+4 Pass through ROP/Blend 3 n+5 Pass through ROP/Blend 4 n+6 {Masked, Unmasked} Write Block op specified on DRAM_EN, DRAM_OP… n+7 {Masked, Unmasked} Write Block on Global Bus n+8 {Masked, Unmasked} Write Block on Global Bus 88 Data written to Dirty Tag Table 5.5 shows a Write Register operation to the Plane Mask register followed by the latest Masked Table 5.5 Cycle Write Block operation that can use the previous contents of the Plane Mask register. Write Register (Plane Mask) on Pixel ALU to Masked Write Block on Global Bus DRAM Activities Pixel ALU Activities n Write Register op specified on PALU_EN, PALU_WE, PALU_OP … n+1 Plane Mask data on PALU_DQ n+2 n+3 Masked Write Block op specified on DRAM_EN, DRAM_OP … n+4 Masked Write Block on Global Bus (uses old Plane Mask value) n+5 Masked Write Block on Global Bus (uses old Plane Mask value) n+6 Plane Mask register loaded n+7 n+8 n+9 89 5 Pixel ALU Pipelines and DRAM Activities Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI 5 Pixel ALU Pipelines and DRAM Activities ELECTRONIC DEVICE GROUP Table 5.6 shows a Write Register operation to the Plane Mask register followed by the earliest Table 5.6 Cycle Masked Write Block operation that can use the new Plane Mask. Write Register (Plane Mask) on Pixel ALU to Masked Write Block on Global Bus DRAM Activities Pixel ALU Activities n Write Register op specified on PALU_EN, PALU_WE, PALU_OP … n+1 Plane Mask data on PALU_DQ n+2 n+3 n+4 n+5 n+6 Masked Write Block op specified on DRAM_EN, DRAM_OP … n+7 Masked Write Block on Global Bus Plane Mask register loaded (uses new Plane Mask value) n+8 Masked Write Block on Global Bus (uses new Plane Mask value) n+9 90 DRAM Activities interlock timing restrictions are listed in Table 7.6 for the DRAM operations on the same bank and in Table 7.7 for the DRAM operations on different banks. This section discusses consecutive DRAM operations on the same bank. To illustrate interlock timing for DRAM activities, we assume that the clock cycle time equals the minimum specification requirements—10ns or 13ns — depending on the speed grade of the parts. The Table 5.7 Table 5.7 shows a minimal length video refresh sequence. Video refresh sequence Cycle External Activities n Access Page specified Internal Activities n+1 Access Page n+2 Access Page n+3 Access Page n+4 Video Transfer specified Access Page n+5 Video Transfer n+6 Video Transfer n+7 Video Transfer n+8 Precharge Bank specified Video Transfer n+9 Precharge Bank n+10 Precharge Bank n+11 Precharge Bank n+12 Precharge Bank 91 5 Pixel ALU Pipelines and DRAM Activities Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI 5 Pixel ALU Pipelines and DRAM Activities ELECTRONIC DEVICE GROUP Table 5.8 shows minimal length DRAM refresh sequence. Table 5.8 DRAM refresh sequence Cycle External Activities n Access Page specified Internal Activities n+1 Access Page n+2 Access Page n+3 Access Page n+4 Access Page n+5 n+6 n+7 n+8 Precharge Bank specified n+9 Precharge Bank n+10 Precharge Bank n+11 Precharge Bank n+12 Precharge Bank 92 Table 5.9 shows a sequence of Read Block and {Masked, Unmasked} Write Block operations. Table 5.9 Sequence of Read Block and {Masked, Unmasked} Write Block operations Cycle External Activities n Access Page specified Internal Activities n+1 Access Page n+2 Access Page n+3 Access Page n+4 1st Read Block specified Access Page 1st Read Block n+5 n+6 2nd Read Block specified 1st Read Block 2nd Read Block n+7 n+8 1st Write Block specified 2nd Read Block 1st Write Block n+9 n+10 2nd Write Block specified 1st Write Block 2nd Write Block n+11 n+12 2nd Write Block Precharge Bank specified n+13 Precharge Bank n+14 Precharge Bank n+15 Precharge Bank n+16 Precharge Bank 93 5 Pixel ALU Pipelines and DRAM Activities Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP 5 Pixel ALU Pipelines and DRAM Activities Table 5.10 shows Duplicate Page sequence. Table 5.10 Duplicate Page sequence Cycle External Activities n Access Page specified Internal Activities n+1 Access Page n+2 Access Page n+3 Access Page n+4 Duplicate Page specified Access Page n+5 Duplicate Page n+6 Duplicate Page n+7 Duplicate Page n+8 Duplicate Page n+9 Duplicate Page n+10 Duplicate Page n+11 Duplicate Page n+12 Precharge Bank specified Duplicate Page n+13 Precharge Bank n+14 Precharge Bank n+15 Precharge Bank n+16 Precharge Bank 94 6 Frame Buffer Organizations MITSUBISHI Rev. 1.03 3D-RAM (M5M410092B) ELECTRONIC DEVICE GROUP Frame Buffer Organizations Introduction • bank = 2*((y%32)/16) + (x%160)/80, [0 = Bank A, 1 = Bank B, 2 = Bank C, 3 = Bank D] There are many ways to use the 3D-RAM to implement frame buffers of various resolutions and depths. This section describes the following frame buffer organizations: • page = 8*(y/32) + x/160 • scan line within page = y%16 • block within page = (y%16)/4 + 4* ((x%80)/8) • 1280 x 1024 x 8 organization in single chip • word within block = 2*(y%4) + (x%8)/4 • 1280 x 1024 x 32, organized as four 1280 x 1024 x 8 or 320 x 1024 x 32 The mapping of page groups to the display screen is completely user definable. The following mappings are hardwired inside the 3D-RAM: blocks to pages, scan lines to pages, words to Pixel Buffer blocks, and Dirty Tags to Pixel Buffer blocks. • 640 x 512 x 8 double buffered organization with 16-bit Z in single chip 1280 x 1024 x 8 Organization In this organization, the screen display is made up of an 8W x 32H array of page groups (that is, 8 page groups wide by 32 page groups high). A page group is 160-pixel wide by 32-pixel high and consists of the same page from all four DRAM banks (A, B, C, D). The four independent DRAM banks can be interleaved to allow pages to be prefetched as images are drawn. Each page within a page group is 80-pixel wide by 16-pixel high. Pages are either sliced into sixteen 80-pixel wide scan lines when sending data to its Video Buffer or they are diced into a 10W x 4H array of 256-bit blocks when dealing with the Global Bus. Two pixels are shifted out of the Video Buffer every video clock. 1280 x 1024 x 32 Single Buffered Organization A frame buffer of this size requires four 3D-RAMs; however, there are two recommended ways of organizing the 3D-RAMs which trade off 2D color expansion rendering performance with pixel oriented rendering performance. • Each of the four components of a pixel (R, G, B, a) are in separate 3D-RAMs. Thus, each 3D-RAM supports 1280 x 1024 x 8. This section describes this implementation. • All four components of a pixel reside in the same 3D-RAMs. The four 3D-RAMs are interleaved on a pixel by pixel basis in a scan line. Thus, each 3D-RAM supports 320 x 1024 x 32. Page 97 describes this implementation. Blocks are 8-pixel wide by 4-pixel high and can be transferred to and from one of the Pixel Buffer blocks via the Global Bus. The Pixel ALU and data pins access four pixels of a Pixel Buffer block at a time. The Dirty Tag for an entire Pixel Buffer block can be written in a single cycle from the data pins. The 320 x 1024 x 32 mode is nearly the same as 1280 x 1024 x 8 except that the pixels are four times as deep and the widths of the screen, page groups, pages, and blocks are one fourth as wide. One pixel is shifted out of the Video Buffer every two video clocks. The Pixel ALU and PALU_DQ pins access one pixel of a Pixel Buffer block. The Dirty Tag for an entire Pixel Buffer block can be written in a single cycle from the PALU_DQ pins. The following formulas determine which bank, page, etc. a given pixel is in, given the x and y coordinates of the pixel. The formulas use C syntax where the percent sign (“%”) indicates integer modulus operation and the slash sign (“/”) indicates integer division. These formulas are valid only when 0 ≤ x < 1280 and 0 ≤ y < 1024. 95 6 Frame Buffer Organizations • pixel (byte) within word = x%4 • 1280 x 1024 x 32 double buffered organization with 32-bit Z Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP 3 = Bank D] The Dirty Tag controls the four bytes of 32-bit pixel independently. • page = 8*(y/32) + x/40 The following formulas determine which bank, page etc. a given pixel is in, given the x and y coordinates of the pixel. • scan line within page = y%16 • block within page = (y%16)/4 + 4* ((x%20) /2) • pixel (word) within block = 2*(y%4) + (x%2) • bank = 2*((y%32)/16) + (x%40)/20, [0 = Bank A, 1= Bank B, 2 = Bank C, 6 Frame Buffer Organizations 1280 0 1 8 9 16 17 7 15 23 240 241 248 249 247 255 A page group consists of the same page from all four DRAM banks (A, B, C, D). 1024 160 The screen is 8 page groups wide by 32 page groups high. 0(A) 2(C) 32 1(B) 3(D) DRAM_A[3:0] DRAM_A[1:0] 80 80 0 1 16 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 16 Each page can be divided into either 40 blocks, which can be accessed from the Global Bus, or 16 scan lines which can be accessed by the Video Buffer. DRAM_A[5:2] 8 PALU_A[2:1] 0 2 4 6 4 1 3 5 7 Blocks can be accessed in 4-pixel words by the Pixel ALU. 80 PALU_A0 4 1 0 1 2 3 1 0 1 Video Buffer 78 79 Video Buffer data is shifted out two bytes at a time on each VID_CLK. PALU_BE0, PALU_DQ[7:0] PALU_BE1, PALU_DQ[15:8] PALU_BE2, PALU_DQ[23:16] PALU_BE3, PALU_DQ[31:24] Figure 6.1 This diagram shows how 3D-RAM maps to pixels in a single-chip 1280x1024x8 frame buffer. The numbers outside each rectangle show its dimensions in pixels. 96 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP System Interface Address & Control Rendering Controller 32 32 32 32 Monitor 3D-RAM 3D-RAM 3D-RAM 3D-RAM Video Control Video Data 16 Video Data 16 Video Data 16 Video Data 16 RAMDAC Figure 6.2 1280 x 1024 x 32 Single Buffer 3D-RAM System 97 6 Frame Buffer Organizations Pixel Data Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP 320 0 1 8 9 16 17 7 15 23 240 241 248 249 247 255 1024 A page group consists of the same page from all four DRAM banks (A, B, C, D). 40 The screen is 8 page groups wide by 32 page groups high. 32 0(A) 1(B) 2(C) 3(D) DRAM_A[3:0] 6 Frame Buffer Organizations DRAM_A[5:2] 20 20 16 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 2 4 PALU_A0 PALU_DQ[31:0] 16 Each page can be divided into either 40 blocks, which can be accessed from the Global Bus, or 16 scan lines which can be accessed by the Video Buffer. DRAM_A[1:0] PALU_A[2:1] 36 37 38 39 0 1 2 3 4 5 6 7 Blocks can be accessed in 1-pixel words by the Pixel ALU. 20 1 0 1 Video Buffer 18 19 Video Buffer data is shifted out two bytes at a time on each VID_CLK. A pixel is shifted out every two cycles. Figure 6.3 This diagram shows how 3D-RAM maps to pixels in a single chip 320x1024x32 frame buffer. The numbers outside each rectangle show its dimensions in pixels. 98 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP 1280 x 1024 x 32 Double Buffered Organization with Z The basic configuration for a 1280 x 1024 x 32 double buffered organization with Z Buffer is shown in Figure 6.4. This configuration uses only twelve 3D-RAMs. In this example each 3D-RAM (for Buffers A, B, and Z) covers a 320 x 1024 portion of the 1280 x 1024 displayed image. The interleave is in the x direction. This implies that vertical scrolling can take place at a very high speed because all data movement occurs within the 3D-RAM chips rather than across chips. Horizontal scrolling would require 3D-RAM to 3D-RAM data transfers. Each of Buffers A, B, and Z is 32 bits in pixel depth. This allows 8 bits each for R, G, B, and 8 bits for alpah or overlays, and we refer to the eight 3D-RAMs containing these data as the Color Buffer 3D-RAMs. In the case of Z Buffer, 24 bits can be used for depth and 8 bits for a combination of stencil pattern ID and window ID, and we refer to these four 3D-RAMs as the Z Buffer 3D-RAMs. The Z Buffer 3D-RAM utilize their Compare units to check depth, stencil and window ID, and supply the result to the PASS_OUT pins. The PASS_OUT pins of Z Buffer are connected to the PASS_IN pins of the corresponding Color Buffer 3D-RAMs. The results of all ALU operations are conditionally written to the Pixel Buffer, depending on the states of the PASS_IN pins (and on the states of the PASS_OUT pins of the Color Buffer 3D-RAMs themselves if they also perform compare tests). Both Buffers A and B are connected to the RAMDAC chip using a 128-bit bus. Buffers A and B can be selected on a pixel-by-pixel basis, alternating between the two buffers. The rendering controller is shown with a 256-bit interface for maximum performance. The three 3D-RAMs (one from each of Buffers A, B, and Z) that hold the data for the same pixels share a 64- 99 6 Frame Buffer Organizations bit bus. More specifically, the two 3D-RAM chips in the Buffers A and B share the same 32-bit data bus because only one of them is active for rendering and the other outputs display data throught the video port, while the 3D-RAM chip in the Z Buffer requires its own 32-bit data bus. A 64bit or 128-bit bus between the rendering controller and the 3D-RAMs could be used but with some loss of performance due to more restricted bandwidth and higher bus loading (implying lower maximum clock frequency). Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP System Interface Address & Control 64 6 Frame Buffer Organizations 32 Rendering Controller 64 32 3D-RAM Z Buffer 32 3D-RAM Z Buffer 32 64 64 32 3D-RAM Z Buffer 3D-RAM Z Buffer 32 32 32 3D-RAM Buffer A 3D-RAM Buffer A 3D-RAM Buffer A 3D-RAM Buffer A Buffer B Buffer B Buffer B Buffer B Video Control PASS_OUT PASS_IN 16 16 16 16 16 16 16 16 Video Data RAMDAC Figure 6.4 1280 x 1024 x 32 double buffered organization with 32-bit Z Buffer 100 Monitor Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP 640 x 512 x 8 Double Buffered Organization With Z A single 3D-RAM chip can be configured to support 640 x 512 x 8 double buffered organization with 16-bit Z. This configuration might be suitable for a very high performance, low cost consumer home or arcade game application. Z Buffer, 640 x 512 x 16 Buffer B, 640 x 512 x 8 Buffer A, 640 x 512 x 8 512 32 640 Rendering Controller 3D-RAM 32 RAMDAC 16 Figure 6.5 Using single 3D-RAM to configure double buffered 640 x 512 x 8 with 16-bit Z 101 6 Frame Buffer Organizations The basic allocation of memory can be seen in Figure 6.6. One fourth of the 3D-RAM serves as Buffer A, one fourth as Buffer B, and the rest as the 16-bit Z Buffer. All Z compares and ROP/ Blend functions are done on the same 3D-RAM. A 32-bit Pixel and Z data bus is provided to the rendering controller. A 16-bit bus interfaces to the RAMDAC. Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP 640 0 1 16 17 32 33 15 31 47 232 233 240 241 239 255 512 A page group consists of the same page from all four DRAM banks (A, B, C, D). 40 The screen is 16 page groups wide by 16 page groups high. 32 0(A) 1(B) 2(C) 3(D) 6 Frame Buffer Organizations DRAM_A[5:2] 20 20 16 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 DRAM_A[3:0] 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 16 Each page can be divided into either 40 blocks, which can be accessed from the Global Bus, or 16 scan lines which can be accessed by the Video Buffer. DRAM_A[1:0] 2 PALU_A[2:1] 4 PALU_A0 PALU_DQ[31:24] Buffer A PALU_DQ[23:16] Buffer B 0 1 2 3 4 5 6 7 Blocks can be accessed in 1-pixel words by the Pixel ALU. 20 1 0 1 Video Buffer 18 19 Video Buffer data is shifted out two bytes at a time on each VID_CLK. A pixel is shifted out every two cycles. Z buffer data is ignored. Buffer A or B is selected by the RAMDAC chip or external logic. PALU_DQ[15:0] Z Buffer Figure 6.6 This diagram shows how 3D-RAM maps to pixels in a single chip 640x512x8 frame buffer. The numbers outside each rectangle show its dimensions in pixels. 102 7 Electrical Specifications MITSUBISHI Rev. 1.03 3D-RAM (M5M410092B) ELECTRONIC DEVICE GROUP Electrical Specifications Absolute Maximum Ratings Absolute Maximum Ratings Symbol Parameter VDD Supply Voltage VI Input Voltage VO Output Voltage IO Output Current Tj Conditions Ratings Unit − 0.5 to 4.6 V − 0.5 to 4.6 V − 0.5 to 4.6 V — 50 mA Maximum Junction Temperature — 125 °C Topr Operation Temperature — 0 to 70 °C Tstg Storage Temperature — − 65 to 150 °C with respect to Vss Testing Conditions pins. The capacitive loading CL is 60 pF for the PALU_DQ pins, 30 pF for the PASS_OUT pin, and 20 pF for both the VID_Q pins and the VID_QSF pin. Figure 7.2 is the output test load for the open-drain HIT pin, with Rpu = 330 W for pullup and CL = 75 pF. The supply voltage VDD and ambient temperature Ta for testing are as follows: VDD = 3.3 V ± 5%, Ta = 0 °C to 70 °C Figure 7.1 shows the output test load for the PALU_DQ, PASS_OUT, VID_Q, and VID_QSF PALU_DQ PASS_OUT VID_Q VID_QSF CL IL CL = 60pF for PALU_DQ 30pF for PASS_OUT 20pF for VID_Q, VID_QSF M1038 Figure 7.1 Output test load for the PALU_DQ, PASS_OUT, VID_Q, AND VID_QSF pins 103 7 Electrical Specifications Table 7.1 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP +3.3V Rpu HIT CL IL CL = 50pF Rpu = 330Ω 7 Electrical Specifications M1039 Figure 7.2 Output test load for the HIT pin shown in Figure 7.4 and Figure 7.5, respectively. Figure 7.6 shows the asynchronous output enable timing measurements. The AC timing measurements are summarized in Figure 7.3 through Figure 7.6. The clock waveform measurements are shown in Figure 7.3. The input and output timing measurements are t2 Clock 2.0V 0.8V 1.5V t3 t1 M1036 t1 : Clock cycle time (minimum) t2 : Clock high pulse width (minimum) t3 : Clock low pulse width (minimum) Figure 7.3 Clock waveform measurement 104 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP 1.5V 1.5V Clock t6 Reset 2.0V 0.8V 0.8V t7 t6 2.0V 0.8V Input t8 t9 Figure 7.4 Input timing measurement 1.5V Clock 2.4V 0.4V Output t10 t12 t12 t11 t11 t13 M1034 t10 : Clock to output low impedance, IO > 2 * IOZ (minimum) t11 : Output access time from clock (maximum) t12 : Output valid time after clock (minimum) t13 : Clock to output high impedance, IO < IOZ (maximum) Figure 7.5 Output timing measurement 105 7 Electrical Specifications M1037 t6 : Reset setup time (minimum) t7 : Reset pulse width (minimum) t8 : Input setup time (minimum) t9 : Input hold time (minimum) Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP OE (Active High) 2.0V 0.8V 2.4V 0.4V 2.4V 0.4V Output t14 t16 t15 t17 M1035 7 Electrical Specifications t14 : Valid output after OE low (minimum) t15 : Output high impedance, IO < IOZ, after OE low (maximum) t16 : Output low impedance, IO > 2 × IOZ, after OE high (minimum) t17 : Valid output after OE high (maximum) Figure 7.6 Asynchronous output enable timing measurement 1.5V Clock 2.4V 0.4V Output t18 t20 t20 t19 t19 t21 M1046 t18 : Clock to output low impedance, IO > 2 × IOZ (minimum) t19 : Output access time from clock (maximum) t20 : Output valid time after clock (minimum) t21 : Clock to output high impedance, IO < IOZ (maximum) Figure 7.7 SCAN_TDO timing measurement 106 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP DC Specifications supply current according to the operations, which include the Pixel ALU, DRAM and video operations. Table 7.2 lists the DC characteristics and the operation conditions. Table 7.3 lists the average Table 7.2 DC characteristics Symbol VIH a VILb Parameter Min Max Unit Input High Voltage 2.0 VDD + 0.3 V Input Low Voltage − 0.3 0.8 V VIH (PASS_IN[1:0]) PASS_IN[1:0] High Voltage 1.5 VDD + 0.3 V VIL (PASS_IN[1:0]}) PASS_IN[1:0] Low Voltage − 0.3 0.9 V 2.4 — V 0 0.4 V VOH c VOLd Output High Voltage, IL = − 0.2 mA Output Low Voltage, IL = 0.2 mA VOH (PASS_OUT) PASS_OUT High Voltage, IL=-0.1 mA 1.9 — V VOL (PASS_OUT) PASS_OUT Low Voltage, IL = 0.1 mA — 0.5 V VOH (HIT) HIT High Voltage — — V VOL (HIT) HIT Low Voltage — 0.8 V IOZ Output Leakage Current in Tri-state − 10 10 µA IIL Input Leakage Current − 10 10 µA CIN Input Capacitance — 5 pF CCLK CLK Input Capacitance — 7 pF CI/O I/O Capacitance — 7 pF a. This parameter applies to every input pin except PASS_IN[1:0]. b. This parameter applies to every input pin except PASS_IN[1:0]. c. This parameter applies to all output pins except PASS_OUT and HIT. d. This parameter applies to all output pins except PASS_OUT and HIT. 107 7 Electrical Specifications VDD = 3.3V ± 5%, Ta = 0°C~70°C Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Table 7.3 Average supply current by function M5M410092B 7 Electrical Specifications Symbol Parameter -10A, -10 -12 Unit ICC<ALU> Average supply current for ALU operation tCLK = min, (1 MCLK cycle) 260 215 mA ICC<NOP> Average standby current tCLK = ∞ 20 20 mA ICC<ACP> Average supply current for DRAM operation ACP tCLK = min, (4 MCLK cycles) 105 85 mA ICC<PRE> Average supply current for DRAM operation PRE tCLK = min, (4 MCLK cycles) 55 40 mA ICC<DUP> Average supply current for DRAM operation DUP tCLK = min, (8 MCLK cycles) 55 40 mA ICC<RDB> Average supply current for DRAM operation RDB tCLK = min, (2 or 3 MCLK cycles) 160 130 mA ICC<UWB> Average supply current for DRAM operation UWB tCLK = min, (2 or 3 MCLK cycles) 160 130 mA ICC<VDX> Average supply current for DRAM operation VDX tCLK = min, (4 MCLK cycles) 80 60 mA ICC<VID> Average supply current for video output tVCLK = min 70 70 mA 108 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP AC Specifications measurement levels illustrated in Figures 7.3 through 7.6. Every AC timing parameter is illustrated in at least one of the timing figures in Chapter 8. The “Refer” column in each timing table refers to the exact Table 7.4 Pixel ALU Timing Parameters The timing parameters of M5M410092B-10 and -12 are presented in Table 7.4. Pixel ALU timing parameters M5M410092B Symbol Parameter -10 -12 Min Max Min Max Min Max Unit Refer Ch. 8 Figure tCLK Master clock MCLK cycle time 10 16000 10 ✻ 16000 12 16000 ns t1 4 tCLKH MCLK high pulse width 4 — 4 — 5 — ns t2 4 tCLKL MCLK low pulse width 4 — 4 — 5 — ns t3 4 tRSS RESET setup time 0 — 0 — 0 — ns t6 1 tRSP RESET pulse width 40 — 40 — 48 — ns t7 2 tENS PALU_EN setup time 3 — 3 — 4 — ns t8 4 tENH PALU_EN hold time 1.5 — 1.5 — 1.5 — ns t9 4 tOPS PALU_OP setup time 3 — 3 — 4 — ns t8 4 tOPH PALU_OP hold time 1.5 — 1.5 — 1.5 — ns t9 4 tADS PALU_A setup time 3 — 3 — 4 — ns t8 4 tADH PALU_A hold time 1.5 — 1.5 — 1.5 — ns t9 4 tDQS PALU_DQ, PALU_DX setup time 3 — 3 — 4 — ns t8 5 tDQH PALU_DQ, PALU_DX hold time 1.5 — 1.5 — 1.5 — ns t9 5 tWES PALU_WE setup time 3 — 3 — 4 — ns t8 4 tWEH PALU_WE hold time 1.5 — 1.5 — 1.5 — ns t9 4 ✻ tCLK = 10.0 ns except that for the alpha saturate logic tCLK = 12.0 ns. 109 7 Electrical Specifications -10A Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Table 7.4 Pixel ALU timing parameters (con’t.) M5M410092B -10A 7 Electrical Specifications Symbol Parameter -10 -12 Min Max Min Max Min Max Unit Refer Ch. 8 Figure tBES PALU_BE setup time 3 — 3 — 4 — ns t8 4 tBEH PALU_BE hold time 1.5 — 1.5 — 1.5 — ns t9 4 tCLZ MCLK to PALU_DQ low impedance 4 — 4 — 5 — ns t10 4 tCQ PALU_DQ access time — 14 — 14 — 18 ns t11 4 tCVD PALU_DQ data valid time 4 — 4 — 4 — ns t12 4 tCHZ MCLK to PALU_DQ high impedance — 4 — 4 — 4 ns t13 4 tPSS PASS_IN setup time 2 — 2 — 3 — ns t8 5 tPSH PASS_IN hold time 0 — 0 — 0 — ns t9 5 tCPS MCLK to valid PASS_OUT — 6 — 6 — 8 ns t11 5 tCPSV PASS_OUT data valid time 3 — 3 — 3 — ns t12 5 tCHT MCLK to valid HIT — 35 — 35 — 35 ns t11 6 110 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP DRAM Timing Parameters edge of the first operation to the MCLK rising edge of the second operation. Both MCLK edges are measured at 1.5 V. The measurements of the DRAM interlock timings in Tables 7.6 and 7.7 are from the MCLK rising Minimum requirements of the DRAM timing parameters M5M410092B Symbol -12 Refresh interval for array 17 17 ms — — tDENS DRAM_EN setup time 3 4 ns t8 7 tDENH DRAM_EN hold time 1.5 1.5 ns t9 7 tDOPS DRAM_OP setup time 3 4 ns t8 7 tDOPH DRAM_OP hold time 1.5 1.5 ns t9 7 tDBKS DRAM_BS setup time 3 4 ns t8 7 tDBKH DRAM_BS hold time 1.5 1.5 ns t9 7 tDADS DRAM_A setup time 3 4 ns t8 7 tDADH DRAM_A hold time 1.5 1.5 ns t9 7 111 Unit Refer Ch. 8 Figure -10A, -10 tREF Parameter 7 Electrical Specifications Table 7.5 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Table 7.6 Minimum requirements of the DRAM interlock timings for operations on same bank M5M410092B 7 Electrical Specifications Symbol Parameter -10A, -10 -12 Unit Ch. 8 Timing Figure tdABS Access Page to Block Transfer 36 36 ns 7 tdAPSa Access Page to Precharge Bank 60 72 ns 7 tdADS Access Page to Duplicate Page 48 48 ns 8 tdAVS Access Page to Video Transfer 40 48 ns 9 tdBBS Block Transfer to Block Transfer 20 24 ns 7 tdBPS Block Transfer to Precharge Bank 20 24 ns 7 tdBDS Block Transfer to Duplicate Page 20 24 ns 8 tdBVS Block Transfer to Video Transfer 20 24 ns 10 Precharge Bank to Access Page 40 48 ns 7 tdPPS Precharge Bank to Precharge Bank 10 12 ns 10 tdDBS Duplicate Page to Block Transfer 80 96 ns 8 tdDPS Duplicate Page to Precharge Bank 80 96 ns 9 tdDDS Duplicate Page to Duplicate Page 80 96 ns 8 tdDVS Duplicate Page to Video Transfer 80 96 ns 9 tdVBS Video Transfer to Block Transfer 40 48 ns 10 tdVPS Video Transfer to Precharge Bank 20 24 ns 9 tdVDS Video Transfer to Duplicate Page 40 48 ns 9 tdVVS Video Transfer to Video Transfer 80 96 ns 9 tdPAS b a. The maximum timing limit from ACP to PRE is 100,000 ns. b. The operation from PRE to ACP requires at least two clock cycles. At the first clock rising edge, PRE starts. At the second clock rising edge, the preparation for ACP starts. 112 M5M410092B Symbol Parameter -10A, -10 -12 Unit Ch. 8 Timing Figure tdAAD Access Page to Access Page 40 48 ns 11 tdABD Access Page to Block Transfer 10 12 ns 11 tdAPD Access Page to Precharge Bank 40 48 ns 11 tdADD Access Page to Duplicate Page 40 48 ns 12 tdAVD Access Page to Video Transfer 40 48 ns 13 tdBAD Block Transfer to Access Page 10 12 ns 11 tdBBD Block Transfer to Block Transfer 20 24 ns 11 tdBPD Block Transfer to Precharge Bank 10 12 ns 11 tdBDD Block Transfer to Duplicate Page 10 12 ns 11 tdBVD Block Transfer to Video Transfer 10 12 ns 13 tdPAD Precharge Bank to Access Page 10 12 ns 11 tdPBD Precharge Bank to Block Transfer 10 12 ns 11 tdPPD Precharge Bank to Precharge Bank 10 12 ns 11 tdPDD Precharge Bank to Duplicate Page 10 12 ns 11 tdPVD Precharge Bank to Video Transfer 10 12 ns 13 tdDAD Duplicate Page to Access Page 80 96 ns 12 tdDBD Duplicate Page to Block Transfer 10 12 ns 12 tdDPD Duplicate Page to Precharge Bank 40 48 ns 12 tdDDD Duplicate Page to Duplicate Page 80 96 ns 12 tdDVD Duplicate Page to Video Transfer 80 96 ns 13 tdVAD Video Transfer to Access Page 40 48 ns 13 tdVBD Video Transfer to Block Transfer 10 12 ns 13 tdVPD Video Transfer to Precharge Bank 20 24 ns 13 tdVDD Video Transfer to Duplicate Page 40 48 ns 13 tdVVD Video Transfer to Video Transfer 80 96 ns 13 113 7 Electrical Specifications Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Video Buffer Timing Parameters Table 7.7 Video Buffer timing parameters M5M410092B -10A, -10 Symbol Min Max Ch. 8 Min Max Unit Refer Figure tVCLK VID_CLK cycle time 12 — 12 — ns t1 14 tVCLKH VID_CLK high pulse width 5 — 5 — ns t2 14 tVCLKL VID_CLK low pulse width 5 — 5 — ns t3 14 tVCES VID_CKE setup time 4 — 4 — ns t8 14 tVCEH VID_CKE hold time 0 — 0 — ns t9 14 VID_Q access time from VID_CLK — 8 — 8 ns t11 14 VID_Q valid after VID_CLK 3 — 3 — ns t12 14 tVLZ VID_Q output low impedance 3 — 3 — ns t16 14 tVHZ VID_Q output high impedance — 3 — 3 ns t15 14 tVQE VID_Q access time from VID_OE high — 9 — 9 ns t17 14 tVQVE VID_Q valid after VID_OE low — — — — ns t14 14 tVXCI1 Initial VDX after last internal VID_CLK 14 — 14 — ns — 15 tVXCI2 Initial VDX VID_CLK 80 — 80 — ns — 15 tVXQFI VID_QSF delay time after initial VDX — 80 — 80 ns t11 15 tQSF VID_QSF delay time from internal VID_CLK 38 — 25 — 25 ns t11 16 tVXC1 Normal VDX after internal VID_CLK 38 20 — 20 — ns — 16 tVXC2 Normal VDX VID_CLK 38 60 — 60 — ns — 16 tVQ 7 Electrical Specifications Parameter -12 tVQVC before before next next internal internal 114 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Boundary-Scan Timing Parameters Boundary-Scan timing parameters M5M410092B -10A, -10, -12 Symbol Parameter Min Max Unit Refer Ch. 8 Figure tSCLK SCAN_TCK cycle time 100 — ns t1 17 tSCLKH SCAN_TCK high pulse width 40 — ns t2 17 tSCLKL SCAN_TCK low pulse width 40 — ns t3 17 tSCNTS SCAN_TMS setup time 8 — ns t8 17 tSCNTH SCAN_TMS hold time 26 — ns t9 17 tSCNIS SCAN_TDI setup time 8 — ns t8 17 tSCNIH SCAN_TDI hold time 26 — ns t9 17 tSLZ SCAN_TCK to SCAN_TDO low impedance — 20 ns t18 17 tSQ SCAN_TDO access time — 26 ns t19 17 tSVD SCAN_TDO data valid time 8 — ns t20 17 tSHZ SCAN_TCK to SCAN_TDO high impedance — 20 ns t21 17 tSCNRS SCAN_RST setup time 8 — ns t6 18 tSCNRP SCAN_RST pulse width 30 — ns t7 18 115 7 Electrical Specifications Table 7.8 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI 7 Electrical Specifications ELECTRONIC DEVICE GROUP 116 8 Timing Diagrams MITSUBISHI Rev. 1.03 3D-RAM (M5M410092B) ELECTRONIC DEVICE GROUP Timing Diagrams for exact timing measurements. Also, the entries of parameter values from Tables 7.4 through 7.9 are repeated here for convenient reference. This chapter shows 18 timing diagrams, which are summarized in Table 8.1. These diagrams show the gross timing specifications. Refer to Chapter 7 Table 8.1 Timing diagram figures Description 8.1 Power on reset 8.2 Restart reset 8.3 DRAM array initialization 8.4 Pixel port read 8.5 Pixel port write 8.6 Pick logic timing 8.7 DRAM operations on the same bank (1) 8.8 DRAM operations on the same bank (2) 8.9 DRAM operations on the same bank (3) 8.10 DRAM operations on the same bank (4) 8.11 DRAM operations between two different banks (1) 8.12 DRAM operations between two different banks (2) 8.13 DRAM operations between two different banks (3) 8.14 Internal VID_CLK and video output timing 8.15 Video output sequence from initial VDX for normal and reversed modes 8.16 Continuous video output sequence in normal mode during display 8.17 Boundary scan 8.18 Boundary scan reset 117 8 Timing Diagrams Figure Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP VDD > 500µs > 9 cycles MCLK RESET tRSS Internal Function Stabilize internal power supply Reset Registers Initialize DRAM Array Start Normal Operation M1020 Figure 8.1 Power on reset > 9 cycles 8 Timing Diagrams MCLK tRSP RESET tRSS tRSS Internal Function Reset Registers Initialize DRAM Array Start Normal Operation M1021 Figure 8.2 Restart reset Table 8.2 Reset timing parameters M5M410092B Symbol Parameter -10A, -10 -12 Unit Refer Min Max Min Max tRSS RESET setup time 0 — 0 — ns t6 tRSP RESET pulse width 40 — 48 — ns t7 118 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP MCLK DRAM_OP ACP ACP ACP ACP PRE PRE PRE PRE DRAM_BS A B C D A B C D Internal Function Initialize DRAM Array Start Normal Operation OR MCLK DRAM_OP ACP PRE ACP PRE ACP PRE ACP PRE DRAM_BS A A B B C C D D Initialize DRAM Array Start Normal Operation M1019 Figure 8.3 DRAM array initialization 119 8 Timing Diagrams Internal Function Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP tCLK tCLKH tCLKL MCLK tENS tENH PALU_EN tOPS tOPH PALU_OP 111 000 tWES tWEH PALU_WE tADS tADH PALU_A 000111 Block : Word tBES tBEH PALU_BEn PALU_DQ[8n+7..8n] Valid ID tCVD tCLZ tCQ Valid Data tCHZ tCQ M1018 8 Timing Diagrams Note: Refer to Figure 2.6 for an example of combined operations of Pixel ALU read and write. Figure 8.4 Pixel port read MCLK 1 2 3 4 5 6 7 8 9 10 PALU_EN PALU_OP 111 100 101 010 or 011 Register Block:xxx Block:xxx Block:Word 000 or 001 PALU_WE PALU_A PALU_BEn tDQH tDQS PALU_DQ[8n+7..n] Reg Data PALU_DXn Reg Data Dirty Tag Dirty Tag New Data New Data New Data New Data tPSS tPSH Pass_In PASS_IN PASS_OUT tCPS tCPSV M1017 Figure 8.5 Pixel port write 120 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Table 8.3 Pixel ALU timing parameters M5M410092B -10A Parameter Min Max -10 Min -12 Max 16000 10 ✻ 16000 Unit Refer Min Max 12 16000 ns t1 tCLK Master clock MCLK cycle time 10 tCLKH MCLK high pulse width 4 — 4 — 5 — ns t2 tCLKL MCLK low pulse width 4 — 4 — 5 — ns t3 tENS PALU_EN setup time 3 — 3 — 4 — ns t8 tENH PALU_EN hold time 1.5 — 1.5 — 1.5 — ns t9 tOPS PALU_OP setup time 3 — 3 — 4 — ns t8 tOPH PALU_OP hold time 1.5 — 1.5 — 1.5 — ns t9 tADS PALU_A setup time 3 — 3 — 4 — ns t8 tADH PALU_A hold time 1.5 — 1.5 — 1.5 — ns t9 tWES PALU_WE setup time 3 — 3 — 4 — ns t8 tWEH PALU_WE hold time 1.5 — 1.5 — 1.5 — ns t9 tBES PALU_BE setup time 3 — 3 — 4 — ns t8 tBEH PALU_BE hold time 1.5 — 1.5 — 1.5 — ns t9 tCLZ MCLK to PALU_DQ low impedance 4 — 4 — 5 — ns t10 tCQ PALU_DQ access time — 14 — 14 — 18 ns t11 tCVD PALU_DQ data valid time 4 — 4 — 4 — ns t12 tCHZ MCLK to PALU_DQ high impedance — 4 — 4 — 4 ns t13 tDQS PALU_DQ, PALU_DX setup time 3 — 3 — 4 — ns t8 tDQH PALU_DQ, PALU_DX hold time 1.5 — 1.5 — 1.5 — ns t9 tPSS PASS_IN setup time 2 — 2 — 3 — ns t8 tPSH PASS_IN hold time 1.5 — 1.5 — 1.5 — ns t9 tCPS MCLK to valid PASS_OUT — 6 — 6 — 8 ns t11 tCPSV PASS_OUT data valid time 3 — 3 — 3 — ns t12 ✻ tCLK = 10.0 ns except that for the alpha saturate logic tCLK = 12.0 ns. 121 8 Timing Diagrams Symbol Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP MCLK 1 2 3 4 5 6 7 8 9 10 PALU_EN PALU_OP 111 011 000101 Block:Word PALU_WE PALU_A PALU_BE[2..0] PALU_BE3 XEXXXXXX PALU_DQ[31..0] Data PASS_IN PASS_OUT Note 1 HIT Note 2 tCHT 8 Timing Diagrams tCHT M1016 Note: 1. The HIT signal is cleared by writing to the Compare Control register. 2. The HIT signal can be set by the comparison result from the PASS_IN and PASS_OUT pins, which are generated two cycles before the HIT signal is. Figure 8.6 Picking Logic timing Table 8.4 Picking Logic timing parameter M5M410092B Symbol tCHT Parameter MCLK to valid HIT -10A, -10 -12 Unit Refer Min Max Min Max — 35 — 35 122 ns t11 This page is intentionally left blank. 123 8 Timing Diagrams Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP tdPAS tdABS tdBBS tdBPS MCLK tDENS tDENH DRAM_EN tDOPS tDOPH PRE DRAM_OP tDBKS A DRAM_BS tDADS DRAM_A NOP ACP NOP BKX NOP BKX NOP PRE tDBKH A A A A PAGE BLOCK BLOCK PAGE tDADH PAGE M1015 tdAPS MCLK 8 Timing Diagrams DRAM_EN DRAM_OP PRE DRAM_BS A A A A A PAGE PAGE BLOCK BLOCK PAGE DRAM_A NOP ACP NOP BKX NOP BKX NOP PRE M1043 Note: BKX means any block transfer operation, such as UWB, MWB, or RDB. Figure 8.7 DRAM operations on the same bank (1) tdADS tdDDS tdDBS tdBDS MCLK DRAM_EN DRAM_OP ACP DRAM_BS A A A A A PAGE PAGE PAGE BLOCK PAGE DRAM_A NOP DUP NOP DUP NOP BKX NOP DUP M1014 Note: BKX means any block transfer operation, such as UWB, MWB, or RDB. Figure 8.8 DRAM operations on the same bank (2) 124 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Table 8.5 Minimum requiremens of the DRAM port interface timing parameters Symbol Parameter M5M410092B -10A, -10 -12 Unit Refer Timing Figure tDENS DRAM_EN setup time 3 4 ns t8 8.7 tDENH DRAM_EN hold time 1.5 1.5 ns t9 8.7 tDOPS DRAM_OP setup time 3 4 ns t8 8.7 tDOPH DRAM_OP hold time 1.5 1.5 ns t9 8.7 tDBKS DRAM_BS setup time 3 4 ns t8 8.7 tDBKH DRAM_BS hold time 1.5 1.5 ns t9 8.7 tDADS DRAM_A setup time 3 4 ns t8 8.7 tDADH DRAM_A hold time 1.5 1.5 ns t9 8.7 Symbol tdABS a Parameter M5M410092B Unit Timing Figure -10A, -10 -12 Access Page to Block Transfer 36 36 ns 8.7 Access Page to Precharge Bank 60 72 ns 8.7 tdADS Access Page to Duplicate Page 48 48 ns 8.8 tdBBS Block Transfer to Block Transfer 20 24 ns 8.7 tdBPS Block Transfer to Precharge Bank 20 24 ns 8.7 tdAPS tdBDS Block Transfer to Duplicate Page 20 24 ns 8.8 tdPASb Precharge Bank to Access Page 40 48 ns 8.7 tdDBS Duplicate Page to Block Transfer 80 96 ns 8.8 tdDDS Duplicate Page to Duplicate Page 80 96 ns 8.8 a. The maximum timing limit from ACP to PRE is 100,000 ns. b. The operation from PRE to ACP requires at least two clock cycles. At the first clock rising edge, PRE starts. At the second clock rising edge, the preparation for ACP starts. 125 8 Timing Diagrams Table 8.6 Minimum requirements of the DRAM interlock timings for operations on same bank Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP tdAVS tdVDS tdDVS tdVPS MCLK DRAM_EN DRAM_OP ACP DRAM_BS A PAGE DRAM_A NOP VDX NOP DUP VDX NOP PRE A A A A LINE PAGE LINE PAGE M1013 tdDPS tdVVS MCLK 8 Timing Diagrams DRAM_EN DRAM_OP ACP DRAM_BS A PAGE DRAM_A NOP VDX NOP DUP VDX NOP PRE A A A A LINE PAGE LINE PAGE M1044 Figure 8.9 DRAM operations on the same bank (3) tdVBS tdBVS tdPPS MCLK DRAM_EN DRAM_OP VDX DRAM_BS A A A A A LINE BLOCK LINE PAGE PAGE DRAM_A NOP BKX NOP VDX NOP PRE NOP PRE M1012 Note: BKX means any block transfer operation, such as UWB, MWB, or RDB. Figure 8.10 DRAM operations on the same bank (4) 126 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Table 8.7 Minimum requirements of the DRAM interlock timings for operations on same bank Symbol Parameter M5M410092B -10A, -10 -12 Unit Timing Figure Access Page to Video Transfer 40 48 ns 8.9 tdBVS Block Transfer to Video Transfer 20 24 ns 8.10 tdPPS Precharge Bank to Precharge Bank 10 12 ns 8.10 tdDPS Duplicate Page to Precharge Bank 80 96 ns 8.9 tdDVS Duplicate Page to Video Transfer 80 96 ns 8.9 tdVBS Video Transfer to Block Transfer 40 48 ns 8.10 tdVPS Video Transfer to Precharge Bank 20 24 ns 8.9 tdVDS Video Transfer to Duplicate Page 40 48 ns 8.9 tdVVS Video Transfer to Video Transfer 80 96 ns 8.9 8 Timing Diagrams tdAVS 127 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP tdBBD tdPPD tdPAD tdABD tdBAD tdBPD tdPDD MCLK DRAM_EN DRAM_OP PRE PRE ACP BKX DRAM_BS A B C PAGE PAGE PAGE DRAM_A NOP ACP BKX PRE DUP D B C D B BLOCK PAGE BLOCK PAGE PAGE M1011 tdAAD tdAPD tdPBD tdBDD MCLK 8 Timing Diagrams DRAM_EN DRAM_OP PRE PRE ACP BKX DRAM_BS A B C PAGE PAGE PAGE DRAM_A NOP ACP BKX PRE DUP D B C D B BLOCK PAGE BLOCK PAGE PAGE M1045 Note: BKX means any block transfer operation, such as UWB, MWB, or RDB. Figure 8.11 DRAM operations between two different banks (1) tdDDD tdADD tdDAD tdDBD tdDPD MCLK DRAM_EN DRAM_OP ACP DUP BKX DUP PRE ACP DRAM_BS A B A C A D PAGE PAGE BLOCK PAGE PAGE PAGE DRAM_A M1010 Figure 8.12 DRAM operations between two different banks (2) 128 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Symbol Parameter M5M410092B -10A, -10 -12 Unit Timing Figure tdAAD Access Page to Access Page 40 48 ns 8.11 tdABD Access Page to Block Transfer 10 12 ns 8.11 tdAPD Access Page to Precharge Bank 40 48 ns 8.11 tdADD Access Page to Duplicate Page 40 48 ns 8.12 tdBAD Block Transfer to Access Page 10 12 ns 8.11 tdBBD Block Transfer to Block Transfer 20 24 ns 8.11 tdBPD Block Transfer to Precharge Bank 10 12 ns 8.11 tdBDD Block Transfer to Duplicate Page 10 12 ns 8.11 tdPAD Precharge Bank to Access Page 10 12 ns 8.11 tdPBD Precharge Bank to Block Transfer 10 12 ns 8.11 tdPPD Precharge Bank to Precharge Bank 10 12 ns 8.11 tdPDD Precharge Bank to Duplicate Page 10 12 ns 8.11 tdDAD Duplicate Page to Access Page 80 96 ns 8.12 tdDBD Duplicate Page to Block Transfer 10 12 ns 8.12 tdDPD Duplicate Page to Precharge Bank 40 48 ns 8.12 tdDDD Duplicate Page to Duplicate Page 80 96 ns 8.12 129 8 Timing Diagrams Table 8.8 Minimum requirement of the DRAM interlock timings for operations between two different banks Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP tdVVD tdDVD tdVPD tdBVD tdVDD tdVBD tdPVD tdVAD tdAVD MCLK DRAM_EN DRAM_OP DUP BKX VDX BKX PRE VDX ACP DUP DRAM_BS A B C B D B D A C PAGE BLOCK LINE BLOCK PAGE LINE PAGE PAGE LINE DRAM_A NOP VDX M1009 Figure 8.13 DRAM operations between two different banks (3) Table 8.9 Minimum requirement of the DRAM interlock timings for operations between two different banks 8 Timing Diagrams Symbol Parameter M5M410092B -10A, -10 -12 Unit tdAVD Access Page to Video Transfer 40 48 ns tdBVD Block Transfer to Video Transfer 10 12 ns tdPVD Precharge Bank to Video Transfer 10 12 ns tdDVD Duplicate Page to Video Transfer 80 96 ns tdVAD Video Transfer to Access Page 40 48 ns tdVBD Video Transfer to Block Transfer 10 12 ns tdVPD Video Transfer to Precharge Bank 20 24 ns tdVDD Video Transfer to Duplicate Page 40 48 ns tdVVD Video Transfer to Video Transfer 80 96 ns 130 This page is intentionally left blank. 131 8 Timing Diagrams Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP tVCLKH tVCLKL tVCLK 1 VID_CLK tVCES 1 tVCEH VID_CKE Internal VID_CLK VID_OE tVLZ Data 1 VID_Q tVQVC tVQ Data 2 Data 2 tVQVE tVHZ Data 3 Data 4 Data 5 tVQE Data 6 M1006 Note: 1. The deassertion of VID_CKE at the current VID_CLK rising edge will mask out the next internal VID_CLK cycle. 2. Timings are measured from the VID_CLK pin or from the VID_OE pin. Figure 8.14 Internal VID_CLK and video output timing 8 Timing Diagrams MCLK DRAM_OP VDX 10 DRAM_A[8..7] tVXCI1 tVXCI2 VID_CLK VID_CKE Internal VID_CLK tVQ VID_Q Normal mode VID_Q Reversed mode Data 0 Data 1 Data 2 Data 3 Data 1 Data 0 Data 3 Data 2 tVXQFI VID_QSF M1024 Note: 1. Note that the VID_OE is always “1” for video output enable is assumed. 2. Timings are measured from the MCLK pin or from the VID_CLK pin. Figure 8.15 Video output sequence from intial VDX for normal and reversed modes 132 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Table 8.10 Video Buffer timing parameters M5M410092B Symbol -10A, -10 Parameter -12 Unit Refer Timing Figure tVCLK VID_CLK cycle time 12 — 12 — ns t1 8.14 tVCLKH VID_CLK high pulse width 5 — 5 — ns t2 8.14 tVCLKL VID_CLK low pulse width 5 — 5 — ns t3 8.14 tVCES VID_CKE setup time 4 — 4 — ns t8 8.14 tVCEH VID_CKE hold time 0 — 0 — ns t9 8.14 VID_Q access time from VID_CLK — 8 — 8 ns t11 8.14 VID_Q valid after VID_CLK 3 — 3 — ns t12 8.14 tVLZ VID_Q output low impedance 3 — 3 — ns t16 8.14 tVHZ VID_Q output high impedance — 3 — 3 ns t15 8.14 tVQE VID_Q access time from VID_OE high — 9 — 9 ns t17 8.14 tVQVE VID_Q valid after VID_OE low — — — — ns t14 8.14 tVXCI1 Initial VDX after last internal VID_CLK 14 — 14 — ns — 8.15 tVXCI2 Initial VDX before next internal VID_CLK 80 — 80 — ns — 8.15 tVXQFI VID_QSF delay time after initial VDX — 80 — 80 ns t11 8.15 tVQ tVQVC 133 8 Timing Diagrams Min Max Min Max Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP VID_QSF tQSF tVQ Bank 0 0 VID_Q [15 . . 0] 0 1 1 37 2 37 38 38 39 tQSF Bank 1 39 0 Bank 2 0 1 1 37 2 38 38 39 39 0 0 1 1 2 VID_CLK VID_CKE tVXC1 tVXC2 8 Timing Diagrams MCLK DRAM_OP VDX VDX VDX DRAM_BS 0 1 2 DRAM_A [8 . . 7] 10 0x 0x Initial VDX During Retrace Normal VDX During Retrace Normal VDX During Display M1022 Note: 1. tVXC1 specifies the earliest allowed normal VDX. 2. tVXC2 specifies the latest allowed normal VDX. Figure 8.16 Continuous video output sequence in normal mode during display Table 8.11 Video Buffer timing parameters M5M410092B Symbol -10A, -10 Parameter Min Max -12 Unit Refer Min Max tQSF VID_QSF delay time from internal VID_CLK 38 — 25 — 25 ns t11 tVXC1 Normal VDX after internal VID_CLK 38 20 — 20 — ns — tVXC2 Normal VDX before next internal VID_CLK 38 60 — 60 — ns — 134 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP tSCLK tSCLKH tSCLKL SCAN_TCK Controller State Test-Logic Reset Note 1 tSCNTS Run-Test-Idle Shift-IR Shift-IR Exit-IR Update-IR tSCNTH SCAN_TMS tSCNIH tSCNIS SCAN_TDI tSQ tSHZ SCAN_TDO tSLZ tSVD M1004 Table 8.12 Boundary-Scan timing parameters M5M410092B -10A, -10, -12 Symbol Parameter Min Max Unit Refer tSCLK SCAN_TCK cycle time 100 — ns t1 tSCLKH SCAN_TCK high pulse width 40 — ns t2 tSCLKL SCAN_TCK low pulse width 40 — ns t3 tSCNTS SCAN_TMS setup time 8 — ns t8 tSCNTH SCAN_TMS hold time 26 — ns t9 tSCNIS SCAN_TDI setup time 8 — ns t8 tSCNIH SCAN_TDI hold time 26 — ns t9 tSLZ SCAN_TCK to SCAN_TDO low impedance — 20 ns t18 tSQ SCAN_TDO access time — 26 ns t19 tSVD SCAN_TDO data valid time 8 — ns t20 tSHZ SCAN_TCK to SCAN_TDO high impedance — 20 ns t21 135 8 Timing Diagrams Figure 8.17 Boundary scan Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP SCAN_TCK tSCNRP tSCNRS SCAN_RST M1005 Figure 8.18 Boundary scan reset Table 8.13 Boundary-Scan reset timing parameters Symbol M5M410092B -10A, -10, -12 Parameter Min Max Unit Refer SCAN_RST setup time 8 — ns t6 tSCNRP SCAN_RST pulse width 30 — ns t7 8 Timing Diagrams tSCNRS 136 9 Packaging MITSUBISHI Rev. 1.03 3D-RAM (M5M410092B) ELECTRONIC DEVICE GROUP Packaging 3D-RAM Pinouts The 3D-RAM is housed in 128-pin QFP(FP) and reverse QFP(RF) packages. There are two pinouts for 3D-RAM: normal pinout with pin 1 located at the lower left hand corner and specially marked by a small circle; and reverse pinout with pin 1 located at the upper left hand corner and marked by a large circle and a pointing triangle. The device in normal pinout is designated by the letters “FP” in the product number, and the device in reverse pinout by the letters “RF.” In both pinouts, the mapping of pin numbers with pin names is identical. For the purpose of convenient reference, the pinout diagrams for the 3D-RAM are repeated on pages 137 and 138. Page 103 contains the mechanical specification for the FP and RF packages. The thermal characteristics data for both packages is on page 142. Normal Pinout Diagram VDD 103 PALU_DQ28 104 PALU_DQ29 105 PALU_DQ30 106 PALU_DQ31 107 PALU_DX2 108 PALU_DX3 109 PALU_BE2 110 PALU_BE3 111 VSS 112 PALU_OP2 113 115 PALU_WE 114 PALU_EN1 118 PALU_A3 116 PALU_A4 117 PALU_A5 VSS 119 DRAM_OP1 120 DRAM_OP2 121 DRAM_A6 122 124 DRAM_A7 123 DRAM_A8 VDD 128 RESET 127 DRAM_BS0 125 DRAM_BS1 126 1 SCAN_TCK 2 SCAN_RST VID_Q8 3 4 100 PALU_DQ26 99 PALU_DQ25 VID_Q9 5 98 PALU_DQ24 VSS VID_Q10 6 97 7 96 VSS PALU_DQ23 VID_Q11 102 VDD 101 PALU_DQ27 95 PALU_DQ22 9 94 PALU_DQ21 VID_Q13 10 93 PALU_DQ20 VDD VID_Q14 11 92 12 91 VDD PALU_DQ19 VID_Q15 13 90 PALU_DQ18 VID_QSF 14 89 PALU_DQ17 VID_CKE 15 88 PALU_DQ16 VSS 16 87 VSS 86 VSS 85 PASS_OUT 84 VDD 83 MCLK 82 NC 81 VSS 80 79 VSS PALU_DQ15 78 PALU_DQ14 17 18 VDD 19 VID_CLK 20 VSS 21 PASS_IN1 22 DDDMMMMM-nn VSS PASS_IN0 M5M410092BFP 8 VID_Q12 VSS 23 VID_OE 24 HIT VID_Q0 25 26 77 PALU_DQ13 VID_Q1 27 76 PALU_DQ12 VDD VID_Q2 28 75 29 74 VDD PALU_DQ11 VID_Q3 30 73 PALU_DQ10 VID_Q4 31 72 PALU_DQ9 VID_Q5 32 71 PALU_DQ8 VSS VID_Q6 33 70 34 69 VSS PALU_DQ7 VID_Q7 35 68 PALU_DQ6 SCAN_TDO 36 67 PALU_DQ5 SCAN_TDI 37 66 PALU_DQ4 VDD 38 65 VDD 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 VDD DRAM_A0 DRAM_A1 DRAM_A2 DRAM_A3 DRAM_A4 DRAM_EN DRAM_A5 DRAM_OP0 VSS PALU_A0 PALU_A1 PALU_A2 PALU_EN0 PALU_OP0 VSS PALU_OP1 PALU_BE0 PALU_BE1 PALU_DX0 PALU_DX1 PALU_DQ0 PALU_DQ1 PALU_DQ2 VDD PALU_DQ3 64 39 137 M1048 9 Packaging SCAN_TMS Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Reverse Pinout Diagram SCAN_TCK 3 PALU_DQ25 99 4 SCAN_RST VID_Q8 PALU_DQ24 98 5 VID_Q9 VSS PALU_DQ23 97 6 96 7 VSS VID_Q10 PALU_DQ22 95 8 VID_Q11 PALU_DQ21 94 9 VID_Q12 PALU_DQ20 93 10 VID_Q13 VDD PALU_DQ19 92 11 91 12 VDD VID_Q14 PALU_DQ18 90 13 VID_Q15 PALU_DQ17 89 14 VID_QSF PALU_DQ16 88 15 VID_CKE VSS 87 16 VSS VSS 86 17 VSS PASS_OUT 85 18 PASS_IN0 19 VDD 20 VID_CLK 21 VSS 22 PASS_IN1 23 VSS 24 VID_OE 25 M5M410092BRF SCAN_TMS 2 DDDMMMMM-nn 1 VDD 84 MCLK 83 NC 82 VSS 81 VSS PALU_DQ15 80 PALU_DQ14 78 PALU_DQ13 77 26 HIT VID_Q0 PALU_DQ12 76 27 VID_Q1 VDD PALU_DQ11 75 28 74 29 VDD VID_Q2 PALU_DQ10 73 30 VID_Q3 PALU_DQ9 72 31 VID_Q4 PALU_DQ8 71 32 VID_Q5 VSS PALU_DQ7 70 33 69 34 VSS VID_Q6 PALU_DQ6 68 35 VID_Q7 PALU_DQ5 67 36 SCAN_TDO PALU_DQ4 66 37 SCAN_TDI VDD 65 38 VDD 79 45 44 43 42 41 40 DRAM_EN DRAM_OP0 DRAM_A5 DRAM_A4 DRAM_A3 DRAM_A2 DRAM_A1 VDD DRAM_A0 39 46 47 PALU_A1 PALU_A2 VSS PALU_A0 PALU_EN0 48 51 PALU_OP0 49 52 PALU_OP1 50 53 PALU_DX1 54 PALU_DQ0 VSS PALU_BE0 PALU_DQ1 55 59 PALU_DQ2 PALU_DX0 60 PALU_DQ3 PALU_BE1 61 VDD 56 62 57 63 58 64 9 Packaging VDD 128 RESET 127 DRAM_BS1 126 DRAM_BS0 125 DRAM_A8 124 DRAM_A7 123 DRAM_A6 122 DRAM_OP2 121 DRAM_OP1 120 VSS 119 PALU_A5 118 PALU_A4 117 111 PALU_A3 116 PALU_EN1 115 110 PALU_WE 114 PALU_OP2 113 PALU_BE3 VSS 112 PALU_BE2 PALU_DX3 109 PALU_DX2 108 PALU_DQ31 107 PALU_DQ30 106 VDD 103 PALU_DQ29 105 PALU_DQ28 104 VDD 102 PALU_DQ27 101 PALU_DQ26 100 M1003 Tracking Label On the top surface of the 3D-RAM package, a tracking label is printed below the Mitsubishi logo and the 3D-RAM product number. The tracking label consists of 7 numbers followed by a dash and a speed/power grade designation, and is represented by the mnemonic “DDDMMMMM-nn”. This mnemonic is explained as below: DDD: Date code MMMMM: Manufacturing code nn: One of the following speed designations: “10A” — tCLK (min) = 10 ns “10” — tCLK (min) = 10 ns except tCLK (min) = 12 ns for the alpha saturate logic “12” — tCLK (min) = 12 ns 138 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Mechanical Drawing for 128-pin FP and RF Packages HD D 128 A2 A1 103 e 102 1 ME b2 M5M410092FP M5M410092BFP I2 E MD HE Recommended Mount Pad 65 39 SEATING PLANE 64 SEATING PLANE e L1 c A θ L b SEE DETAIL F DETAIL F y M1041 Figure 9.1 128-pin FP package drawing 139 9 Packaging 38 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP HD D 103 A2 A1 128 e 1 102 ME b2 M5M410092BRF I2 E MD HE Recommended Mount Pad 65 38 64 SEATING PLANE 39 SEATING PLANE e L1 c A θ L b 9 Packaging SEE DETAIL F DETAIL F y M1042 Figure 9.2 128-pin RF package drawing 140 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Table 9.1 Package drawing parameters in millimeters Dimension in Millimeters Symbol Min Nominal Max Mounting height A — — 1.7 Stand-off height A1 0.05 0.15 0.25 Package height A2 — 1.4 — Terminal width b 0.13 0.18 0.28 Terminal thickness c 0.105 0.125 0.175 Package length D 13.9 14.0 14.1 Package width E 19.9 20.0 20.1 Linear spacing between terminals e — 0.5 — Over length HD 15.8 16.0 16.2 Over width HE 21.8 22.0 22.2 L 0.3 0.5 0.7 L1 — 1.0 — Flatness of terminal y — — 0.1 Terminal angle q 0° — 10° Mount pad dimensions b2 — 0.225 — Mount pad dimensions I2 1.0 — — Mount pad dimensions MD — 14.4 — Mount pad dimensions ME — 20.4 — Length of the flat portion of terminal Terminal length 9 Packaging Description 141 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Thermal Characteristics and Ta is the ambient temperature. The maximum junction temperature (Tjc) is set to 125 °C. The junction temperature can be calculated with the following equation: Thermal Resistance for Single Package Four cases of the thermal resistance for single package are listed in Table 9.2. These four cases are package only, package on PCB, package on PCB with thermal compound, and package on PCB with fin. Figures 9.3 through 9.7 show the detailed conditions of these four cases. Tjc (°C) = θja (°C/W) x P (W) + Ta (°C) where θja is the junction-to-ambient thermal resistance, P is the whole chip power dissipation, Table 9.2 Thermal resistance for single package Case 2 Case 1 Package only Package on PCB Airflow Case 4 Package on PCB with fin θja (°C/W) θja (°C/W) θja (°C/W) θja (°C/W) 0; or natural convection 160.0 86.0 80.0 44.4 100 ft/min or 0.5 m/s 97.0 66.2 60.3 31.2 200 ft/min or 1.0 m/s 78.0 58.7 53.9 26.3 400 ft/min or 2.0 m/s 59.5 49.4 43.6 20.9 1000 ft/min or 5.0 m/s 43.1 39.5 35.1 16.2 Thermal Compound Package 9 Packaging Case 3 Package on PCB with compound Package PCB (Adiabatic Plane) (Adiabatic Plane) Figure 9.3 Case 1 condition: package only Figure 9.5 Case 3 condition: package on PCB with thermal compound Fin Package Package 9.2 mm PCB PCB (Adiabatic Plane) (Adiabatic Plane) Figure 9.6 Case 4 condition: package on PCB with fin Figure 9.4 Case 2 condition: package on PCB 142 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP 1.0 2.0 7.5 9.2 1.5 φ 13 φ6 Figure 9.7 Mechanical drawing of the fin 143 9 Packaging 1.5 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Thermal Resistance for Twelve Packages Mounted on PCB Table 9.3 lists the thermal resistance for 12 packages mounted on two sides of a PCB. Two cases are listed: without heat sink and with aluminum plate for heat sink. Solder Fe Table 9.3 Thermal resistance for 12 packages mounting double-sided on PCB Airflow Without Heat Sink With Heat Sink θja (°C/W) θja (°C/W) 84.0 41.0 73.0 26.1 PCB Figure 9.8 Multiple packages double-side mounted on PCB, without heat sink 0 m/s or natural convection 100 ft/min or Al Plate (40 x 20 x 0.6 mm) 0.5 m/s Resin 200 ft/min or 59.2 21.5 43.8 17.0 Solder Fe PCB 1.0 m/s 400 ft/min or 2.0 m/s Figure 9.9 Multiple packages double-side mounted on PCB, with heat sink 1000 ft/min 9 Packaging or 30.7 13.3 5.0 m/s 144 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP PCB (Glass Epoxy) Air Flow 10.16 76.2 1.6 9 Packaging Figure 9.10 Mechanical drawing of the mounting, without heat sink 145 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Al Resin 0.6 MAX 2.5 Air Flow 10.16 40 20 76.2 1.6 9 Packaging Figure 9.11 Mechanical drawing of the mounting, with heat sink 146 10 JTAG Boundary Scan MITSUBISHI Rev. 1.03 3D-RAM (M5M410092B) ELECTRONIC DEVICE GROUP JTAG Boundary Scan Boundary-Scan Architecture pins in the Serial Test Port. The 3D-RAM provides test features that are partially in compliance with the IEEE Standard 1149.1 Test Access Port and Boundary-Scan Architecture. The on-chip test logic provides a standardized approach for checking the interconnections between different components on the same printed circuit board. Inside the 3D-RAM, the boundary-scan cells for the signal pins are interconnected to form a shiftregister chain around the pads. This path has serial input and output connections with scan clock and control signals. On a printed circuit board, the boundary-scan registers for the individual components can be connected in series to form a single path through the whole board, as illustrated in Figure 10.1. Alternatively, a board design could contain several independent boundary-scan paths. The boundary-scan test logic consists of the boundary-scan register and support logic. The test function is accessed through the Test Access Port (TAP). The TAP provides a simple serial interface that allows testing of all signal traces with only five Boundary-scan cell Serial data out Serial test interconnect System interconnect Figure 10.1 A boundary-scannable board design 147 10 JTAG Boundary Scan Serial data in Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP The boundary-scan test logic contains the following elements: instruction codes shifted through the test data input (SCAN_TDI) pin • Two test Data Registers: Bypass Register (BPR) and Boundary-Scan Register (BSR) • Test Access Port (TAP), consisting of input pins SCAN_TMS, SCAN_TCK, SCAN_RST and SCAN_TDI, and an output pin SCAN_TDO The instruction and test data registers are separate shift-register paths connected in parallel and have a common serial data input (SCAN_TDI) and a common serial data output (SCAN_TDO). The data flow is controlled by the TAP controller signals. A block diagram of the boundary-scan architecture is shown in Figure 10.2. • TAP Controller, which interprets the inputs on the test mode select line (SCAN_TMS) and performs the corresponding operations, such as controlling the scan instruction and data registers within the 3D-RAM • Instruction Register (IR), which accepts Boundary Scan Register SCAN_TDI Bypass Register SCAN_TDO Decode Instruction Register 10 JTAG Boundary Scan Clocks and/or Controls TAP Controller SCAN_TMS SCAN_TCK SCAN_RST M1002 Figure 10.2 Block diagram of the boundary scan architecture 148 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP The TAP Controller controller is shown in Figure 10.3. The TAP controller is a synchronous finite state machine controlling the sequence of test logic operations. The TAP controller changes state at the rising edge of the SCAN_TCK pin. The SCAN_TMS pin controls the sequence of the state changes. A state diagram for the TAP The TAP controller is initialized either after power up or when SCAN_RST is asserted low. In addition, it can be initialized by applying a high signal level on the SCAN_TMS input for five SCAN_TCK cycles. 1 Test-LogicReset 0 Run-Test/ Idle 1 1 SelectDR-Scan 0 1 0 1 Capture-DR Capture-IR 0 0 0 Shift-DR 1 1 Exit1-DR 0 0 Pause-DR 0 Pause-IR 1 1 0 Exit2-DR Exit2-IR 1 1 Update-DR 0 Figure 10.3 TAP controller state diagram 149 1 Exit1-IR 0 1 0 Shift-IR 1 0 1 SelectIR-Scan Update-IR 1 0 10 JTAG Boundary Scan 0 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Test-Logic-Reset State Pause-DR State The Instruction Register is set to the default Bypass instruction, so that normal 3D-RAM operations can proceed without interference. The TAP controller enters this state when it is initialized after power-up or by the reset signal SCAN_RST. Regardless of the original state, the controller enters this state when the SCAN_TMS input is held high for at least five rising SCAN_TCK cycles. The Pause-DR state allows the data shifting through the test data register to be temporarily halted. The test data register selected by the current instruction retains its previous value during this state. The current instruction does not change in this state. Exit2-DR State This is a temporary controller state. The test data register selected by the current instruction retains its previous value during this state. The current instruction does not change in this state. Run-Test/Idle State This is an idle state. The test data register selected by the current instruction retains its previous value during this state. The current instruction does not change in this state. Update-DR State The boundary-scan cell for output pad has a latch to prevent changes at the parallel output while data is shifting along the boundary-scan chain. When the TAP controller is in this state and the Boundary-Scan Register is selected, data is latched from the shift-register path on the falling edge of SCAN_TCK. The data held at the latch does not change other than in this state. All shiftregister stages in selected test data register retain their previous values during this state. The current instruction does not change in this state. Select-DR-Scan State This is a temporary controller state. The test data register selected by the current instruction retains its previous value during this state. The current instruction does not change in this state. 10 JTAG Boundary Scan Capture-DR State The Boundary-Scan Register captures input data from the SCAN_TDI pin if the current instruction is Extest or Sample/Preload. The Bypass Register does not change. The current instruction does not change in this state. Select-IR-Scan State This is a temporary controller state. The test data register selected by the current instruction retains its previous state. The current instruction does not change in this state. Shift-DR State The test data register selected by the current instruction shifts data one stage toward SCAN_TDO on each rising edge of SCAN_TCK. The current instruction does not change in this state. Capture-IR State The shift-register contained in the Instruction Register is loaded with the fixed value “1001” on the rising edge of SCAN_TCK. The test data register selected by the current instruction retains its previous value during this state. The current instruction does not change in this state. Exit1-DR State This is a temporary controller state. The test data register selected by the current instruction retains its previous value during this state. The current instruction does not change in this state. 150 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Shift-IR State Test Data Register The shift-register contained in the Instruction Register is connected between the SCAN_TDI and SCAN_TDO pins. The shift-register shifts data one stage towards its serial output on each rising edge of SCAN_TCK. The test data register selected by the current instruction retains its previous value during this state. The current instruction does not change in this state. The 3D-RAM contains the two required test data registers: Bypass Register and Boundary-Scan Register. Both registers are connected to the SCAN_TDI and SCAN_TDO pins. When a register is selected by the current instruction, the data in its shift-register is shifted one stage towards the SCAN_TDO output pin on each rising edge of the SCAN_TCK pin. Exit1-IR State Bypass Register This is a temporary state. The test data register selected by the current instruction retains its previous value during this state. The current instruction does not change in this state. The Bypass Register is a one-bit shift-register that provides the minimal length path between the SCAN_TDI and SCAN_TDO pins. When the 3D-RAM is not required to perform scan test operation, this path can be selected to allow rapid movement of test data to and from other components on the board. Pause-IR State The Pause-IR state allows the data shifting through the Instruction Register to be temporarily halted. The test data register selected by the current instruction retains its previous value during this state. The current instruction does not change in this state. The Boundary-Scan Register is a shift-register path containing the boundary-scan cells that are connected to all input/output signal pins of the 3D-RAM except the following pins: MCLK, PASS_IN[1:0], PASS_OUT, and VID_CLK. The boundary-scan cells of the PALU_DQ[31:0] pins are implemented as input pins only. Figure 10.4 shows the logical structure of the Boundary-Scan Register. While output cells determine the value of the signal driven on the corresponding pin, input cells only capture data. The output cell has a latch connected to the shift-register for latching the data in Update-DR state. These operations do not affect the normal operations of the device. Data is shifted from the SCAN_TDI pin to the SCAN_TDO pin through the Boundary-Scan Register during scanning. The Boundary-Scan Register can be operated by the Extest and Sample/Preload instructions. Exit2-IR State This is a temporary state. The test data register selected by the current instruction retains its previous value during this state. The current instruction does not change in this state. Update-IR State The instruction shifted into the Instruction Register is latched from the shift-register path on the falling edge of SCAN_TCK. The test data register selected by the current instruction retains its previous value during this state. Once the new instruction has been latched, it becomes the current instruction. 151 10 JTAG Boundary Scan Boundary-Scan Register Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP RESET, PALU_DQ, PALU_DX, PALU_A, PALU_OP, PALU_BE, PALU_EN, DRAM_A, DRAM_BS, DARM_OP, DRAM_EN, VID_CKE 10 JTAG Boundary Scan VID_OE B/S Cell B/S Cell VID_QSF B/S Cell VID_Q B/S Cell VID_Q B/S Cell HIT 3D-RAM System Logic B/S Cell M1001 Figure 10.4 Logical structure of the Boundary-Scan Register 152 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP The boundary-scan cells inside the Boundary-Scan Register are organized in the following order: → → → → → → → → → → → → → → → → → → → → → → → DRAM_A0 DRAM_A4 PALU_A0 PALU_OP0 PALU_DX0 PALU_DQ2 PALU_DQ6 PALU_DQ10 PALU_DQ14 PALU_DQ18 PALU_DQ22 PALU_DQ26 PALU_DQ30 PALU_BE2 PALU_EN1 DRAM_OP1 DRAM_A8 VID_Q8 VID_Q12 VID_QSF VID_Q0 VID_Q4 SCAN_TDO Instruction Register → → → → → → → → → → → → → → → → → → → → → → DRAM_A1 DRAM_A5 PALU_A1 PALU_OP1 PALU_DX1 PALU_DQ3 PALU_DQ7 PALU_DQ11 PALU_DQ15 PALU_DQ19 PALU_DQ23 PALU_DQ27 PALU_DQ31 PALU_BE3 PALU_A3 DRAM_OP2 DRAM_BS0 VID_Q9 VID_Q13 VID_CKE VID_Q1 VID_Q5 → → → → → → → → → → → → → → → → → → → → → → DRAM_A2 DRAM_EN PALU_A2 PALU_BE0 PALU_DQ0 PALU_DQ4 PALU_DQ8 PALU_DQ12 PALU_DQ16 PALU_DQ20 PALU_DQ24 PALU_DQ28 PALU_DX2 PALU_OP2 PALU_A4 DRAM_A6 DRAM_BS1 VID_Q10 VID_Q14 VID_OE VID_Q2 VID_Q6 → → → → → → → → → → → → → → → → → → → → → → the Capture-IR controller state, the Instruction Register is loaded with the default instruction “1001”, which is the Bypass instruction. Instructions are shifted into the Instruction Register on the rising edge of the SCAN_TCK pin while the TAP controller is in the Shift-IR state. The Instruction Register (IR) allows instructions to be serially shifted into the 3D-RAM through the SCAN_TDI pin. The instruction selects the particular test to be performed, the test data register to be accessed, or both. The instruction register is a four-bit wide shift-register with a parallel latch. The most significant bit is connected to the SCAN_TDI pin and the least significant bit is connected to the SCAN_TDO pin. On entering The 3D-RAM supports all three mandatory boundary-scan instructions, namely Bypass, Sample/Preload, and Extest. Table 10.1 lists the 3D-RAM boundary-scan instruction codes. 153 10 JTAG Boundary Scan SCAN_TDI DRAM_A3 DRAM_OP0 PALU_EN0 PALU_BE1 PALU_DQ1 PALU_DQ5 PALU_DQ9 PALU_DQ13 PALU_DQ17 PALU_DQ21 PALU_DQ25 PALU_DQ29 PALU_DX3 PALU_WE PALU_A5 DRAM_A7 RESET VID_Q11 VID_Q15 HIT VID_Q3 VID_Q7 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP The instruction code is “0100”. The Sample/ Preload instruction allows the scanning of the Boundary-Scan Register without interference to the normal operation of the 3D-RAM. As suggested by the instruction name, the Sample/ Preload instruction can be used to perform two functions: Table 10.1 Boundary-Scan instruction codes 10 JTAG Boundary Scan Instruction Code Instruction Name 0000 Extest 0001 Bypass 0010 Bypass 0011 Bypass 0100 Sample/Preload 0101 Bypass 0110 Bypass 0111 Bypass 1000 Bypass 1001 Bypass Extest Instruction 1010 Bypass 1011 Bypass 1100 Bypass 1101 Bypass 1110 Bypass 1111 Bypass The instruction code is “0000”. The Extest instruction allows testing of board interconnections. The Extest instruction selects the Boundary-Scan Register to be connected between the SCAN_TDI and SCAN_TDO pins. Two functions are performed when the Extest instruction is selected: • SAMPLE is performed in the Capture-DR controller state. All signals received at the 3D-RAM input pins are loaded into the Boundary-Scan Register on the rising edge of the SCAN_TCK pin. • PRELOAD is performed in the Update-DR controller state. The data held in the shiftregister stage of the output cell is latched on the falling edge of the SCAN_TCK pin. • In the Capture-DR controller state, all signals received at the 3D-RAM input pins are loaded into the Boundary-Scan Register on the rising edge of the SCAN_TCK pin. This is equivalent to the Sample operation in the Sample/Preload instruction. Bypass Instruction The instruction codes for the Bypass instruction are any codes except “0000” (for Extest) and “0100” (for Sample/Preload). The Bypass instruction selects the Bypass Register to be connected to the SCAN_TDI or SCAN_TDO pin. The Bypass Register contains a single shiftregister stage and is used to provide a minimum length serial path between the SCAN_TDI and the SCAN_TDO pins when no scan test operation of the 3D-RAM is required. This allows more rapid movement of test data to and from other components on the board. • In the Update-DR controller state, the data held in the shift-register stage of the output cell is latched and driven to the 3D-RAM output pins, on the falling edge of the SCAN_TCK. Due to the pull-up resistor on the SCAN_TDI input, an open circuit fault in the board level test data path will cause the Bypass Register to be selected following an instruction scan cycle. This was done to prevent any unwanted interference with the normal operation of the 3D-RAM. Sample/Preload Instruction 154 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP VID_OE Boundary-Scan Cell into the shift-register on the rising edge of the SCAN_TCK pin. The VID_OE pin controls the tri-state buffer of the VID_Q bus. Therefore, its boundary-scan cell configuration is different from a normal input pin. The functions performed on this cell for the Sample/Preload and Extest instructions are summarized below. • In the Update-DR controller state, the data held in the shift-register is latched on the falling edge of the SCAN_TCK pin. If the instruction is Extest, the latched VID_OE data will control the tri-state buffer of the VID_Q bus. 10 JTAG Boundary Scan • In the Capture-DR controller state, the signal received at the VID_OE pin is loaded 155 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI 10 JTAG Boundary Scan ELECTRONIC DEVICE GROUP 156 11 Formal Specification of Operations MITSUBISHI Rev. 1.03 3D-RAM (M5M410092B) ELECTRONIC DEVICE GROUP Formal Specification of Operations This chapter specifies exactly which bits are moved for many types of operations. It uses a syntax derived from C and Verilog to specify exactly which bits are copied for each operation. Elements DRAM Array bit SA[4][10240] Sense Amplifiers bit RAL[4][9] Row Address Latch bit VB[2][640] Video Buffer bit VD[16] Video Data pins bit VC[7] Video Counter bit VM[1] Video Mode bit SRAM[8][256] Static RAM bit DT[8][32] Dirty Tag bit PM[32] Plane Mask register bit DQ[32] Pixel ALU data pins bit BE[4] Byte Enable pins bit daddr[9] DRAM address bit paddr[6] Pixel ALU address 157 11 Formal Specification of Operations bit DA[4][257][10240] Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Bit Ordering of Elements Block PALU_DQ pins 0 8 16 24 32 40 48 56 64 72 80 88 96 104 112 120 0 128 136 144 152 160 168 176 184 PALU_BE pins 192 200 208 216 224 232 240 248 0 1 2 8 16 24 Plane Mask 3 Page 11 Formal Specification of Operations 0 0 8 16 616 624 632 640 648 656 1256 1264 1272 8 16 24 Dirty Tags 0 8 16 24 1 9 17 25 2 10 18 26 3 11 19 27 8960 8968 8976 9576 9584 9592 4 12 20 28 5 13 21 29 9600 9608 9616 102161022410232 6 14 22 30 7 15 23 31 VID_Q pins Video Buffer 0 8 16 616 624 632 0 8 Figure 11.1 Bit orderings of several elements 158 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Blocks within a Page 0 4 8 12 16 20 24 28 32 36 1 5 9 13 17 21 25 29 33 37 2 6 10 14 18 22 26 30 34 38 3 7 11 15 19 23 27 31 35 39 0 1 2 3 4 5 6 7 Figure 11.2 Orderings of words and blocks in a page Access Page ACCESS PAGE(bit bank[2], bit daddr[9]) { bit i[4]; for(i = 0; i < 9; i++) RAL[bank][i] <- daddr[i]; bit j[14]; for(j = 0;j < 10240; j++) SA[bank][j] <- DA[bank][daddr][j]; } Duplicate Page DUPLICATE PAGE(bit bank[2], bit daddr[9]) { bit i[4]; for(i = 0; i < 9; i++) RAL[bank][i] <- daddr[i]; bit j[14]; for(j = 0;j < 10240; j++) DA[bank][daddr][j] <- SA[bank][j]; } 159 11 Formal Specification of Operations Words within a Block Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Precharge Bank PRECHARGE(bit bank[2]) {} Read Block READ BLOCK(bit bank[2], bit daddr[9]) { bit i[8]; for(i = 0; i < 256; i++) SRAM[daddr[8..6]][i] <- SA[bank][daddr[1..0]*2560 + i[7..6]*640 + daddr[5..2]*64 + i[5..0]]; bit j[5]; for(j = 0;j < 32; j++) DT[daddr[8..6]][j] <- 0; } Masked Write Block 11 Formal Specification of Operations MASKED WRITE BLOCK(bit bank[2], bit daddr[9]) { bit i[8]; for(i = 0; i < 256; i++) if(PM[i[4..0]] && DT[daddr[8..6]][{i[4..3],i[7..5]}]) { SA[bank][daddr[1..0]*2560 + i[7..6]*640 + daddr[5..2]*64 + i[5..0]] <SRAM[daddr[8..6]][i]; DA[bank][RAL][daddr[1..0]*2560 + i[7..6]*640 + daddr[5..2]*64 + i[5..0]] <SRAM[daddr[8..6]][i]; } } 160 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Unmasked Write Block UNMASKED WRITE BLOCK(bit bank[2], bit daddr[9]) { bit i[8]; for(i = 0; i < 256; i++) if(DT[daddr[8..6]][{i[4..3],i[7..5]}]) { SA[bank][daddr[1..0]*2560 + i[7..6]*640 + daddr[5..2]*64 + i[5..0]] <SRAM[daddr[8..6]][i]; DA[bank][RAL][daddr[1..0]*2560 + i[7..6]*640 + daddr[5..2]*64 + i[5..0]] <SRAM[daddr[8..6]][i]; } } VIDEO TRANSFER(bit bank[2], bit daddr[9]) { bit i[10]; for(i = 0; i < 640; i++) VB[bank[0]][i] <- SA[bank][640*daddr[3..0] + i]; if(daddr[8]) { VC[5..0] <- 0; VC[6] <- bank[0]; VM[0] <- daddr[7]; } } 161 11 Formal Specification of Operations Video Transfer Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Video Cycle VIDEO CYCLE(bit enable[1], bit voe[1]) { if(voe) { bit i[4]; for(i = 0; i < 16; i++) VD[i] <- VB[VC[6]][{VC[5..1],(VC[0]^VM[0])}*16 + i]; } if(enable) { if(VC[5..0] == 39) { VC[5..0] <- 0; VC[6] <- ~VC[6]; } else { 11 Formal Specification of Operations VC[5..0] <- VC[5..0] + 1; } } } Data Read DATA READ(bit paddr[6]) { bit i[5]; for(i = 0; i < 32; i++) if(BE[i[4..3]]) DQ[i] <- SRAM[paddr } 162 [5..3]][paddr[2..0]*32 + i]; Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Stateless Initial Data Write STATELESS INITIAL DATA WRITE(bit paddr[6]) { bit i[5]; for(i = 0; i < 32; i++) if(BE[i[4..3]]) SRAM[paddr[5..3]][paddr[2..0]*32 + i] <- DQ[i]; bit j[5]; for(j = 0;j < 32; j++) if (CDS[0] == 0) /* (8,8,8,8) normal mode */ if((paddr[2..0] == j[2..0]) && BE[j[4..3]]) DT[paddr[5..3]][j] <- 1; else DT[paddr[5..3]][j] <- 0; if(((BE[3] || BE[2]) == 1) && ((BE[1] || BE[0]) == 1)) ERROR(“illegal byte enable combination”); else if((paddr[2..0] == j[2..0]) && j[4]) DT[paddr[5..3]][j] <- (BE[3] || BE[1]); else if((paddr[2..0] == j[2..0] && !j[4]) DT[paddr[5..3]][j] <- (BE[2] || BE[0]); else DT[paddr[5..3]][j] <- 0;) } 163 11 Formal Specification of Operations else /* (4,4,4,4) 16-bit color mode */ Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Stateless Normal Data Write STATELESS NORMAL DATA WRITE(bit paddr[6]) { bit i[5]; for(i = 0; i < 32; i++) if(BE[i[4..3]]) SRAM[paddr[5..3]][paddr[2..0]*32 + i] <- DQ[i]; bit j[5]; for(j = 0; j < 32; j++) if (CDS[0] == 0) /* (8,8,8,8) normal mode */ if((paddr[2..0] == j[2..0]) && BE[j[4..3]]) DT[paddr[5..3]][j] <- 1; else /* (4,4,4,4) 16-bit color mode */ if(((BE[3] || BE[2]) == 1) && ((BE[1] || BE[0]) == 1)) 11 Formal Specification of Operations ERROR(“illegal byte enable combination”); else if((paddr[2..0] == j[2..0]) && j[4]) DT[paddr[5..3]][j] <- (BE[3] || BE[1]); else if((paddr[2..0] == j[2..0] && !j[4]) DT[paddr[5..3]][j] <- (BE[2] || BE[0]); } Replace Dirty Tag REPLACE DIRTY TAG(bit paddr[6]) { bit i[5]; for(i = 0; i < 32; i++) if(BE[i[4..3]]) if (i == paddr[2..0] + 8*i[4..3]) DT[paddr[5..3]][i] <- DQ[i]; } 164 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP OR Dirty Tag OR DIRTY TAG(bit paddr[6]) { bit i[5]; for(i = 0; i < 32; i++) if(BE[i[4..3]] && DQ[i]) if (i == paddr[2..0] + 8*i[4..3]) DT[paddr[5..3]][i] <- DQ[i]; } Write Plane Mask Register WRITE PLANE MASK REGISTER() { bit i[5]; for(i = 0; i < 32; i++) if(BE[i[4..3]]) 11 Formal Specification of Operations PM[i] <- DQ[i]; } 165 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI 11 Formal Specification of Operations ELECTRONIC DEVICE GROUP 166 12 Appendix A MITSUBISHI Rev. 1.03 3D-RAM (M5M410092B) ELECTRONIC DEVICE GROUP Appendix A Glossary and provided with a brief definition and a reference to the paragraphs where the terms are explained in greater detail. This glossary list is not intended to be exhaustive Some of the terms used in this document may be unfamiliar to the reader, and some may have a specific meaning in the context of this document. For convenience, these terms are collected here Byte 3D-RAM A unit of memory containing 8 bits of data. This is the unit of data operation for the four Blend units in the Pixel ALU. The rendering controller can enable or disable the writing (to 3D-RAM) or reading (from 3D-RAM) of the individual bytes in a word. (Pages 10, 15, and 26) An innovative 10-Mbit cached dual-port CMOS memory device that dramatically improves the performance of a three-dimensional computer graphics system with on-chip support for Z-buffer hidden surface removal algorithm and for full blending and logical raster operations, achieving a peak bandwidth of 14.6 Gbytes/s and a sustained bandwidth of 400 Mbytes/s (for the -10 speed grade). Color Buffer The collection of memory that contains all color bits of all pixels to be displayed on the screen. Because the alpha information is needed for blending operations in 3D-RAM, the alpha data should also be stored together with the color data. It is also popular to have overlay information stored together with the color information to allow fast display of 2D objects by data multiplexing in a RADMAC chip. (Chapter 6) Blending A computer graphics operation for simulating the visual effect of overlapping objects with the foreground objects being partially transparent. An example of blending equations is that overall color = (a) x (color of foreground object) + (1 - a) x (color of background object), where a is the percentage of light transmitted through the medium of the foreground object. Each of the four 8-bit Blend units in the Pixel ALU can perform one of the two multiplications and then the addition, provided that the product of the other multiplication is supplied by the rendering controller. (Pages 10 and 26) A 32-bit memory in the Pixel Buffer, indicating which of the 32 bytes in the corresponding 256-bit block in the SRAM cache have been updated by the Pixel ALU since the data was transferred from the DRAM array. A “1” in a bit of Dirty Tag indicates the corresponding byte in the SRAM cache is newer than the data in the DRAM array. There are eight such Dirty Tags in the Pixel Buffer. (Page 22) Block A unit of memory organized into eight 32-bit words. This is the unit of data movement between a DRAM bank and the Pixel Buffer. (Pages 4 and 21) 167 12 Appendix A Dirty Tag Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP each of the R, G, and B color component, bits for the Z (or depth) value, bits for window ID, and bits for some other auxiliary functions, such as overlay; only the color bits are actually displayed on the screen, while the other bits are stored with the color bits for faster graphic processing. (Chapter 6) In 2D rendering, this feature may also be used for color expansion from a bit to a byte to accelerate drawing of many pixels of the same color. Two Pixel ALU operations, namely “Replace Dirty Tags” and “Or Dirty Tags,” may be used to facilitate this application of the Dirty Tags. (Page 23) In the (4, 4, 4, 4) 16-bit color mode, the setting of the dirty tag bits is same as the (8, 8, 8, 8) 32-bit color mode in the sense that if a byte of data is updated, then the corresponding dirty tag bit is set. However, since the PALU_BE[3:0] pins have different meanings in the (4, 4, 4, 4) 16-bit color mode from the (8, 8, 8, 8) 32-bit color mode, the specific dirty tag bits that are set are different in the two color modes. The mapping for the “Replace Dirty Tag” and “OR Dirty Tag” are the same in both modes. (Page 47) Global Bus A 50-MHz (for the -10 speed grade) 256-bit data bus connecting between the four DRAM banks and the Pixel Buffer. (Page 8) Magitude Compare Pixel ALU operation that compares the incoming 32-bit data with the 32-bit data stored in the frame buffer. Each bit of the 32-bit magnitude comparison may be masked by setting a “1” in the corresponding bit of the Magnitude Mask register. The result of one of eight possible magnitude comparison tests can govern whether the new data is written into the frame buffer. Most commonly, as part of the Z-buffer hidden surface removal algorithm, the magnitude comparison tests are performed on the Z value of the existing pixel against that of a new pixel intended for the same screen location. (Pages 10, 51 and 61) Double Buffer Two color buffers of identical size, with one of them shifting out video data toward the display screen (usually through a RAMDAC chip), while the other being updated with new pixel data by the rendering controller. (Pages 99 and 101) DRAM Bank 12 Appendix A One of the four 2.5-Mbit DRAM banks in a 3D-RAM chip. Each banks has 10,240 sense amplifiers, which function as a level-two pixel cache, and 257 pages of 10,240 bits each. All four DRAM banks are connected with a common 256-bit Global Bus to interface with the Pixel Buffer. (Page 6) Match Compare A Pixel ALU operation that compares the incoming 32-bit data with the 32-bit data stored in the frame buffer. Each bit of the 32-bit match comparison may be masked by setting a “1” in the corresponding bit of the Match Mask register. The result of one of four possible match comparison tests can govern whether the new data is written into the frame buffer. (Pages 10, 51 and 61) Frame Buffer A collection of memory that contain all bits of data for all pixels in the display screen and, in some systems where the frame buffer is large enough to hold more than the pixel data for the specific display resolutions, extra pixel data temporarily stored outside the memory area for display pixels. A pixel data may include bits for the color value for Page A unit of memory contains forty 256-bit blocks. Three DRAM operations (PRE, ACP, and DUP) operate on the entire 10,240-bit page in one command. (Pages 4 and 75) 168 Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP resultant of the ROP/Blend units. For MBW, the effect of the Plane Mask on the (32-bit) word 0 is simply extended to words 1 through 7. (Pages 23 and 58) Picking A computer graphics operation to select a drawn object on the display screen. A familiar example in 2D graphics is selecting an icon by clicking a mouse button. In 3D graphics, the selection is complicated by the Z values of objects and the size and locations (X, Y, Z) of the 3D selection cursor, and is supported by the on-chip Picking Logic. (Pages 12 and 55) ROP An abbreviation for raster operation. There is a total of sixteen standard logical raster operations, including AND, OR, and XOR. (Pages 10, 26 and 60) Picking Logic Stateful Data Write A Pixel ALU function that provides to the rendering controller a HIT flag, indicating if a pixel falls within the 3D selection cursor. (Pages 12 and 55) A pixel data operation of the Pixel ALU. The word “stateful” refers to the controls of the Dual Compare units and the ROP/Blend units of the Pixel ALU on whether and what data is to be written into the Pixel Buffer. There are two types of Stateful Data Write: namely, Stateful Initial Data Write and Stateful Normal Data Write. (Page 66) Pixel ALU An on-chip 100-MHz (for the -10 speed grade) processing unit with a 7-stage pipeline that performs destination blending/logical raster operation, magnitude comparison, and match comparison all in parallel, thus converting the read-modify-write interface with the rendering controller to a write-mostly one. (Pages 9, 25 and 56) Stateless Data Write A pixel data operation of the Pixel ALU. The word “stateless” refers to the straight pass-through of pixel data from the 3D-RAM inputs to the Pixel Buffer, with the data write unaffected by the ROP/ Blend units and the Dual Compare units of the Pixel ALU. There are two types of Stateless Data Write: namely, Stateless Initial Data Write and Stateless Normal Data Write. (Page 72) Pixel Buffer The triple-port 2,048-bit SRAM as a level-one pixel cache in 3D-RAM with two 100-MHz (for the -10 speed grade) 32-bit buses interfacing with the Pixel ALU and a 50-MHz (for the -10 speed grade) 256-bit bus interfacing with the DRAM arrays. Also included in the Pixel Buffer are eight 32-bit Dirty Tags and a 32-bit Plane Mask. (Pages 7 and 21) Stenciling applies a test that compares a reference value with the value stored at a pixel in the stencil buffer and then performs two tasks based on the results of this stencil test and the depth test: (1) operates on the update of the stencil data and (2) controls the write enable of the color buffer and the depth buffer. The most common use of stencil function is to generate an irregular shaped region of some desirable color pattern, such as a decal. Stencil may also be applied to produce stipples or screened door patterns which are sometimes employed to achieve transparency effect without resorting to the expensive multipliers and adders in the Plane Mask A 32-bit register that affects both the Stateful Data Writes to the Pixel Buffer and the Masked Block Writes (MBW) to the DRAM arrays. The effect is simultaneous on both types of operations when they are perfomed concurrently by the Pixel ALU port and the DRAM port, respectively. With respect to the Pixel Buffer, the Plane Mask facilitates the “write-per-bit” function on the 32-bit 169 12 Appendix A Stencil or Stenciling Rev. 1.03 3D-RAM (M5M410092B) MITSUBISHI ELECTRONIC DEVICE GROUP Word blending function. Stencil is also useful in hidden surface removal. (Pages 40 and 66) A unit of memory representing four bytes. This is the unit of data movement between the Pixel ALU and the Pixel Buffer. (Pages 4 and 21) Video Buffer One of the two serial access video buffers of 40x16 bits, which alternates every forty VCLK cycles to shift video data out. One video buffer is connected with two DRAM banks, and all 640 bits of a video buffer are loaded by a single DRAM command (VDX). The video data output rate is 16 bits per 14-ns cycle. (Pages 7 and 78) Z Buffer 12 Appendix A The collection of memory that contains all Z values of all pixels to be displayed on the screen. (Pages 99 and 101) 170