2016 DFPMU-DP IP Core Floating Point Coprocessor - Double Precision v. 3.07 COMPANY OVERVIEW Digital Core Design is a leading IP Core provider and a System-on-Chip design house. The company was founded in 1999 and since the very beginning has been focused on IP Core architecture improvements. Our innovative, silicon proven solutions have been employed by over 300 customers and with more than 500 hundred licenses sold to companies like Intel, Siemens, Philips, General Electric, Sony and Toyota. Based on more than 70 different architectures, starting from serial interfaces to advanced microcontrollers and SoCs, we are designing solutions tailored to your needs. KEY FEATURES ● Direct replacement for C double, float software functions, such as: +, -, *, /,==, !=,>=, <=, <, > Configurability of all available functions C interface supplied for all popular compilers: GNU C/C++, 8051 compilers No programming required IEEE-754 Double precision real format support – double type IEEE-754 Single precision real format support – float type 8-bit, 16-bit 32-bit and 52-bit integers format supported – integer types Flexible arguments and result registers location Performs the following functions: ● ● ● ● ● ● ● ● ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ IP CORE OVERVIEW The DFPMU-DP is a Floating Point Coprocessor, designed to assist the CPU in performing floating point mathematic computations. The DFPMU-DP replaces directly C software functions by equivalent, very fast hardware operations, which significantly accelerate the system performance. It doesn’t require any programming, so it also doesn’t require any modifications to be made in the main software. Everything is done automatically, during software compilation by the DFPMU-DP C driver. The DFPMU-DP was designed to operate with DCD’s DP8051, but can also work with any other 8-, 16- and 32-bit processor. Drivers for all popular 8051 C compilers are delivered together with the DFPMU-DP package. The DFPMU-DP uses specialized CORDIC and standard algorithms, to compute math functions. It supports addition, subtraction, multiplication, division, square root, comparison, and trigonometric functions: sine, cosine, tangent and arctangent. It has built-in conversion instructions from integer type to floating point type and vice versa. The input numbers format is compliant with the IEEE-754 standard. The DFPMU-DP supports double and single precision real numbers, 8-bit, 16-bit, 32-bit and 52-bit integers. Each floating point function can be turned on/off at configuration level, providing flexible scalability of the DFPMU-DP module. It allows saving silicon space and provides an exact configuration, required by certain applications. The DFPMU-DP is a technology independent design, which can be implemented in a variety of process technologies. APPLICATIONS ● ● ● ● Math coprocessors DSP algorithms Embedded arithmetic coprocessor Fast data processing & control ● ● FADD, FSUB – addition, subtraction FMUL, FDIV – multiplication, division FSQRT – square root FXAM – examine input data FUCOM – comparison FSIN, FCOS – sine, cosine FTAN – tangent FATAN – arctangent FCLD, FILD – 8-bit, 16-bit integer to double FLLD, FELD – 32-bit, 52-bit integer to double FCST, FIST – double to 8-bit, 16-bit integer FLST, FEST – double to 32-bit, 52-bit integer FFLD – float to double FFST – double to float Exceptions built-in routines Masks each exception indicator: ○ ○ ○ ○ ○ ○ ● ● Precision lack PE Underflow result UE Overflow result OE Invalid operand IE Division by zero ZE Denormal operand DE Fully configurable Fully synthesizable, static synchronous design with no internal tri-states DELIVERABLES ♦ Source code: ● ● ● ♦ VHDL Source Code or/and VERILOG Source Code or/and Encrypted, or plain text EDIF VHDL & VERILOG test bench environment ● ● ● ♦ Active-HDL automatic simulation macros ModelSim automatic simulation macros Tests with reference responses Technical documentation ● ● ● ♦ ♦ ♦ Installation notes HDL core specification Datasheet Synthesis scripts Example application Technical support ● ● IP Core implementation support 3 months maintenance ● Delivery of the IP Core and documentation updates, minor ● and major versions changes Phone & email support 1 Copyright © 1999-2016 DCD – Digital Core Design. All Rights Reserved. All trademarks mentioned in this document are the property of their respective owners. SYMBOL LICENSING Comprehensible and clearly defined licensing methods without royalty-per-chip fees make use of our IP Cores easy and simple. datai(31:0)1 addr(4:2)2 Single-Site license option – dedicated to small and middle sized companies, which run their business in one place. we Multi-Site license option – dedicated to corporate customers, who operate at several locations. The licensed product can be used in selected company branches. In all cases the number of IP Core instantiations within a project and the number of manufactured chips are unlimited. The license is royalty-per-chip free. There are no restrictions regarding the time of use. There are two formats of the delivered IP Core: VHDL or Verilog RTL synthesizable source code called HDL Source code FPGA EDIF/NGO/NGD/QXP/VQM called Netlist UNITS SUMMARY Mantissa – performs operations on mantissa part of number. The addition, subtraction, multiplication, division, square root, comparison and conversion operations are executed in this module. It contains mantissas and work registers. CORDIC – performs trigonometric operations on input data. The sine, cosine, tangent and arctangent operations are executed in this module. It contains three work registers. Exponent – performs operations on exponent part of number. The addition, subtraction, shifting, comparison and conversion operations are executed in this module. It contains exponents and work registers. irq cs rst clk PINS DESCRIPTION PIN TYPE DESCRIPTION clk Input Global system clock rst Input Global system reset cs Input Chip select for read/write datai[31:0]1 Input Data bus input addr[4:2]2 Input Register address to read/write we Input Data write enable datao[31:0]1 Output Data bus output irq Output Interrupt request indicator 1 – data bus can be configured as 8-, 16- or 32- bit depends on processor’s bus size 2 – address bus is aligned to work with 8- (3:0), 16- (3:1) or 32(4:2) bit processors BLOCK DIAGRAM datai(31:0)1 datao(31:0)1 irq Mantissa 2 addr(4:2) we cs Interface Align Align – performs the numbers analysis against IEEE-754 standard compliance. Information about the data classes are passed as result to appropriate internal module. Exponent Shifter – performs mantissa shifting, during normalization and denormalization operations. Information about shifted-out bits is stored for rounding process. Shifter Control Unit – manages execution of all instructions and internal operation, required to execute particular function. Interface – makes interface between external device and DFPMU-DP internal 32-bit modules. It contains data, control and status registers. It can be configured to work with 8-, 16- and 32-bit processors. datao(31:0)1 CORDIC clk rst Control Unit 2 Copyright © 1999-2016 DCD – Digital Core Design. All Rights Reserved. All trademarks mentioned in this document are the property of their respective owners. IMPROVEMENTS PERFORMANCE The DFPMU-DP floating point instructions performance has been compared to standard C library functions, delivered with every commercial C compiler. Each program was executed in the same system environments. The number of clock periods was measured between input data loading into work registers and output result storing after operation. Results are placed in the table below. Improvement has been computed as number of: (NIOS-II CLK) divided by (NIOS-II+DFPMU-DP CLK), required to execute particular instruction. The following table gives a survey about the Core area and performance in XILINX® devices, after Place & Route (all key features included): IEEE-754 FP Instruction Addition Subtraction Multiplication Division Square Root Sine Cosine Tangent Arcs Tangent Average speed improvement: Improvement 12.0 11.7 10.6 15.0 21.5 52.0 60.8 97.9 78.7 38.3 More details are available in the core documentation. Device SPARTAN-3E SPARTAN-6 VIRTEX-4 VIRTEX-5 VIRTEX-6 Speed grade Slices/LUTs 4590/8400 -5 16xMULT18 1885/5900 -3 16xDSP48 4390/8100 -12 16xDSP48 2250/6550 -3 14xDSP48 2170/6000 -2 14xDSP48 Core performance in XILINX® devices Fmax 71 MHz 72 MHz 100 MHz 125 MHz 100 MHz CONTACT Digital Core Design Headquarters: Wroclawska 94, 41-902 Bytom, POLAND e-mail: [email protected] tel.: 0048 32 282 82 66 fax: 0048 32 282 74 37 Distributors: Please check: 72,4 http://dcd.pl/sales 50 38,8 40 30 20 10 10,2 1 0 32-bit NIOS-II NIOS-II+DFPMU-DP (arithmetic) NIOS-II+DFPMU-DP (trigonometric) NIOS-II+DFPMU-DP (overall) The following table gives a survey about the 32-bit NIOS-II+DFPMU-DP performance, compared to the 32-bit NIOS-II. Device NIOS-II NIOS-II+DFPMU (arithmetic) NIOS-II+DFPMU (trigonometric) NIOS-II+DFPMU (overall) Improvement 1.0 14.1 72.4 38.8 Device 32-bit RISC 32-bit RISC+DFPMU (arithmetic) 32-bit RISC+DFPMU (trigonometric) 32-bit RISC+DFPMU (overall) Improvement 1.0 14.1 72.4 38.8 3 Copyright © 1999-2016 DCD – Digital Core Design. All Rights Reserved. All trademarks mentioned in this document are the property of their respective owners.