2016 DFPAU-DP IP Core Floating Point Arithmetic Coprocessor - Double Precision v. 3.07 COMPANY OVERVIEW Digital Core Design is a leading IP Core provider and a System-on-Chip design house. The company was founded in 1999 and since the very beginning has been focused on IP Core architecture improvements. Our innovative, silicon proven solutions have been employed by over 300 customers and with more than 500 hundred licenses sold to companies like Intel, Siemens, Philips, General Electric, Sony and Toyota. Based on more than 70 different architectures, starting from serial interfaces to advanced microcontrollers and SoCs, we are designing solutions tailored to your needs. KEY FEATURES ● ● ● ● ● ● ● ● ● ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ IP CORE OVERVIEW The DFPAU-DP is a Floating Point Arithmetic Coprocessor, designed to assist the CPU in performing floating point arithmetic computations. It replaces directly C software functions, by equivalent, very fast hardware operations, which significantly accelerate the system performance. It doesn’t require any programming, so it also doesn’t require any modifications to be made in the main software. Everything is done automatically during software compilation, by the DFPAU-DP C driver. The DFPAU-DP was designed to operate with DCD’s DP8051, but can also operate with any other 8-, 16- and 32-bit processor. Drivers for all popular 8051 C compilers are delivered together with the DFPAU-DP package. The DFPAU-DP uses specialized algorithms to compute math functions. It supports addition, subtraction, multiplication, division, square root and comparison. It has builtin conversion instructions from integer type to floating point type and vice versa. The input numbers format conforms to the IEEE-754 standard. The DFPAU-DP supports double and single precision real numbers, 8-bit, 16-bit and 32-bit integers. Each floating point function can be turned on/off at configuration level, providing flexible scalability of the DFPAU-DP module. It allows saving silicon space and provides an exact configuration required by certain application. SYMBOL da- datao(31:0)1 addr(4:2) we irq Direct replacement for C double, float software functions, such as: +, -, *, /,==, !=,>=, <=, <, > Configurability of all available functions C interface supplied for all popular compilers: GNU C/C++, 8051 compilers No programming required IEEE-754 Double precision real format support – double type IEEE-754 Single precision real format support – float type 8-bit, 16-bit 32-bit and 52-bit integers format supported – integer types Flexible arguments and result registers location Performs the following functions: ● ● Exceptions built-in routines Masks each exception indicator: ○ ○ ○ ○ ○ ○ ● ● FADD, FSUB – addition, subtraction FMUL, FDIV – multiplication, division FSQRT – square root FXAM – examine input data FUCOM – comparison FCLD, FILD – 8-bit, 16-bit integer to double FLLD, FELD – 32-bit, 52-bit integer to double FCST, FIST – double to 8-bit, 16-bit integer FLST, FEST – double to 32-bit, 52-bit integer FFLD – float to double FFST – double to float Precision lack PE Underflow result UE Overflow result OE Invalid operand IE Division by zero ZE Denormal operand DE Fully configurable Fully synthesizable, static synchronous design with no internal tri-states LICENSING Comprehensible and clearly defined licensing methods without royalty-per-chip fees make use of our IP Cores easy and simple. Single-Site license option – dedicated to small and middle sized companies, which run their business in one place. Multi-Site license option – dedicated to corporate customers, who operate at several locations. The licensed product can be used in selected company branches. In all cases the number of IP Core instantiations within a project and the number of manufactured chips are unlimited. The license is royalty-per-chip free. There are no restrictions regarding the time of use. There are two formats of the delivered IP Core: cs rst clk VHDL or Verilog RTL synthesizable source code called HDL Source code FPGA EDIF/NGO/NGD/QXP/VQM called Netlist 1 Copyright © 1999-2016 DCD – Digital Core Design. All Rights Reserved. All trademarks mentioned in this document are the property of their respective owners. DELIVERABLES ♦ Source code: ● ● ● ♦ VHDL Source Code or/and VERILOG Source Code or/and Encrypted, or plain text EDIF VHDL & VERILOG test bench environment ● ● ● ♦ Active-HDL automatic simulation macros ModelSim automatic simulation macros Tests with reference responses Technical documentation ● ● ● ♦ ♦ ♦ Installation notes HDL core specification Datasheet IP Core implementation support 3 months maintenance ● Delivery of the IP Core and documentation updates, minor ● PIN clk rst cs datai[31:0]1 addr[4:2]2 we datao[31:0]1 irq TYPE Input Input Input Input Input Input Output Output DESCRIPTION Global system clock Global system reset Chip select for read/write Data bus input Register address to read/write Data write enable Data bus output Interrupt request indicator 1 – data bus can be configured as 8-, 16- or 32- bit depends on processor’s bus size 2 – address bus is aligned to work with 8- (3:0), 16- (3:1) or 32(4:2) bit processors Synthesis scripts Example application Technical support ● ● PINS DESCRIPTION and major versions changes Phone & email support UNITS SUMMARY Mantissa – performs operations on mantissa part of number. The addition, subtraction, multiplication, division, square root, comparison and conversion operations are executed in this module. It contains mantissas and work registers. BLOCK DIAGRAM datai(31:0)1 datao(31:0)1 irq Mantissa Interface addr(4:2)2 we cs Align Exponent Shifter Exponent – performs operations on exponent part of number. The addition, subtraction, shifting, comparison and conversion operations are executed in this module. It contains exponents and work registers. Align – performs the numbers analyze against IEEE754 standard compliance. Information about the data classes are passed as a result to appropriate internal module. Shifter – performs mantissa shifting, during normalization, denormalization operations. Information about shifted-out bits, are stored for rounding process. Control Coprocessor – manages execution of all instructions and internal operation required, to execute particular function. Interface – makes interface between external device and DFPAU-DP internal 32-bit modules. It contains data, control and status registers. It can be configured to work with 8-, 16- and 32-bit processors. APPLICATIONS ● ● ● ● Math coprocessors DSP algorithms Embedded arithmetic coprocessor Fast data processing & control clk rst Control Unit IMPROVEMENTS The DFPAU-DP floating point instructions performance has been compared to standard C library functions, delivered with every commercial C compiler. Each program was executed in the same system environments. The number of clock periods was measured between input data loading into work registers and output result storing after operation. The results are placed in the table below. Improvement has been computed as number of: (NIOS-II CLK), divided by (NIOS-II+DFPAU-DP CLK), required to execute particular instruction. IEEE-754 FP Instruction Addition Subtraction Multiplication Division Square Root Sine Cosine Tangent Arcs Tangent Average speed improvement: Improvement 12.0 11.7 10.6 15.0 21.5 11.8 10.3 10.1 14.7 13.1 2 Copyright © 1999-2016 DCD – Digital Core Design. All Rights Reserved. All trademarks mentioned in this document are the property of their respective owners. The following table gives a survey about the 32-bit NIOS-II+DFPAU-DP performance, compared to the 32-bit NIOS-II. Device Improvement 1.0 14.1 11.7 13.1 NIOS-II NIOS-II+DFPAU (arithmetic) NIOS-II+DFPAU (trigonometric) NIOS-II+DFPAU (overall) CONTACT Digital Core Design Headquarters: Wroclawska 94, 41-902 Bytom, POLAND e-mail: tel.: fax: [email protected] 0048 32 282 82 66 0048 32 282 74 37 Distributors: Device Improvement 1.0 14.1 11.7 13.1 32-bit RISC 32-bit RISC+DFPAU (arithmetic) 32-bit RISC+DFPAU (trigonometric) 32-bit RISC+DFPAU (overall) 14,1 15 Please check: http://dcd.pl/sales 13,2 11,7 10 5 1 0 32-bit NIOS-II NIOS-II+DFPMU-DP (arithmetic) NIOS-II+DFPMU-DP (trigonometric) NIOS-II+DFPMU-DP (overall) PERFORMANCE The following table gives a survey about the Core area and performance in XILINX® devices, after Place & Route (all key features included): Device SPARTAN-3E SPARTAN-6 VIRTEX-4 VIRTEX-5 VIRTEX-6 Speed grade Slices/LUTs 2470/4580 -5 16xMULT18 920/2720 -3 16xDSP48 2600/4800 -12 16xDSP48 1010/3200 -3 14xDSP48 800/2800 -2 14xDSP48 Core performance in XILINX® devices Fmax 71 MHz 75 MHz 100 MHz 125 MHz 110 MHz 3 Copyright © 1999-2016 DCD – Digital Core Design. All Rights Reserved. All trademarks mentioned in this document are the property of their respective owners.