XILINX Datasheet

2016
DFPAU-DP IP Core
Floating Point Arithmetic Coprocessor - Double Precision v. 3.07
COMPANY OVERVIEW
Digital Core Design is a leading IP Core provider
and a System-on-Chip design house. The company
was founded in 1999 and since the very beginning
has been focused on IP Core architecture improvements. Our innovative, silicon proven solutions have been employed by over 300 customers
and with more than 500 hundred licenses sold to
companies like Intel, Siemens, Philips, General
Electric, Sony and Toyota. Based on more than 70
different architectures, starting from serial interfaces to advanced microcontrollers and SoCs, we
are designing solutions tailored to your needs.
KEY FEATURES
●
●
●
●
●
●
●
●
●
○
○
○
○
○
○
○
○
○
○
○
IP CORE OVERVIEW
The DFPAU-DP is a Floating Point Arithmetic Coprocessor, designed to assist the CPU in performing floating point arithmetic computations. It replaces directly C software functions, by equivalent,
very fast hardware operations, which significantly
accelerate the system performance. It doesn’t
require any programming, so it also doesn’t require any modifications to be made in the main
software. Everything is done automatically during
software compilation, by the DFPAU-DP C driver.
The DFPAU-DP was designed to operate with DCD’s
DP8051, but can also operate with any other 8-,
16- and 32-bit processor. Drivers for all popular
8051 C compilers are delivered together with the
DFPAU-DP package. The DFPAU-DP uses specialized algorithms to compute math functions. It
supports addition, subtraction, multiplication,
division, square root and comparison. It has builtin conversion instructions from integer type to
floating point type and vice versa. The input numbers format conforms to the IEEE-754 standard.
The DFPAU-DP supports double and single precision real numbers, 8-bit, 16-bit and 32-bit integers.
Each floating point function can be turned on/off
at configuration level, providing flexible scalability
of the DFPAU-DP module. It allows saving silicon
space and provides an exact configuration required
by certain application.
SYMBOL
da-
datao(31:0)1
addr(4:2)
we
irq
Direct replacement for C double, float software
functions, such as: +, -, *, /,==, !=,>=, <=, <, >
Configurability of all available functions
C interface supplied for all popular compilers: GNU
C/C++, 8051 compilers
No programming required
IEEE-754 Double precision real format support –
double type
IEEE-754 Single precision real format support –
float type
8-bit, 16-bit 32-bit and 52-bit integers format
supported – integer types
Flexible arguments and result registers location
Performs the following functions:
●
●
Exceptions built-in routines
Masks each exception indicator:
○
○
○
○
○
○
●
●
FADD, FSUB – addition, subtraction
FMUL, FDIV – multiplication, division
FSQRT – square root
FXAM – examine input data
FUCOM – comparison
FCLD, FILD – 8-bit, 16-bit integer to double
FLLD, FELD – 32-bit, 52-bit integer to double
FCST, FIST – double to 8-bit, 16-bit integer
FLST, FEST – double to 32-bit, 52-bit integer
FFLD – float to double
FFST – double to float
Precision lack PE
Underflow result UE
Overflow result OE
Invalid operand IE
Division by zero ZE
Denormal operand DE
Fully configurable
Fully synthesizable, static synchronous design with
no internal tri-states
LICENSING
Comprehensible and clearly defined licensing
methods without royalty-per-chip fees make use
of our IP Cores easy and simple.
Single-Site license option – dedicated to small and
middle sized companies, which run their business
in one place.
Multi-Site license option – dedicated to corporate
customers, who operate at several locations. The
licensed product can be used in selected company
branches.
In all cases the number of IP Core instantiations
within a project and the number of manufactured
chips are unlimited. The license is royalty-per-chip
free. There are no restrictions regarding the time
of use.
There are two formats of the delivered IP Core:
cs
rst
clk
VHDL or Verilog RTL synthesizable source code
called HDL Source code
FPGA EDIF/NGO/NGD/QXP/VQM called Netlist
1
Copyright © 1999-2016 DCD – Digital Core Design. All Rights Reserved.
All trademarks mentioned in this document are the property
of their respective owners.
DELIVERABLES
♦
Source code:
●
●
●
♦
VHDL Source Code or/and
VERILOG Source Code or/and
Encrypted, or plain text EDIF
VHDL & VERILOG test bench environment
●
●
●
♦
Active-HDL automatic simulation macros
ModelSim automatic simulation macros
Tests with reference responses
Technical documentation
●
●
●
♦
♦
♦
Installation notes
HDL core specification
Datasheet
IP Core implementation support
3 months maintenance
● Delivery of the IP Core and documentation updates, minor
●
PIN
clk
rst
cs
datai[31:0]1
addr[4:2]2
we
datao[31:0]1
irq
TYPE
Input
Input
Input
Input
Input
Input
Output
Output
DESCRIPTION
Global system clock
Global system reset
Chip select for read/write
Data bus input
Register address to read/write
Data write enable
Data bus output
Interrupt request indicator
1 – data bus can be configured as 8-, 16- or 32- bit depends
on processor’s bus size
2 – address bus is aligned to work with 8- (3:0), 16- (3:1) or 32(4:2) bit processors
Synthesis scripts
Example application
Technical support
●
●
PINS DESCRIPTION
and major versions changes
Phone & email support
UNITS SUMMARY
Mantissa – performs operations on mantissa part of
number. The addition, subtraction, multiplication,
division, square root, comparison and conversion
operations are executed in this module. It contains
mantissas and work registers.
BLOCK DIAGRAM
datai(31:0)1
datao(31:0)1
irq
Mantissa
Interface
addr(4:2)2
we
cs
Align
Exponent
Shifter
Exponent – performs operations on exponent part of
number. The addition, subtraction, shifting, comparison and conversion operations are executed in this
module. It contains exponents and work registers.
Align – performs the numbers analyze against IEEE754 standard compliance. Information about the data
classes are passed as a result to appropriate internal
module.
Shifter – performs mantissa shifting, during normalization, denormalization operations. Information
about shifted-out bits, are stored for rounding process.
Control Coprocessor – manages execution
of all instructions and internal operation required, to
execute particular function.
Interface – makes interface between external device
and DFPAU-DP internal 32-bit modules. It contains
data, control and status registers. It can be configured
to work with 8-, 16- and 32-bit processors.
APPLICATIONS
●
●
●
●
Math coprocessors
DSP algorithms
Embedded arithmetic coprocessor
Fast data processing & control
clk
rst
Control
Unit
IMPROVEMENTS
The DFPAU-DP floating point instructions performance has been compared to standard C library
functions, delivered with every commercial C compiler. Each program was executed in the same
system environments. The number of clock periods
was measured between input data loading into
work registers and output result storing after operation. The results are placed in the table below.
Improvement has been computed as number of:
(NIOS-II CLK), divided by (NIOS-II+DFPAU-DP CLK),
required to execute particular instruction.
IEEE-754 FP Instruction
Addition
Subtraction
Multiplication
Division
Square Root
Sine
Cosine
Tangent
Arcs Tangent
Average speed improvement:
Improvement
12.0
11.7
10.6
15.0
21.5
11.8
10.3
10.1
14.7
13.1
2
Copyright © 1999-2016 DCD – Digital Core Design. All Rights Reserved.
All trademarks mentioned in this document are the property
of their respective owners.
The following table gives a survey about the 32-bit
NIOS-II+DFPAU-DP performance, compared to the
32-bit NIOS-II.
Device
Improvement
1.0
14.1
11.7
13.1
NIOS-II
NIOS-II+DFPAU (arithmetic)
NIOS-II+DFPAU (trigonometric)
NIOS-II+DFPAU (overall)
CONTACT
Digital Core Design Headquarters:
Wroclawska 94, 41-902 Bytom, POLAND
e-mail:
tel.:
fax:
[email protected]
0048 32 282 82 66
0048 32 282 74 37
Distributors:
Device
Improvement
1.0
14.1
11.7
13.1
32-bit RISC
32-bit RISC+DFPAU (arithmetic)
32-bit RISC+DFPAU (trigonometric)
32-bit RISC+DFPAU (overall)
14,1
15
Please check:
http://dcd.pl/sales
13,2
11,7
10
5
1
0
32-bit NIOS-II
NIOS-II+DFPMU-DP (arithmetic)
NIOS-II+DFPMU-DP (trigonometric)
NIOS-II+DFPMU-DP (overall)
PERFORMANCE
The following table gives a survey about the Core
area and performance in XILINX® devices, after
Place & Route (all key features included):
Device
SPARTAN-3E
SPARTAN-6
VIRTEX-4
VIRTEX-5
VIRTEX-6
Speed grade
Slices/LUTs
2470/4580
-5
16xMULT18
920/2720
-3
16xDSP48
2600/4800
-12
16xDSP48
1010/3200
-3
14xDSP48
800/2800
-2
14xDSP48
Core performance in XILINX® devices
Fmax
71 MHz
75 MHz
100 MHz
125 MHz
110 MHz
3
Copyright © 1999-2016 DCD – Digital Core Design. All Rights Reserved.
All trademarks mentioned in this document are the property
of their respective owners.