DCD DFPMU

DFPMU
Floating Point Coprocessor
ver 2.05
OVERVIEW
DFPMU is a Floating Point Coprocessor, designed to assist CPU in performing the floating point mathematic computations. DFPMU
directly replaces C software functions, by
equivalent, very fast hardware operations,
which significantly accelerate system performance. It doesn’t require any programming, so it also doesn’t require any modifications made in the main software. Everything
is done automatically during software compilation by the DFPMU C driver.
DFPMU was designed to operate with DCD’s
DP8051, but can also operate with any other
8-, 16- and 32-bit processor. Drivers for all
popular 8051 C compilers are delivered together with the DFPMU package.
DFPMU uses the specialized CORDIC and
standard algorithms to compute math functions. It supports addition, subtraction, multiplication, division, square root, comparison, absolute value, change sign of a number
and trigonometric functions: sine, cosine,
tangent and arctangent. It has built-in conversion instructions from integer type to floating point type and vice versa. The input numbers format is according to IEEE-754 standard. DFPMU supports single precision real
numbers, 16-bit and 32-bit integers. Each
floating point function can be turned
on/off at configuration level providing the
flexible scalability of DFPMU module. It allows save silicon space and provides exact
configuration required by certain application.
DFPMU is a technology independent design
that can be implemented in a variety of process technologies.
APPLICATIONS
● Math coprocessors
● DSP algorithms
● Embedded arithmetic coprocessor
● Fast data processing & control
KEY FEATURES
● Direct replacement for C float software
functions such as: +, -, *, /,==, !=,>=, <=, <,
>
● Configurability of all available functions
● C interface supplied for all popular compilers: GNU C/C++, 8051 compilers
● No programming required
● IEEE-754 Single precision real format
support – float type
● 16-bit word and 32-bit short integers format supported – integer types
● Flexible arguments and result registers
location
● Performs the following functions:
○ FADD, FSUB
– addition, subtraction
○ FMUL, FDIV
– multiplication, division
○ FSQRT
– square root
○ FCHS, FABS
– change of sign, absolute
value
All trademarks mentioned in this document
are trademarks of their respective owners.
http://www.DigitalCoreDesign.com
http://www.dcd.pl
Copyright 1999-2007 DCD – Digital Core Design. All Rights Reserved.
○ FXAM
– examine input data
○ FUCOM
– comparison
○ FSIN, FCOS
– sine, cosine
○ FTAN
– tangent
○ FATAN
– arctangent
○ FILDW, FILD
– 16-bit, 32-bit integer to float
○ FISTW, FIST
– float to 16-bit, 32-bit integer
● Exceptions built-in routines
● Masks each exception indicator:
○ Precision lack PE
○ Underflow result UE
○ Overflow result OE
○ Invalid operand IE
○ Division by zero ZE
LICENSING
Comprehensible and clearly defined licensing
methods without royalty fees make using of
IP Core easy and simply.
Single Design license allows using IP Core in
single FPGA bitstream and ASIC implementation. It also permits FPGA prototyping before ASIC production.
Unlimited Designs license allows using IP
Core in unlimited number of FPGA bitstreams
and ASIC implementations.
In all cases number of IP Core instantiations
within a design, and number of manufactured
chips are unlimited. There is no time of use
limitations.
●
○ Denormal operand DE
Source
● Fully configurable
○ Encrypted, or plain text EDIF called Netlist
● Fully synthesizable, static synchronous
design with no internal tri-states
DELIVERABLES
♦
Single Design license for
○ VHDL, Verilog source code called HDL
Source code:
VHDL Source Code or/and
VERILOG Source Code or/and
Encrypted Netlist or/and
plain text EDIF netlist
VHDL & VERILOG test bench environment
◊ Active-HDL automatic simulation macros
◊ NCSim automatic simulation macros
◊ ModelSim automatic simulation macros
◊ Tests with reference responses
Technical documentation
◊ Installation notes
◊ HDL core specification
◊ Datasheet
Synthesis scripts
Example application
Technical support
◊ IP Core implementation support
◊ 3 months maintenance
●
Unlimited Designs license for
○ HDL Source
○ Netlist
●
Upgrade from
○ Netlist to HDL Source
○ Single Design to Unlimited Designs
◊
◊
◊
◊
♦
♦
♦
♦
♦
●
●
●
Delivery the IP Core updates, minor
and major versions changes
Delivery the documentation updates
Phone & email support
All trademarks mentioned in this document
are trademarks of their respective owners.
http://www.DigitalCoreDesign.com
http://www.dcd.pl
Copyright 1999-2007 DCD – Digital Core Design. All Rights Reserved.
Information about shifted-out bits are stored
for rounding process.
SYMBOL
datai(31:0)1
datao(31:0)1
2
addr(4:2)
we
irq
Control Unit – manages execution of all
instructions and internal operation required to
execute particular function.
datai(31:0)1
datao(31:0)1
irq
cs
rst
clk
TYPE
Input
Global system clock
rst
Input
Global system reset
Input
Chip select for read/write
Input
Data bus input
addr[4:2]
Input
Register address to read/write
we
Input
Data write enable
datai[31:0]
1
2
Align
Exponent
Shifter
DESCRIPTION
clk
cs
Interface
addr(4:2)2
we
cs
PINS DESCRIPTION
PIN
Mantissa
datao[31:0]1
Output Data bus output
irq
Output Interrupt request indicator
1 – data bus can be configured as 8-, 16- or 32- bit
depends on processor’s bus size
2 – address bus is aligned to work with 8- (3:0), 16(3:1) or 32- (4:2) bit processors
BLOCK DIAGRAM
Mantissa – performs operations on mantissa
part of number. The addition, subtraction,
multiplication, division, square root, comparison and conversion operations are executed
in this module. It contains mantissas and
work registers.
CORDIC – performs trigonometric operations
on input data. The sine, cosine, tangent and
arctangent operations are executed in this
module. It contains three work registers.
Exponent – performs operations on exponent part of number. The addition, subtraction, shifting, comparison and conversion
operations are executed in this module. It
contains exponents and work registers.
CORDIC
clk
rst
Control
Unit
Interface – makes interface between external device and DFPMU internal 32-bit modules. It contains data, control and status registers. It can be configured to work with 8-,
16- and 32-bit processors.
PERFORMANCE
The following table gives a survey about
the Core area and performance in the ALTERA® devices after Place & Route (all key
features have been included):
Speed
Logic Cells
Fmax
grade
APEX20KE
-1
5150
50 MHz
APEX20KC
-7
5150
58 MHz
APEX-II
-7
5150
73 MHz
CYCLONE
-6
4650
90 MHz
CYCLONE-II
-6
4520
96 MHz
STRATIX
-5
4460
108 MHz
STRATIX-II
-3
3300
168 MHz
Core performance in ALTERA® devices
Device
Align – performs the numbers analyze
against IEEE-754 standard compliance. Information about the data classes are passed
as result to appropriate internal module.
Shifter – performs mantissa shifting during
normalization, denormalization operations.
All trademarks mentioned in this document
are trademarks of their respective owners.
http://www.DigitalCoreDesign.com
http://www.dcd.pl
Copyright 1999-2007 DCD – Digital Core Design. All Rights Reserved.
DFPMU floating point instructions performance has been compared to standard C
library functions delivered with every commercial C compiler. Each program was executed in the same system environments.
Number of clock periods were measured between input data loading into work registers
and output result storing after operation. The
results are placed in table below. Improvement has been computed as number of:
(CPU clk) divided by (CPU+DFPMU clk),
required to execute the same operation.
More details are available in core documentation.
The following table gives a survey about
the DP8051+DFPMU performance compared
to std 8051 microcontroller.
Device
Improvement
80C51
1.0
DP8051
7.3
DP8051+DFPMU
162.0
General performance improvements
The table below shows performance improvements of the NIOS-II and DFPMU
based system, compared to the same system
without the DFPMU coprocessor.
Device
Improvement
NIOS-II/s
1.0
NIOS-II+DFPMU (arithmetic)
7.5
NIOS-II+DFPMU (trigonometric)
49.2
NIOS-II+DFPMU (overall)
28.3
General performance improvements
49,2
50
40
28,3
30
20
7,5
10
1
0
200
162
32-bit NIOS-II/s
NIOS-II+DFPMU (arithmetic)
NIOS-II+DFPMU (trigonometric)
NIOS-II+DFPMU (overall)
150
100
50
7,3
1
0
80C51
DP8051
DP8051+DFPMU
IEEE-754 FP Instruction
Improvement
Addition
73
Subtraction
60
Multiplication
65
Division
182
Square Root
392
Sine
139
Cosine
144
Tangent
222
Arcs Tangent
182
Average speed improvement:
162
Improvements of particular operations
All trademarks mentioned in this document
are trademarks of their respective owners.
IEEE-754 FP Instruction
Improvement
Addition
6.4
Subtraction
6.5
Multiplication
5.1
Division
6.5
Square Root
12.9
Sine
40.8
Cosine
41.3
Tangent
65.0
Arcs Tangent
49.6
Average speed improvement:
28.3
Improvements of particular operations
More details are available in core documentation.
http://www.DigitalCoreDesign.com
http://www.dcd.pl
Copyright 1999-2007 DCD – Digital Core Design. All Rights Reserved.
CONTACTS
For any modification or special request
please contact to Digital Core Design or local
distributors.
Headquarters:
Wroclawska 94
41-902 Bytom, POLAND
n fo @ d c d .p l
e-mail: [email protected]
tel.
: +48 32 282 82 66
fax
: +48 32 282 74 37
Distributors:
ttp://www.dcd.pl/apartn.php
Please check hhttp://www.dcd.pl/apartn.php
All trademarks mentioned in this document
are trademarks of their respective owners.
http://www.DigitalCoreDesign.com
http://www.dcd.pl
Copyright 1999-2007 DCD – Digital Core Design. All Rights Reserved.