ACTEL CORECORDIC-XX

CoreCORDIC CORDIC RTL Generator
Product Summary
•
–
Intended Use
•
COordinate Rotation DIgital Computer (CORDIC)
Rotator Function for Actel FPGAs
Vector Rotation – Conversion of Polar Coordinates
to Rectangular Coordinates
•
Vector Translation – Conversion of Rectangular
Coordinates to Polar Coordinates
•
Sine and Cosine Calculation
2
2
•
Vector (X, Y) Magnitude
(arctan[X/Y]) Calculation
•
8-Bit to 48-Bit Configurable Word Size
•
8 to 48 Configurable Number of Iterations
•
Parallel Pipelined Architecture for the Fastest
Calculation
•
Bit-Serial Architecture for the Smallest Area
•
Word-Serial Architecture for Moderate Speed and
Area
•
Word Parallel Data I/Os
X +Y
Fusion
•
ProASIC®3/E
•
ProASICPLUS ®
•
Axcelerator®
•
RTAX-S
•
SX-A
•
RTSX-S
Synthesis:
Synplicity®,
Synopsys®
Compiler/FPGA Compiler), Exemplar™
•
Simulation: OVI-Compliant Verilog Simulators and
Vital-Compliant VHDL Simulators.
(Design
General Description
CoreCORDIC is an RTL generator that produces an Actel
FPGA–optimized CORDIC engine. The CORDIC algorithm
by J. Volder provides an iterative method of performing
vector rotations using shifts and adds only. The articles
listed in "References" on page 12 present a detailed
description of the algorithm.
CoreCORDIC RTL Generator. Generates UserDefined CORDIC Model and Test Harness. Fully
Supported in the Actel Libero® Integrated
Design Environment (IDE)
March 2006
© 2006 Actel Corporation
•
General Description ................................................... 1
CoreCORDIC Device Requirements ........................... 4
Architectures .............................................................. 5
I/O Formats ................................................................. 7
CoreCORDIC Configuration Parameters ................... 9
I/O Signal Description ................................................ 9
I/O Interface and Timing .......................................... 11
References ................................................................ 12
A Sample Configuration File ................................... 13
Ordering Information .............................................. 13
Datasheet Categories ............................................... 13
Appendix I ................................................................ 14
Appendix II ............................................................... 15
Full Version
–
Libero IDE
and Phase
Core Deliverables
•
•
Table of Contents
Supported Families
•
Supports CORDIC Engine and Test Harness
Generation with Limited Parameters. Fully
Supported in Libero IDE.
Synthesis and Simulation Support
Key Features
•
Evaluation Version
Depending on the configuration defined by the user, the
resulting module implements pipelined parallel, wordserial, or bit-serial architecture in one of two major
modes: rotation or vectoring. In rotation mode, the
CORDIC rotates a vector by a specified angle. This mode
is used to convert polar coordinates to Cartesian
v 2 .0
1
CoreCORDIC CORDIC RTL Generator
coordinates, for general vector rotation, and also to
calculate sine and cosine functions (see Figure 1).
"Appendix I" on page 14 presents mathematical
coordinate conversion formulae, and "Appendix II" on
page 15 describes examples of a few of the most used
CORDIC modes.
Magnitude r
Phase
θ
x
CORDIC
Engine
y
The gain can be compensated for elsewhere in many
applications when the system includes the CORDIC
engine. To assist a user in doing so, the CoreCORDIC
software computes the precise value of the gain and
displays it on a screen. In the cases when only relative
magnitude is of importance—for example, spectrum
analysis and AM demodulation—the constant gain can
be neglected. When calculating sine/cosine, the CORDIC
gets initialized with a constant reciprocal value of the
processing gain r = 1/K.
EQ 1 and EQ 2 become
X = cos θ
Figure 1 • CORDIC Engine in Rotation Mode
Y = sin θ
In vectoring mode, the CORDIC rotates the input vector
towards the x axis while accumulating a rotation angle.
Vectoring mode is used to convert Cartesian vector
coordinates to polar coordinates; i.e., to calculate the
magnitude and phase of the input vector (Figure 2).
Thus, the gain does not impact the sine/cosine results or
the phase output.
To perform the conversions, the CORDIC processor
implements the iterative CORDIC equations EQ 5
through EQ 7.
xi + 1 = xi – yi × di × 2
x
CORDIC
Engine
y
–i
EQ 5
Magnitude r
Phase
yi + 1 = yi + xi × di × 2
θ
–i
EQ 6
–i
a i + 1 = a i – d i × arctan ( 2 )
Figure 2 • CORDIC Engine in Vectoring Mode
The CORDIC results, such as x, y, and r, are scaled by the
inherent processing gain, K, which depends on number
of iterations and converges to about 1.647 after a few
iterations. The gain is constant for a given number of
iterations. When performing Cartesian/polar coordinate
conversion, the CORDIC computes the results shown in
EQ 1 and EQ 2 in rotation mode.
EQ 1
In rotation mode
di = –1 if ai < 0, otherwise di = 1
EQ 8
In vectoring mode
di = 1 if yi < 0, otherwise di = –1
EQ 9
Y = K ⋅ r ⋅ sin θ
EQ 2
EQ 3 and EQ 4 show the CORDIC results in vectoring
mode.
2
•
•
X = K ⋅ r ⋅ cos xθ
r = K⋅ X +Y
EQ 7
The sign-controlling function di takes the values shown
in EQ 8 and EQ 9:
2
EQ 3
θ = arctan ( Y ⁄ X )
The input and output data is represented as n-bit words,
where n is a user-defined number in the range from 8 to
48. The number of iterations is also defined by a user in
the same range. The CORDIC result accuracy improves
when the number of iterations is increased, as long as
the number of iterations does not exceed data bit width.
In other words, the bit width limits the number of
meaningful iterations.
EQ 4
A system that utilizes the CORDIC engine (Figure 3 on page 3) consists of the following:
2
•
A data source generating the vector data to be converted by the CORDIC
•
The CORDIC module configured to work in either rotation or vectoring mode
•
A data receiver accepting the newly converted vector data
v2.0
CoreCORDIC CORDIC RTL Generator
Master Clock
clkEn
x0
Data
Source
xn
CORDIC
Engine
y0
a0
ldData
yn
an
OR
Data
Receiver
rdyOut
rst
Global Reset
nGrst
Figure 3 • CORDIC-Based System
The negative nGrst signal resets the CORDIC engine and, optionally, the entire system. After the reset (input nGrst
taken high), the CORDIC module is ready to receive data samples to be processed. The module synchronous reset input
rst can be used to bring the CORDIC unit to the ready state at any time after the initial global reset.
Note: The CORDIC module will lose half-processed data when rst is taken high by the system.
The data source supplies the CORDIC engine with the data to be converted. Depending on the mode (rotation or
vectoring), the system uses different CORDIC inputs and outputs to enter and obtain the data. Table 1 shows the input/
output signals used in each mode.
Table 1 • CORDIC Connection to the System
Input Data
CORDIC Input
Output Data
CORDIC Output
Common Rotation Modes
Input vector magnitude
x0
Output vector coordinate X
xn
Constant 0
y0
Output vector coordinate Y
yn
Input vector phase
a0
N/A
an
Rotation Mode: Sine/Cosine Table Generator
Constant reciprocal value of the processing gain r = 1/K
x0
sin(θ)
xn
Constant 0
y0
cos(θ)
yn
Sine/cosine argument θ
a0
N/A
an
Vectoring Mode
Input vector coordinate X
x0
Output vector magnitude r
xn
Input vector coordinate Y
y0
N/A
yn
Constant 0
a0
Output vector phase θ
an
The system accompanies every new pair of the input data samples with the one-bit ldData signal. Upon receiving the
ldData bit, the module assumes the vector coordinates are present on input data busses. Once the CORDIC results are
ready, the engine puts these out, accompanied by the one-bit rdyOut signal. Upon receiving the rdyOut bit, the system
can supply a new pair of input data and generate another ldData signal.
CoreCORDIC can generate three different CORDIC core implementation architectures and an appropriate testbench:
•
Parallel pipelined
•
Word-serial
•
Bit-serial
The parallel pipelined architecture provides the fastest speed, whereas the bit-serial architecture provides the smallest
area. The word-serial architecture provides the trade-off of moderate speed and area.
v2.0
3
CoreCORDIC CORDIC RTL Generator
CoreCORDIC Device Requirements
Table 2 provides typical utilization and performance data for CoreCORDIC, implemented in various Actel devices with
the CORDIC engine bit resolution set to 24 bits and the number of iterations set to 24. Device utilization and
performance will vary depending upon the architecture chosen and the configuration parameters used. Time-driven
settings were used when synthesizing parallel architectures; area optimization settings were used in other cases.
The CORDIC core does not utilize on-chip RAM blocks.
Table 2 • CoreCORDIC Device Utilization and Performance
Cells or Tiles
Device
Engine
Architecture
Mode
Comb
Fusion
AFS600
AFS600
AFS600
A3P250
A3P1000
Total
Transform
Time,
nsec
Speed Grade –2
Bit-serial
Word- serial
Parallel
Rotate
297
110
407
3%
88
6,568
Vector
293
108
401
3%
87
6644
Rotate
668
103
771
6%
30
833
Vector
660
101
761
6%
27
926
Rotate
11,810
1,884
13,694
99%
46
21.7
ProASIC3/E
A3P250
Seq
Utilization Clock Rate,
%
MHz
Speed Grade –2
Bit-serial
Word-serial
Parallel
Rotate
297
110
407
7%
83
6,964
Vector
296
108
404
7%
93
6,215
Rotate
664
103
767
12%
30
833
Vector
658
101
759
12%
26
962
Rotate
12,541
1,906
14,447
59%
46
21.7
Vector
14,832
1,981
16,813
68%
62
16.1
393
8%
61
9,475
ProASICPLUS
Speed Grade STD
APA150
Bit-serial
Rotate
Vector
394
107
501
8%
63
9,175
APA150
Word-serial
Rotate
824
114
938
15%
20
1,250
Vector
822
114
936
15%
19
1,316
APA1000
Parallel
Rotate
14,301
1,889
16,190
29%
32
31.3
Vector
16,594
1,936
18,530
33%
37
27.0
Axcelerator
AX125
AX125
AX500
108
501
Speed Grade –2
Bit-serial
Word-serial
Parallel
Rotate
196
106
302
15%
113
5,115
Vector
185
105
290
14%
115
5,026
Rotate
413
124
537
27%
103
243
Vector
405
133
538
27%
109
229
Rotate
4,633
1,832
6,465
80%
130
7.7
Vector
4,617
1,835
6,452
80%
124
8.1
196
302
8%
92
6,283
RTAX-S
Speed Grade –1
RTAX250S
Bit-serial
Rotate
106
Vector
185
105
290
7%
100
5,780
RTAX250S
Word-serial
Rotate
413
124
537
14%
74
338
Vector
405
133
538
14%
75
333
RTAX1000S
Parallel
Rotate
4,633
1,832
6,465
36%
89
11.2
Vector
4,617
1,835
6,452
36%
81
12.3
Note: The above data were obtained by typical synthesis and place-and-route methods. Other core parameter settings can result in
different utilization and performance values.
4
v2.0
CoreCORDIC CORDIC RTL Generator
Table 2 • CoreCORDIC Device Utilization and Performance (Continued)
Cells or Tiles
Engine
Architecture
Mode
Comb
54SX72A
Bit-serial
Rotate
190
Vector
195
105
300
54SX72A
Word-serial
Rotate
656
132
788
Vector
643
124
767
13%
Device
54SX-A
Seq
Total
Utilization Clock Rate,
%
MHz
Transform
Time,
nsec
Speed Grade –2
105
RT54SX-S
295
5%
67
8,627
5%
71
8,141
13%
55
455
50
500
Speed Grade –1
RT54SX72S
RT54SX72S
Bit-serial
Word-serial
Rotate
189
104
293
5%
55
10,509
Vector
190
104
294
5%
55
10,509
Rotate
677
132
809
13%
33
758
Vector
664
125
789
13%
34
735
Note: The above data were obtained by typical synthesis and place-and-route methods. Other core parameter settings can result in
different utilization and performance values.
Architectures
Word-Serial Architecture
Direct implementation of the CORDIC iterative equations (see "References" on page 12) yields the block diagram
shown in Figure 4. The vector coordinates to be converted, or initial values, are loaded via multiplexers into registers
RegX, RegY, and RegA. RegA, along with an adjacent adder/subtractor, multiplexer, and a small arctan LUT, is often
called an angle accumulator. Then on each of the following clock cycles, the registered values are passed through
adders/subtractors and shifters. The results described by EQ 5 through EQ 7 on page 2 are loaded back to the same
registers. Every iteration takes one clock cycle, so that in n clock cycles, n iterations are performed and the converted
coordinates are stored in the registers.
>> i
x0
y0
>> i
di
+/–
arctan
LUT
di
–/+
RegX
+/–
RegY
Mode: Rotation/Vectoring
RegA
Sign ai
Sign yi
xn
a0
yn
Sign Controlling Logic
an
di
Figure 4 • Word-Serial CORDIC Block Diagram
v2.0
5
CoreCORDIC CORDIC RTL Generator
Depending on the CORDIC mode (rotation or vectoring),
the sign-controlling logic watches either the RegY or the
RegA sign bit. Based on EQ 8 and EQ 9 on page 2, it
decides what type of operation (addition or subtraction)
needs to be performed at every iteration. The arctan LUT
keeps a pre-computed table of the arctan(2-i) values. The
number of entries in the arctan LUT equals the desirable
number of iterations, n.
The word-serial CORDIC engine takes n + 1 clock cycles to
complete a single vector coordinate conversion.
Parallel Pipelined Architecture
This architecture presents an unrolled version of the
sequential CORDIC algorithm above. Instead of reusing
the same hardware for all iteration stages, the parallel
architecture has a separate hardware processor for every
CORDIC iteration. An example of the parallel CORDIC
architecture configured for rotation mode is shown in
Figure 5.
Each of the n processors performs a specific iteration,
and a particular processor always performs the same
iteration. This leads to a simplification of the hardware.
All the shifters perform the fixed shift, which means
these can be implemented in the FPGA wiring. Every
processor utilizes a particular arctan value that can also
be hardwired to the input of every angle accumulator.
Yet another simplification is an absence of a state
machine.
The parallel architecture is obviously faster than the
sequential architecture described in the "Word-Serial
Architecture" section on page 5. It accepts new input
data and puts out the results at every clock cycle. The
architecture introduces a latency of n clock cycles.
x0
a0
y0
d0
>> 0
+/–
>> 0
d0
–/+
+/–
d0
Reg
Reg
Reg
x1
y1
a1
d1
>> 1
+/–
>> 1
d1
–/+
+/–
d1
Reg
Reg
x2
y2
a2
d2
>> 2
+/–
>> 2
d2
–/+
Reg
Reg
xn–1
yn–1
an–1
>> n-1
dn–1
–/+
Reg
Reg
xn
yn
dn–1
Figure 5 • Parallel CORDIC Architecture
v2.0
d2
Reg
dn–1
+/–
d1
arctan (2–2)
+/–
d2
d0
arctan (2–1)
Reg
>> n–1
6
arctan (20)
arctan (2n–1)
+/–
Reg
an
dn–1
CoreCORDIC CORDIC RTL Generator
I/O Formats
Bit-Serial Architecture
Whenever the CORDIC conversion speed is not an issue,
this
architecture
provides
the
smallest
FPGA
implementation. For example, in order to initialize a
Sine/Cosine LUT, the bit-serial CORDIC is the solution.
Figure 6 depicts the simplified block diagram of the bitserial architecture. The shift registers get loaded with
initial data presented in bit-parallel form, i.e., all bits at
once. The data then shifts to the right, before arriving
the serial adders/subtractors. Every iteration takes m
clock cycles, where m is the CORDIC bit resolution. Serial
shifters are implemented by properly tapping the bits of
the shift registers. The control circuitry (not shown in
Figure 6) provides sign-padding of the shifted serial data
to realize its correct sign extension. The results from the
serial adders return back to the shift registers, so that in
m clock cycles the results of another iteration are stored
in the shift registers.
Q Format Fixed-Point Numbers
CoreCORDIC, as virtually any FPGA DSP core does, utilizes
fixed-point arithmetic. In particular, the numbers the
core operates with are presented as two’s complement
signed fractional numbers. To identify the position of a
binary point separating the integer and fractional
portions of the number, the Q format is commonly used.
An mQn format number is an (n+1)-bit signed two’s
complement fixed-point number: a sign bit followed by
n significant bits with the binary point placed
immediately to the right of the m most significant bits.
The m MSBs represent the integer part, and (n–m) LSBs
represent the fractional part of the number, called the
mantissa. Table 3 depicts an example of a 1Qn format
number.
A single full CORDIC conversion takes n×m+2 clock
cycles.
x0
Table 3 • 1Qn Format Number
msb
3
2
1
0
Integer bit
Sign
+/–
Position of the
Binary Point
Bits [2n–2 : 20]
Mantissa
3
2
1
Shift Reg yn
0
Position of the
Binary Point
Bits [2n–1 : 20]
Mantissa
The following sections explain in detail the formats of
the input and output signals. The linear and angular
values are explained separately. The linear signals
include Cartesian coordinates and a vector magnitude.
These come to the CORDIC engine inputs x0 and y0, or
appear on its outputs xn and yn. Since the sine and
cosine functions the CORDIC calculates are essentially the
Cartesian coordinates of the vector, the angular signals
include the vector phase that comes to the CORDIC
engine input a0, or appears on its output an. Both linear
and angular signals utilize mQn formats and appropriate
conversion rules from floating-point to the mQn formats.
signY
+/–
lsb
y0
msb
Sign
Bit 2n
lsb
signX
n–2
Bit 2n–1
Table 4 • Qn Format Number
Shift Reg xn
n–2
Bit 2n
a0
I/O Linear Format
msb
The CoreCORDIC engine utilizes the 1Qn format shown in
Table 3. Though the 1Qn format numbers are capable of
expressing fixed-point numbers in the range from (–2n) to
(2n – 2m–n), the input linear data must be limited to fit
the smaller range from (–2n–1) to (2n–1). In terms of
floating-point numbers, the input must fit the range
from –1.0 to +1.0. For example, the 1Q9 format input
data range is limited by the following 10-bit numbers:
lsb
Shift Reg an
+/–
arctan
Serial ROM
Figure 6 • Bit-Serial CORDIC Architecture
Max input negative number of –1.0:
1100000000 ⇔ 11.00000000
v2.0
7
CoreCORDIC CORDIC RTL Generator
Max input positive number of +1.0:
Here it is assumed the floating-point data are presented
in the range from –1.0 to 1.0. The product on the
right-hand side of EQ 10 contains integer and fractional
parts. The fractional part has to be truncated or
rounded. Table 5 shows a few examples of converting
the floating-point numbers to the 1Q9 format.
0100000000 ⇔ 01.00000000
This precaution is taken to prevent the data overflow
that otherwise could occur as a result of the CORDIC
inherent processing gain. The output data obviously do
not have to fit the limited range.
To convert the 1Qn format back to the floating-point
format, use EQ 11.
To convert floating-point linear input data to the 1Qn
format, follow the simple rule in EQ 10:
n–1
1Qn Fixed-Point Data = 2
Floating-Point Data = 1Qn Fixed-Point Data/2n–1
× Floating-Point Data
EQ 11
EQ 10
Table 5 • Floating-Point to 1Q9 Format Conversion
Floating-Point
Number X
P = X × 2(n–1)
P Rounded
Common Binary
Format
1Q9 Format
256
256
0100000000
01.00000000
0.678915
173.80224
174
0010101101
00.10101101
0.047216
12.087296
12
0000001100
00.00001100
–256
–256
1100000000
11.00000000
–0.678915
–173.80224
–174
1101010011
11.01010011
–0.047216
–12.087296
–12
1111110100
11.11110100
1.00
–1.00
I/O Angular Format
The conversion formulae (EQ 12 and EQ 13) support an
important feature that greatly simplifies sine and cosine
table calculations. Such tables usually have power of two
entries (lines). At the same time, they often span angular
values from –π/2 to π/2 radians. Therefore, it is beneficial
to represent the angle of π/2 radians with the power of
two fixed-point number. In particular, when having the
CORDIC engine calculate the sin(θ) and cos(θ) table, it is
sufficient to increment the fixed-point angular argument
θ at each cycle.
The angle (phase) signals are a0 and an. They are
presented in Qn format, as shown in Table 4 on page 7.
The relation between the floating-point angular value
expressed in radians and the Qn format is shown in EQ 12. 1
Qn Fixed-Point Angle = 2n-1 × Floating-Point Angle/π
EQ 12
In EQ 12, the floating-point angle is measured in radians.
The product on the right-hand side of EQ 10 contains
integer and fractional parts. The fractional part must be
truncated or rounded.
The angular value range is from –π/2 to π/2, or in Q9
format:
Max input negative number of –π/2:
EQ 13 presents a rule for the conversion from the Qn
format back to the floating-point radian measure.
Floating-Point Angle = Qn Fixed-Point Angle × π/2
1100000000 ⇔ .1100000000
n–1
Max input positive number of +π/2:
0100000000 ⇔ .0100000000
EQ 13
Table 6 shows a few examples of converting floating-point numbers to Q9 format.
Table 6 • Examples of Angular Value to Fixed-Point Conversion
Floating-Point Angle A (rad)
P = A × 2n
Common Binary Format
Q9 Format (sign.mantissa)
π/2
1.5707963268
256
0100000000
0.100000000
π/4
0.7853981634
128
0010000000
0.010000000
π/256
0.0122718463
2
0000000010
0.000000010
1. This format means, literally, the angle of π radians is expressed as the floating-point value of 1.0.
8
v2.0
CoreCORDIC CORDIC RTL Generator
Table 6 • Examples of Angular Value to Fixed-Point Conversion
–π/2
–1.5707963268
–256
1100000000
1.100000000
–π/4
–0.7853981634
–128
1110000000
1.110000000
–π/256
–0.0122718463
–2
1111111110
1.111111110
CoreCORDIC Configuration Parameters
CoreCORDIC generates the CORDIC engine RTL code based on parameters set by the user when generating the
module. The core generator supports the variations specified in Table 7.
Table 7 • Core Generator Parameters
Parameter Name
Description
Values
module_name
Name of the generated RTL code module
–
architecture
Bit-serial, word-serial, or word parallel architecture
mode
Vector rotation (polar to rectangular coordinate conversion and sine/ 0 (vector rotation), 1 (vector
cosine calculation) or vector translation (rectangular to polar translation). Default value = 0.
conversion)
bit_width
I/O data bit width
8–48. Default value = 16.
iterations
Number of iterations
8–48. Default value = bit_width.*
fpga_family
Family of the Actel FPGA device
ax (Axcelerator), apa (ProASICPLUS),
pa3 (ProASIC3), sx (SX-A), af (Fusion)
lang
RTL code language
vhdl, verilog
0 (bit-serial), 1 (word-serial),
(parallel). Default value = 0.
2
Note: *A warning is issued if the number of iterations is set greater than the bit width.
I/O Signal Description
Figure 7 shows the CoreCORDIC module pinout.
CORDIC
x0
xn
y0
yn
a0
an
IdData
rst
rdyOut
clkEn
clk
nGrst
Figure 7 • CoreCORDIC I/O Signals
v2.0
9
CoreCORDIC CORDIC RTL Generator
The CoreCORDIC module I/O signal functionality is listed in Table 8.
Table 8 • I/O Signal Descriptions
Signal Name
Direction Description
x0 [bit_width – 1 : 0]
Input
Input data bus x0. The abscissa of the input vector in the vectoring mode or the magnitude of the
input vector in rotation mode should be placed on this bus. Bit [bit_width – 1] is the MSB. Data
are assumed to be presented in two’s complement format. The other vector coordinates are to be
supplied simultaneously.
y0 [bit_width – 1 : 0]
Input
Input data bus y0. The ordinate of the input vector in the vectoring mode should be placed on
this bus. In rotation mode, the bus should be grounded or left idle. Bit [bit_width – 1] is the MSB.
Data are assumed to be presented in two’s complement format. The other vector coordinates are
to be supplied simultaneously.
a0 [bit_width – 1 : 0]
Input
Input angle data bus a0. The phase of the input vector in the rotation mode should be placed on
this bus. In vectoring mode, the bus should be grounded or left idle. Bit [bit_width – 1] is the
MSB. Data are assumed to be presented in two’s complement format. The other vector
coordinates are to be supplied simultaneously.
clk
Input
System clock. Active rising edge.
nGrst
Input
System asynchronous reset. Active low.
rst
Input
System/module synchronous reset. Active high. Valid in parallel architecture only. Resets all
registers of the core.
clkEn
Input
Clock enable signal. Active high. Valid in word-serial and bit-serial architectures.
ldData
Input
Load input data. Indicates that input vector coordinates are ready for the CORDIC engine to be
processed. Active high. Valid in word-serial and bit-serial architectures.
rdyOut
Output
Output data (vector coordinates or sine/cosine values) are ready for the data receiver to read.
Active high. Valid in word-serial and bit-serial architectures.
xn [bit_width-1 : 0]
Output
Output data bus xn. The abscissa of the output vector in rotation mode or the magnitude of the
output vector in the vectoring mode appears on this bus. Bit [bit_width – 1] is the MSB. Data are
presented in two’s complement format. The other vector coordinates emerge on their respective
output busses simultaneously.
yn [bit_width-1 : 0]
Output
Output data bus yn. The ordinate of the output vector in rotation mode. Bit [bit_width – 1] is the
MSB. Data are presented in two’s complement format. The other vector coordinates emerge on
their respective output busses simultaneously.
an [bit_width-1 : 0]
Output
Output data bus an. The phase of the output vector in vectoring mode. Bit [bit_width – 1] is the
MSB. Data are presented in two’s complement format. The other vector coordinates emerge on
their respective output busses simultaneously.
10
v2.0
CoreCORDIC CORDIC RTL Generator
I/O Interface and Timing
computation cycle and discards the incomplete results of
the interrupted cycle.
Upon reset, the CORDIC core returns to its initial state.
Signal nGrst asynchronously resets any architecture.
Other I/O interfaces and timing depend on core
architecture.
Once the CORDIC engine completes calculating the
result, it generates rdyOut signal one clock period in
width. The result on the output busses (an, xn, and yn) is
valid while the rdyOut signal is active. The next ldData
signal can coincide with the rdyOut signal. Obviously a
valid, fresh set of input data, shown as In1 in Figure 8,
must be ready by then.
Bit-Serial Architecture Interface and
Timing
One cycle of CORDIC computation = (bit_width × iterations + 2)
clock cycles.
Figure 8 depicts a typical timing diagram for the bitserial architecture. Signal ldData resets the bit-serial
CORDIC module and loads a set of data present on the
a0, x0, and y0 input busses. The set of input data is
shown in Figure 8 as In0. Normally, a next ldData signal
has to come after the end of a current CORDIC cycle; i.e.,
after the rdyOut signal appears on the module output. In
the case that the next ldData signal is issued prior to the
end of the current cycle, the CORDIC engine starts a new
Signal clkEn can be manipulated as desired. While this
signal is low, the CORDIC engine retains all the data it
has collected or processed so far. Normally, the bit-serial
CORDIC engine is used to fill up the LUT on a power-on
event. Once the CORDIC fulfills this function, a high-level
state machine may disable the clkEn signal.
CORDIC Cycle
clk
IdData
x0, y0, a0
In0
In1
rdyOut
xn, yn, an
Out0
Figure 8 • Bit-Serial Architecture Timing Diagram
Word-Serial Architecture Interface and
Timing
Once the CORDIC engine completes calculating the
result, it generates a rdyOut signal one clock period in
width. The result on the output busses (an, xn, and yn) is
valid while the rdyOut signal is active. The next ldData
signal can immediately follow the rdyOut signal.
Obviously a valid, fresh set of input data, shown as In1,
must be ready by then.
Figure 9 on page 12 depicts a timing diagram for the
word-serial architecture. It is very similar to the bit-serial
timing diagram. Signal ldData resets the word-serial
CORDIC module and loads the set of data present on the
a0, x0, and y0 input busses. The set of input data is
shown in Figure 9 on page 12 as In0. Normally the next
ldData signal must come after the end of the current
CORDIC cycle; i.e., after the rdyOut signal appears on the
module output. In the case that the next ldData signal is
issued prior to the end of a current cycle, the CORDIC
engine starts a new computation cycle and discards the
incomplete results of the interrupted cycle.
One cycle of CORDIC computation = (iterations + 1) clock
cycles.
v2.0
11
CoreCORDIC CORDIC RTL Generator
Signal clkEn can be manipulated as desired. While this signal is low, the CORDIC engine retains all the data it has
collected or processed so far. As an example, the word-serial CORDIC engine is used to fill up the LUT on a power-on
event. Once the CORDIC completes the task, a high-level state machine may disable the clkEn signal.
CORDIC Cycle
clk
IdData
In0
x0, y0, a0
In1
rdyOut
xn, yn, an
Out0
Figure 9 • Word-Serial Architecture Timing Diagram
Parallel Architecture Interface and Timing
Figure 10 depicts a timing diagram for the parallel architecture. At the beginning of every clock cycle, a fresh set of
input arguments a0, x0, and y0 enters the CORDIC engine. No control signals accompany the input data. The CORDIC
engine puts out the results at the beginning of every clock cycle with the latency of iterations clock cycles.
Signal rst synchronously resets the parallel architecture; i.e., resets all the registers of the parallel engine.
CORDIC Latency
clk
x0, y0, a0
In0
In1
In2
In3
xn, yn, an
Out0
Out1 Out2
Out3
Figure 10 • Parallel Architecture Timing Diagram
References
J.E. Volder. 1959. "The CORDIC Trigonometric Computing Technique." IRE Transaction on Electronic Computers, EC8:330-334. http://lap.epfl.ch/courses/comparith/Papers/3-Volder_CORDIC.pdf
Ray Andraka, "A Survey of CORDIC Algorithms for FPGA Based Computers," http://www.fpga-guru.com/files/crdcsrvy.pdf,
1998.
Norbert Lindlbauer, "The CORDIC-Algorithm for Computing a Sine," http://www.cnmat.berkeley.edu/~norbert/cordic/
node4.html, 2000.
Grant R. Griffin, "CORDIC FAQ," http://www.dspguru.com/info/faqs/cordic.htm.
12
v2.0
CoreCORDIC CORDIC RTL Generator
A Sample Configuration File
The following is an example of the configuration file:
module_name
Cordic_test
architecture
0
mode
0
bit_width
16
iterations
16
fpga_family
pa3
lang
verilog
Ordering Information
Order CoreCORDIC through your local Actel sales representative. Use the following numbering convention when
ordering: CoreCORDIC-XX, where XX is listed in Table 9.
Table 9 • Ordering Codes
XX
Description
EV
Evaluation version
AR
RTL for unlimited use on Actel devices
UR
RTL for unlimited use and not restricted to Actel devices
Datasheet Categories
In order to provide the latest information to designers, some datasheets are published before data has been fully
characterized. Datasheets are designated as "Product Brief," "Advanced," and "Production." The definitions of these
categories are as follows:
Product Brief
The product brief is a summarized version of an advanced or production datasheet containing general product
information. This brief summarizes specific device and family information for unreleased products.
Advanced
This datasheet version contains initial estimated information based on simulation, other products, devices, or speed
grades. This information can be used as estimates, but not for production.
Unmarked (production)
This datasheet version contains information that is considered to be final.
v2.0
13
CoreCORDIC CORDIC RTL Generator
Appendix I
Polar and Rectangular Coordinate Relations
Y
θ
r
X
Figure 11 • Cartesian Coordinate Definition
The Cartesian coordinates (X, Y) are defined in terms of the polar coordinates r (vector magnitude, or radial
coordinate) and θ (vector phase, or polar angle), as given in EQ 14 and EQ 15.
X = r cos θ
EQ 14
Y = r sin θ
EQ 15
In terms of Cartesian coordinates, the polar coordinates are expressed as given in EQ 16 and EQ 17.
r =
2
X +Y
2
EQ 16
θ = arctan ( Y ⁄ X )
EQ 17
14
v2.0
CoreCORDIC CORDIC RTL Generator
Appendix II
Examples of CORDIC Modes
Polar to Cartesian Coordinate Conversion
The CORDIC engine is in rotation mode. Input data represent magnitude r and phase θ of the vector whose polar
coordinates are to be converted to Cartesian coordinates. The CORDIC engine puts out a pair of Cartesian coordinates
(X*K, Y*K) scaled by processing gain K (Figure 12).
Polar to Cartesian
θ
Y*K
r*K
r
X*K
Figure 12 • Polar to Cartesian Vector Conversion
General Rotation
The CORDIC engine is in rotation mode. Input data (X0 , Y0 , Angle) represent initial vector Cartesian coordinates, as
well as an angle to rotate the vector. The CORDIC engine puts out a pair of Cartesian coordinates (X*K, Y*K) of the
resulting rotated vector scaled by processing gain K (Figure 13).
General Rotation
Y*K
Y0
r*K
M
X*K
Angle to Rotate
X0
Figure 13 • CORDIC General Vector Rotation
v2.0
15
CoreCORDIC CORDIC RTL Generator
Sine and Cosine CORDIC Calculator
The CORDIC engine is in rotation mode. Input data r = 1/K and phase θ represent initial vector polar coordinates. The
CORDIC engine puts out a pair of Cartesian coordinates equal to (cosθ, sinθ ), as shown in Figure 14.
Sin/Cos
sinθ
θ
1
1/K
cosθ
Figure 14 • Sine and Cosine CORDIC Computation
Cartesian to Polar Coordinate Conversion
The CORDIC engine is in vectoring mode. Input data represent Cartesian coordinates (X0, Y0) of the input vector. The
CORDIC engine puts out a pair of polar coordinates: magnitude r*K and phase θ of the input vector (Figure 15).
Cartesian to Polar
θ
Y0
r*K
X0
Figure 15 • Cartesian to Polar Coordinate Conversion
16
v2.0
CoreCORDIC CORDIC RTL Generator
CORDIC Square Root Calculator
The CORDIC engine is in vectoring mode. Input data represent Cartesian coordinates (X0, Y0) of the input vector. The
2
CORDIC engine puts out a pair of polar coordinates: magnitude r = K x 0 + y 0
2
and phase
θ of the input vector
(Figure 16).
Square Root Calculator
Y0
r*K
X0
Figure 16 • CORDIC Square Root Calculator
CORDIC Arctan Calculator
The CORDIC engine is in vectoring mode. Input data represent Cartesian coordinates (X0, Y0) of the input vector. The
CORDIC engine puts out a pair of polar coordinates: magnitude r and phase θ = arctan(Y0 / X0) of the input vector.
Arctan Calculator
arctan(Y0/X0)
Y0
X0
Figure 17 • CORDIC Arctan Phase Calculator
v2.0
17
Actel and the Actel logo are registered trademarks of Actel Corporation.
All other trademarks are the property of their owners.
www.actel.com
Actel Corporation
Actel Europe Ltd.
Actel Japan
www.jp.actel.com
Actel Hong Kong
www.actel.com.cn
2061 Stierlin Court
Mountain View, CA
94043-4655 USA
Phone 650.318.4200
Fax 650.318.4600
Dunlop House, Riverside Way
Camberley, Surrey GU15 3YL
United Kingdom
Phone +44 (0) 1276 401 450
Fax +44 (0) 1276 401 490
EXOS Ebisu Bldg. 4F
1-24-14 Ebisu Shibuya-ku
Tokyo 150 Japan
Phone +81.03.3445.7671
Fax +81.03.3445.7668
Suite 2114, Two Pacific Place
88 Queensway, Admiralty
Hong Kong
Phone +852 2185 6460
Fax +852 2185 6488
51700064-0/3.06