ZARLINK PDSP16116MC

PDSP16116/A/MC
PDSP16116/A/MC
16 by 16 Bit Complex Multiplier
DS3858
The PDSP16116A will multiply two complex (16 + 16) bit
words every 50ns and can be configured to output the
complete complex (32 + 32) bit result within a single cycle. The
data format is fractional two's complement.
The PDSP16116/A contains four 16 x 16 Array Multipliers,
two 32 bit Adder/Subtractors and all the control logic required
to support Block Floating Point Arithmetic as used in FFT
applications. In combination with a PDSP16318, the
PDSP16116A forms a two chip 10MHz Complex Multiplier
Accumulator with 20 bit accumulator registers and output
shifters. The PDSP16116 in combination with two
PDSP16318s and two PDSP1601s forms a complete 10MHz
Radix 2 DIT FFT Butterfly solution which fully supports Block
Floating Point Arithmetic. The PDSP16116/A has an
extremely high throughput that is suited to recursive
algorithms as all calculations are performed with a single
pipeline delay (two cycle fall-through).
ISSUE 3.0
June 2000
Ordering Information
PDSP16116 MC GC1R
10MHz
PDSP16116 MC AC1R
10MHz
PDSP16116A MC GC1R 20MHz
PDSP16116A MC AC1R20MHz
MIL-883 screened ceramic QFP
MIL-883 screened PGA package
MIL-883 screened ceramic QFP
MIL-883 screened PGA package
XR
XI
YR
YI
REG
REG
REG
REG
MULT
MULT
MULT
MULT
REG
REG
REG
REG
FEATURES
■
■
■
■
■
■
■
■
■
■
Complex Number (16 + 16) X (16 + 16) Multiplication
Full 32 bit Result
20MHz Clock Rate
Block Floating Point FFT Butterfly Support
-1 times -1 Trap
Two's Complement Fractional Arithmetic
TTL Compatible I/O
Complex Conjugation
2 Cycle Fall Through
144 pin PGA or QFP packages
APPLICATION
■
■
■
■
■
Fast Fourier Transforms
Digital Filtering
Radar and Sonar Processing
Instrumentation
Image Processing
+/-
+/-
SHIFT
SHIFT
REG
REG
PR
PI
ASSOCIATED PRODUCTS
PDSP16318/A
PDSP16112/A
PDSP16330/A
PDSP1601/A
PDSP16350
PDSP16256
PDSP16510
Complex Accumulator
(16 + 16) X (12 + 12) Complex Multiplier
Pythagoras Processor
ALU and Barrel Shifter
Precision Digital Modulator
Programmable FIR Filter
Single Chip FFT Processor
Fig.1 Simplified Block Diagram
CHANGE NOTIFICATION
The change notification requirements of MIL-M-38510 will be
implemented on this device type. Known customers will be
notified of any changes since last buy when ordering further
parts if significant changes have been made.
Rev
Date
A
B
C
JULY 1993 OCT 1998 JUN 2000
D
1
PDSP16116/A/MC
The PDSP16116 has a number of features tailored for
System applications.
-1 x -1 Trap
In multiply operations utilising Twos Complement
Fractional notation, the -1 x -1 operation forms an invalid result
as +1 is not representable in the fractional number range. The
PDSP16116/A eliminates this problem by trapping the
-1 x -1 operation and forcing the Multiplier result to become the
most positive representable number.
traditionally required an adiditional ALU to multiply the
imaginary component by -1. The PDSP16116 eliminates the
requirement for the extra ALU by offering on chip complex
conjugation of either of the two incoming complex data words
with no loss in throughput.
Easy Interfacing
As with all PDSP family members the PDSP16116 has
registered I/O for data and control. Data inputs have
independent clock enables and data outputs have
independent three state output enables.
Complex Conjugation
Many algorithms utilising complex arithmetic require
conjugation of complex data stream. This operation has
Normal mode Configuration
Signal
Type
Description
XR15:0
XI15:0
YR15:0
YI15:0
PR15:0
PI15:0
CLK
CEX
CEY
CONX
CONY
ROUND
MBFP
SOBFP
EOPSS
AR15:13
AI15:13
WTA1:0
WTB1:0
WTOUT1:0
SFTA1:0
SFTR2:0
GWR4:0
OSEL1:0
OER, OEI
VDD
GND
INPUT
INPUT
INPUT
INPUT
OUTPUT
OUTPUT
INPUT
INPUT
INPUT
INPUT
INPUT
INPUT
INPUT
INPUT
INPUT
INPUT
INPUT
INPUT
INPUT
OUTPUT
OUTPUT
OUTPUT
OUTPUT
INPUT
INPUT
POWER
POWER
16 bit input for real x data
16 bit input for imag x data
16 bit input for reaal y data
16 bit input for imag y data
16 bit output for real p data
16 bit output for img p data
Clock, new data is loaded on rising edge of CLK
Clock, enable X-port input register
Clock, enable Y-port input register
Conjugate X data
Conjugate Y data
Rounds the real & imag results
Mode select (BFP/Normal)
Start of BFP operations **
End of pass **
3 MSB's from real part of A-word **
3 MSB's from imag part of A-word **
Word tag from A-word
Word tag from B-word / shift control *
Word tag output **
Shift control for A-word / overflow flag *
Shift control for accumulator resul **
Global weighting register contents **
Selects the desired output configuration
Output enables
+5V Supply All supply pins
0V Supply
must be connected
* Indicates pin performs different functions in BFP / Normal modes.
** Indicates pin is used only in BFP mode
Table.1 Signal Descriptions
2
Tie Low
Tie Low
Tie Low
Tie Low
Tie Low
Tie Low
PDSP16116/A/MC
XR
XI
YR
YI
CEY
CEX
REG
C
O
M
P
REG
C
O
M
P
16X16
MULT
16X16
MULT
REG
C
O
M
P
REG
C
O
M
P
16X16
MULT
16X16
MULT
'1'
MUX
MUX
MUX
MUX
REG
REG
REG
REG
ROUND
ADD/SUB
OVR
DECODE
CONX
WTA
ADD/SUB
CONY
AR15:13
WTB
SOBFP
EOPSS
SFTR
SFTA
CONTROL
LOGIC
AI15:13
SHIFT
SHIFT
REG
REG
GWR4:0
WTOUT
OSEL
MUX
MUX
OER
OEI
PR
PI
Figure 2 - Block Diagram
3
PDSP16116/A/MC
A
B
C
D
E
F
G
H
J
K
L
M
N
P
R
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
AC144 (POWER)
Pin connections for 144 I/O power pin grid array package (bottom view)
PIN 1
PIN 144
PIN 1 IDENT
(SEE NOTE 2)
GC144
Pin connections for 144 I/O ceramic quad flatpack (top view)
Figure 3 Pin connection diagrams (not to scale).
4
PDSP16116/A/MC
GC
AC
Signal
GC
AC
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
D3
C2
B1
D2
E3
C1
E2
D1
F2
F3
E1
G2
G3
F1
G1
H2
H1
H3
J3
J1
K1
J2
K2
K3
L1
L2
M1
N1
M2
L3
N2
P1
M3
N3
B2
A1
PI14
PI15
WTOUT1
WTOUT0
SFTR0
SFTR1
SFTR2
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
N4
P3
R2
P4
N5
R3
P5
R4
N6
P6
R5
P7
N7
R6
R7
P8
R8
N8
N9
R9
R10
P9
P10
N10
R11
P11
R12
R13
P12
N11
P13
R14
N12
N13
P14
R15
OEI
CONX
CONY
ROUND
AI13
AI14
AI15
AR13
AR14
AR15
YI15
YI14
YI13
YI12
YI11
YI10
YI9
YI8
YI7
YI6
YI5
YI4
YI3
YI2
YI1
YI0
XI0
GND
VDD
Signal GC
XI1
XI2
XI3
XI4
XI5
XI6
XI7
XI8
XI9
XI10
XI11
XI12
XI13
XI14
XI15
CEY
CEX
XR15
XR14
XR13
XR12
XR11
XR10
XR9
XR8
XR7
XR6
XR5
XR4
XR3
XR2
XR1
XR0
YR15
YR14
YR13
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
Signal
GC
AC
Signal
GND
P2
VDD
R1
P15 YR12
M14 YR11
L13 YR10
YR9
N15
YR8
L14
YR7
M15
YR6
K13
YR5
K14
YR4
L15
YR3
J14
YR2
J13
YR1
K15
YR0
J15
H14 EOPSS
VDD
H15
H13 SOBFP
G13 WTB1
G15 WTB0
F15 WTA1
G14 WTA0
F14 MBFP
CLK
F13
E15 OSEL1
E14 OSEL0
OER
D15
C15 SFTA0
D14 SFTA1
E13 GWR0
C14 GWR1
B15 GWR2
D13 GWR3
C13 GWR4
B14 PR15
A15 PR14
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
N14
M13
A14
B12
C11
A13
B11
A12
C10
B10
A11
B13
C12
A10
A9
B8
A8
C8
C7
A7
A6
B7
B6
C6
A5
B5
A4
A3
B4
C5
B3
A2
C4
C3
B9
C9
VDD
GND
PR13
PR12
PR11
PR10
PR9
PR8
PR7
PR6
PR5
GND
VDD
PR4
PR3
PR2
PR1
PR0
PI0
PI1
PI2
PI3
PI4
VDD
PI5
GND
PI6
PI7
PI8
PI9
PI10
PI11
PI12
PI13
GND
VDD
AC
NOTE. All GND and VDD pins must be used
Figure 3A - Pin connections for AC144 (Power) and GC144 packages
5
PDSP16116/A/MC
NORMAL MODE OPERATION
When the MBFP mode select input is held low the ‘Normal’
mode of operation is selected. This mode supports all
Complex Multiply operations that do not require Block Floating
Point arithmetic.
Complex two's complement fractional data is loaded into
the X and Y input registers via the X and Y Ports on the rising
edge of CLK. The Real and Imaginary components of the
fractional data are each assumed to have the following format
15
14
WEIGHTING
S
2
-1
13
-2
2
OPERATION
CONX
CONY
XxY
(XR+XI)x(YR+YI)
low
low
X x Conj Y
(XR+XI)x(YR-YI)
high
low
Conj X x Y
(XR-XI)x(YR+YI)
low
high
Invalid
high
high
Invalid
Multiplier Satge
BIT NUMBER
FUNCTION
12
-3
2
11
-4
2
10
-5
2
9
-6
2
8
-7
2
7
-8
2
6
-9
2
5
-10
2
4
-11
2
3
-12
2
2
-13
2
1
-14
2
0
-15
2
Where S = sign bit which has an effective weighting -20
The value of the 16 bit two’s complement word is
Table 3 Conjugate Functions
Adder / Subtractor Stage
The 31 bit Real and Imaginary results from the Multipliers
are passed to two 32 bit Adder/Subtractors. The Adder
calculates the imaginary result ((Xr x Yi) + (Xi x Yr)) and the
Subtractor calculates the Real result ((Xr x Yr) = (Xi x Yi)).
Each Adder/Subtractor produces a 32 bit result with the
following format.
BIT NUMBER
31
30
WEIGHTING
S
2
0
29
-1
2
28
-2
2
27
-3
2
26
-4
2
...
...
8
-22
2
7
-23
2
6
-24
2
5
-25
2
4
3
-26
2
-27
2
2
-28
2
1
-29
2
0
-30
2
Value = (-1xS)+(bit14x2-1)+(bit13x2-2)+(bit12x2-3). . .
The effective weighting of the sign bit is -21
The X & Y port registers are individually enabled by the
CEX & CEY signals respectvely. If the registers are required
to be permanently enabled, then these signals may be tied to
ground. On each clock cycle the contents of the input registers
are passed to the four multipliers to start a new Complex
Multiply operation. Each Complex Multiply operation requires
four partial products (Xr x Yr), (Xr x Yi), (Xi x Yr), (Xi x Yi), all
of which are calculated in parallel by the four 16 x 16
Multipliers. Only one clock cycle is required to complete the
multiply stage before the Mutliplier results are loaded into the
Multiplier output registers for passing on to the Adder/
Subtractors in the next cycle. Each multiplier produces a 31
bit result with the duplicate sign bit eliminated. The format of
the output data from the Multipliers is
BIT NUMBER
30
29
WEIGHTING
S
2
-1
28
-2
2
27
-3
2
26
-4
2
25
-5
2
24
-6
2
...
...
7
-23
2
6
-24
2
5
-25
2
4
-26
2
3
-27
2
2
-28
2
1
-29
2
0
Rounding
The ROUND control when asserted rounds the most
significant 16 bits of the full 32 bit result from the Adder/
Subtractor. If the ROUND signal is active (High), then bit 16
is set to a one, rounding the most significant 16 bits of the
Adder/Subractor result. (The least siginificant 16 bits are
unaffected). Inserting a one ensures that the rounding error
is never greater than 1LSB, and that no DC bias is introduced
as a result of the rounding processes.
The format of the Rounded result is;
BIT NUMBER
31
30
WEIGHTING
S
2
0
-30
29
-1
2
28
-2
2
27
-3
2
...
18
...
2
-12
17
-13
2
16
-14
2
15
-15
2
14
16
2
13
-17
2
ROUNDED VALUE
2
...
...
2
-28
2
1
-29
2
0
-30
2
LBS's
The effective weighting of the sign bit is -20
The effective weighting of the sign is -21
Result Correction
Shifter
Due to the nature of the fraction twos complement
representation it is possible to represent -1 exactly but not 1.
With conventional multipliers this causes a problem when -1
is multiplied by -1 as the multiplier produces an incorrect
result. The PDSP16116 includes a trap to ensure that the
most positive number (value = 1.2-30), (hex = 7FFFFFFFF) is
subsituted for the incorrect result. The multiplier result is
therefore always a (correct) fractional value.
Each of the two Adder/Subtractors are followed by Shifters
controlled via the WTB control input. These shifters can each
apply four different shifts, however the same shift is applied to
both real and imaginary components. The four shift options
are:
Complex Conjugation
Either the X or Y input data may be complex conjugated by
asserting the CONX or CONY signals respectively. Asserting
either of these signals has the effect of inverting (multiplying
by -1) the imaginary component of the respective input. Table
3 shows the effect of CONX and CONY on the X and Y inputs.
6
i) WTB1:0 = 11 Shift complex product one place to the left
giving a shifter output format:
BIT NUMBER
31
30
WEIGHTING
S
2
-1
29
-2
2
28
-3
2
27
-4
2
26
-5
2
25
-6
2
...
...
7
-24
2
6
-25
2
5
-26
2
The effective weighting of the sign bit is -20
4
-27
2
3
-28
2
2
-29
2
1
-30
2
0
-31
2
PDSP16116/A/MC
Part No:
PDSP11616/A/MC 16 By16 Bit Complex Multiplier
VDD max = +5.5V = V1
Package Type:
AC144
N/C = not connected
Pin No.
Con.
Pin No.
Con.
Pin No.
Con.
Pin No.
Con.
A1
A2
A3
A4
A5
A6
A7
A8
A9
A10
A11
A12
A13
A14
A15
B1
B2
B3
B4
B5
B6
B7
B8
B9
B10
B11
B12
B13
B14
B15
C1
C2
C3
C4
C5
C6
C7
C8
C9
C10
C11
C12
C13
C14
C15
D1
D2
D3
D4
D5
D6
D7
D8
D9
D10
D11
D12
D13
D14
D15
V1
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
0V
N/C
N/C
0V
N/C
N/C
N/C
0V
N/C
N/C
N/C
0V
N/C
N/C
N/C
N/C
N/C
N/C
N/C
V1
N/C
N/C
V1
N/C
N/C
V1
N/C
N/C
N/C
V1 100Ω
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
V1 100Ω
E1
E2
E3
E4
E5
E6
E7
E8
E9
E10
E11
E12
E13
E14
E15
F1
F2
F3
F4
F5
F6
F7
F8
F9
F10
F11
F12
F13
F14
F15
G1
G2
G3
G4
G5
G6
G7
G8
G9
G10
G11
G12
G13
G14
G15
H1
H2
H3
H4
H5
H6
H7
H8
H9
H10
H11
H12
H13
H14
H15
0V 100Ω
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
V1 100Ω
V1 100Ω
0V 100Ω
0V 100Ω
V1 100Ω
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
0V 100Ω
0V 100Ω
0V 100Ω
0V 100Ω
0V 100Ω
0V 100Ω
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
V1 100Ω
0V 100Ω
V1 100Ω
0V 100Ω
0V 100Ω
V1 100Ω
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
0V 100Ω
0V 100Ω
V1
J1
J2
J3
J4
J5
J6
J7
J8
J9
J10
J11
J12
J13
J14
J15
K1
K2
K3
K4
K5
K6
K7
K8
K9
K10
K11
K12
K13
K14
K15
L1
L2
L3
L4
L5
L6
L7
L8
L9
L10
L11
L12
L13
L14
L15
M1
M2
M3
M4
M5
M6
M7
M8
M9
M10
M11
M12
M13
M14
M15
V1 100Ω
V1 100Ω
V1 100Ω
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
0V 100Ω
0V 100Ω
0V 100Ω
V1 100Ω
V1 100Ω
V1 100Ω
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
0V 100Ω
0V 100Ω
0V 100Ω
V1 100Ω
V1 100Ω
V1 100Ω
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
0V 100Ω
0V 100Ω
0V 100Ω
V1 100Ω
V1 100Ω
V1 100Ω
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
0V
0V 100Ω
0V 100Ω
N1
N2
N3
N4
N5
N6
N7
N8
N9
N10
N11
N12
N13
N14
N15
P1
P2
P3
P4
P5
P6
P7
P8
P9
P10
P11
P12
P13
P14
P15
R1
R2
R3
R4
R5
R6
R7
R8
R9
R10
R11
R12
R13
R14
R15
V1 100Ω
V1 100Ω
0V 100Ω
0V 100Ω
0V 100Ω
0V 100Ω
0V 100Ω
V1 100Ω
V1 100Ω
V1 100Ω
V1 100Ω
V1 100Ω
0V 100Ω
V1
0V 100Ω
V1 100Ω
0V
0V 100Ω
0V 100Ω
0V 100Ω
0V 100Ω
0V 100Ω
V1 100Ω
V1 100Ω
V1 100Ω
V1 100Ω
V1 100Ω
V1 100Ω
0V 100Ω
0V 100Ω
V1
0V 100Ω
0V 100Ω
0V 100Ω
0V 100Ω
0V 100Ω
0V 100Ω
V1 100Ω
V1 100Ω
V1 100Ω
V1 100Ω
V1 100Ω
V1 100Ω
V1 100Ω
0V 100Ω
Figure 4(a) - Life Test/Burn-in connections
NOTE: PDA is 5% and based on groups 1 and 7
7
PDSP16116/A/MC
Part No:
PDSP16116/A/MC 16 By 16 Bit Complex Multiplier
Package Type:
GC144
Pin No.
Con.
Pin No.
Con.
Pin No.
Con.
Pin No.
Con.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
N/C
N/C
N/C
N/C
N/C
N/C
N/C
V1
0V
0V
0V
0V
0V
0V
0V
0V
0V
V1
V1
V1
V1
V1
V1
V1
V1
V1
V1
V1
V1
V1
V1
V1
V1
0V
0V
V1
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
0V
0v
0V
0V
0V
0V
0V
0V
0V
0V
0V
0V
0V
0V
0V
V1
V1
V1
V1
V1
V1
V1
V1
V1
V1
V1
V1
V1
V1
V1
V1
V1
V1
0V
0V
0V
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
0V
V1
0V
0V
0V
0V
0V
0V
0V
0V
0V
0V
0V
0V
0V
0V
V1
0V
V1
V1
0V
0V
0V
V1
0V
0V
V1
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
0V
V1
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
V1
N/C
0V
N/C
N/C
N/C
N/C
N/C
N/C
N/C
N/C
0V
V1
VDD max = +5.0V = V1
N/C = not connected
Figure 4(b) Life Test/Burn-in connections
NOTE: PDA is 5% and based on groups 1 and 7
8
PDSP16116/A/MC
ii) WTB1:0 = 00 No shift applied giving a shifter output format:
31 30 29 28 27 26
Weighting
S
20 2–1 2–2 2–3 2–4
≈ ≈ ≈
Bit Number
8
7
6
5
4
3
2
1
2–22 2–23 2–24 2–25 2–26 2–27 2–28 2–29 2–30
The effective weighting of the shift bit is -21.
iii) WTB1:0 = 01 Shift complex product one place to the right
giving a shifter output format:
31 30 29 28 27 26 25 24
Weighting
S
21
20 2–1 2–2 2–3 2–4 2–5
≈ ≈ ≈
Bit Number
6
5
4
3
2
1
0
2–23 2–24 2–25 2–26 2–27 2–28 2–29
The effective weighting of the sign bit is -22.
iv) WTB1:0 = 10 Shift complex product two places to the right
giving a shifter output format:
31 30 29 28 27 26 25 24
Weighting
S
22
21
20 2–1 2–2 2–3 2–4
≈ ≈ ≈
Bit Number
6
5
4
3
2
1
PIN DESCRIPTIONS
0
XR, XI, YR, YI
Data inputs 16 bits: Data is loaded into the input registers
from these ports on the rising edge of CLK. The data format
is Twos Complement Fractional, where the MSB (sign bit) is
bit 15. In normal mode the weighting of the MSB is -20 ie -1.
PR, PI
Data outputs 16 bits: Data is clocked into the output
registers and passed to the PR and PI outputs on the rising
edge of CLK. The data format is Twos Complement
Fractional. The field of the internal result selected for output
via PR and PI is controlled by signals OSEL1:0 (see Table 4).
0
2–22 2–23 2–24 2–25 2–26 2–27 2–28
The effective weighting of the sign bit is -23.
CLK
Common Clock to all internal register.
Overflow
CEX, CEY
If the left shift option is selected and the Adder/Subtractor
contain a 32 bit word, then an invalid result will be passed to
the output. An invalid output arising from this combination of
events will be flagged by the SFTA0 flag output. The SFTA0
Flag will go high if either the real or imaginary reslut is invalid.
Clock enables for X and Y input ports: When low these
inputs enable the CLK signal to the X or Y input registers
allowing new data to be clocked into the Multiplier.
Output Select
If either of these inputs are high on the rising edge of CLK,
then the data in the associated input has its imaginary
component inverted (multiplied by -1), see Table 3. CONX
and CONY affect data input on the same clock rising edge.
The output from the Shifters is passed to the Output Select
Mux, which is controlled via the OSEL inputs. These inputs
are not registered and hence allow the output combination to
be changed within each cycle. The full complex 64 bit result
from the multiplier may therefore be output within a single
cycle. The OSEL control selects four different output
combinations as as summarised in Table 4.
OSEL1
OSEL0
PR
PI
0
0
MSR
MSI
0
1
LSR
LSI
1
0
MSR
LSR
1
1
MSI
LSI
Table 3 - Output Selection
(Where MSR and LSR are the most and least siginificant 16 bit
words of the Real Shifter output, MSI and LSI are the most and
least significant 16 bit words of the imaginary Shifter output).
CONX, CONY
ROUND
The ROUND control is used to round the most siginficant
16 bits of the Adder/Subtractor result prior to being passed to
the output register. The rounding operation takes place one
cycle after the ROUND input is taken high. The ROUND input
is not latched and is intended to be tied high or low depending
upon the application.
MBFP
Mode select: When high, Block Floating Point (BFP) mode
is selected. This allows the device to maintain the dynamic
range of the data using a series of word tags. This is especially
useful in FFT appllications. When low, the chip operates in
normal mode for more general applications. This pin is
intended to be tied high or low, depending on application.
The output select options allow two different modes for
extracting the full 32 bit result from the PDSP16116. The first
mode treats the two 16 bit outputs as real and imaginary ports
allowing the real and imaginary results to be output in two
halves on the real and imaginary output ports. The second
mode treats the two 16 bit outputs as one 32 bit output and
allows the real and imaginary results to be output as 32 bit
words.
9
PDSP16116/A/MC
SOBFP (BFP MODE ONLY)
GWR4:0 (BFP MODE ONLY)
Start of BFP: This input should be held low for the first cycle
of the first pass of the BFP calculations (see Fig.7). It serves
to reset the internal registers associated with BFP control.
When operating in normal mode this input should be tied low.
Contents of the global weighting register: This stores the
weighting of the largest word present with respect to the
weighting of the original input words. Hence, if the contents of
the GWR are 00010, this indicates that the largest word
currently being processed has its binary point two bits to the
right of the original data at the start of the BFP calculations.
The contents of this register are updated at the end of each
pass, according to the largest value of WTOUT occuring
during that pass. (i.e. If WTOUT = 11, then GWR will be
increased by 2). The GWR is presented in two’s complement
format. These outputs are superfluous in normal mode.
EOPSS (BFP MODE ONLY)
End of pass: This input should be held low for the last cycle
of each pass and for the lay time between passes. It instructs
the control logic to update the value of the global weighting
register and prepare the BFP circuitry for the next pass. When
operating in normal mode this input should be tied low.
WTOUT1:0 (BFP MODE ONLY)
AR15:13 (BFP MODE ONLY)
Three Msbs of the real part of the A-word : These are used
in the FFT butterfly application to deteremine the magnitude of
the real part of the A-word and, hence, to determine if there will
be any chage of word growth in the PDSP16318 Complex
Accumulator. When operating in normal mode, these inputs
are not used and may be tied low.
Word tag output. This tag records the weighting of the
output words from the current cycle relative to the current
global weighting register (see Table 6). It should be stored
along with the A’ and B’ words as it will form the input word
tags, WTA and WTB, for each complex word during the next
pass. These outputs are superfluous in normal mode.
AI15:13 (BFP MODE ONLY)
WTOUT1:0
Weighting of the output relative to
the current global weighting register
Three Msbs of the imaginary part of the A-word : used in
the same fashion as AR.
00
One less
01
The same
10
One more
11
Two more
SFTR2:0 (BFP MODE ONLY)
Accumulator result shift control. These pins should be
linked directly to the S2:0 pins on the PDSP16318 Complex
Accumulator. They control the accumulator’s barrel shifter
(see Table 5). The purpose of this shift is to minimise sign
extension in the multiplier or accumulator ALU’s. When
operating in normal mode, these output are superfluous.
SFTR2:0
FUNCTION
000
Reserved
001
Reserved
010
Reserved
011
Shift right by one
100
No shift
101
Shift left by one
110
Shift left by two
111
Reserved
Table 5 - Auccumulator Shifts ( BFP mode )
10
Table 6 - Word Tag Weightings
WTA1:0 (BFP MODE ONLY)
Word tag from the A-word. This word records the
weighting of the A-word relative to the global weighting
register on the previous pass. Although the A-word inself is
not processed in the PDSP16116, this information is required
by the control logic for the radix-2 butterfly FFT application.
These inputs should be tied low in normal mode.
WTB1:0 (BFP & NORMAL MODES)
In BFP mode, this is the word tag from the B-word. This is
operated in the same manner as WTA but for the B-word. The
value of the word tags are used to ensure that the binary
weighting of the A word and the product of the complex
multiplier are the same at the inputs to the complex
accumulator. Depending on which word is the larger, the
weighting adjustment is performed using either the internal
shifter or an external shifter controlled by SFTA. The word
tags are also used to maintain the weighting of the final result
to within plus two and minus one binary points relative to the
new GWR. (On the first pass all word tags will be ignored).
PDSP16116/A/MC
In normal mode, these inputs perform a different function.
They directly control the internal shifter at the output port as
shown in Table 7.
WTB1:0
FUNCTION
11
00
01
10
shift complex product one place to the left
no shift applied
shift complex product one place to the right
shift complex product two places to the right
Table 7 - Normal Mode Shift Control
SFTA1:0 (BFP & NORMAL MODES)
In BFP mode, these signals act as as the A-word shift
control. They allow shifting from one to four places to the right,
see Table 8. Depending on the relative weightings of the Awords and the complex product, the A-word may have to be
shifted to the right to ensure compatible weightings at the
inputs to the PDSP16318 complex accumulator. (The two
words must have the same weighting if they are to be added).
In normal mode, SFTA0 performs a different a different
function. If WTB1:0 is set to implement a left shift, then
overflow will occur if the data is fully 32 bits wide. This pin is
used to flag such an overflow. SFTA1 is not used in normal
mode.
WTB1:0
FUNCTION
00
01
10
11
Shift A-word 1 places to the right
Shift A-word 2 places to the right
Shift A-word 3 places to the right
Shift A-word 4 places to the right
Table 8 - External A-word shift control
OSEL1:0
The outputs from the device are selected by the OSEL0 &
OSEL1 instruction bits. These controls allow selection of the
output combination during the current cycle. (They are not
registered). These are four possible output configurations
that allow either complex outputs of the most or least
significant bytes, or real or imaginary outputs of the full 32 bit
word (see Table 4). OSEL0 and OSEL1 should both be tied
low when in BFP mode.
BFP MODE FFT APPLICATION
The PDSP16116 may be used as the main arithmetic unit
of the butterfly processor which will allow the following FFT
benchmarks:
1024 point complex radix-2 transform in 517us
512 point complex radix-2 transform in 235us
256 point complex radix-2 transform in 106us
In addition, with pin MBFP tied high, the BFP circuitry
within the PDSP16116 can be used to adaptively rescale data
throughout the course of the FFT so as to give high-resolution
results.
The BFP system on the PDSP16116 can be used with any
variation of the Radix-2 Decimation-In-Time FFT - e.g. the
Constant Geometry algorithm, the In-Place algorithm etc. An
N-point Radix-2 DIT FFT is split into log (N) passes. Each pass
consists of N/2 ‘butterflies’, each performing the operation:
A’ = A + B.W
B’ = A - B.W
Where W is the complex coefficient and A & B are the complex
data.
Fig.4 illustrates how a single PDSP16116 may be
combined with two PDSP1601’s and two PDSP16318’s to
form a complete BFP butterfly processor. The PDSP16318’s
are used to perform the complex addition and subtraction of
the butterfly operation, while the PDSP1601’s are used to
match the data path of the A-word to the pipelining and shifting
operations within the PDSP16116.
For more information on the theory and construction of this
butterfly processor, refer to application note AN59.
BFP MODE OPERATION
The BFP mode on the PDSP16116 is intended for use in
the FFT application described above. i.e. it is intended to
prevent data degredation during the course of an FFT
calculation. The operation of the PDSP16116 based BFP
butterfly processor (see Fig.4) is described below.
The Block Floating Point System
A block floating point system is essentially an ordinary
integer arithmetic system with some clever logic bolted on.
The object of the extra logic is to lend the system some of the
enormous dynamic range afforded by a true floating point
system without suffering the corresponding loss in
performance.
The initial data used by the FFT should all have the same
binary arithmetic weighting. i.e. the binary point should
occupy the same position in every data word, as is normal in
integer arithmetic. However, during the course of the FFT, a
variety of weightings are used in the data words to increase the
dynamic range available. This situation is similar to that within
a true floating point system, though the range of numbers
representable is more limited. In the BFP system used in the
PDSP16116, there are, within any one pass of the FFT, four
possible positions of the binary point wihin the integer words.
To record the position of its binary point, each word has a 2bit word tag associated with it. By way of example, in a
particular pass we may have the following four positions of
binary point avaiable, each denoted by a certain value of word
tag:
XX.XXXXXXXXXXXX
XXX.XXXXXXXXXXX
XXXX.XXXXXXXXXX
XXXXX.XXXXXXXXX
word tag = 00
word tag = 01
word tag = 10
word tag = 11
11
PDSP16116/A/MC
SOBFP
AR
EOPSS
BR
BI
WR
WTA WTB
WI
AI
AI15:13
AR15:13
A
XR
PDSP1601/A
XI
YR
YI
A
PDSP16116/A
PDSP1601/A
SFTA
SFTA
C
PR
C
PI
DAR
DAI
A
B
PDSP16318/A
C
A'R
B
SFTR
SFTR
D
A'I
WTOUT
GWR
A
PDSP16318/A
C
D
B'R
B'I
Figure 5 - FFT Butterfly Processor
At the end of each constituent pass of the FFT, the
positions of the binary point supported may change to reflect
the trend of data increase or decreases in magnitude. Hence,
in the pass following that of the above example, the four
positions of binary point supported may be change to:
XX.XXXXXXXXXXXX
XXX.XXXXXXXXXXX
XXXX.XXXXXXXXXX
XXXXX.XXXXXXXXX
word tag = 00
word tag = 01
word tag = 10
word tag = 11
This variation in the range of binary points supported from
pass to pass (i.e. the movement of the binary point relative to
its position in the original data) is recorded in the GWR.
12
Thus we can determine the position of the binary point
relative to its initial position by modifying the value of GWR by
WTOUT for a given word as shown in Table 6.
As an example, if GWR=01001 and WTOUT=10 then the
binary point has moved 10 places to the right of its original
position.
PDSP16116/A/MC
A new butterfly operation is commenced each cycle,
requiring a new set of data for , B, W, WTA and WTB. Five
cycles later, the corresponding results A' and B' are produced
along with their associated WTOUT. In between, the signals
SFTA and SFTR are produced and acted upon by the shifters
in the PDSP1601/A and PDSP16318/A. The timing of the data
and control signals is shown in Fig.6.
The results (A' and B') of each butterfly calculation in a
pass must be stored away to be used later as the input data
(A and B) in the next pass. Each result must be stored together
with its associated word tag, WTOUT. Although WTOUT is
common to both A' and B', it must be stored separately with
each word as the words are used on different cycles during the
next pass. At the inputs, the word tag associated with the A
word is known as WTA and the word tag associated with the
B word is known as WTB. Hence, the WTOUTs from one pass
will become the WTAs and WTBs for the following pass. It
should be noted that the first pass is unique in that word tags
need not be input into the butterfly as all data initially has the
same weighting. Hence, during the first pass alone, the inputs
WTA and WTB are ignored.
The butterfly operation
The butterfly operation is the arithmetic operation which is
repeated many times to produce an FFT. The PDSP16116A
based butterfly processor performs this operation in a low
power high accuracy chip set.
A
A'
A' = A + B. W
B' = A - B. W
W
B
B'
Figure 6 - Butterfly Operation
CLK
Present Br, Bi,
Wr, Wi to inputs
;;;;
;;;;
;;;;
;;;;
n
n+1
n+2
n+3
n+4
n+5
n
n+1
n+2
n+3
n+4
n+5
n
n+1
Output SFTA
n-2
n-1
Output SFTR
n-3
n-2
Output Pr, Pi
n-2
n-1
Output DAr, DAi
n-3
n-2
Output WTOUT
n-5
n-4
Output A'r, A'i, B'r, B'i
n-5
n-4
Present WTA,
WTB to inputs
Present Ar,
Ai to inputs
;;;;;
;;;;;
;;;;
;;;;
;;;;
;;;;
n+2
n+3
n+4
n+5
n
n+1
n+2
n+3
n-1
n
n+1
n+2
n-1
n
n+1
n+2
n-1
n
n+1
n-3
n-2
n-1
n-3
n-2
n-1
;;;;;
;;;;;
;;;;;
n+2
n
n
Figre 7 Butterfly Data and Control Signals
13
PDSP16116/A/MC
Control of the FFT
where it should remain for the duration of the FFT. New data
is presented to the processor each successive cycle until the
end of the first pass of the FFT. On the last cycle of the pass,
the signal EOPSS should be pulled low and remain low for a
minimum of five cycles *, the time required to clear the pipeline
of the butterfly processor so that all the results from one pass
are obtained before commencing the following pass. On the
initial cycle of each new pass, the signal EOPSS should be
pulled high and it should remain high until the final cycle of that
pass, when it is pulled low again.
* Should a longer pause be required between passes - to
arrange the data for the next pass, for example, then EOPSS
may be kept low as long as necessary - the next pass cannot
commence until it is brought high again.
To enable the block floating point hardware to keep track
of the data, the following signals are provided :
SOBFP - start of the FFT
EOPSS - end of current pass
These inform the PDSP16116/A when an FFT is starting
and when each pass is complete. Fig.7 shows how these
signals should be used and a commentary is provided below.
To commence the FFT, the signal EOPSS should be set
high (where it will remain for the duration of the pass). SOBFP
should be pulled low during the initial cycle when the first data
words A and B are presented to the inputs of the butterfly
processor. The following cycle SOBFP must be pulled high
CLK
SOBFP
1 = first cycle of
data in pass
n = last cycle of
data in pass
EOBFP
EOPSS
A, B, W, WTA, WTB
1
2
3
4
5
6
1
A', B', WTOUT
7
2
n-1
3
n
n-5
1
n-4
n-3
n-2
n-1
2
n
3
4
5
6
1
7
2
GWR
start of first pass
end of first pass / start of next pass
(minimum number of lay cycles shown)
- period between other intermediate passes
is similar
Figure 8 - Use of the BFP Control Signals
FFT Output Normalisation
When an FFT system outputs a series of FFT results for
display, storage or transmission, it is essential that all results
are compatible, i.e. with the binary point in the same position.
However, in order to preserve the dynamic range of the data
in the FFT calculation, the PDSP1601/A employs a range of
different weightings. Therefore, data must be re-formatted at
the end of the FFT to be pre-determined common weighting.
This can be done by comparing the exponent of given data
word with the pre-determined unversial exponent and then
shifting the data word by the difference. The PDSP1601/A,
with its multifunction 16 bit barrel shifter, is ideally suited to this
task.
What value should the Unversal Exponent take? Well,
according to theory, the largest possible data result from an
FFT is N times the largest input data. This means that the
binary point can move a maximum of log2(N) places to the
right. Hence, if we choose the Unverisal Exponent to be
log2(N) this should give us sufficient range to represent all
data points faithfully.
14
In practice, data output may never approach the theoretical maximum. Hence, it may be worthwhile to try various
Unverisal Exponents and choose the one best suited to the
particular application.
Data is output from the butterfly processor with a two-part
exponent: the 5-bit GWR applicable to all data words from a
given FFT and a 2-bit WTOUT associated with each individual
data word. To find the complete exponent for a given word, the
GWR for that FFT must be modified by its WTOUT as shown
in Table 6. The result is the number of places the binary point
has shifted to the right during the course of the FFT.
This value must be compared with the Unversial Exponent
to determine the shift required. This is done by subtracting it
from the Unversial Exponent. The number of places to be
shifted is equal to the difference between the two exponents.
The shift can be implemented in a PDSP1601/A. The shift
value is fed into the SV port.
PDSP16116/A/MC
As FFT data consists of real and imaginary parts, either
two PDSP1601As must be used (controlled by the same logic)
or a single PDSP1601/A could be used handling real and
imaginary data on alternate cycles (using the same
instructions for both cycles).
N.B.
It is easier to simply add the word tag to the exponent for
the purpose of determing the shift required, instead of
modifying it according to Table.6. To compensate for this, the
Universal Exponent may be increased by one.
An example of an output normalisation circuit is shown in
Fig.8. Only 4 bit data paths are used in calculating the shift.
This means that we must be able to trap very small values
negative of GWR and force a 15-bit right shift in such cases.
WTOUT
GWR
16-BIT DATA
sign bit
4-BIT ADDER
UNVERSAL
EXPONENT
4-BIT SUBTRACTOR
1111
4-BIT MUX
SV-PORT
B-PORT
PDSP1601
IS
ASRSV
C-PORT
NORMALISED OUTPUT DATA
Fig.9 Output Normalisation Circuitry
15
PDSP16116/A/MC
ABSOLUTE MAXIMUM RATINGS (Note 1)
NOTES
1. Exceeding these ratings may cause permanent damage.
Functional operation under these conditions is not implied.
2. Maximum dissipation or 1 second should not be exceedeed, only
one output to be tested at any one time.
3. Exposure to absolute maximum ratings for extended periods may
affect device reliability.
Supply voltage VCC
-0.5V to 7.0V
Input voltage VIN
-0.5V to VCC +0.5V
-0.5V to VCC +0.5V
Output voltage VOUT
Clamp diode current per Ik (see note 2)
18mA
Static discharge voltage (HBM)
500V
Storage temperature range TS
-65°C to +150°C
Ambient temperature with power applied TAMB
Military
-55°C to +125°C
Industrial
-40°C to +85°C
Junction temperature
150°C
Package power dissipation
1000mW
Thermal resistances
Junction to case øJC
12°C/W
Junction to case øJA
29°C/W
ELECTRICAL CHARACTERISTICS
Operating conditions (unless otherwise stated):
Industrial: TAMB = -40°C to +85°C, VCC = 5.0V ± 10%, GND = 0V
Military: TAMB = -55°C to +125°C, VCC = 5.0V ± 10%, GND = 0V
Static Characteristics
Characteristic
Output high voltage
Output low voltage
Input high voltage
Input high voltage
Input low voltage
Input leakage current
Input capacitance
Output leakage current
Output S/C current
16
Value
Symbol
VOH
VOL
VIH
VIH
VIL
IIN
CIN
IOZ
IOS
Min. Typ.
Min.
2.4
3.0
2.2
-10
0.4
0.8
+10
10
-50
10
+50
300
Units
Conditions
V
V
V
V
V
µA
pF
µA
mA
IOH = 8mA
IOL = -8mA
CLK input only
All other inputs
GND <VIN<VCC
GND <VIN<VCC
VCC = Max
PDSP16116/A/MC
Switching Characteristics
PDSP16116 PDSP16116A
Characteristic
CLK rising edge to P-PORTS
CLK rising edge to WTOUT1:0
CLK rising edge to GWR4:0
CLK rising edge to SFTA1:0
CLK rising edge to SFTR2:0
Setup CEX or CEY to CLK rising edge
Hold CEX or CEY to CLK rising edge
Setup X or Y port inputs to CLK rising edge
Hold X or Y port inputs to CLK rising edge
Setup WTA1:0, WTB1:0, SOBFP or EOPSS inputs
to CLK rising edge
Hold WTA1:0, WTB1:0, SOBFP or EOPSS inputs to
CLK rising edge
Setup CONX or CONY inputs to CLK rising edge
Hold CONX or CONY inputs to CLK rising edge
Setup AR15:13 or AI15:13 to CLK rising edge
Hold AR15:13 or AI15:13 to CLK rising edge
OPSEL to valid P-PORTS
OER or OEI rising PR-PORT or PI-PORT high to Z
OER or OEI rising PR-PORT or PI-PORT low to Z
OER or OEI falling PR-PORT or PI-PORT Z to high
OER or OEI falling PR-PORT or PI-PORT Z to low
Clock period
Clock high time
Clock low time
Vcc Current (CMOS input levels)
Vcc Current (TTL input levels)
Units
Min.
Max.
Min.
Max.
5
5
5
5
5
11
11
14
45
30
30
60
50
0
2
-
5
5
5
5
5
8
8
8
23
20
20
30
28
0
0
-
ns
ns
ns
ns
ns
ns
ns
ns
ns
ns
-
0
-
0
ns
14
14
100
30
20
-
0
0
35
35
45
22
24
60
100
8
50
12
12
-
0
0
20
25
25
18
18
80
130
ns
ns
ns
ns
ns
ns
ns
ns
ns
ns
ns
ns
mA
mA
Conditions
2 x LSTTL + 20pF
2 x LSTTL + 20pF
2 x LSTTL + 20pF
2 x LSTTL + 20pF
2 x LSTTL + 20pF
2 x LSTTL + 20pF
see Fig.9
see Fig.9
see Fig.9
see Fig.9
see Note 4
see Note 4
NOTE 4 :- VCC = Max Outputs unloaded, clock freq = Max
Test
Delay from output
high to output
high impedance
Waveform - measurement level
VH
V T = 0V
Delay from output
low to output
high impedance
Delay from output
high impedance to
output high
1.5K Ω
DUT
30pF
V T = Vcc
VL
Delay from output
high impedance to
output low
VT
0.5V
1.5V
1.5V
0.5V
Fig.10 Three state delay measurement load
0.5V
0.5V
VH - Voltage reached wh en output driven hig
VL - Voltage reached wh en output driven low
17
For more information about all Zarlink products
visit our Web Site at
www.zarlink.com
Information relating to products and services furnished herein by Zarlink Semiconductor Inc. or its subsidiaries (collectively “Zarlink”) is believed to be reliable.
However, Zarlink assumes no liability for errors that may appear in this publication, or for liability otherwise arising from the application or use of any such
information, product or service or for any infringement of patents or other intellectual property rights owned by third parties which may result from such application or
use. Neither the supply of such information or purchase of product or service conveys any license, either express or implied, under patents or other intellectual
property rights owned by Zarlink or licensed from third parties by Zarlink, whatsoever. Purchasers of products are also hereby notified that the use of product in
certain ways or in combination with Zarlink, or non-Zarlink furnished goods or services may infringe patents or other intellectual property rights owned by Zarlink.
This publication is issued to provide information only and (unless agreed by Zarlink in writing) may not be used, applied or reproduced for any purpose nor form part
of any order or contract nor to be regarded as a representation relating to the products or services concerned. The products, their specifications, services and other
information appearing in this publication are subject to change by Zarlink without notice. No warranty or guarantee express or implied is made regarding the
capability, performance or suitability of any product or service. Information concerning possible methods of use is provided as a guide only and does not constitute
any guarantee that such methods of use will be satisfactory in a specific piece of equipment. It is the user’s responsibility to fully determine the performance and
suitability of any equipment using such information and to ensure that any publication or data used is up to date and has not been superseded. Manufacturing does
not necessarily include testing of all functions or parameters. These products are not suitable for use in any medical products whose failure to perform may result in
significant injury or death to the user. All products and materials are sold and services provided subject to Zarlink’s conditions of sale which are available on request.
Purchase of Zarlink’s I2C components conveys a licence under the Philips I2C Patent rights to use these components in and I2C System, provided that the system
conforms to the I2C Standard Specification as defined by Philips.
Zarlink, ZL and the Zarlink Semiconductor logo are trademarks of Zarlink Semiconductor Inc.
Copyright Zarlink Semiconductor Inc. All Rights Reserved.
TECHNICAL DOCUMENTATION - NOT FOR RESALE