AN9418: Complex Filtering with the HSP43168 Dual FIR Filter

Complex Filtering with the HSP43168 Dual FIR
Filter
Application Note
April 1998
How to Use HSP43168 to Implement
Complex Filtering
AN9418.1
FIR A
XRCR - XICI
The architecture of the HSP43168 allows for filtering of
complex inputs. The output of the filtering operation in the
complex case will calculate an Imaginary (I) and a Real (R)
component. The complex filter outputs are governed by the
following equations.
XI, XR, XI, XR
MUX
TM
YI, YR, YI, YR
FIR B
XRCI + XICR
HSP43168
N–1
X R ( j )C ( j ) – X ( j )C ( j )
YR ( n ) =
Σ
R
I
I
j=0
FIGURE 1. HSP43168 SINGLE CHIP CONFIGURATION TO
PERFORM COMPLEX FILTERING
and:
YI ( n ) =
N–1
X R ( j )C ( j ) + X ( j )C ( j )
Σ
i
I
R
j=0
Figure 2 illustrates in more detail the internal operations of
the HSP43168 as it calculates YR and YI.
The computational flow for FIR A is:
Where: YR = Real Output Component
Clock 1: XR(0)·CR(3)+XR(1)·CR(2)+XR(2)·CR(1)+XR(3)CR(0)
YI = Imaginary Output Component
Clock 2: XR(0)·CR(3)+XR(1)·CR(2)+XR(2)CR(1)+XR(3)CR(0)
+
-XI(0)·CI(3)-XI(1)·CI(2)-XI(2)CI(1)-XI(3)CI(0)
XR, XI = I and R Input Components
CR = Real Coefficients
CI = Imaginary Coefficients
Similarly, the computational flow for FIR B is:
Using a single HSP43168 dual FIR Filter one can implement
a 4-tap complex filter with the output rate running at the full
input rate. The HSP43168 architecture includes two
independent FIR filters that can be configured to operate in
various modes. For this example the two filters within the
HSP43168 are configured to operate as two separate filters,
FIR A and FIR B. FIR A is calculating the Real Output YR(n),
while FIR B is calculating the Imaginary Output YI(n).
Clock 1: XR(0)·CI(3)+XR(1)CI(2)+XR(2)·CI(1)+XR(3)CI(0)
Clock 2: XR(0)·CI(3)+XR(1)CI(2)+XR(2)CI(1)+XR(3)CI(0)
+
XI(0)·CR(3)+XI(1)CR(2)+XI(2)CR(1)+XI(3)CR(0)
After Clock 2, both YR and YI are valid and ready to be
multiplexed as outputs. Note on Figure 2 that in the decimate
by 2 mode, there are two decimation registers between each
multiplier. This ensures that either all R or all I input samples
are aligned at the multipliers on alternate clocks. Also note
that a different coefficient set is used on alternate clocks.
Real coefficients and imaginary coefficients are alternated
on every clock as appropriate for each of the two filters to
calculate the desired results.
Figure 1 illustrates a top level Block Diagram for the complex
filtering operations of the HSP43168. Each of the two filters
FIR A and FIR B must be programmed to decimate by 2.
This implies that every 2 clocks the real and imaginary
outputs are calculated and then loaded into the holding
registers. The contents of these registers are then
multiplexed and clocked out at the full input rate.
3-1
1-888-INTERSIL or 321-724-7143
|
Intersil and Design is a trademark of Intersil Corporation.
|
Copyright
© Intersil Corporation 2000
Application Note 9418
FIR A
I3
R3
I2
R2
I1
R1
I0
R0
CR0
CR1
CR2
CR3
-CI0
-CI1
-CI2
-CI3
X(R, I)
YR
+
MUX
ACC
FIR B
I3
R3
I2
R2
I1
R1
I0
R0
CI0
CI1
CI2
CI3
CR0
CR1
CR2
CR3
+
Y (R, I)
Yi
ACC
HSP43168
FIGURE 2. DATA FLOW WITHIN HSP43168 CONFIGURED AS A COMPLEX FILTER
Combining Multiple HSP43168 Filters For Extended Number of Taps and Complex Filtering
Many applications require more than 4-taps to achieve the
filtering requirements of the system.
Multiple HSP43168s can be combined to meet these
requirements. One possible architecture that implements
complex filtering for extended number of taps is shown on
Figure 3. This example illustrates the implementation of a 16tap complex filter using the HSP43168 as the core filtering
engine. This example also assumes that the desired output
rate of the filter is equal to the input rate of the data. The
example can be expanded to accommodate more taps and/or
various input and output data rates. The maximum number of
filters that can be combined together under this architecture is
limited by the maximum decimation factor of the HSP43168.
The maximum throughput is set by the maximum data rate
that a single HSP43168 can operate at.
As shown on Figure 3, there are eight HSP43168 filters that
are required for this 16-tap implementation. The architecture
is partitioned into two processing groups with one group of 4
filters calculating the real output component and the second
group of 4 filters calculating the imaginary output component
of the complex result.
The two independent FIR filters that are integrated in each of
the 8 HSP43168 devices are configured to operate as
separate filters. Each FIRA is processing the real input
samples X(real) while each FIRB is processing the
3-2
imaginary input samples X(im.). In addition, each of the
individual filters is set in a decimate by four mode. In
essence, this decimating factor is actually increasing the
number of taps from four to sixteen for each of the individual
FIR operations.
Decimation causes each of the filters to have an output rate
that is four times less than the input rate (decimation by 4).
For this example the input data rate is 45MHz and the
decimated output rate of each filter is 11.25MHz.
In an attempt to better understand the signal processing
throughout this architecture, the calculation of the real output
component will be described in some detail. The hardware
processing for the calculation of the imaginary complex
output is equivalent.
The combined output for the group of the four filters, that
calculates the real output component, runs at the aggregate
rate of its 4 filters, which is the 45MHz input rate (11.25 x 4).
This implies that the output MUX selects one of the four
individual filters at every 45MHz clock, rotating sequentially
through the output of each of the four filters. Every filter
calculates the sum of products that defines the real output
component which is defined by the following equation:
YR ( n ) =
N–1
X R ( j )C ( j ) – X ( j )C ( j )
Σ
R
I
I
j=0
Application Note 9418
where: YR = Real Output Component
XR, XI = I and R Input Components
CR = Real Coefficients
CI = Imaginary Coefficients
and N =16 representing the 16 filter taps required for this
example.
Since each of the four filter outputs is selected sequentially
every fourth consecutive clock, all of the input data samples
are being filtered within the filter combination. The four filters
are programmed to use all 16 coefficients in the decimate by
four mode.
Figure 4 illustrates the data flow and register structure within
a single HSP43168 device. The snapshot shows 16 real and
16 imaginary input samples loaded in the registers. Note that
in the decimate by 4 mode there are 4 registers between
each of the 4 multipliers for FIRA and FIRB.
The sum of the 16 products required for each output sample
is calculated over four clocks by shifting four new samples
and their corresponding coefficients at the inputs of the four
multipliers as shown on the Diagram of Figure 4. The results
of FIRA and FIRB are accumulated individually and they are
finally combined every fourth clock to provide the desired
output sample. The computational flow for FIRA over the 4
clock periods is shown below as an example for the
calculation of the sum of products processing. The example
illustrates the results in the accumulator for each of the
clocks.
Clock 1: R(0) C(3) + R(4) C(2) + R(8) C(1) + R(12) C(0)
Clock 2: R(0) C(3) + R(4) C(2) + R(8) C(1) + R(12) C(0)+
R(1) C(7) + R(5) C(6) + R(9) C(5) + R(13) C(4)
Clock 3: R(0) C(3) + R(4) C(2) + R(8) C(1) + R(12) C(0)+
R(1) C(7) + R(5) C(6) + R(9) C(5) + R(13) C(4)R(2)
C(11)+R(6) C(10)+R(10) C(9)+R(14) C(8)
Clock 4: R(0) C(3) + R(4) C(2) + R(8) C(1) + R(12) C(0)+
R(1) C(7) + R(5) C(6) + R(9) C(5) + R(13) C(4)+
R(2) C(11)+R(6) C(10)+R(10) C(9)+R(14) C(8)+
R(3) C(15)+R(7) C(14)+R(11) C(13)+R(15)C(12)
Therefore, FIRA calculates one of the two partial sum of
products that is necessary for the real complex output component. This partial sum of products is:
 N–1



X R ( j )C ( j )
 Σ
R


 j=0

In a similar fashion FIRB calculates the second partial sum
of the overall real output sample which is:
 N–1

 Σ

 j=0


X I ( j )C I ( j )


The results of the two filters are finally combined as shown
on Figure 4 in order to produce the desired output sample.
Note that there are 4 filters in the group running in parallel
but with their outputs staggered by one clock. This
architecture assures the processing of all input samples.
The implementation for the calculation of the imaginary
component of the complex output is identical to that of the
real and is performed by the lower group of the other 4
HSP43168s as shown on Figure 3. This second group of
filters calculates the imaginary component equation:
YI ( n ) =
Where:
N–1
X R ( j )C ( j ) + X ( j )C ( j )
Σ
i
I
R
j=0
YI =Imaginary Output Samples
XR, XI =R and I Input Samples
CR =Real Coefficients
CI =Imaginary Coefficients
The Timing Diagram that describes the relationship between
data, control signals and clocks to operate a single
HSP43168 at its decimating mode is shown on Figure 5. The
Timing Diagrams on Figure 5 are examples for the decimate
by 2 and decimate by 4 cases. Timing of the device for
higher decimation factors can be readily derived based on
these two sample examples. For more details on signal
description and part functionality and operation refer to the
Intersil DSP Data Book.
The examples described in this Application Note provide the
core architectural and signal processing details that can be
followed to implement other complex filters with different
length and/or data rate requirements.
3-3
Application Note 9418
OUTPUT EVERY 4
CLOCKS PER FILTER
†
XR
AIN
BIN
XI
FS = 45MHz
AIN
11.25MHz
11.25MHz
HSP43168
45MHz
MUX
BIN
HSP43168
AIN
BIN
AIN
BIN
YR ( n ) =
R
11.25MHz
N–1
Σ
XR ( j ) C ( j ) – X ( j ) C ( j )
R
I
I
j=0
HSP43168
11.25MHz
HSP43168
R
OUTPUTS AT FULL INPUT RATES
AIN
BIN
AIN
11.25MHz
11.25MHz
HSP43168
45MHz
MUX
BIN
HSP43168
AIN
BIN
AIN
BIN
YI ( n ) =
I
11.25MHz
N–1
Σ
XR ( j ) C ( j ) + X ( j ) C ( j )
I
I
R
j=0
HSP43168
11.25MHz
HSP43168
†
I
EACH HSP43168 CONFIGURED TO USE SEPARATE FIR A AND FIR B AT
A DECIMATION OF 4. THIS GIVES AN EQUIVALENT OF 16-TAPS PER FILTER
FIGURE 3. 16-TAP COMPLEX FILTER AT 45MHz
XR
R 15 R 14 R 13 R 12
R 11 R 10
R9
R8
CR
R7 R6 R5 R4
CR
R 3 R 2 R 1 R0
CR
FIR A
CR
FSOUT = 11.25MHz
+
OUTPUT EVERY
4 CLOCKS
15
∑ xR(j) CR(j)
j=0
+
XI
I15 I14 I13 I12
I11 I10
CI
I9
I8
I7
I6
I5
I4
CI
I3
I2
I1
I0
CI
FIR B
CI
YR
-
NEED 4 HSP43168
TO IMPLEMENT
Yi AND 4 TO
IMPLEMENT Yq
(SEE FIGURE 3)
+
15
∑ xI(j) CR(j)
j=0
15
YR = ∑ x R(j) C R(j) - x I(j) C I(j)
j=0
15
YI = ∑ x R(j) CI(j) + x I(j) CR(j)
j=0
FIGURE 4. INTERNAL OPERATIONS PER FILTER (YR EXAMPLE)
3-4
Application Note 9418
Timing Diagram
CLK
INA0 - 9
CSEL0 - 2
D(N+0)
D(N+1)
D(N+2)
D(N+3)
D(N+4)
D(N+5)
D(N+6)
D(N+7)
D(N+8)
0
1
0
1
0
1
0
1
0
ACCEN
TXFR
Y(N+0)
OUT 9 - 27
DECIMATE BY 4
CLK
INA0 - 9
CSEL0 - 2
D(N+0)
D(N+1)
D(N+2)
D(N+3)
D(N+4)
D(N+5)
D(N+6)
D(N+7)
D(N+8)
0
1
2
3
0
1
2
3
0
ACCEN
TXFR
Y(N+0)
OUT 9 - 27
FWRD, RVRS, SHFTEN SHOULD BE TIED LOW
FIGURE 5. TIMING DIAGRAM OF DECIMATING MODES
All Intersil semiconductor products are manufactured, assembled and tested under ISO9000 quality systems certification.
Intersil semiconductor products are sold by description only. Intersil Corporation reserves the right to make changes in circuit design and/or specifications at any time without notice. Accordingly, the reader is cautioned to verify that data sheets are current before placing orders. Information furnished by Intersil is believed to be accurate and
reliable. However, no responsibility is assumed by Intersil or its subsidiaries for its use; nor for any infringements of patents or other rights of third parties which may result
from its use. No license is granted by implication or otherwise under any patent or patent rights of Intersil or its subsidiaries.
For information regarding Intersil Corporation and its products, see web site www.intersil.com
3-5
Similar pages