Complex Filtering with the HSP43168 Dual FIR Filter Application Note April 1998 How to Use HSP43168 to Implement Complex Filtering AN9418.1 FIR A XRCR - XICI The architecture of the HSP43168 allows for filtering of complex inputs. The output of the filtering operation in the complex case will calculate an Imaginary (I) and a Real (R) component. The complex filter outputs are governed by the following equations. XI, XR, XI, XR MUX TM YI, YR, YI, YR FIR B XRCI + XICR HSP43168 N–1 X R ( j )C ( j ) – X ( j )C ( j ) YR ( n ) = Σ R I I j=0 FIGURE 1. HSP43168 SINGLE CHIP CONFIGURATION TO PERFORM COMPLEX FILTERING and: YI ( n ) = N–1 X R ( j )C ( j ) + X ( j )C ( j ) Σ i I R j=0 Figure 2 illustrates in more detail the internal operations of the HSP43168 as it calculates YR and YI. The computational flow for FIR A is: Where: YR = Real Output Component Clock 1: XR(0)·CR(3)+XR(1)·CR(2)+XR(2)·CR(1)+XR(3)CR(0) YI = Imaginary Output Component Clock 2: XR(0)·CR(3)+XR(1)·CR(2)+XR(2)CR(1)+XR(3)CR(0) + -XI(0)·CI(3)-XI(1)·CI(2)-XI(2)CI(1)-XI(3)CI(0) XR, XI = I and R Input Components CR = Real Coefficients CI = Imaginary Coefficients Similarly, the computational flow for FIR B is: Using a single HSP43168 dual FIR Filter one can implement a 4-tap complex filter with the output rate running at the full input rate. The HSP43168 architecture includes two independent FIR filters that can be configured to operate in various modes. For this example the two filters within the HSP43168 are configured to operate as two separate filters, FIR A and FIR B. FIR A is calculating the Real Output YR(n), while FIR B is calculating the Imaginary Output YI(n). Clock 1: XR(0)·CI(3)+XR(1)CI(2)+XR(2)·CI(1)+XR(3)CI(0) Clock 2: XR(0)·CI(3)+XR(1)CI(2)+XR(2)CI(1)+XR(3)CI(0) + XI(0)·CR(3)+XI(1)CR(2)+XI(2)CR(1)+XI(3)CR(0) After Clock 2, both YR and YI are valid and ready to be multiplexed as outputs. Note on Figure 2 that in the decimate by 2 mode, there are two decimation registers between each multiplier. This ensures that either all R or all I input samples are aligned at the multipliers on alternate clocks. Also note that a different coefficient set is used on alternate clocks. Real coefficients and imaginary coefficients are alternated on every clock as appropriate for each of the two filters to calculate the desired results. Figure 1 illustrates a top level Block Diagram for the complex filtering operations of the HSP43168. Each of the two filters FIR A and FIR B must be programmed to decimate by 2. This implies that every 2 clocks the real and imaginary outputs are calculated and then loaded into the holding registers. The contents of these registers are then multiplexed and clocked out at the full input rate. 3-1 1-888-INTERSIL or 321-724-7143 | Intersil and Design is a trademark of Intersil Corporation. | Copyright © Intersil Corporation 2000 Application Note 9418 FIR A I3 R3 I2 R2 I1 R1 I0 R0 CR0 CR1 CR2 CR3 -CI0 -CI1 -CI2 -CI3 X(R, I) YR + MUX ACC FIR B I3 R3 I2 R2 I1 R1 I0 R0 CI0 CI1 CI2 CI3 CR0 CR1 CR2 CR3 + Y (R, I) Yi ACC HSP43168 FIGURE 2. DATA FLOW WITHIN HSP43168 CONFIGURED AS A COMPLEX FILTER Combining Multiple HSP43168 Filters For Extended Number of Taps and Complex Filtering Many applications require more than 4-taps to achieve the filtering requirements of the system. Multiple HSP43168s can be combined to meet these requirements. One possible architecture that implements complex filtering for extended number of taps is shown on Figure 3. This example illustrates the implementation of a 16tap complex filter using the HSP43168 as the core filtering engine. This example also assumes that the desired output rate of the filter is equal to the input rate of the data. The example can be expanded to accommodate more taps and/or various input and output data rates. The maximum number of filters that can be combined together under this architecture is limited by the maximum decimation factor of the HSP43168. The maximum throughput is set by the maximum data rate that a single HSP43168 can operate at. As shown on Figure 3, there are eight HSP43168 filters that are required for this 16-tap implementation. The architecture is partitioned into two processing groups with one group of 4 filters calculating the real output component and the second group of 4 filters calculating the imaginary output component of the complex result. The two independent FIR filters that are integrated in each of the 8 HSP43168 devices are configured to operate as separate filters. Each FIRA is processing the real input samples X(real) while each FIRB is processing the 3-2 imaginary input samples X(im.). In addition, each of the individual filters is set in a decimate by four mode. In essence, this decimating factor is actually increasing the number of taps from four to sixteen for each of the individual FIR operations. Decimation causes each of the filters to have an output rate that is four times less than the input rate (decimation by 4). For this example the input data rate is 45MHz and the decimated output rate of each filter is 11.25MHz. In an attempt to better understand the signal processing throughout this architecture, the calculation of the real output component will be described in some detail. The hardware processing for the calculation of the imaginary complex output is equivalent. The combined output for the group of the four filters, that calculates the real output component, runs at the aggregate rate of its 4 filters, which is the 45MHz input rate (11.25 x 4). This implies that the output MUX selects one of the four individual filters at every 45MHz clock, rotating sequentially through the output of each of the four filters. Every filter calculates the sum of products that defines the real output component which is defined by the following equation: YR ( n ) = N–1 X R ( j )C ( j ) – X ( j )C ( j ) Σ R I I j=0 Application Note 9418 where: YR = Real Output Component XR, XI = I and R Input Components CR = Real Coefficients CI = Imaginary Coefficients and N =16 representing the 16 filter taps required for this example. Since each of the four filter outputs is selected sequentially every fourth consecutive clock, all of the input data samples are being filtered within the filter combination. The four filters are programmed to use all 16 coefficients in the decimate by four mode. Figure 4 illustrates the data flow and register structure within a single HSP43168 device. The snapshot shows 16 real and 16 imaginary input samples loaded in the registers. Note that in the decimate by 4 mode there are 4 registers between each of the 4 multipliers for FIRA and FIRB. The sum of the 16 products required for each output sample is calculated over four clocks by shifting four new samples and their corresponding coefficients at the inputs of the four multipliers as shown on the Diagram of Figure 4. The results of FIRA and FIRB are accumulated individually and they are finally combined every fourth clock to provide the desired output sample. The computational flow for FIRA over the 4 clock periods is shown below as an example for the calculation of the sum of products processing. The example illustrates the results in the accumulator for each of the clocks. Clock 1: R(0) C(3) + R(4) C(2) + R(8) C(1) + R(12) C(0) Clock 2: R(0) C(3) + R(4) C(2) + R(8) C(1) + R(12) C(0)+ R(1) C(7) + R(5) C(6) + R(9) C(5) + R(13) C(4) Clock 3: R(0) C(3) + R(4) C(2) + R(8) C(1) + R(12) C(0)+ R(1) C(7) + R(5) C(6) + R(9) C(5) + R(13) C(4)R(2) C(11)+R(6) C(10)+R(10) C(9)+R(14) C(8) Clock 4: R(0) C(3) + R(4) C(2) + R(8) C(1) + R(12) C(0)+ R(1) C(7) + R(5) C(6) + R(9) C(5) + R(13) C(4)+ R(2) C(11)+R(6) C(10)+R(10) C(9)+R(14) C(8)+ R(3) C(15)+R(7) C(14)+R(11) C(13)+R(15)C(12) Therefore, FIRA calculates one of the two partial sum of products that is necessary for the real complex output component. This partial sum of products is: N–1 X R ( j )C ( j ) Σ R j=0 In a similar fashion FIRB calculates the second partial sum of the overall real output sample which is: N–1 Σ j=0 X I ( j )C I ( j ) The results of the two filters are finally combined as shown on Figure 4 in order to produce the desired output sample. Note that there are 4 filters in the group running in parallel but with their outputs staggered by one clock. This architecture assures the processing of all input samples. The implementation for the calculation of the imaginary component of the complex output is identical to that of the real and is performed by the lower group of the other 4 HSP43168s as shown on Figure 3. This second group of filters calculates the imaginary component equation: YI ( n ) = Where: N–1 X R ( j )C ( j ) + X ( j )C ( j ) Σ i I R j=0 YI =Imaginary Output Samples XR, XI =R and I Input Samples CR =Real Coefficients CI =Imaginary Coefficients The Timing Diagram that describes the relationship between data, control signals and clocks to operate a single HSP43168 at its decimating mode is shown on Figure 5. The Timing Diagrams on Figure 5 are examples for the decimate by 2 and decimate by 4 cases. Timing of the device for higher decimation factors can be readily derived based on these two sample examples. For more details on signal description and part functionality and operation refer to the Intersil DSP Data Book. The examples described in this Application Note provide the core architectural and signal processing details that can be followed to implement other complex filters with different length and/or data rate requirements. 3-3 Application Note 9418 OUTPUT EVERY 4 CLOCKS PER FILTER † XR AIN BIN XI FS = 45MHz AIN 11.25MHz 11.25MHz HSP43168 45MHz MUX BIN HSP43168 AIN BIN AIN BIN YR ( n ) = R 11.25MHz N–1 Σ XR ( j ) C ( j ) – X ( j ) C ( j ) R I I j=0 HSP43168 11.25MHz HSP43168 R OUTPUTS AT FULL INPUT RATES AIN BIN AIN 11.25MHz 11.25MHz HSP43168 45MHz MUX BIN HSP43168 AIN BIN AIN BIN YI ( n ) = I 11.25MHz N–1 Σ XR ( j ) C ( j ) + X ( j ) C ( j ) I I R j=0 HSP43168 11.25MHz HSP43168 † I EACH HSP43168 CONFIGURED TO USE SEPARATE FIR A AND FIR B AT A DECIMATION OF 4. THIS GIVES AN EQUIVALENT OF 16-TAPS PER FILTER FIGURE 3. 16-TAP COMPLEX FILTER AT 45MHz XR R 15 R 14 R 13 R 12 R 11 R 10 R9 R8 CR R7 R6 R5 R4 CR R 3 R 2 R 1 R0 CR FIR A CR FSOUT = 11.25MHz + OUTPUT EVERY 4 CLOCKS 15 ∑ xR(j) CR(j) j=0 + XI I15 I14 I13 I12 I11 I10 CI I9 I8 I7 I6 I5 I4 CI I3 I2 I1 I0 CI FIR B CI YR - NEED 4 HSP43168 TO IMPLEMENT Yi AND 4 TO IMPLEMENT Yq (SEE FIGURE 3) + 15 ∑ xI(j) CR(j) j=0 15 YR = ∑ x R(j) C R(j) - x I(j) C I(j) j=0 15 YI = ∑ x R(j) CI(j) + x I(j) CR(j) j=0 FIGURE 4. INTERNAL OPERATIONS PER FILTER (YR EXAMPLE) 3-4 Application Note 9418 Timing Diagram CLK INA0 - 9 CSEL0 - 2 D(N+0) D(N+1) D(N+2) D(N+3) D(N+4) D(N+5) D(N+6) D(N+7) D(N+8) 0 1 0 1 0 1 0 1 0 ACCEN TXFR Y(N+0) OUT 9 - 27 DECIMATE BY 4 CLK INA0 - 9 CSEL0 - 2 D(N+0) D(N+1) D(N+2) D(N+3) D(N+4) D(N+5) D(N+6) D(N+7) D(N+8) 0 1 2 3 0 1 2 3 0 ACCEN TXFR Y(N+0) OUT 9 - 27 FWRD, RVRS, SHFTEN SHOULD BE TIED LOW FIGURE 5. TIMING DIAGRAM OF DECIMATING MODES All Intersil semiconductor products are manufactured, assembled and tested under ISO9000 quality systems certification. Intersil semiconductor products are sold by description only. Intersil Corporation reserves the right to make changes in circuit design and/or specifications at any time without notice. Accordingly, the reader is cautioned to verify that data sheets are current before placing orders. Information furnished by Intersil is believed to be accurate and reliable. However, no responsibility is assumed by Intersil or its subsidiaries for its use; nor for any infringements of patents or other rights of third parties which may result from its use. No license is granted by implication or otherwise under any patent or patent rights of Intersil or its subsidiaries. For information regarding Intersil Corporation and its products, see web site www.intersil.com 3-5