FPGA signal processing using SigmaDelta Modulation  Specom 
Embed Size (px)
Transcript of FPGA signal processing using SigmaDelta Modulation  Specom 
���� �
FPGA SIGNAL PROCESSING USING SIGMADELTAMODULATION
1
fred harris
College of Engineering,San Diego State University ,San Diego, CA, [email protected]
Chris Dick
Xilinx Inc.,2100 Logic Drive, San Jose, CA 95124, [email protected]
���������
While Σ∆M techniques [l, 2] are appliedwidely in analog conversion subsystems, both analogtodigital (ADC) and
digitaltoanalog (DAC) converters. these methods have enjoyed much less exposure in thebroader application domain, where flexible andconfigurable solutions, traditionally supplied viaa software DSP (softDSP), are required. And this
limited level of exposure is easy to understand.Most, if not all, of the efficiencies and optimizations afforded by Σ∆M are hardware oriented andso cannot be capitalized on in the fixed precisionpredefined datapath found in a softDSP processor. This limitation, of course, does not exist in afield programmable gate array (FPGA) DSP solution.With FPGAs the designer has complete control ofthe silicon to implement any desired datapathand employ optimal word precisions in the system with the objective of producing a design thatsatisfies the specifications in the most economically sensitive manner .While implementation of a digital Σ∆ ASIC (application specific integrated circuit) is of course possible, economic constraints make the implementation of such a building block that would providethe flexibility, and be generic enough to cover abroad market crosssection, impractical. FPGAbased hardware provides a solution to this problem.FPGAs are an offtheshelf commodity item thatprovide a silicon feature set ideal for constructinghighperformance DSP systems. These devicesmaintain the flexibility of software based solutions. while providing levels of performance thatmatch, and often exceed ASIC solutions.
����������� ����������� �Σ∆��� �����
���������� ������ �������!� �������������
��� �� ������� ����������� �"����� !��� #���� ����������� ����
!�������� ���� ��������� ������$������ ������ ���� ��������
������%����������������������&�Σ∆�����������"'���������(���� )*+, #����� ������� ����������� ����(���'� ���� #�
���#����� ��� �������� ������ �'� ��������!�������� ���
������!!�����������������������"����������������������#�
����&����� ������ ��������� �� ����� ����� �%������&� Σ∆�#����� * ��� �����"��� ��� ��������� )*+, ����(���
���������������� �!� �����(�#���� !������'� �� �� ��������
���� �"#���� ��������������� �������� ������ !��� �� ��!�(���
��!�����������������������&
�"(����.�!��������������#������������"��)*+,�'�������
������ ����������'� )��� !�����'� �� ��������'� ��!�(���
��!����������
ABSTRACT
�#���������������///� �����������������$������������������ ������*���������'�,������0111&
2
There is a rich and expanding body of literaturedevoted to the efficient and effective implementation of digital signal processors using FPGAbased hardware. More often than not, the mostsuccessful of these techniques involves a paradigm shift away from the methods that providegood solutions in software programmable DSPsystems.This paper reports on the rich set of design opportunities that are available to the signal processingsystem designer through innovative combinationsof Σ∆M techniques and FPGA signal processinghardware. The applications considered includenarrowband filters, both singlerate and multirate, DC canceler, and Σ∆M hybrid digitalanalogcontrol loops for simplifying carrier recovery, timing recovery and AGC (automatic gain control)loops in a digital communication receiver.The paper is organized as follows: Section 2 presents a brief overview of FPGA architecture. InSection 3 a simple singleloop baseband Σ∆ modulator is introduced. This structure is then extended to a novel architecture that permits center frequency tuning, as well a method for working withthe system degrees of freedom to tradeoff modulator bandwidth with dynamic range. The tunableΣ∆M is then utilized for implementing area efficient FPGA FIR filters. The process for computingthe modulator coefficients for lowpass, bandpassand highpass designs is described. In Section 4, anew Σ∆ modulator architecture is described thatprovides a very simple method for tuning usingonly a single coefficient. In any fixedpoint datapath, careful consideration must be given to theDC aspects of the design. For example, the introduction of a DC component due tc sample truncation between the stages of a multistage multiratefilter can be problematic, causing arithmetic saturation or increasing the bit error rates in a digitalreceiver. Section 5 describes a unique Σ∆ modulator approach to building a DC canceler. In Section6, Σ∆ methods are described for simplifying theimplementation of hybrid digitalanalog controlloops in a system such as a software definedradio. In Section 7 some comments on the indus
trial implications of the techniques considered inthe paper are presented. Finally, some conclusions are drawn in Section 8.
)*+, ,��2��/����/
There is a rich range of FPGAs provided by manysemiconductor vendors including Xilinx, Altera,Atmel, AT&T and several others. The architectural approaches are as diverse as there are manufacturers, but some generalizations can be made.Most of the devices are basically organized as anarray of logic elements and programmable routing resources used to provide the connectivitybetween the logic elements, FPGA I/O pins andother resources such as onchip memory. Thestructure and complexity of the logic elements, aswell as the organization and functionality supported by the interconnection hierarchy, distinguish the devices from each other. Other devicefeatures such as block memory and delay locked
Figure 1. Generic FPGA architecture
�#���������������///� �����������������$������������������ ������*���������'�,������0111&
3
loop technology are also significant factors thatinfluence the complexity and performance of analgorithm that is implemented using FPGAs.A logic element usually consists of 1 or moreRAM (random access memory) ninput lookuptables, where n is between 3 and 6, and 1 to several flipflops. There may also be additional hardware support in each element to enable highspeedarithmetic operations. This generic FPGA architecture is shown in Figure 1. Alsoillustrated in the figure (as widelines) are several connectionsbetween logic elements and thedevice input/output (1/O) ports.Application specific circuitry issupported in the device by downloading a bit stream into SRAM(static rondom access memory) basedconfiguration memory. This personalization database defines thefunctionality of the logic elements,as well as the internal routing.Different applications are supported on the same FPGA hardwareplatform by configuring theFPGA(s) with appropriate bitstreams. As a specific example, consider the
Xilinx VirtexTM series of FPGAs [3]. The logic elements, called slices, essentially consist of two 4input lookup tables (LUTs), two flipflops, several multiplexors and some additional silicon support that allows the efficient implementation ofcarrychains for building highspeed adders, subtracters and shift registers. Two slices form a configurable logic block (CLB) as shown in Figure2.The CLB is the basic tile that is used to build the
logic matrix. Some FPGAs, like the Xilinx Virtexfamilies, supply onchip block RAM. Figure 3shows the CLB matrix that defines a Virtex FPGA.Current generation Virtex silicon provides a family of devices offering 768 to 12,288 logic slices,and from 8 to 32 variable form factor block memories.Xilinx XC4000 and Virtex [3] devices also allowthe designer to use the logic element LUTs asmemory  either ROM or RAM. Constructingmemory with this distributed memory approachcan yield access bandwidths in the many tens ofgigabytes per second range.Typical clock frequencies for current generation devices are in the multiple tens of megahertz (100 to 200) range. In contrast to the logic slice architecture employed inXilinx Virtex devices, the logic block architectureemployed in the Atmel AT40K [4] FPGA is shownin Figure 4. Like the Xilinx device, combinational
BANK 0 BANK 1
DLL
RAM
RAM
RAM
RAM
BA
NK
6B
AN
K 7
DLL
RAM
RAM
RAM
RAM
BA
NK
3B
AN
K 2
DLL DLL
BANK 5 BANK 4
DLL = DELAY LOCKED LOOPRAM=BLOCK RAM  VARIOUS FORM FACTORS FROM 4096x1 TO 256x16
= 2 SLICE CLB = I/O
BANK N = MULTISTANDARD I/O OUTPUT
Figure 1. Xilinx Virtex logic cell array architecture
�#���������������///� �����������������$������������������ ������*���������'�,������0111&
LUT
G4
G3
G2
G1
Carry &Control
SPD Q
EC
RC
COUT
BY
YB
Y
YQ
LUT
F4
F3
F2
F1
Carry &Control
SPD Q
EC
RCBX
XB
X
XQ
CIN
SLICE 1
LUT
G4
G3
G2
G1
Carry &Control
SPD Q
EC
RC
COUT
BY
YB
Y
YQ
LUT
F4
F3
F2
F1
Carry &Control
SPD Q
EC
RCBX
XB
X
XQ
CIN
SLICE 0
EC = CLOCK ENABLE CONTROLSP = SET/PRESET CONTROLBY = RESET CONTROL
Figure 2. Simplified Virtex CLB architecture
4
logic is realized using lookup tables. In this case,two 3input LUTs and a single flipflop are available in each logic cell. The pass gates in a cell formpart of the signal routing network and are usedfor connecting signals to the multiple horizontaland vertical bus planes. In addition to the orthogonal routing resources, indicated as N, S, E and Win Figure 4, a diagonal group of interconnects(NW, NE, SE, and SW), associated with each cell xoutput, are available to provide efficient connections to a neighboring cell's x bus inputs.The objective of the FPGA/DSP architect is to formulate algorithmic solutions for applications thatbest utilize FPGA resources to achieve therequired functionality.This is a threedimensional
optimization problem in power , complexity and bandwidth. The remainder of thispaper describes some novel FPGA solutions to several signal processing problems. The results are important in anindustrial context because they enableeither smaller, and hence more economic,solutions to important problems, or allowmore arithmetic compute power to be realized with a given area of silicon.
3&�Σ∆���4,�� '�)���)�4�/� �,�)*+,
This section describes a method employingsigmadelta modulation (Σ∆ M) techniques forimplementing area efficient finite impulseresponse (FIR) filters using FPGAhardware.Before treating the FPGA filter design,a brief review of Σ∆ modulation encoding is presented.
Σ∆���4,���
SigmaDelta modulation is a source coding technique most prominently employed in analogtodigital and digitalto analog converters. In thiscontext, hybrid analog and digital circuits areused in the realization. Figure 5 shows a singleloop Σ∆ modulator. Provided the input signal isbusy enough, the linearized discrete time model ofFigure 6 can be used to illustrate the principle. Inthis figure the 1bit quantizer is modeled by anadditive white noise source with varianceδ2e=∆2/12, where ∆ represents the quantizationinterval. The ztransform of the system is
y
a2, a1, a08 x 1 LUT
out
a2, a1, a08 x 1 LUT
out
AND
Q
D
x
x y
1 Nw Se Se Sw 1 1 N S E W
1
10
Nw Se Se Sw N S E W
clock
set/reset
Output Enable Control = Pass Gate
v1
h1
v2
h2
v3
h3
v4
h4
v5
h5
Figure 4. Atmel AT40K logic cell architecture
Figure 5. Single loop Σ∆ modulator
1
1
−z
+1
1x(n) y(n)
+

Figure 6. Linearized model of Σ∆ modulator
1−z y(n)x(n)
+

q(n)
)()()()(
)()(1
1)(
)(1
)()(
zQzHzXzH
zQzH
zXzH
zHzY
ns +=+
++
= (1)
(2)
�#���������������///� �����������������$������������������ ������*���������'�,������0111&
5
which is the transfer function of delay and anideal integrator, and Hs(z) and Hn(z) are the signal and noise transfer functions (NTF) respectively. In a good Σ∆ modulator, Hs(ω) will have a flatfrequency response in the intervalf =< B. Incontrast, Hn(ω) will have a high attenuation in thefrequency bandf=< B and a don't care region inthe interval B<f < fs/2. For the single loopΣ∆ inFigure 6 Hs(z) = z1 and Hn(z) = 1 z1. Thus theinput signal is not distorted in any way by the network and simply experiences a pure delay frominput to output. The performance of the system isdetermined by the noise transfer function Hn(z)which is given by
and is shown in Figure 7. The inband quantization noise variance is
where Sq(f) = δ2q/fs is the power spectral densityof the quantization noise. Observe that for a nonshaped noise (or white) spectrum, increasing the
sampling rate by a factor of 2, while keeping thebandwidth B fixed, reduces the quantizationnoise by 3 dB. For a first order Σ∆M it can beshown that
for fs >> 2B. Under these conditions doubling thesampling frequency reduces the noise power by 9dB, of which 3 dB is due to the reduction in Sq(f)and a further 6 dB is due to the filter characteristic Hn(f). The noise power is reduced by increasing the sampling rate to spread the quantizationnoise over a large bandwidth and then by shapingthe power spectrum using an appropriate filter.
�/��/���*4/5��6 )�4�/� �� ��+�Σ∆���4,����/�2��7�/
Σ∆Μ techniques can be employed for realizingarea efficient narrowband filters in FPGAs. Thesefilters are utilized in many applications. Forexample, narrowband communication receivers,multichannel RF surveillance systems and forsolving some spectrum management problems.A uniform quantizer operating at the Nyquist rateis the standard solution to the problem of representing data within a specified dynamic range.Each additional bit of resolution in the quantizerprovides an increase in dynamic range of approximately 6 dB. A signal with 60 dB of dynamicrange requires 10 bits, while 16 bits can representdata with a dynamic range of 96 dB.While the required dynamic range of a systemfixes the number of bits required to represent thedata, it also affects the expense of subsequentarithmetic operations, in particular multiplications. In any hardware implementation, and ofcourse this includes FPGA based DSP processors,there are strong economic imperatives to minimize the number and complexity, of the arithmetic components employed in the datapath. Theproposal investigated in this section is to employ
1
1)(
−=
zzH
Figure 7. Single loop sigmadelta modulator noise transferfunction.
(3)
where
sn f
ffH
πsin4)( =
(4)
∫+
−=
B
B qnn dffSfH )()(22δ
(5)
2
222 2
3
1
≈
sqn f
Bδπδ
(6)
�#���������������///� �����������������$������������������ ������*���������'�,������0111&
6
noiseshaping techniques to reduce the precisionof the input data samples so that the complexityof the multiplyaccumulate (MAC) units in the filter can be minimized. Of course, the preprocessing must not compromise the integrity of the signal in the band of interest. The net result is areduction in the amount of FPGA logic resourcesrequired to realize the specified filter.Consider the structure shown in Figure 8. Insteadof applying the quantized data x(n) from the analogtodigital converter directly to the filter, it willbe preprocessed by a Σ∆ modulator.
The requantized input samples x(n) are now represented using fewer bits per sample, so permitting the subsequent filter H(z) to employ reducedprecision multipliers in the mechanization. Thefilter coefficients are still kept to a high precision.The Σ∆ data requantizer is based on a single looperror feedback sigmadelta modulator [l] shownin Figure 9. In this configuration, the differencebetween the quantizer input and output sample isa measure of the quantization error which is fedback and combined with the next input sample.The errorfeedback sigmadelta modulator operates on a highly oversampled input and uses theunit delay z1 as a predictor. With this basic errorfeedback modulator only a small fraction of thebandwidth can be occupied by the required signal. In addition, the circuit only operates at baseband. A larger fraction of the Nyquist bandwidthcan be made available and the modulator can betuned if a more sophisticated error predictor isemployed. This requires replacing the unit delaywith a prediction filter P(z). This generalizedmodulator is shown in Figure 10.The operation of the requantizer can be understood by considering the transform domaindescription of the circuit.
This is expressed in Eq. (7) as
where Q(z) is the ztransform of the equivalentnoise source added by the quantizer q(·), P( z ) isthe transfer function of the error predictor filter,and X(z) and X(z) are the transforms of the systeminput and output respectively. P(z) is designed tohave unity gain and leading phase shift in thebandwidth of interest. Within the design bandwidth, the term Q(z)(1  p(z)zl) = 0 and so X(z) =X(z). By designing P(z) to be commensurate withthe system passband specifications, the inbandspectrum of the requantizer output will ideallybe the same as the corresponding spectral regionof the input signal.To illustrate the operation of the system considerthe task of recovering a signal that occupies 10%of the available bandwidth and is centered at anormalized frequency of 0.3 Hz. The stopband
Σ∆ H(z)A/D y(n)x(t)x(n) x(n)
Figure 8. Reducedcomplexity FIR filtering employing a Σ∆preprocessor
1−z
Q( )x(n) y(n)
x(n) q(.)
1−z 1−z 1−z 1−z 1−z
+
+
aM1 a4 a3 a2 a1
x(n)
+

Figure 10. Tunable sigmadelta modulator using a linearmodulator in the feedback path.
Predictor filter P(z)
))(1)(()()(ˆ 1−−+= zzPzQzXzX
(7)
�#���������������///� �����������������$������������������ ������*���������'�,������0111&
7
requirement is to provide 60 dB of attenuation.Figure 11(a) shows the input test signal.It comprises an inband component and two outofband tones that are to be rejected. Figure 11(b) is afrequency domain plot of the signal after it hasbeen requantized to 4 bits of precision by a Σ∆modulator employing an 8th order predictor inthe feedback path. Notice that the 60 dB dynamicrange requirement is supported in the bandwidthof interest, but that the outofband SNR has beencompromised. This is of course acceptable, sincethe subsequent filtering operation will providethe necessary rejection. A 160tap filter H(z) satisfies the problem specifications. The frequencyresponse of H(z) using 12bit filter coefficients isshown in Figure 11(c). Finally, H(z) is applied tothe reduced sample precision data stream X(z) toproduce the spectrum shown in Figure 11(d).Observe that the desired tone has been recovered,the two outofband components have been rejected, and that the inband dynamic range meets the60 dB requirement.
*�/������)�4�/��/ �+�
The design of the error predictor filter is a signalestimation problem [7, 8]. The optimum predictor
is designed from a statistical viewpoint. The optimization criterion is based on the minimization ofthe meansquared error. As a consequence, onlythe secondorder statistics (autocorrelation function) of a stationary process are required in thedetermination of the filter. The error predictor filter is designed to predict samples of a bandlimited white noise process Nxx(ω) shown in Figure 12.
Nxx(ω) is defined as
and related to the autocorrelation sequence rxx
(m) by discretetime Fourier transform (DTFT)
The autocorrelation function rxx(n) is found bytaking the inverse DTFT of Eq. ( 9 )
Nxx(ω) is nonzero only in the interval θ=<ω=<θgiving rxx(n) as
�#���������������///� �����������������$������������������ ������*���������'�,������0111&
Figure 11. (a) Input signal  12b samples. (b) Shaped input 4b samples. (c) Filter response  12b coefficients. (d) Filteredresult.
)(ωxxN
Frequencyπθθ−π−
Figure 12. The Σ∆ modulator error predictor filter isdesigned to predict samples of a narrowband white noiseprocess.
≤≤−
=otherwise
Nxx ....0
...1)(
θωθω
(8)
∑∞
−∞=
−=n
njxxxx ekrN ωω )()(
(9)
∫−
−=π
π
ω ωωπ
deNnr njxxxx )(
2
1)(
(10)
)(sin)( ncnrxx θπθ=
(11)
8
So the autocorrelation function corresponding toa bandlimited white noise power spectrum is asinc function. Samples of this function are used toconstruct an autocorrelation matrix which is usedin the solution of the normal equations to find therequired coefficients. Leaving out the scaling factor in Eq. (11), the required autocorrelation function rxx(n), truncated to p samples, is defined as
The normal equations are defined as
equations [7]. This system of equations can becompactly written in matrix form by first definingseveral matrices.To design a ptap error predictor filter first compute a sinc function consisting of p + 1 samplesand construct the autocorrelation matrix Rxx as
Next define a filter coefficient rowvector A as
A=[a(0),a(1),...,a(p1)]
where ai i = 0, . . . , p 1 are the predictor filter coefficients. Let the rowvector R’xx be defined as
The matriz equivalent of Eq. (13) is
The filter coefficients are therefore given as
For the case inhand, the solution of Eq. (18) is anillconditioned problem. To arrive at a solution forA, a small constant e is added to the elementsalong the diagonal of the autocorrelation matrixRxx in order to raise its condition number. Theactual autocorrelation matrix used to solve for thepredictor filter coefficients is
8,�*, �*�/����
The previous section described the design of alowpass predictor. In this section bandpassprocesses are considered.A bandpass predictorfilter is designed by modulating a lowpass prototype sinc function to the required center frequency θ0 [10]. The bandpass predictor coefficientshBP(n) are obtained by solving the normal equations with a heterodyned sinc function
2�+2*, �*�/����
The highpass predictor coefficients hHP(n) areobtained by solving the normal equations with asinc function heterodyned to the half sample rate
sincHP(n) = sincLP(n)(l)nk n = 0, ...,2p
�#���������������///� �����������������$������������������ ������*���������'�,������0111&
1,...,0,)sin(
)( −== pnn
nnrxx θ
θ
(12)
pm
kmrkamr xx
p
kxx
,....,2,1
)()()(1
=
−= ∑=
(13)
−−
−−
=
)0(....)2()1(
................
)2(....)0()1(
)1(....)1()0(
xxxxxx
xxxxxx
xxxxxx
xx
rprpr
prrr
prrr
R
(14)
)](),....,2(),1([' prrrR xxxxxxxx =(16)
(15)
Txx
Txx RAR )'(= (17)
Txxxx
T RRA )'(1−=(18)
∈+−−
−∈+−∈+
=
)0(....)2()1(
................
)2(....)0()1(
)1(....)1()0(
xxxxxx
xxxxxx
xxxxxx
xx
rprpr
Mrrr
prrr
R
(19)
pn
knncnc LPBP
2,...,0
))(cos()(sin)(sin 0
=−= θ
(20)
(21)
9
Σ∆���4,���)*+, ��*4/�/��,���
The most challenging aspect of implementing thedata modulator is producing an efficient implementation for the prediction filter P(z). The desireto support highsample rates, and the requirement of zero latency for P(z), will preclude bitserial methods from this problem. In addition, forthe sake of area efficiency, parallel multipliersthat exploit one timeinvariant input operand (thefilter coefficients) will be used, rather than general variablevariable multipliers. The constant coefficient multiplier (KCM) is based on a multibitinspection version of Booth's algorithm [9].Partitioning the input variable into 4bit nibbles isa convenient selection for the Xilinx Virtex function generators (FG) [3]. Each FG has 4 inputs andcan be used for combinatorial logic or as application RAM/ROM. Each logic slice [3] in the Virtexlogic fabric comprises 2 FGs, and so can accommodate a 16 x 2 memory slice. Using the rule ofthumb that each bit of filter coefficient precisioncontributes 5 dB to the sidelobe behavior, 12bitprecision is used for P(z). 12bit precision will alsobe employed for the input samples . There are 34bit nibbles in each input sample. Concurrently,each nibble addresses independent 16 x 16 lookuptables (LUTs). The bit growth incorporated hereallows for worst case filter coefficient scaling inP(z). No pipeline stages are permitted in the multipliers because of P(z)'s location in the feedbackpath of the modulator. It is convenient to use thetransposed FIR filter for constructing the predictor. This allows the adders and delay elements inthe structure to occupy a single slice. 64 slices arerequired to build the accumulatedelay path. TheFPGA logic requirements for P(z), using a9tappredictor, is Γ(p(z)) =9x 40+64=424 CLBs. A smallamount of additional logic is required to completethe entire Σ∆ modulator. The final slice count is450. The entire modulator comfortably operateswith a 113 MHz clock. This clock frequencydefines the system sample rate, so the architecturecan support a throughput of 113 MSamples persecond. The critical path through this part of the
design is related to the exclusion of pipelining inthe multipliers.
�/��/���*4/5��6 )����/�2,��9,���
Now that the input signal is available as a reducedprecision sample stream, filtering can be performed using area optimized hardware. For thereasons discussed above, 4bit data samples are aconvenient match for Virtex devices. Figure 13shows the structure of the reduced complexityFIR filter. The coded samples x(n) are presented tothe address
inputs of N coefficient LUTs. In accordance withthe modulated data stream precision, each LUTstores the 16 possible scaled coefficient values forone tap as shown in Figure 14.An Ntap filterrequires N such elements. The outputs of the minimized multipliers are combined with an adddelay datapath to produce the final result. Thelogic requirement for the filter is Γ(H(z)) =NΓ(MUL)+(N 1)Γ(ADD_z1) where Γ(MUL) andΓ(ADDz1) are the FPGA area cost functions fora KCM multiplier and an adddelay datapathcomponent respectively.Using fullprecision input samples without anyΣ∆M encoding, each KCM would occupy 40 slices.The total cost of a direct implementation of H(z) is7672 slices. The reduced precision KCMs used toprocess the encoded data each consume only 8slices. Including the sigmadelta modulator theslice count is 3002 for the Σ∆ approach. So thedata requantization approach consumes only
�#���������������///� �����������������$������������������ ������*���������'�,������0111&
1−z 1−z 1−z
x(n)
y(n)
0a3−Na2−Na1−Na
.....
.....
.....
implemented as a table lookup constantcoefficient multiplier (KCM)
Figure 13. Area optimized FPGA FIR structure
10
39% of the logic resources of a direct implementation.
Σ∆�/���,��
The procedure for requantizing the source datacan also be used effectively in an m . 1 decimationfilter. An interesting problem is presented whenhigh input sample rates (>= 150MHz) must besupported in FPGA technology. Highperformance multipliers are typically realized by incorporating pipelining in the design. This naturallyintroduces some latency in to the system. Thelocation of the predictor filter P(z) requires a zerolatency design.1 Instead of requantizing, filteringand decimating, which would of course require aΣ∆ modulator running at the input sample rate,this sequence of operations must reordered to
permit several slower modulators to be used inparallel. The process is performed by first decimating the signal, requantizing and then filtering. Now the Σ∆ modulators operate at thereduced output sample rate. This is depicted inFigure 15. To support arbitrary center frequencies,and any arbitrary, but integer, downsamplingfactor m, the bandpass decimation filter mustemploy complex weights. The filter weights are ofcourse just the bandpass modulated coefficientsof a lowpass prototype filter designed to supportthe bandwidth of the target signal. Samples arecollected from the A/D and alternated betweenthe two modulators. Both modulators are identical and use the same predictor filter coefficients.The requantized samples are processed by anm:1 complex polyphase filter to produce the decimated signal. Several design options are presented once the signal has been filtered and the sample rate lowered. Figure 15 illustrates one possibility. Now that the data rate has been reduced, thelow rate signal is easily shifted to baseband with asimple, and area efficient, complex heterodyne.One multiplier and a single digital frequency synthesizer could be time shared to extract one ormultiple channels.It is interesting to investigate some of the changesthat are required to support the Σ∆ decimator.What may not be immediately obvious is that thecenter frequency of the prediction filter must bedesigned to predict samples in the required spectral region in accordance with the output samplerate. For example, consider m=2, and the requiredchannel center frequency located at 0.1 Hz, nor
1 It is possible that the predictor could be modified topredict samples further ahead in the time series, butthis potential modification will not be dealt with in thelimited space available.
0 x an
2 x an
1 x an
3 x an
5 x an
4 x an
8 x an
7 x an
7 x an
5 x an
6 x an
6 x an
3 x an
2 x an
1 x an
QuantizedData
Samples
4
Address bus
Implemented using FPGAfunction generators
Figure 14. Constant coefficient multiplier.
A/D
0Σ∆ )(0 zH)(ˆ0 nx
1Σ∆ )(1 zH)(ˆ1 nx
)(tx
)cos( 0nθ
)sin( 0nθ
I
Q
Figure 15. Dual halfrate Σ∆ modulators in m:1 complexdecimator configuration.
�#���������������///� �����������������$������������������ ������*���������'�,������0111&
11
malized with respect to the input sample rate. Theprediction filter must be designed with a centerfrequency located at 0.2 Hz. In addition, the quality of the prediction must be improved. Withrespect to the output sample rate, the predictorsare required to operate over a wider fractionalbandwidth. This implies more filter coefficients inP(z).The increase in complexity of this componentmust of course be balanced against the savingsthat result in the reduced complexity filter stageto confirm that a net savings in logic requirementsis produced. To more clearly demonstrate theapproach, consider a 2:1 decimator, a channel center frequency at 0.2 Hz and a 60 dB dynamic rangerequirement.Figure 16(a) shows the double sided spectrum ofthe input test signal. The input signal is commutated between
Σ∆0 and Σ∆1 to produce the two low precisionsequences xo(n) and x1(n). The respective spec
trums of these two signals are shown in Figures16(b) and 16(c). The complex decimation filterresponse is defined in Figure 16(d). After filtering,a complex sample stream supported at the lowoutput sample rate is produced. This spectrum isshown in Figure 16(e). Observe that the outofband components in the test signal have beenrejected by the specified amount and that the inband data meets the 60 dB dynamic rangerequirement. For comparison, the signal spectrumresulting from applying the processing stages inthe order, requantize, filter and decimate is shownin Figure 16(f). The interesting point to note is thatwhile the dual Σ∆ modulator approach satisfiesthe system performance requirements, its outofband performance is not quite as good as theresponse depicted in Figure 16(f). The stopbandperformance of the dual modulator architecturehas degraded by approximately 6 dB. This can beexplained by noting that the shaping noise produced by each modulator is essentially statistically independent. Since there is no couplingbetween these two components prior to filtering,complete phase cancelation of the modulatornoise cannot occur in the polyphase filter.
� �� ��
To provide a frame of reference for the Σ∆ decimator, consider an implementation that does notpreprocess the input data, but just applies itdirectly to a polyphase decimation filter. A complex filter processing realvalued data consumesdouble the FPGA resources of a filter with realweights. For N = 160, 15344 CLBs are required.This figure is based on a cost of 40 CLBs for eachKCM and 8 CLBs for an adddelay component.Now consider the logic accounting for the dualmodulator approach. The area cost Γ(FIR) for thisfilter is
�#���������������///� �����������������$������������������ ������*���������'�,������0111&
Figure 16. (a) Input signal. (b) Shaped data xo(n). (c) Shapeddata xl(n). (d) Complex filter. (e) Recovered result. (f) Filteredsignal  single modulator.
)_()ˆ()(2)ˆ( 1−Γ+Γ+Σ∆Γ=Γ zACCLUMRIF
(22)
12
where Γ(Σ∆) represents the logic requirements forone Σ∆ modulator, and Γ(MUL) is the logic needed for a reduced precision multiplier. Using thefilter specifications defined earlier, and 18taperror prediction filters, Γ(FIR) = 2 x 738 + 2 x ((160+ 159) x 8) = 6596. Comparing the area requirements of the two options produces the ratio
So for this example, the requantization approachhas produced a realization that is significantlymore area efficient than a standard tappeddelayline implementation.
�/��/��)�/7�/��6 �����+
For both the singlerate and multirate Σ∆ basedarchitectures, the center frequency is defined bythe coefficients in the predictor filter and the coefficients in the primary filter.The constant coefficient multipliers can be constructed using theFPGA function generators configured as RAMelements. When the system center frequency is tobe changed, the system control hardware wouldupdate all of the tables to reflect the new channelrequirements. If only several channel locations areanticipated, separate configuration bit streams [3]could be stored, and the FPGA(s) reconfigured asneeded.
8,�*, �Σ∆���� ��+�,44*, ��/�:�;
In an earlier section we discussed how to design apredicting filter for the feedback loop of a standard sigma delta modulator. The predicting filterincreases the order of the modulator so that themodified structure has additional degrees of freedom relative to a singledelay noise feedbackloop.These extra degrees of freedom have beenused in two ways, first to broaden the bandwidth
of the loop's noise transfer function, and second totune its center frequency. The tuning processentailed an off line solution of the Normal equations which while not difficult, does present asmall delay and the need for a backgroundprocessor. We can define a sigmadelta loop witha completely different architecture that offers thesame flexibility, namely wider bandwidth and atunable center frequency that does not requirethis background task. In this alternate architecture, a fixed set of feedback weights from a set ofdigital integrators defines a baseband prototypefilter with a desirable NTF. The filter is tuned toarbitrary frequencies by attaching to each delayelement zl, a simple subprocessing element thatperforms a baseband to bandpass transformation of the prototype filter. This processing element tunes the center frequency of its host prototype with a single real and selectable scalar.Thestructure of a fourth order prototype sigmadeltaloop is shown in Figure 17. The time and spectrumobtained by using the loop with a 4bit quantizeris shown in Figure 18. In this structure the digitalintegrator poles are located on the unit circle atDC. The local feedback (a1 and a2) separates thepoles by sliding them along the unit circle, andthe global feedback (b1, b2, b3 and b4) placesthese poles in the feedback path of the quantizerso they become noise transfer function zeros.
%4315344/6596)(
)ˆ( ≈=ΓΓ=
FIR
RIFλ
(23)
Figure 18. Input and output time series of baseband prototype 4th order, 4bit sigmadelta loop (top) and output spectrum(bottom).
�#���������������///� �����������������$������������������ ������*���������'�,������0111&
13
These zeros are positioned to form an equalripple stop band for the NTF. The coefficients selected to match the NTF polezero locations to anelliptic high pass filter. The single sided bandwidth of this fourth order loop is approximately4% of the input sample rate.The lowpass to bandpass transformation for asampled data filter is achieved by substituting an
allpass transfer function G(z) for the allpasstransfer function zl. This transformation isshown in Eq.(24).
A block diagram of a digital filter with the transfer function for G(z) is shown in Figure 19.Examining the left hand block diagram, we findthe transfer function from x(n) to y(n) is the allpass network (1  cz)/(z  c) while the transferfunction from x(n) to v(n) is (l/z)(lcz)/(zc).weabsorb the external negative sign change in the
internal addersof the filter weobtain the simple righthandside version ofthe desiredtransfer functionG(z).After the blockdiagram substitution has been
made, we obtainFigure 20, the tunable version of the lowpass prototype. The basic structure of the prototyperemains the same when we replace the delay withthe tunable allpass network. The order of the filter is doubled by the substitution since each delayis replaced by a second order subfilter. Tuning istrivially accomplished by changing the c multiplier of the allpass network. The tuned version of
the system reverts back to the prototyperesponse if we set c to 1.Figure 21 presents the time and spectrumobtained by using the tunable loop with a4bit quantizer shown in Figure 20. Thesingle sided bandwidth of the prototypefilter is distributed to the positive and negative spectral bands of the tuned filter.
Thus the twosided bandwidth of each
1−z 1−z
q(.)
)(nx
)(ny
1a
1b
1−z 1−z
2a
2b 3b 4b


Figure 17. Fourth order sigmadelta loop.
1−z
1−z

)(ny
)(nx
)(nv
1
c
1−z 1−z)(nx )(nv
c

Figure 19. Block diagram of allpass transfer function G(z) form Eq.(1)
−−−→ −−
cz
czzz
111
(24)
�#���������������///� �����������������$������������������ ������*���������'�,������0111&
Figure 21. Input and output time series of tuned basebandprototype 4th order, 4bit sigmadelta loop (top), filter andmodulator output spectrum (middle), and filtered output spectrum (bottom).
14
spectral band is approximately 4% of the inputsample rate.We now estimate the computational workloadrequired to operate the prototype and tunable filter. The prototype filter has six coefficients toform the 4poles and the 4zeros of the transferfunction. The two ak k = 0, 1 coefficients determine the four zero locations. These are small coefficients and can be set to simple binary scalers.The values computed for this filter for al and a2
were 0.0594 and 0.0110.These can be approximated by 1116 and 11128 which lead to no significantshift of the spectral zeros in the NTF. These simple multiplications are of course virtually free inthe FPGA hardware since they are implementedwith suitable wiring. The four coefficients bk k =0, ... , 3 are 1.000, 0.6311, 0.1916, and 0.0283 respectively were replaced with coefficients containingone or two binary symbols to obtain values 1.000,1;2+1/8 (.625), 118+1/16 (0.1875) and 1;32(0.03125). When the sigmadelta loop ran withthese coefficients there was no discernable changein bandwidth or attenuation level of the loop. Theloop operates equally as well in the tuning modeand the non tuning mode with the approximatecoefficients listed above. Thus the only real multiplies in the tunable sigmadelta loop are the ccoefficients of the allpass networks. These networks are unconditionally stable and alwaysexhibit allpass behavior even in the presence offinite arithmetic and finite coefficients.This isbecause the same coefficient forms the numeratorand the denominator. Errors in approximating the
coefficients for csimply result in afrequency shift ofthe filter's tunedcenter. The ccoefficient isdetermined fromthe cosine of thecenter frequency(in radians/sample). The curvefor this relation
ship is shown in Figure 22. Also shown is an errordue to approximating c by c +δc. The question is,what is the change in center frequency θ, from θto θ + δθ due to the approximation of c? We cansee that the slope at the operating point on thecosine curve is  sin(θ), so that δc/δθ ≈ sin(θ) sothat δc ≈ δθsin(θ) is the required precision tomaintain a specified error. We note that tuningsensitivity is most severe for small frequencieswhere sin(θ) is near zero. The tolerance term, δθsin(θ) , is quadratic for small frequencies, but thelowest frequency that can be tuned by the loop ishalf the NTF passband bandwidth.For the fourthorder system described here, this bandwidth is4% of sample rate, so the half bandwidth angle is2% or 0.126 radians. To assure that the frequencyto which the loop is tuned has an error smallerthan 1% of center frequency δc < δθsin(θ) => δc <(0.1261100)(0.126) = 0.0002 which corresponds to a14 bit coefficient. An error of less than 10% centerfrequency can be achieved with 10 bit coefficients.
G(z) G(z)
q(.)
)(nx
)(ny
1a
1b
G(z) G(z)
2a
2b 3b 4b


Figure 20. Tunable sigmadelta loop. Prototype is fourth Order. Tunable version is eight order.
Figure 22. Relationship between errors in constant c in allpass networks to errors in tuning frequency.
�#���������������///� �����������������$������������������ ������*���������'�,������0111&
15
The tuning multipliers could be implemented asfull multipliers in the FPGA hardware or asdynamically reconfigured KCMs, or KDCM, asshown in Figure 23. The later approach conservesFPGA resources at the expense of introducing astartup penalty each time the center frequency ischanged. The startup period is the initializationtime of the KCM LUT. When a new center frequency is desired, the tuning constant is presented to the k input of the KDCM and the load signalLD is asserted. This starts the initialization enginewhich requires 16 clock cycles to initialize 16 locations in the multiplier LUT. The initializationengine relies on the automatic shift mode [31 ofthe Virtex LUTs. In this mode of operation a LUT'sregister contents are passed from one cell to thenext cell on each clock tick.This avoids therequirement for a separate address generator andmultiplexor in the initialization hardware.Observe from Figure 23 that the initializationengine only introduces a small amount of additional hardware over that of a static KCM.There is approximately a factor of 4 difference inthe area of a KDCM and full multiplier.
Σ∆����,��/4/�
Unwanted DC components can be introduced intoa DSP datapath at several places. It may be presented to the system via an untrimmed offset in
the analogtodigital conversion preprocessingcircuit, or may be attributed to bias in the AIDconverter itself. Even if the sampled input signalhas a zero mean, DC content can be introducedthough arithmetic truncation processes in thefixedpoint datapath. For example, in a multistage multirate filter , the intermediate filter output samples may be quantized between stages inorder to compensate for the filter processing gainand thereby keep the wordlength requirementsmanageable. The introduced DC bias can impactthe dynamic range performance of a system andpotentially increase the error rate in a digitalreceiver application.In a fixedpoint datapath, the bias can causeunnecessary saturation events that would notoccur if the DC was not present in the system.In a digital communication receiver employing Mary QAM modulation, the DC bias can interferewith the symbol decision process, so causingincorrect decoding and therefore increasing thebit error rate.In some cases the introduced bias can be ignoredand is of no concern. However, for other applications it is desirable to remove the DC component.One solution to removing the unwanted DClevel is to employ a DC canceler.A simple canceler is shown in Figure 24. It is easyto show that the transfer function of the networkis
The cancelation is due to the transfer functionzero at 0 Hz. The pole at 1 µ controls the system
WE
DOUT
DIN
ADDR
CONTROL

LDLD 16
k
A
k x A
INITIALIZATION ENGINE
= REGISTERDISTRIBUTED
RAM LUT
Figure 23. Loadable constant coefficient multiplier (KDCM).
�#���������������///� �����������������$������������������ ������*���������'�,������0111&
1−z
Q( )

+)(nx
)(ny)(nyq
µ
Figure 24. Simple DC canceler.
)1(
1)(
µ−−−=
z
zzH
(25)
16
bandwidth and hence the system transientresponse. The location of the zero at z = 1 ofcourse, completely removes the DC component inthe signal, but there are some problems with apractical implementation of this circuit.Figure 25a. is a spectral domain representation ofa biased signal presented to the DC canceler.Figure 25b is the processed signal spectrum atyq(n) in Figure 24. We observe that the DC contentin the input signal has been completely removed.However, in the process of running the cancelingloop the network processing gain has caused adynamic range expansion. So although the sample stream yq(n) is a zero mean process, it requiresa larger number of bits to represent each samplethan is desirable. The only option with the circuitis to requantize yq(n) to produce y(n) using thequantizer Q(·). The effect of this operation isshown in Figure 25c, which demonstrates, notsurprisingly, that after an 8bit quantizer, the signal now has a DC component and we are almostback to where we started. How
can the canceler be reorganized to avoid thisimplementation pitfall? One option is to embedthe requantizer in the feedback loop in the formof a Σ∆ modulator as shown in Figure 26. Themodulator can be a very simple 1st order loop
such as the error feedback Σ∆ modulator shownin Figure 9. Figure 25d demonstrates the operation of the circuit for 8bit output data. Observefrom the figure that the DC has been removedfrom the signal while employing the same 8bitoutput sample precision that was used in Figure24. The simple Σ∆M employed in the canceler iseasily implemented in an FPGA.
��*4�)6 �+��,4 �/�/�</������4 4*
� ��+�Σ∆���4,��
In earlier sections we recognized that when asampled data input signal has a bandwidth that isa small fraction of its sample rate the sample components from this restricted bandwidth are highly correlated. We took advantage of that correlation to use a digital sigmadelta modul8.tor torequantize the signal to a reduced number of bits.The sigmadelta modulator encodes the input signal with 8. reduced number of bits while preserving full input precision over the signal bandwidthby placing the increased noise due to requantization in outofband spectral positions that are
Figure 25. (a) Input signal. (b) DC canceler. (c) QuantizedDC canceler. (d) Σ∆ DC canceler.
1−z

+)(nx )(ny
µ
Σ∆
Figure 26. Sigmadelta based DC canceler.
�#���������������///� �����������������$������������������ ������*���������'�,������0111&
VCO VCO
DAC DAC DAC
REG REG REG
Digital Receiver
Control Loops
RF INAGC
Control Data
DemodulatedData
Figure 27. Block diagram of digital receiver showing feedback paths containing digital signals converted to analog control signals for analog components.
17
already scheduled to be rejected by subsequentDSP processing. The purpose of this requantization is to allow the subsequent DSP processing tobe performed with reduced arithmetic resourcerequirements since the desired data is now represented by a smaller number of bits.A similar remodulation of data samples can by beemployed for signals generated within a DSPprocess when the bandwidth of the signals aresmall compared to the sample rate of the process.A common example of this circumstance is thegeneration of control signals used in feedbackpaths of a digital receiver. These control signalsinclude a gain control signal for a voltage controlled amplifier in an automatic gain control(AGC) loop and VCO (voltage controlled oscillator) control signals in carrier recovery and timingrecovery loops [10, 11, 12]. A block diagram of areceiver with these specific controls signals isshown in Figure 27. The control signals are generated from processes operating at a sample rateappropriate to the input signal bandwidth. Thebandwidth of control loops in a receiver are usually a very small fraction of the signal bandwidth,which means that the control signal are veryheavily oversampled. As a typical example, in acable TV modem, the input bandwidth is 6 MHz,the processing sample rate is 20 MHz, and theloop bandwidth may be 50 kHz. For this example,the ratio of sample rate to bandwidth is 4000to1.As seen in Figure 27, the process of deliveringthese oversampled control signals to their respective control points entails the transfer of 16 bitwords to external control registers, requiringappropriate busses, addressing, and enable linesas well as the operation of 16bit digitaltoanalogconverters (DACs).We can use a sigmadelta modulator to requantizethe 16bit oversampled control signals in the digital receiver prior to passing them out of the processing chip. The sigmadelta can preserve therequired dynamic range over the signal's restricted bandwidth with a onebit output. As suggested in Figure 28, the transfer of a single bit to control the analog components is a significantly less
difficult task than the original. We no longerrequire registers to accept the transfer, the bussesto deliver the bits, or the DAC to convert the digital data to the analog levels the data represents.All that is needed a simple filter ( and likely ananalog amplifier to satisfy drive level and offsetrequirements). Experience shows that a 1bit, oneloop sigmadelta modulator could achieve 80 dB
dynamic range and requires a single RC filter toreconstruct the analog signal. A twoloop sigmadelta modulator is required to achieve 16bit precision for which a double RC filter is required toreconstruct the analog output signal. Figure 29shows the time response of the onebit two loopsigmadelta converter to a slowly varying controlsignal and the reconstructed signal obtained fromthe dualRC filter. Figure 30 shows the spectrumobtained from a 1bit twoloop modulator and thespectrum obtained from an unbuffered RCRC filter.This example has shown how with minimal additional hardware, an FPGA can generate analogcontrol signals to control lowbandwidth analogfunctions in a system.An observation worthy of note, is that the audioengineering community has recognized theadvantage offered by this option of requantizing a16bit oversampled data stream to 1bit datastream. In that community, the output signal isintentionally upsampled by a factor of 64 andthen requantized to 1bit in a process called aMASH converter. Nearly all CD players use the
�#���������������///� �����������������$������������������ ������*���������'�,������0111&
VCO VCO
Digital Receiver
Control Loops
Σ∆ Modulators
RF INAGC
DemodulatedData
Figure 28. Block diagram of digital receiver showing feedback paths delivering and processing onebit analog control signals generated by sigmadelta modulators.
18
MASH converter to deliver analog audio signals.
:2,� 2,</�:/�+,��/=
What has been achieved by expressing our signalprocessing problems in terms of Σ∆M techniques?The paper has demonstrated some Σ∆M techniques for the compact implementation of certaintypes of filter and control applications usingFPGAs. This optimization can be used in several
ways to bring economic benefits to a commercialdesign. By exploiting Σ∆M filter processes, agiven processing load may be realizable in alowerdensity, and hence less expensive, FPGAthan is possible without access to these techniques. An alternative would be to perform moreprocessing using the same hardware. For example, processing multiple channels in a communication system.In addition to FPGA area tradeoffs, the Σ∆Mmethods can result in reduced power consumption in a design. Power p may be expressed as
where c is capacitance, v is voltage and fclk is thesystem clock frequency. By reducing the siliconarea requirements of a filter, we can simultaneously reduce the power consumption of thedesign. For the examples considered earlier, logicresource savings of greater than 50% weredemonstrated. The savings is proportional toincreased efficiency in the system power budget,and this of course is very important for mobileapplications.The Σ∆M AGC, timing and carrier recovery control loop designs are also important examples in aindustrial context. The examples illustrated howthe component count in a mixed analog/digitalsystem can be reduced. In fact, not only is thecomponent count reduced, but printed circuitboard area is minimized. This results in more reliable and physically smaller implementations. Thereduced component count also results in reducedpower consumption.In addition, since the controlloops no longer require wide output buses fromthe FPGA to multibit DACs that generate analogcontrol voltages, power consumption is decreasedbecause fewer FPGA I/O pads are being driven.
���4� ��
FPGAs opens a range of opportunities in the solu
Figure 29. Control loop signal and Σ∆ encoded control signal (top), filtered Σ∆ sequence (bottom).
Figure 30. Σ∆ encoded control signal spectrum (top), filteredΣ∆ sequence spectrum (middle), exploded view at baseband offiltered Σ∆ control signal (bottom).
clkfcvP 2=(26)
�#���������������///� �����������������$������������������ ������*���������'�,������0111&
19
tion space that can result in highperformanceand economic solutions to a DSP problem. Oftenthis is best achieved via an appropriate paradigmshift in the algorithmic domain to find an optimalapproach that best exploits the cellular LUT /flipflop FPGA architecture. This paper has illustratedhow Σ∆M techniques can be combined withFPGA technology to address a range of signalprocessing problems. These included single andmultirate filters, DC canceler and the efficientand compact generation of analog control signalsfor AGC, carrier recovery and timing recoveryfunctions in a communication receiver. The sourcedata requantization approach is suitable for bothsinglerate and multirate filter processes. Theproposed method arms the DSP /FPGA engineerwith another tool that is useful for certain filteringrequirements. For the examples considered here,logic savings in excess of 50% were demonstrated.As the frequency band of interest occupies asmaller fractional bandwidth, the order of therequired filter increases. This growth tends tomake the data requantization approach moreattractive, as the cost of modulator consumes adecreasing proportion of the entire design.While the paper has exclusively focused on Σ∆Mmethods in the context of FPGA hardware, we feelthat there is broader lesson delivered in the study.The signal processing literature is full of creativesolutions to real world problems. Often thesesolutions are excluded to a designer because theydo not map well to software programmable DSParchitectures. The algorithm will of course havean ASSP solution, but this may not be an optionfor reasons of schedule, economies of scale, andflexibility. FPGAs do, however, give immediateaccess to the diverse range of potential solutions.And they do so while simultaneously providingflexibility and highperformance  frequently theperformance equals or exceeds that of an ASSP.In this era of systemsonachip, increased fiscalpressure, tighter engineering deadlines, timetomarket constraints, and increasing performancedemands, new system level and hardware architectures must be employed. FPGAs provide a
mechanism for working with and performingtradeoffs between all of these important variables. As we move into the next millennium,reconfigurable FPGA technology will increasingly provide solutions to signal processing problems.
�/)/�/��/
[l] J. C. Candy and G. C. Temes, "OversamplingMethods for A/D and DIA Conversion" inOversampling DeltaSigma Data Converters Theory Design and Simulation, IEEE Press, NewYork, 1992.[2] S. R. Norsworthy, R. Schreier and G. C.Temes, "DeltaSigma Data Converters", IEEE Press,Piscataway, NJ, 1997.[3] Xilinx Inc., The Programmable Logic Data Book,1999.[4] Atmel, AT40K/05/10/20/40 Data Sheet, 1999.[5] A. Peled and B. Liu, "A New HardwareRealization of Digital Filters", IEEE Trans. onAcoust., Speech, Signal Processing, vol. 22, pp.456462, Dec. 1974.[6] S. A. White, "Applications of DistributedArithmetic to Digital Signal Processing” , IEEE ASSPMagazine, Vol. 6(3), pp. 419, July 1989.[7] J. G. Proakis, and D. G. Manolakis, “DigitalSignal Processing Principles, Algorithms andApplications” Second Edition, Maxwell MacmillanInternational, New York, 1992.[8] S. Haykin, "Modern Filters" , MacmillanPublishing Co., New York, 1990.[9] D. A. Patterson and J. L. Hennessy, “ComputerArchitecture: A Quantitative Approach”, MorganKaufmann Publishers Inc., California, 1990.[10] M. E. Frerking, “Digital Signal Processing inCommuncation Systems”, Van Nostrand Reinhold,New York, 1994.[11] H. Meyr, M. Moeneclaey and S. A. Fechtal,“Digital Communication Receivers”, John Wiley &Sons Inc., New York, 1998.[12] J. A. C. Bingham, “The Theory and Practice ofModem Design”, J. Wiley&Sons, New York, 1998.
�#���������������///� �����������������$������������������ ������*���������'�,������0111&