FPGA signal processing using Sigma-Delta Modulation - Specom |

��

FPGA SIGNAL PROCESSING USING SIGMA-DELTAMODULATION

1

fred harris

College of Engineering,San Diego State University ,San Diego, CA, [email protected]

Chris Dick

Xilinx Inc.,2100 Logic Drive, San Jose, CA 95124, [email protected]

��

While Σ∆M techniques [l, 2] are appliedwidely in analog conversion sub-sys-tems, both analog-to-digital (ADC) and

digital-to-analog (DAC) converters. these meth-ods have enjoyed much less exposure in thebroader application domain, where flexible andconfigurable solutions, traditionally supplied viaa software DSP (soft-DSP), are required. And this

limited level of exposure is easy to understand.Most, if not all, of the efficiencies and optimiza-tions afforded by Σ∆M are hardware oriented andso cannot be capitalized on in the fixed precisionpre-defined datapath found in a soft-DSP proces-sor. This limitation, of course, does not exist in afield programmable gate array (FPGA) DSP solution.With FPGAs the designer has complete control ofthe silicon to implement any desired datapathand employ optimal word precisions in the sys-tem with the objective of producing a design thatsatisfies the specifications in the most economi-cally sensitive manner .While implementation of a digital Σ∆ ASIC (appli-cation specific integrated circuit) is of course possi-ble, economic constraints make the implementa-tion of such a building block that would providethe flexibility, and be generic enough to cover abroad market cross-section, impractical. FPGAbased hardware provides a solution to this prob-lem.FPGAs are an off-the-shelf commodity item thatprovide a silicon feature set ideal for constructinghigh-performance DSP systems. These devicesmaintain the flexibility of software based solu-tions. while providing levels of performance thatmatch, and often exceed ASIC solutions.

�� Σ∆��

�� !� ��

�� "�� !�� #��

!�� $��

��%��&�Σ∆��"'��(�� )*+, #�� (��'� �� #�

��#�� '� ��!��

��!!��"��#�

��&�� %��&� Σ∆�#�� * �� "�� )*+, ��(��

�� !� ��(�#�� !��'� ��

�� "#�� !�� !�(��

��!��&

-�"(��.�!��#��"��)*+,�'��

�� '� )�� !��'� �� '� ��!�(��

��!��

ABSTRACT

�#��///� ��$�� *��'�,��0111&

2

There is a rich and expanding body of literaturedevoted to the efficient and effective implementa-tion of digital signal processors using FPGAbased hardware. More often than not, the mostsuccessful of these techniques involves a para-digm shift away from the methods that providegood solutions in software programmable DSPsystems.This paper reports on the rich set of design oppor-tunities that are available to the signal processingsystem designer through innovative combinationsof Σ∆M techniques and FPGA signal processinghardware. The applications considered includenarrow-band filters, both single-rate and multi-rate, DC canceler, and Σ∆M hybrid digital-analogcontrol loops for simplifying carrier recovery, tim-ing recovery and AGC (automatic gain control)loops in a digital communication receiver.The paper is organized as follows: Section 2 pres-ents a brief overview of FPGA architecture. InSection 3 a simple single-loop base-band Σ∆ mod-ulator is introduced. This structure is then extend-ed to a novel architecture that permits center fre-quency tuning, as well a method for working withthe system degrees of freedom to tradeoff modu-lator bandwidth with dynamic range. The tunableΣ∆M is then utilized for implementing area effi-cient FPGA FIR filters. The process for computingthe modulator coefficients for lowpass, bandpassand highpass designs is described. In Section 4, anew Σ∆ modulator architecture is described thatprovides a very simple method for tuning usingonly a single coefficient. In any fixed-point data-path, careful consideration must be given to theDC aspects of the design. For example, the intro-duction of a DC component due tc sample trunca-tion between the stages of a multistage multi-ratefilter can be problematic, causing arithmetic satu-ration or increasing the bit error rates in a digitalreceiver. Section 5 describes a unique Σ∆ modula-tor approach to building a DC canceler. In Section6, Σ∆ methods are described for simplifying theimplementation of hybrid digital-analog controlloops in a system such as a software definedradio. In Section 7 some comments on the indus-

trial implications of the techniques considered inthe paper are presented. Finally, some conclu-sions are drawn in Section 8.

)*+, ,��2��/��/

There is a rich range of FPGAs provided by manysemiconductor vendors including Xilinx, Altera,Atmel, AT&T and several others. The architectur-al approaches are as diverse as there are manu-facturers, but some generalizations can be made.Most of the devices are basically organized as anarray of logic elements and programmable rout-ing resources used to provide the connectivitybetween the logic elements, FPGA I/O pins andother resources such as on-chip memory. Thestructure and complexity of the logic elements, aswell as the organization and functionality sup-ported by the interconnection hierarchy, distin-guish the devices from each other. Other devicefeatures such as block memory and delay locked

Figure 1. Generic FPGA architecture

�#��///� ��$�� *��'�,��0111&

3

loop technology are also significant factors thatinfluence the complexity and performance of analgorithm that is implemented using FPGAs.A logic element usually consists of 1 or moreRAM (random access memory) n-input look-uptables, where n is between 3 and 6, and 1 to sever-al flip-flops. There may also be additional hard-ware support in each element to enable highspeedarithmetic operations. This generic FPGA archi-tecture is shown in Figure 1. Alsoillustrated in the figure (as widelines) are several connectionsbetween logic elements and thedevice input/output (1/O) ports.Application specific circuitry issupported in the device by down-loading a bit stream into SRAM(static rondom access memory) basedconfiguration memory. This per-sonalization database defines thefunctionality of the logic elements,as well as the internal routing.Different applications are support-ed on the same FPGA hardwareplatform by configuring theFPGA(s) with appropriate bitstreams. As a specific example, consider the

Xilinx VirtexTM series of FPGAs [3]. The logic ele-ments, called slices, essentially consist of two 4-input look-up tables (LUTs), two flip-flops, sever-al multiplexors and some additional silicon sup-port that allows the efficient implementation ofcarry-chains for building highspeed adders, sub-tracters and shift registers. Two slices form a con-figurable logic block (CLB) as shown in Figure2.The CLB is the basic tile that is used to build the

logic matrix. Some FPGAs, like the Xilinx Virtexfamilies, supply on-chip block RAM. Figure 3shows the CLB matrix that defines a Virtex FPGA.Current generation Virtex silicon provides a fam-ily of devices offering 768 to 12,288 logic slices,and from 8 to 32 variable form factor block mem-ories.Xilinx XC4000 and Virtex [3] devices also allowthe designer to use the logic element LUTs asmemory - either ROM or RAM. Constructingmemory with this distributed memory approachcan yield access bandwidths in the many tens ofgigabytes per second range.Typical clock frequen-cies for current generation devices are in the mul-tiple tens of mega-hertz (100 to 200) range. In con-trast to the logic slice architecture employed inXilinx Virtex devices, the logic block architectureemployed in the Atmel AT40K [4] FPGA is shownin Figure 4. Like the Xilinx device, combinational

BANK 0 BANK 1

DLL

RAM

RAM

RAM

RAM

BA

NK

6B

AN

K 7

DLL

RAM

RAM

RAM

RAM

BA

NK

3B

AN

K 2

DLL DLL

BANK 5 BANK 4

DLL = DELAY LOCKED LOOPRAM=BLOCK RAM - VARIOUS FORM FACTORS FROM 4096x1 TO 256x16

= 2 SLICE CLB = I/O

BANK N = MULTI-STANDARD I/O OUTPUT

Figure 1. Xilinx Virtex logic cell array architecture

�#��///� ��$�� *��'�,��0111&

LUT

G4

G3

G2

G1

Carry &Control

SPD Q

EC

RC

COUT

BY

YB

Y

YQ

LUT

F4

F3

F2

F1

Carry &Control

SPD Q

EC

RCBX

XB

X

XQ

CIN

SLICE 1

LUT

G4

G3

G2

G1

Carry &Control

SPD Q

EC

RC

COUT

BY

YB

Y

YQ

LUT

F4

F3

F2

F1

Carry &Control

SPD Q

EC

RCBX

XB

X

XQ

CIN

SLICE 0

EC = CLOCK ENABLE CONTROLSP = SET/PRESET CONTROLBY = RESET CONTROL

Figure 2. Simplified Virtex CLB architecture

4

logic is realized using lookup tables. In this case,two 3-input LUTs and a single flip-flop are avail-able in each logic cell. The pass gates in a cell formpart of the signal routing network and are usedfor connecting signals to the multiple horizontaland vertical bus planes. In addition to the orthog-onal routing resources, indicated as N, S, E and Win Figure 4, a diagonal group of interconnects(NW, NE, SE, and SW), associated with each cell xoutput, are available to provide efficient connec-tions to a neighboring cell's x bus inputs.The objective of the FPGA/DSP architect is to for-mulate algorithmic solutions for applications thatbest utilize FPGA resources to achieve therequired functionality.This is a three-dimensional

optimization problem in power , complex-ity and bandwidth. The remainder of thispaper describes some novel FPGA solu-tions to several signal processing prob-lems. The results are important in anindustrial context because they enableeither smaller, and hence more economic,solutions to important problems, or allowmore arithmetic compute power to be real-ized with a given area of silicon.

3&�Σ∆��4,�� '�)��)�4�/� �,�)*+,

This section describes a method employingsigma-delta modulation (Σ∆ M) techniques forimplementing area efficient finite impulseresponse (FIR) filters using FPGAhardware.Before treating the FPGA filter design,a brief review of Σ∆ modulation encoding is pre-sented.

Σ∆��4,��

Sigma-Delta modulation is a source coding tech-nique most prominently employed in analog-to-digital and digital-to analog converters. In thiscontext, hybrid analog and digital circuits areused in the realization. Figure 5 shows a singleloop Σ∆ modulator. Provided the input signal isbusy enough, the linearized discrete time model ofFigure 6 can be used to illustrate the principle. Inthis figure the 1-bit quantizer is modeled by anadditive white noise source with varianceδ2e=∆2/12, where ∆ represents the quantizationinterval. The z-transform of the system is

y

a2, a1, a08 x 1 LUT

out

a2, a1, a08 x 1 LUT

out

AND

Q

D

x

x y

1 Nw Se Se Sw 1 1 N S E W

1

10

Nw Se Se Sw N S E W

clock

set/reset

Output Enable Control = Pass Gate

v1

h1

v2

h2

v3

h3

v4

h4

v5

h5

Figure 4. Atmel AT40K logic cell architecture

Figure 5. Single loop Σ∆ modulator

1

1

−z

+1

-1x(n) y(n)

+

-

Figure 6. Linearized model of Σ∆ modulator

1−z y(n)x(n)

+

-

q(n)

)()()()(

)()(1

1)(

)(1

)()(

zQzHzXzH

zQzH

zXzH

zHzY

ns +=+

++

= (1)

(2)

�#��///� ��$�� *��'�,��0111&

5

which is the transfer function of delay and anideal integrator, and Hs(z) and Hn(z) are the sig-nal and noise transfer functions (NTF) respective-ly. In a good Σ∆ modulator, Hs(ω) will have a flatfrequency response in the interval|f| =< B. Incontrast, Hn(ω) will have a high attenuation in thefrequency band|f|=< B and a don't care region inthe interval B<|f| < fs/2. For the single loopΣ∆ inFigure 6 Hs(z) = z-1 and Hn(z) = 1- z-1. Thus theinput signal is not distorted in any way by the net-work and simply experiences a pure delay frominput to output. The performance of the system isdetermined by the noise transfer function Hn(z)which is given by

and is shown in Figure 7. The in-band quantiza-tion noise variance is

where Sq(f) = δ2q/fs is the power spectral densityof the quantization noise. Observe that for a non-shaped noise (or white) spectrum, increasing the

sampling rate by a factor of 2, while keeping thebandwidth B fixed, reduces the quantizationnoise by 3 dB. For a first order Σ∆M it can beshown that

for fs >> 2B. Under these conditions doubling thesampling frequency reduces the noise power by 9dB, of which 3 dB is due to the reduction in Sq(f)and a further 6 dB is due to the filter characteris-tic Hn(f). The noise power is reduced by increas-ing the sampling rate to spread the quantizationnoise over a large bandwidth and then by shapingthe power spectrum using an appropriate filter.

�/��/��*4/5��6 )�4�/� �� +�Σ∆��4,��/�2��7�/

Σ∆Μ techniques can be employed for realizingarea efficient narrowband filters in FPGAs. Thesefilters are utilized in many applications. Forexample, narrow-band communication receivers,multi-channel RF surveillance systems and forsolving some spectrum management problems.A uniform quantizer operating at the Nyquist rateis the standard solution to the problem of repre-senting data within a specified dynamic range.Each additional bit of resolution in the quantizerprovides an increase in dynamic range of approx-imately 6 dB. A signal with 60 dB of dynamicrange requires 10 bits, while 16 bits can representdata with a dynamic range of 96 dB.While the required dynamic range of a systemfixes the number of bits required to represent thedata, it also affects the expense of subsequentarithmetic operations, in particular multiplica-tions. In any hardware implementation, and ofcourse this includes FPGA based DSP processors,there are strong economic imperatives to mini-mize the number and complexity, of the arith-metic components employed in the datapath. Theproposal investigated in this section is to employ

1

1)(

−=

zzH

Figure 7. Single loop sigma-delta modulator noise transferfunction.

(3)

where

sn f

ffH

πsin4)( =

(4)

∫+

−=

B

B qnn dffSfH )()(22δ

(5)

2

222 2

3

1

≈

sqn f

Bδπδ

(6)

�#��///� ��$�� *��'�,��0111&

6

noise-shaping techniques to reduce the precisionof the input data samples so that the complexityof the multiply-accumulate (MAC) units in the fil-ter can be minimized. Of course, the pre-process-ing must not compromise the integrity of the sig-nal in the band of interest. The net result is areduction in the amount of FPGA logic resourcesrequired to realize the specified filter.Consider the structure shown in Figure 8. Insteadof applying the quantized data x(n) from the ana-log-to-digital converter directly to the filter, it willbe pre-processed by a Σ∆ modulator.

The re-quantized input samples x(n) are now rep-resented using fewer bits per sample, so permit-ting the subsequent filter H(z) to employ reducedprecision multipliers in the mechanization. Thefilter coefficients are still kept to a high precision.The Σ∆ data re-quantizer is based on a single looperror feedback sigma-delta modulator [l] shownin Figure 9. In this configuration, the differencebetween the quantizer input and output sample isa measure of the quantization error which is fedback and combined with the next input sample.The error-feedback sigma-delta modulator oper-ates on a highly oversampled input and uses theunit delay z-1 as a predictor. With this basic errorfeedback modulator only a small fraction of thebandwidth can be occupied by the required sig-nal. In addition, the circuit only operates at base-band. A larger fraction of the Nyquist bandwidthcan be made available and the modulator can betuned if a more sophisticated error predictor isemployed. This requires replacing the unit delaywith a prediction filter P(z). This generalizedmodulator is shown in Figure 10.The operation of the re-quantizer can be under-stood by considering the transform domaindescription of the circuit.

This is expressed in Eq. (7) as

where Q(z) is the z-transform of the equivalentnoise source added by the quantizer q(·), P( z ) isthe transfer function of the error predictor filter,and X(z) and X(z) are the transforms of the systeminput and output respectively. P(z) is designed tohave unity gain and leading phase shift in thebandwidth of interest. Within the design band-width, the term Q(z)(1 - p(z)z-l) = 0 and so X(z) =X(z). By designing P(z) to be commensurate withthe system passband specifications, the in-bandspectrum of the re-quantizer output will ideallybe the same as the corresponding spectral regionof the input signal.To illustrate the operation of the system considerthe task of recovering a signal that occupies 10%of the available bandwidth and is centered at anormalized frequency of 0.3 Hz. The stopband

Σ∆ H(z)A/D y(n)x(t)x(n) x(n)

Figure 8. Reduced-complexity FIR filtering employing a Σ∆preprocessor

1−z

Q( )x(n) y(n)

x(n) q(.)

1−z 1−z 1−z 1−z 1−z

+

+

aM-1 a4 a3 a2 a1

x(n)

+

-

Figure 10. Tunable sigma-delta modulator using a linearmodulator in the feedback path.

Predictor filter P(z)

))(1)(()()(ˆ 1−−+= zzPzQzXzX

(7)

�#��///� ��$�� *��'�,��0111&

7

requirement is to provide 60 dB of attenuation.Figure 11(a) shows the input test signal.It com-prises an in-band component and two out-of-band tones that are to be rejected. Figure 11(b) is afrequency domain plot of the signal after it hasbeen re-quantized to 4 bits of precision by a Σ∆modulator employing an 8th order predictor inthe feedback path. Notice that the 60 dB dynamicrange requirement is supported in the bandwidthof interest, but that the out-of-band SNR has beencompromised. This is of course acceptable, sincethe subsequent filtering operation will providethe necessary rejection. A 160-tap filter H(z) satis-fies the problem specifications. The frequencyresponse of H(z) using 12-bit filter coefficients isshown in Figure 11(c). Finally, H(z) is applied tothe reduced sample precision data stream X(z) toproduce the spectrum shown in Figure 11(d).Observe that the desired tone has been recovered,the two out-of-band components have been reject-ed, and that the in-band dynamic range meets the60 dB requirement.

*�/��)�4�/��/ �+�

The design of the error predictor filter is a signalestimation problem [7, 8]. The optimum predictor

is designed from a statistical viewpoint. The opti-mization criterion is based on the minimization ofthe mean-squared error. As a consequence, onlythe second-order statistics (autocorrelation func-tion) of a stationary process are required in thedetermination of the filter. The error predictor fil-ter is designed to predict samples of a band-limit-ed white noise process Nxx(ω) shown in Figure 12.

Nxx(ω) is defined as

and related to the autocorrelation sequence rxx

(m) by discrete-time Fourier transform (DTFT)

The autocorrelation function rxx(n) is found bytaking the inverse DTFT of Eq. ( 9 )

Nxx(ω) is non-zero only in the interval -θ=<ω=<θgiving rxx(n) as

�#��///� ��$�� *��'�,��0111&

Figure 11. (a) Input signal - 12b samples. (b) Shaped input -4b samples. (c) Filter response - 12b coefficients. (d) Filteredresult.

)(ωxxN

Frequencyπθθ−π−

Figure 12. The Σ∆ modulator error predictor filter isdesigned to predict samples of a narrow-band white noiseprocess.

≤≤−

=otherwise

Nxx ....0

...1)(

θωθω

(8)

∑∞

−∞=

−=n

njxxxx ekrN ωω )()(

(9)

∫−

−=π

π

ω ωωπ

deNnr njxxxx )(

2

1)(

(10)

)(sin)( ncnrxx θπθ=

(11)

8

So the autocorrelation function corresponding toa band-limited white noise power spectrum is asinc function. Samples of this function are used toconstruct an autocorrelation matrix which is usedin the solution of the normal equations to find therequired coefficients. Leaving out the scaling fac-tor in Eq. (11), the required autocorrelation func-tion rxx(n), truncated to p samples, is defined as

The normal equations are defined as

equations [7]. This system of equations can becompactly written in matrix form by first definingseveral matrices.To design a p-tap error predictor filter first com-pute a sinc function consisting of p + 1 samplesand construct the autocorrelation matrix Rxx as

Next define a filter coefficient row-vector A as

A=[a(0),a(1),...,a(p-1)]

where ai i = 0, . . . , p- 1 are the predictor filter coef-ficients. Let the row-vector R’xx be defined as

The matriz equivalent of Eq. (13) is

The filter coefficients are therefore given as

For the case in-hand, the solution of Eq. (18) is anill-conditioned problem. To arrive at a solution forA, a small constant e is added to the elementsalong the diagonal of the autocorrelation matrixRxx in order to raise its condition number. Theactual autocorrelation matrix used to solve for thepredictor filter coefficients is

8,�*, �*�/��

The previous section described the design of alowpass predictor. In this section bandpassprocesses are considered.A bandpass predictorfilter is designed by modulating a lowpass proto-type sinc function to the required center frequen-cy θ0 [10]. The bandpass predictor coefficientshBP(n) are obtained by solving the normal equa-tions with a heterodyned sinc function

2�+2*, �*�/��

The highpass predictor coefficients hHP(n) areobtained by solving the normal equations with asinc function heterodyned to the half sample rate

sincHP(n) = sincLP(n)(-l)n-k n = 0, ...,2p

�#��///� ��$�� *��'�,��0111&

1,...,0,)sin(

)( −== pnn

nnrxx θ

θ

(12)

pm

kmrkamr xx

p

kxx

,....,2,1

)()()(1

=

−= ∑=

(13)

−−

−−

=

)0(....)2()1(

................

)2(....)0()1(

)1(....)1()0(

xxxxxx

xxxxxx

xxxxxx

xx

rprpr

prrr

prrr

R

(14)

)](),....,2(),1([' prrrR xxxxxxxx =(16)

(15)

Txx

Txx RAR )'(= (17)

Txxxx

T RRA )'(1−=(18)

∈+−−

−∈+−∈+

=

)0(....)2()1(

................

)2(....)0()1(

)1(....)1()0(

xxxxxx

xxxxxx

xxxxxx

xx

rprpr

Mrrr

prrr

R

(19)

pn

knncnc LPBP

2,...,0

))(cos()(sin)(sin 0

=−= θ

(20)

(21)

9

Σ∆��4,��)*+, ��*4/�/��,��

The most challenging aspect of implementing thedata modulator is producing an efficient imple-mentation for the prediction filter P(z). The desireto support high-sample rates, and the require-ment of zero latency for P(z), will preclude bit-serial methods from this problem. In addition, forthe sake of area efficiency, parallel multipliersthat exploit one time-invariant input operand (thefilter coefficients) will be used, rather than gener-al variable-variable multipliers. The constant coef-ficient multiplier (KCM) is based on a multi-bitinspection version of Booth's algorithm [9].Partitioning the input variable into 4-bit nibbles isa convenient selection for the Xilinx Virtex func-tion generators (FG) [3]. Each FG has 4 inputs andcan be used for combinatorial logic or as applica-tion RAM/ROM. Each logic slice [3] in the Virtexlogic fabric comprises 2 FGs, and so can accom-modate a 16 x 2 memory slice. Using the rule ofthumb that each bit of filter coefficient precisioncontributes 5 dB to the sidelobe behavior, 12-bitprecision is used for P(z). 12-bit precision will alsobe employed for the input samples . There are 34bit nibbles in each input sample. Concurrently,each nibble addresses independent 16 x 16 lookuptables (LUTs). The bit growth incorporated hereallows for worst case filter coefficient scaling inP(z). No pipeline stages are permitted in the mul-tipliers because of P(z)'s location in the feedbackpath of the modulator. It is convenient to use thetransposed FIR filter for constructing the predic-tor. This allows the adders and delay elements inthe structure to occupy a single slice. 64 slices arerequired to build the accumulate-delay path. TheFPGA logic requirements for P(z), using a9-tappredictor, is Γ(p(z)) =9x 40+64=424 CLBs. A smallamount of additional logic is required to completethe entire Σ∆ modulator. The final slice count is450. The entire modulator comfortably operateswith a 113 MHz clock. This clock frequencydefines the system sample rate, so the architecturecan support a throughput of 113 MSamples persecond. The critical path through this part of the

design is related to the exclusion of pipelining inthe multipliers.

�/��/��*4/5��6 )��/�2,��9,��

Now that the input signal is available as a reducedprecision sample stream, filtering can be per-formed using area optimized hardware. For thereasons discussed above, 4-bit data samples are aconvenient match for Virtex devices. Figure 13shows the structure of the reduced complexityFIR filter. The coded samples x(n) are presented tothe address

inputs of N coefficient LUTs. In accordance withthe modulated data stream precision, each LUTstores the 16 possible scaled coefficient values forone tap as shown in Figure 14.An N-tap filterrequires N such elements. The outputs of the min-imized multipliers are combined with an add-delay datapath to produce the final result. Thelogic requirement for the filter is Γ(H(z)) =NΓ(MUL)+(N -1)Γ(ADD_z-1) where Γ(MUL) andΓ(ADD-z-1) are the FPGA area cost functions fora KCM multiplier and an add-delay datapathcomponent respectively.Using full-precision input samples without anyΣ∆M encoding, each KCM would occupy 40 slices.The total cost of a direct implementation of H(z) is7672 slices. The reduced precision KCMs used toprocess the encoded data each consume only 8slices. Including the sigma-delta modulator theslice count is 3002 for the Σ∆ approach. So thedata re-quantization approach consumes only

�#��///� ��$�� *��'�,��0111&

1−z 1−z 1−z

x(n)

y(n)

0a3−Na2−Na1−Na

.....

.....

.....

implemented as a table lookup constantcoefficient multiplier (KCM)

Figure 13. Area optimized FPGA FIR structure

10

39% of the logic resources of a direct implementa-tion.

Σ∆�/��,��

The procedure for re-quantizing the source datacan also be used effectively in an m . 1 decimationfilter. An interesting problem is presented whenhigh input sample rates (>= 150MHz) must besupported in FPGA technology. High-perform-ance multipliers are typically realized by incorpo-rating pipelining in the design. This naturallyintroduces some latency in to the system. Thelocation of the predictor filter P(z) requires a zero-latency design.1 Instead of requantizing, filteringand decimating, which would of course require aΣ∆ modulator running at the input sample rate,this sequence of operations must re-ordered to

permit several slower modulators to be used inparallel. The process is performed by first deci-mating the signal, re-quantizing and then filter-ing. Now the Σ∆ modulators operate at thereduced output sample rate. This is depicted inFigure 15. To support arbitrary center frequencies,and any arbitrary, but integer, down-samplingfactor m, the bandpass decimation filter mustemploy complex weights. The filter weights are ofcourse just the bandpass modulated coefficientsof a lowpass prototype filter designed to supportthe bandwidth of the target signal. Samples arecollected from the A/D and alternated betweenthe two modulators. Both modulators are identi-cal and use the same predictor filter coefficients.The re-quantized samples are processed by anm:1 complex polyphase filter to produce the deci-mated signal. Several design options are present-ed once the signal has been filtered and the sam-ple rate lowered. Figure 15 illustrates one possibil-ity. Now that the data rate has been reduced, thelow rate signal is easily shifted to baseband with asimple, and area efficient, complex heterodyne.One multiplier and a single digital frequency syn-thesizer could be time shared to extract one ormultiple channels.It is interesting to investigate some of the changesthat are required to support the Σ∆ decimator.What may not be immediately obvious is that thecenter frequency of the prediction filter must bedesigned to predict samples in the required spec-tral region in accordance with the out-put samplerate. For example, consider m=2, and the requiredchannel center frequency located at 0.1 Hz, nor-

1 It is possible that the predictor could be modified topredict samples further ahead in the time series, butthis potential modification will not be dealt with in thelimited space available.

0 x an

2 x an

1 x an

3 x an

5 x an

4 x an

-8 x an

7 x an

-7 x an

-5 x an

-6 x an

6 x an

-3 x an

-2 x an

-1 x an

QuantizedData

Samples

4

Address bus

Implemented using FPGAfunction generators

Figure 14. Constant coefficient multiplier.

A/D

0Σ∆ )(0 zH)(ˆ0 nx

1Σ∆ )(1 zH)(ˆ1 nx

)(tx

)cos( 0nθ

)sin( 0nθ

I

Q

Figure 15. Dual half-rate Σ∆ modulators in m:1 complexdecimator configuration.

�#��///� ��$�� *��'�,��0111&

11

malized with respect to the input sample rate. Theprediction filter must be designed with a centerfrequency located at 0.2 Hz. In addition, the qual-ity of the prediction must be improved. Withrespect to the output sample rate, the predictorsare required to operate over a wider fractionalbandwidth. This implies more filter coefficients inP(z).The increase in complexity of this componentmust of course be balanced against the savingsthat result in the reduced complexity filter stageto confirm that a net savings in logic requirementsis produced. To more clearly demonstrate theapproach, consider a 2:1 decimator, a channel cen-ter frequency at 0.2 Hz and a 60 dB dynamic rangerequirement.Figure 16(a) shows the double sided spectrum ofthe input test signal. The input signal is commu-tated between

Σ∆0 and Σ∆1 to produce the two low precisionsequences xo(n) and x1(n). The respective spec-

trums of these two signals are shown in Figures16(b) and 16(c). The complex decimation filterresponse is defined in Figure 16(d). After filtering,a complex sample stream supported at the lowoutput sample rate is produced. This spectrum isshown in Figure 16(e). Observe that the out-of-band components in the test signal have beenrejected by the specified amount and that the in-band data meets the 60 dB dynamic rangerequirement. For comparison, the signal spectrumresulting from applying the processing stages inthe order, requantize, filter and decimate is shownin Figure 16(f). The interesting point to note is thatwhile the dual Σ∆ modulator approach satisfiesthe system performance requirements, its out-of-band performance is not quite as good as theresponse depicted in Figure 16(f). The stopbandperformance of the dual modulator architecturehas degraded by approximately 6 dB. This can beexplained by noting that the shaping noise pro-duced by each modulator is essentially statistical-ly independent. Since there is no couplingbetween these two components prior to filtering,complete phase cancelation of the modulatornoise cannot occur in the polyphase filter.

� ��

To provide a frame of reference for the Σ∆ deci-mator, consider an implementation that does notpre-process the input data, but just applies itdirectly to a polyphase decimation filter. A com-plex filter processing real-valued data consumesdouble the FPGA resources of a filter with realweights. For N = 160, 15344 CLBs are required.This figure is based on a cost of 40 CLBs for eachKCM and 8 CLBs for an add-delay component.Now consider the logic accounting for the dualmodulator approach. The area cost Γ(FIR) for thisfilter is

�#��///� ��$�� *��'�,��0111&

Figure 16. (a) Input signal. (b) Shaped data xo(n). (c) Shapeddata xl(n). (d) Complex filter. (e) Recovered result. (f) Filteredsignal - single modulator.

)_()ˆ()(2)ˆ( 1−Γ+Γ+Σ∆Γ=Γ zACCLUMRIF

(22)

12

where Γ(Σ∆) represents the logic requirements forone Σ∆ modulator, and Γ(MUL) is the logic need-ed for a reduced precision multiplier. Using thefilter specifications defined earlier, and 18-taperror prediction filters, Γ(FIR) = 2 x 738 + 2 x ((160+ 159) x 8) = 6596. Comparing the area require-ments of the two options produces the ratio

So for this example, the re-quantization approachhas produced a realization that is significantlymore area efficient than a standard tapped-delayline implementation.

�/��/��)�/7�/��6 ��+

For both the single-rate and multi-rate Σ∆ basedarchitectures, the center frequency is defined bythe coefficients in the predictor filter and the coef-ficients in the primary filter.The constant coeffi-cient multipliers can be constructed using theFPGA function generators configured as RAMelements. When the system center frequency is tobe changed, the system control hardware wouldupdate all of the tables to reflect the new channelrequirements. If only several channel locations areanticipated, separate configuration bit streams [3]could be stored, and the FPGA(s) re-configured asneeded.

8,�*, �Σ∆�� +�,44*, ��/�:�;

In an earlier section we discussed how to design apredicting filter for the feedback loop of a stan-dard sigma delta modulator. The predicting filterincreases the order of the modulator so that themodified structure has additional degrees of free-dom relative to a single-delay noise feedbackloop.These extra degrees of freedom have beenused in two ways, first to broaden the bandwidth

of the loop's noise transfer function, and second totune its center frequency. The tuning processentailed an off line solution of the Normal equa-tions which while not difficult, does present asmall delay and the need for a backgroundprocessor. We can define a sigma-delta loop witha completely different architecture that offers thesame flexibility, namely wider bandwidth and atunable center frequency that does not requirethis background task. In this alternate architec-ture, a fixed set of feedback weights from a set ofdigital integrators defines a base-band prototypefilter with a desirable NTF. The filter is tuned toarbitrary frequencies by attaching to each delayelement z-l, a simple sub-processing element thatperforms a base-band to band-pass transforma-tion of the prototype filter. This processing ele-ment tunes the center frequency of its host proto-type with a single real and selectable scalar.Thestructure of a fourth order prototype sigma-deltaloop is shown in Figure 17. The time and spectrumobtained by using the loop with a 4-bit quantizeris shown in Figure 18. In this structure the digitalintegrator poles are located on the unit circle atDC. The local feedback (a1 and a2) separates thepoles by sliding them along the unit circle, andthe global feedback (b1, b2, b3 and b4) placesthese poles in the feedback path of the quantizerso they become noise transfer function zeros.

%4315344/6596)(

)ˆ( ≈=ΓΓ=

FIR

RIFλ

(23)

Figure 18. Input and output time series of base-band proto-type 4-th order, 4-bit sigma-delta loop (top) and output spectrum(bottom).

�#��///� ��$�� *��'�,��0111&

13

These zeros are positioned to form an equal-rip-ple stop band for the NTF. The coefficients select-ed to match the NTF pole-zero locations to anelliptic high pass filter. The single sided band-width of this fourth order loop is approximately4% of the input sample rate.The low-pass to band-pass transformation for asampled data filter is achieved by substituting an

all-pass transfer function G(z) for the all-passtransfer function z-l. This transformation isshown in Eq.(24).

A block diagram of a digital filter with the trans-fer function for G(z) is shown in Figure 19.Examining the left hand block diagram, we findthe transfer function from x(n) to y(n) is the all-pass network -(1 - cz)/(z - c) while the transferfunction from x(n) to v(n) is -(l/z)(l-cz)/(z-c).weabsorb the external negative sign change in the

internal addersof the filter weobtain the sim-ple right-handside version ofthe desiredtransfer functionG(z).After the blockdiagram substi-tution has been

made, we obtainFigure 20, the tunable version of the low-pass pro-totype. The basic structure of the prototyperemains the same when we replace the delay withthe tunable all-pass network. The order of the fil-ter is doubled by the substitution since each delayis replaced by a second order sub-filter. Tuning istrivially accomplished by changing the c multipli-er of the all-pass network. The tuned version of

the system reverts back to the prototyperesponse if we set c to 1.Figure 21 presents the time and spectrumobtained by using the tunable loop with a4-bit quantizer shown in Figure 20. Thesingle sided bandwidth of the prototypefilter is distributed to the positive and neg-ative spectral bands of the tuned filter.

Thus the two-sided bandwidth of each

1−z 1−z

q(.)

)(nx

)(ny

1a

1b

1−z 1−z

2a

2b 3b 4b

-

-

Figure 17. Fourth order sigma-delta loop.

1−z

1−z

-

)(ny

)(nx

)(nv

-1

c

1−z 1−z)(nx )(nv

c

-

Figure 19. Block diagram of all-pass transfer function G(z) form Eq.(1)

−−−→ −−

cz

czzz

111

(24)

�#��///� ��$�� *��'�,��0111&

Figure 21. Input and output time series of tuned base-bandprototype 4-th order, 4-bit sigma-delta loop (top), filter andmodulator output spectrum (middle), and filtered output spec-trum (bottom).

14

spectral band is approximately 4% of the inputsample rate.We now estimate the computational workloadrequired to operate the prototype and tunable fil-ter. The prototype filter has six coefficients toform the 4-poles and the 4-zeros of the transferfunction. The two ak k = 0, 1 coefficients deter-mine the four zero locations. These are small coef-ficients and can be set to simple binary scalers.The values computed for this filter for al and a2

were 0.0594 and 0.0110.These can be approximat-ed by 1116 and 11128 which lead to no significantshift of the spectral zeros in the NTF. These sim-ple multiplications are of course virtually free inthe FPGA hardware since they are implementedwith suitable wiring. The four coefficients bk k =0, ... , 3 are 1.000, 0.6311, 0.1916, and 0.0283 respec-tively were replaced with coefficients containingone or two binary symbols to obtain values 1.000,1;2+1/8 (.625), 118+1/16 (0.1875) and 1;32(0.03125). When the sigma-delta loop ran withthese coefficients there was no discernable changein bandwidth or attenuation level of the loop. Theloop operates equally as well in the tuning modeand the non tuning mode with the approximatecoefficients listed above. Thus the only real multi-plies in the tunable sigma-delta loop are the ccoefficients of the all-pass networks. These net-works are unconditionally stable and alwaysexhibit all-pass behavior even in the presence offinite arithmetic and finite coefficients.This isbecause the same coefficient forms the numeratorand the denominator. Errors in approximating the

coefficients for csimply result in afrequency shift ofthe filter's tunedcenter. The ccoefficient isdetermined fromthe cosine of thecenter frequency(in radians/sam-ple). The curvefor this relation-

ship is shown in Figure 22. Also shown is an errordue to approximating c by c +δc. The question is,what is the change in center frequency θ, from θto θ + δθ due to the approximation of c? We cansee that the slope at the operating point on thecosine curve is - sin(θ), so that δc/δθ ≈ -sin(θ) sothat δc ≈ -δθsin(θ) is the required precision tomaintain a specified error. We note that tuningsensitivity is most severe for small frequencieswhere sin(θ) is near zero. The tolerance term, δθsin(θ) , is quadratic for small frequencies, but thelowest frequency that can be tuned by the loop ishalf the NTF pass-band bandwidth.For the fourthorder system described here, this bandwidth is4% of sample rate, so the half bandwidth angle is2% or 0.126 radians. To assure that the frequencyto which the loop is tuned has an error smallerthan 1% of center frequency δc < δθsin(θ) => δc <(0.1261100)(0.126) = 0.0002 which corresponds to a14 bit coefficient. An error of less than 10% centerfrequency can be achieved with 10 bit coefficients.

G(z) G(z)

q(.)

)(nx

)(ny

1a

1b

G(z) G(z)

2a

2b 3b 4b

-

-

Figure 20. Tunable sigma-delta loop. Prototype is fourth Order. Tunable version is eight order.

Figure 22. Relationship between errors in constant c in all-pass networks to errors in tuning frequency.

�#��///� ��$�� *��'�,��0111&

15

The tuning multipliers could be implemented asfull multipliers in the FPGA hardware or asdynamically reconfigured KCMs, or KDCM, asshown in Figure 23. The later approach conservesFPGA resources at the expense of introducing astart-up penalty each time the center frequency ischanged. The start-up period is the initializationtime of the KCM LUT. When a new center fre-quency is desired, the tuning constant is present-ed to the k input of the KDCM and the load signalLD is asserted. This starts the initialization enginewhich requires 16 clock cycles to initialize 16 loca-tions in the multiplier LUT. The initializationengine relies on the automatic shift mode [31 ofthe Virtex LUTs. In this mode of operation a LUT'sregister contents are passed from one cell to thenext cell on each clock tick.This avoids therequirement for a separate address generator andmultiplexor in the initialization hardware.Observe from Figure 23 that the initializationengine only introduces a small amount of addi-tional hardware over that of a static KCM.There is approximately a factor of 4 difference inthe area of a KDCM and full multiplier.

Σ∆��,��/4/�

Unwanted DC components can be introduced intoa DSP datapath at several places. It may be pre-sented to the system via an un-trimmed offset in

the analog-to-digital conversion pre-processingcircuit, or may be attributed to bias in the AIDconverter itself. Even if the sampled input signalhas a zero mean, DC content can be introducedthough arithmetic truncation processes in thefixed-point datapath. For example, in a multi-stage multi-rate filter , the intermediate filter out-put samples may be quantized between stages inorder to compensate for the filter processing gainand thereby keep the word-length requirementsmanageable. The introduced DC bias can impactthe dynamic range performance of a system andpotentially increase the error rate in a digitalreceiver application.In a fixed-point datapath, the bias can causeunnecessary saturation events that would notoccur if the DC was not present in the system.In a digital communication receiver employing M-ary QAM modulation, the DC bias can interferewith the symbol decision process, so causingincorrect decoding and therefore increasing thebit error rate.In some cases the introduced bias can be ignoredand is of no concern. However, for other applica-tions it is desirable to remove the DC compo-nent.One solution to removing the unwanted DClevel is to employ a DC canceler.A simple canceler is shown in Figure 24. It is easyto show that the transfer function of the networkis

The cancelation is due to the transfer functionzero at 0 Hz. The pole at 1- µ controls the system

WE

DOUT

DIN

ADDR

CONTROL

-

LDLD 16

k

A

k x A

INITIALIZATION ENGINE

= REGISTERDISTRIBUTED

RAM LUT

Figure 23. Loadable constant coefficient multiplier (KDCM).

�#��///� ��$�� *��'�,��0111&

1−z

Q( )

-

+)(nx

)(ny)(nyq

µ

Figure 24. Simple DC canceler.

)1(

1)(

µ−−−=

z

zzH

(25)

16

bandwidth and hence the system transientresponse. The location of the zero at z = 1 ofcourse, completely removes the DC component inthe signal, but there are some problems with apractical implementation of this circuit.Figure 25a. is a spectral domain representation ofa biased signal presented to the DC canceler.Figure 25b is the processed signal spectrum atyq(n) in Figure 24. We observe that the DC contentin the input signal has been completely removed.However, in the process of running the cancelingloop the network processing gain has caused adynamic range expansion. So although the sam-ple stream yq(n) is a zero mean process, it requiresa larger number of bits to represent each samplethan is desirable. The only option with the circuitis to re-quantize yq(n) to produce y(n) using thequantizer Q(·). The effect of this operation isshown in Figure 25c, which demonstrates, notsurprisingly, that after an 8-bit quantizer, the sig-nal now has a DC component and we are almostback to where we started. How

can the canceler be re-organized to avoid thisimplementation pitfall? One option is to embedthe re-quantizer in the feedback loop in the formof a Σ∆ modulator as shown in Figure 26. Themodulator can be a very simple 1st order loop

such as the error feedback Σ∆ modulator shownin Figure 9. Figure 25d demonstrates the opera-tion of the circuit for 8-bit output data. Observefrom the figure that the DC has been removedfrom the signal while employing the same 8-bitoutput sample precision that was used in Figure24. The simple Σ∆M employed in the canceler iseasily implemented in an FPGA.

��*4�)6 �+��,4 �/�/�</��4 4*

� ��+�Σ∆��4,��

In earlier sections we recognized that when asampled data input signal has a bandwidth that isa small fraction of its sample rate the sample com-ponents from this restricted bandwidth are high-ly correlated. We took advantage of that correla-tion to use a digital sigma-delta modul8.tor torequantize the signal to a reduced number of bits.The sigma-delta modulator encodes the input sig-nal with 8. reduced number of bits while preserv-ing full input precision over the signal bandwidthby placing the increased noise due to requantiza-tion in out-of-band spectral positions that are

Figure 25. (a) Input signal. (b) DC canceler. (c) QuantizedDC canceler. (d) Σ∆ DC canceler.

1−z

-

+)(nx )(ny

µ

Σ∆

Figure 26. Sigma-delta based DC canceler.

�#��///� ��$�� *��'�,��0111&

VCO VCO

DAC DAC DAC

REG REG REG

Digital Receiver

Control Loops

RF INAGC

Control Data

DemodulatedData

Figure 27. Block diagram of digital receiver showing feed-back paths containing digital signals converted to analog con-trol signals for analog components.

17

already scheduled to be rejected by subsequentDSP processing. The purpose of this requantiza-tion is to allow the subsequent DSP processing tobe performed with reduced arithmetic resourcerequirements since the desired data is now repre-sented by a smaller number of bits.A similar remodulation of data samples can by beemployed for signals generated within a DSPprocess when the bandwidth of the signals aresmall compared to the sample rate of the process.A common example of this circumstance is thegeneration of control signals used in feedbackpaths of a digital receiver. These control signalsinclude a gain control signal for a voltage con-trolled amplifier in an automatic gain control(AGC) loop and VCO (voltage controlled oscilla-tor) control signals in carrier recovery and timingrecovery loops [10, 11, 12]. A block diagram of areceiver with these specific controls signals isshown in Figure 27. The control signals are gener-ated from processes operating at a sample rateappropriate to the input signal bandwidth. Thebandwidth of control loops in a receiver are usu-ally a very small fraction of the signal bandwidth,which means that the control signal are veryheavily oversampled. As a typical example, in acable TV modem, the input bandwidth is 6 MHz,the processing sample rate is 20 MHz, and theloop bandwidth may be 50 kHz. For this example,the ratio of sample rate to bandwidth is 4000-to-1.As seen in Figure 27, the process of deliveringthese oversampled control signals to their respec-tive control points entails the transfer of 16 bitwords to external control registers, requiringappropriate busses, addressing, and enable linesas well as the operation of 16-bit digital-to-analogconverters (DACs).We can use a sigma-delta modulator to requantizethe 16-bit oversampled control signals in the digi-tal receiver prior to passing them out of the pro-cessing chip. The sigma-delta can preserve therequired dynamic range over the signal's restrict-ed bandwidth with a one-bit output. As suggest-ed in Figure 28, the transfer of a single bit to con-trol the analog components is a significantly less

difficult task than the original. We no longerrequire registers to accept the transfer, the bussesto deliver the bits, or the DAC to convert the digi-tal data to the analog levels the data represents.All that is needed a simple filter ( and likely ananalog amplifier to satisfy drive level and offsetrequirements). Experience shows that a 1-bit, one-loop sigma-delta modulator could achieve 80 dB

dynamic range and requires a single RC filter toreconstruct the analog signal. A two-loop sigma-delta modulator is required to achieve 16-bit pre-cision for which a double RC filter is required toreconstruct the analog output signal. Figure 29shows the time response of the one-bit two loopsigma-delta converter to a slowly varying controlsignal and the reconstructed signal obtained fromthe dual-RC filter. Figure 30 shows the spectrumobtained from a 1-bit two-loop modulator and thespectrum obtained from an unbuffered RC-RC fil-ter.This example has shown how with minimal addi-tional hardware, an FPGA can generate analogcontrol signals to control low-bandwidth analogfunctions in a system.An observation worthy of note, is that the audioengineering community has recognized theadvantage offered by this option of requantizing a16-bit oversampled data stream to 1-bit datastream. In that community, the output signal isintentionally upsampled by a factor of 64 andthen requantized to 1-bit in a process called aMASH converter. Nearly all CD players use the

�#��///� ��$�� *��'�,��0111&

VCO VCO

Digital Receiver

Control Loops

Σ∆ Modulators

RF INAGC

DemodulatedData

Figure 28. Block diagram of digital receiver showing feed-back paths delivering and processing one-bit analog control sig-nals generated by sigma-delta modulators.

18

MASH converter to deliver analog audio signals.

:2,� 2,</�:/�+,��/=

What has been achieved by expressing our signalprocessing problems in terms of Σ∆M techniques?The paper has demonstrated some Σ∆M tech-niques for the compact implementation of certaintypes of filter and control applications usingFPGAs. This optimization can be used in several

ways to bring economic benefits to a commercialdesign. By exploiting Σ∆M filter processes, agiven processing load may be realizable in alower-density, and hence less expensive, FPGAthan is possible without access to these tech-niques. An alternative would be to perform moreprocessing using the same hardware. For exam-ple, processing multiple channels in a communi-cation system.In addition to FPGA area trade-offs, the Σ∆Mmethods can result in reduced power consump-tion in a design. Power p may be expressed as

where c is capacitance, v is voltage and fclk is thesystem clock frequency. By reducing the siliconarea requirements of a filter, we can simultane-ously reduce the power consumption of thedesign. For the examples considered earlier, logicresource savings of greater than 50% weredemonstrated. The savings is proportional toincreased efficiency in the system power budget,and this of course is very important for mobileapplications.The Σ∆M AGC, timing and carrier recovery con-trol loop designs are also important examples in aindustrial context. The examples illustrated howthe component count in a mixed analog/digitalsystem can be reduced. In fact, not only is thecomponent count reduced, but printed circuitboard area is minimized. This results in more reli-able and physically smaller implementations. Thereduced component count also results in reducedpower consumption.In addition, since the controlloops no longer require wide output buses fromthe FPGA to multi-bit DACs that generate analogcontrol voltages, power consumption is decreasedbecause fewer FPGA I/O pads are being driven.

��4� ��

FPGAs opens a range of opportunities in the solu-

Figure 29. Control loop signal and Σ∆ encoded control sig-nal (top), filtered Σ∆ sequence (bottom).

Figure 30. Σ∆ encoded control signal spectrum (top), filteredΣ∆ sequence spectrum (middle), exploded view at baseband offiltered Σ∆ control signal (bottom).

clkfcvP 2=(26)

�#��///� ��$�� *��'�,��0111&

19

tion space that can result in high-performanceand economic solutions to a DSP problem. Oftenthis is best achieved via an appropriate paradigmshift in the algorithmic domain to find an optimalapproach that best exploits the cellular LUT /flip-flop FPGA architecture. This paper has illustratedhow Σ∆M techniques can be combined withFPGA technology to address a range of signalprocessing problems. These included single andmulti-rate filters, DC canceler and the efficientand compact generation of analog control signalsfor AGC, carrier recovery and timing recoveryfunctions in a communication receiver. The sourcedata re-quantization approach is suitable for bothsingle-rate and multi-rate filter processes. Theproposed method arms the DSP /FPGA engineerwith another tool that is useful for certain filteringrequirements. For the examples considered here,logic savings in excess of 50% were demonstrated.As the frequency band of interest occupies asmaller fractional bandwidth, the order of therequired filter increases. This growth tends tomake the data re-quantization approach moreattractive, as the cost of modulator consumes adecreasing proportion of the entire design.While the paper has exclusively focused on Σ∆Mmethods in the context of FPGA hardware, we feelthat there is broader lesson delivered in the study.The signal processing literature is full of creativesolutions to real world problems. Often thesesolutions are excluded to a designer because theydo not map well to software programmable DSParchitectures. The algorithm will of course havean ASSP solution, but this may not be an optionfor reasons of schedule, economies of scale, andflexibility. FPGAs do, however, give immediateaccess to the diverse range of potential solutions.And they do so while simultaneously providingflexibility and high-performance - frequently theperformance equals or exceeds that of an ASSP.In this era of systems-on-a-chip, increased fiscalpressure, tighter engineering deadlines, time-to-market constraints, and increasing performancedemands, new system level and hardware archi-tectures must be employed. FPGAs provide a

mechanism for working with and performingtrade-offs between all of these important vari-ables. As we move into the next millennium,reconfigurable FPGA technology will increasing-ly provide solutions to signal processing prob-lems.

�/)/�/��/

[l] J. C. Candy and G. C. Temes, "OversamplingMethods for A/D and DIA Conversion" inOversampling Delta-Sigma Data Converters -Theory Design and Simulation, IEEE Press, NewYork, 1992.[2] S. R. Norsworthy, R. Schreier and G. C.Temes, "Delta-Sigma Data Converters", IEEE Press,Piscataway, NJ, 1997.[3] Xilinx Inc., The Programmable Logic Data Book,1999.[4] Atmel, AT40K/05/10/20/40 Data Sheet, 1999.[5] A. Peled and B. Liu, "A New HardwareRealization of Digital Filters", IEEE Trans. onAcoust., Speech, Signal Processing, vol. 22, pp.456-462, Dec. 1974.[6] S. A. White, "Applications of DistributedArithmetic to Digital Signal Processing” , IEEE ASSPMagazine, Vol. 6(3), pp. 4-19, July 1989.[7] J. G. Proakis, and D. G. Manolakis, “DigitalSignal Processing Principles, Algorithms andApplications” Second Edition, Maxwell MacmillanInternational, New York, 1992.[8] S. Haykin, "Modern Filters" , MacmillanPublishing Co., New York, 1990.[9] D. A. Patterson and J. L. Hennessy, “ComputerArchitecture: A Quantitative Approach”, MorganKaufmann Publishers Inc., California, 1990.[10] M. E. Frerking, “Digital Signal Processing inCommuncation Systems”, Van Nostrand Reinhold,New York, 1994.[11] H. Meyr, M. Moeneclaey and S. A. Fechtal,“Digital Communication Receivers”, John Wiley &Sons Inc., New York, 1998.[12] J. A. C. Bingham, “The Theory and Practice ofModem Design”, J. Wiley&Sons, New York, 1998.

�#��///� ��$�� *��'�,��0111&

FPGA signal processing using Sigma-Delta Modulation - Specom |

Documents

Transcript of FPGA signal processing using Sigma-Delta Modulation - Specom |