ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( [email protected]) · 2003 IEEE NSREC Short Course 2003...

41
ΗΜΥ 664 ΨΗΦΙΑΚΟΣ ΣΧΕΔΙΑΣΜΟΣ ΜΕ FPGAs Χειμερινό Εξάμηνο 2010 ΔΙΑΛΕΞΗ 7: FPGAs: Research Issues ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( [email protected]) Ack: •Rick Padovani, Peter Alfke •Xilinx Corporation

Transcript of ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( [email protected]) · 2003 IEEE NSREC Short Course 2003...

Page 1: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

ΗΜΥ 664ΨΗΦΙΑΚΟΣ ΣΧΕΔΙΑΣΜΟΣ ΜΕ FPGAs

Χειμερινό Εξάμηνο 2010

ΔΙΑΛΕΞΗ 7:FPGAs: Research Issues

ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ( [email protected])

Ack: •Rick Padovani, Peter Alfke

•Xilinx Corporation

Page 2: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

Are Von Neumann processors running out of steam?

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

0.45

Pentium MMX (P55C) 1997

Celeron (Mendocino)

1998

Pentium III EB1999

Pentium III-S2001

Pentium 4 (Willamette) 2001

Pentium 4 (Northwood)

2002

(MOP

S/MHz

/Millio

n Tran

sistor

s

Source: UC Berkeley HERC and CPUscorecard.com

Compute Density of Processors

0.1

1.0

10.0

1997 1998 1999 2000 2001 2002 2003 2004 2005

GH

z Clock Speed• Lack of increased clock

speed is being addressed by:– Increased cache size– Longer pipelines– Trying to do more per

cycle• This approach also

nearing its limit

Page 3: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

What’s next for Computing Platforms?

• Hyperthreading?• CMP?• Clusters?• Configurable instruction sets?• Configurable coprocessors?

In general, the need for parallel execution is nowrecognized as a requirement, as is the desire for customizable instruction sets

Page 4: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

Reconfigurable FPGAs to the rescue• For at least 15 years people have seen the Von Neumann limitations

and have argued that FPGAs were the ultimate supercomputer– Better programmability – not stuck with a fixed ALU– Parallel processing – not just hyperthreading but limitless opportunities for

parallelism– No wasted cost on features that you don’t need

• Some traction over the years, but very limited – Numerous chess-playing machines from Deep Thought to Hydra– Craig Venter used Xilinx chips for the Human Genome project– Other people are using Xilinx chips for Bioinformatics– Cray, SGI and others have been using FPGAs as coprocessors to offload

certain operations– Berkeley Emulation Engine is a recent example – Numerous companies represented in the consortium have been extolling the

virtues of FPGA computing for a long time

Page 5: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

Raw Processing Performance Characteristics and Comparisons

Three axes of performance• Computational capability• Memory Bandwidth• IO Bandwidth

050

100150200250

Computation(GOPS)

MemoryBandwidth(GB/sec)

IO Bandwidth(Gbps)

Pentium Virtex-4

Virtex 4

Computation

IO B

andw

idth

Memory

Bandwidth

Pentium

Page 6: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

Why has adoption taken so long?

• Traction has been limited by programming model– Direct C translation to gates

• Definite progress in development and productization• Limited customer acceptance in the supercomputer market

but picture may be changes– Direct HDL design

• Difficult to implement current applications of supercomputing in HDL

• Need for high connectivity lowers performance

To date, the only model in widespread use for supercomputing-type applications is HDL

Page 7: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

7

Process Technology Advances

Advanced 90-nm process

11-Layer metallization 10 copper + 1 aluminum

New Triple-Oxide Structures Lower quiescent power consumption

Benefits: Best cost

Highest performance

Lowest power

Highest density

Over 1 million 90 nm FPGAs shipped

Channel

Gate

Source Drain

SourceMetal

Connection

DrainMetal

Connection

Page 8: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

Moore’s Law Continues Fueling Reprogrammable FPGA Advances

65 nm

90 nm

130 nm

150 nm

180 nm

45 nm32 nm22 nm

1999 2001 2003 2005 2007 2009 2011 2013 2015 2017

8 nm

MatureFPGA Product

Technology

DevelopingFPGA Product

Technology

FutureProcess Technology

• Plan continuation of 2 year Technology node cycle

• “Traditional Scaling” is starting to be effected by the fundamental material limits of the planar CMOS process

• “Equivalent Scaling” or the assimilation of new materials, structures and functional integration will drive continued scaling

Page 9: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

Architectural EvolutionReconfigurable FPGAs

Devic

e Com

plex

ity an

d Pe

rform

ance

1985 1992 2000 2002 2004

• FPGA Fabric• Block RAM• Embedded Registers

and Multipliers• Clock Management• Multi-standard

Programmable IO

• FPGA Fabric• Block RAM

• FPGA Fabric

Domain-optimized System Logic

• FPGA Fabric• Block RAM• Embedded Registers

and Multipliers• Clock Management• Multi-standard

Programmable IO• Embedded

Microprocessor• Multigigabit

Transceivers

• FPGA Fabric• Block RAM• Embedded Registers

and Multipliers• Clock Management• Multi-standard

Programmable IO• Embedded

Microprocessor• Multigigabit

Transceivers• Embedded DSP-

optimized Multiplers• Embedded Ethernet

MACs

GlueLogic

BlockLogic

PlatformLogic

SystemLogic

2005

Programmable “System in a

Package”

Page 10: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

10

’65 ’70 ’75 ’80 ’85 ’90 ’95 ’00 ’05 ’10Year

Clock Frequency in MHz

Trace Length in cm per 1/4 clock period

2048

1024

512

256

128

64

32

16

8

4

2

1

Moore Meets Einstein

Speed Doubles Every 5 Years…...but the speed of light never changes

Page 11: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

11

A Bird’s Eye View...

Lower Cost

Moore’s Law is alive Smaller geometries and larger wafers

and lower defect density (=higher yield ) continue to achieve lower cost per function

LUT + flip-flop: $1.- in 1990, $ 0.002 in 2003

State-of-the-art: 90 nm on 300 mm wafersSpartan-3 uses this technology for lowest cost

Rapid price reductions, intense competition

Page 12: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

12

A Bird’s Eye View…

More Logic and Better Features:

>100,000 LUTs & flip-flops>200 BlockRAMs, and same number 18 x 18 multipliers

1156 pins (balls) with >800 GP I/O50 I/O standards, incl. LVDs with internal termination

16 low-skew global clock linesMultiple clock management circuits

On-chip microprocessor(s) and Gbps transceivers

Gate count is really a meaningless metric

Page 13: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

13

A Bird’s Eye View…

Higher SpeedSmaller and faster transistors

90 nm technology, using 193 nm ultra-violet lightCu interconnect ( instead of Al ) was easily achievedLow-K dielectric progress is disappointing

System speed: up to 500 MHz,Mainly through smart interconnects, clock management, dedicated

circuits, flexible I/O.

Integrated transceivers running at 10 Gigabits/sec

Speeding up general-purpose logic is getting difficult

Page 14: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

14

A Bird’s Eye View…

Better toolsBack-End Place&Route and XST synthesis

VHDL and Verilog becoming entry point

IP/Cores speed up design and verification

Embedded Software Development Toolssupport architectures and merge HW and SW

Domain-Specific LanguagesSystem Generator bridges the gap betweenMatlab/Simulink and FPGA circuit description

ASIC-size FPGAs need ASIC-like tools

ASIC-like size requires ASIC-quality tools

Page 15: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

15

ASICs Are Losing GroundMask set >$1M + design + verification + risk

ASICS are only for extreme designs:

Extreme volume, speed, size, low power

Source:IBM

Page 16: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

16

FPGAs in 2003

1000 to 80,000 LUTs and flip-flops, millions of bits in dual-ported RAMs

Low-skew Global Clocks, Frequency synthesis, 50 ps phase control

18 Kbit BlockRAMs and 18 x 18 multipliers

FPGAs are not glue-logic anymore

Page 17: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

17

Virtex-4 in September 2004 / Virtex-5 Even Better

ASMBL™ Column-Based

Architecture500 MHz

SmartRAM™BRAM/FIFO

0.6 - 11.1 GbpsRocketIO™

SelectIO withChipSync™Technology:

- 1 Gbps LVDS- 600 Mbps SE

500 MHz Xtreme DSP™ Slice

500 MHzXesium™ Clocking

IntegratedSystem Monitor

IntegratedTri-Mode

Ethernet MACCores

Integrated 450 MHzPowerPC Cores

4th GenerationAdvanced Logic

Page 18: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

18

1 Hz to 640 MHz Pulse Generator

Direct Digital Synthesis in smallest Spartan3 chip PicoBlaze for arithmetic and user interface Special DCM frequency synthesis for <350 ps jitter External PLL for jitter reduction to 100 picoseconds

Max 640 MHz in 1 Hz steps, 1 ppm accuracy

Three SMA outputs: LVDS plus single-ended 1000 frequency values can be stored in EEPROM

Small size, low cost, easy single-knob control

Early 2005, next generation will reach 5 GHz

Page 19: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

19

640 MHz Pulse Generator

Page 20: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

20

Challenges

Technology moves rapidly: 130, 90, 65 nm

Multiple Vcc, lower voltage - higher currentLower Vcc makes decoupling very critical

Moore’s law becomes more difficult to sustainLeakage current has increased significantlyTriple-oxide transistors and clever design provide relief

Signal integrity on pc-boards is crucial“homebrew” prototyping would waste money and time

Use Standard Evaluation Boards Instead

Page 21: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

21

AFX Basic Evaluation Boards

Page 22: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

22

Low-Cost ML40X (~ $ 700)

Page 23: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

23

ML46X- Memory Eval. Board

Page 24: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

24

Page 25: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

25

ChipScope Pro for Real-Time Debug

Debugging usually dominates the design effort needs access to chip-internal nodes and busses practically impossible to dedicate extra pins and routing don’t waste time “debugging the debugger”

ChipScope Pro has internal virtual test headers Small cores that act as internal logic state analyzers

ChipScope Pro provides full visibility at speed Read-out via JTAG, no extra pins needed

ChipScope Pro is the best tool for logic debugging

Page 26: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

26

FPGAs in 2004+

Page 27: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

2727

“a bold new course into the cosmos”Reconfigurable Scalable Computing (RSC)

for Space Applications - $14.8M

Page 28: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

2828

Spirit & Opportunity Rovers6 Radiation-tolerant FPGAs:1M gates @ 100kRads-----------------------------------------Next:6M gates @ 200kRads

Page 29: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

29

Power and Reliability

Supply voltage fluctuations can increase in power-aware design

Need models that can be adapted by architects/software designers that abstract detailed circuit issuesCost of hardware solutions - supply grids, decoupling capacitancesCost of software solutions – balancing work load

Substrate coupling between digital and analog Interconnect reliability and power

Noise sources – single errors, multiple errors, amplitude of the error sources, modes of failureChallenge – detecting the unobservedHow to offset encoding/decoding costs – Just Enough Power

Soft errors

Page 30: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

30

Higher Leakage Current…

High Leakage current = static power consumptionWas <100 microamps, now > 100 mA, even amps (!)

Caused by:

Gate leakage due to 16 Å gate thickness

Sub-threshold leakage current incomplete turn-off because threshold does not scale

Tyranny of numbers:

10 nA x 100 million transistors = 1 Aevenly distributed, thus no reliability problem

Sub-100 nm is not ideal for portable designs

Page 31: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

31

Dramatic Power Reduction in Virtex-4

Frequency

Power Consumption

50%Virtex-4 cuts power by 50%• Measured 40% lower static power with

Triple-Oxide technology• 50% lower dynamic power with 90-nm

• Lower core voltage• Less capacitance

• Up to 10x lower dynamic power with hard IP• Integration means fewer transistors per function

Challenges- Static power grows with process generations

- Transistor leakage current- Dynamic power grows with frequency

- P = cv2f

Page 32: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

32

Single-Event Upsets in Virtex-II

SEU = random soft error, directly or indirectly caused by solar radiation

Known problem at high altitude and spacetraditionally not a problem at sea level.

Many tests, papers, show ways to mitigate:readback, scrubbing, triple redundancy

Aerospace apps tolerate the cost/size penalty.

Creates FUD: Fear, Uncertainty & Doubt

Page 33: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

33

Radiation Sources

Trapped ParticlesProtons, Electrons, Heavy Ions

Nikkei Science, Inc. of Japan, by K. Endo, Prof. Yohsuke Kamide

Galactic Cosmic Rays (GCRs)

Solar Protons&

Heavier Ions

Page 34: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

34

FPGA Radiation ToleranceTID Trends vs Product/Technology

050100150200250300350400

50100150200250300350nm

TID

Kra

ds (S

i)(p

er 1

019.

6)

Process trends*:• Gate oxide continues to thin• Oxide tunnel currents increase• Gate stress voltage decreases

*See “CMOS SCALING, DESIGN PRINCIPLES and HARDENING-BY-DESIGN METHODOLOGIES” by Ron Lacoe, Aerospace Corp2003 IEEE NSREC Short Course 2003

350nm - XQ4000XL − 60K Rads (Si)

220nm - XQVR (Virtex)− 100K Rads (Si)

150nm - XQR2V (Virtex-II)

− 200K Rads (Si)

130nm - XQR2VP− 250K Rads (Si)

90nm (Preliminary)− 300K Rad (Si)

TID tolerance of Military-grade FPGAs with full production test:

Page 35: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

35

Applied Mitigation (TMR + Scrubbing)

Virtex-II

PROM ScrubControl

TMR

Single FPGA with TMR and Configuration Scrubbing Continuous, uninterrupted

operation (except SEFI) Can employ readback for error

detection Scrub controller detects and

handles SEFIs Critical data processing

applications (Communications, Navigation)

FPGA can manage itsown configuration scrubbing!

Page 36: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

36

Xilinx TMR (XTMR)

XTMR

Single-String

Page 37: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

37

Spectrum of Reconfiguration

Field Upgrades Rapid Design Data Processing Networking Signal Processing

Occasionally Periodic Frequent Run-time

New use models enabled with Reconfigurable FPGAs

More efficient use of hardware Adaptive hardware algorithms Design modules that time share device resources

- Reduced device count and lower power consumption

Page 38: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

38

FPGA Partial Reconfiguration

Think of an FPGA as Two Layers Configuration Memory Layer User Logic Layer

Configuration memory controls functions on user logic layer

Partial Reconfiguration allows a portion of device to be changed while the rest is still running

Documented in XAPP 290

Configuration Memory Layer

User Logic Layer

What FPGA Configuration Memory Controls • All interconnection (wiring)• Logic Definition (Look-up Tables or “LUTs”)• Multiply by, divide by, etc.• Inversion• Feature selection• Interface to hardwired blocks, e.g. PPC• Pipeline on/off• ECC enable/disable• BRAM width• I/O Modes

>really EVERYTHING!

Page 39: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

39

Partial Reconfiguration Modules (PRMs)

XC2VP30

PRM_A0

PRM_A1PRM_A2

PRM_B0

PRM_B1PRM_B2

PR Region A

PR Region B

One or more PR regions can be defined

Multiple PRMs can be defined for each region

Page 40: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

40

Current Virtex Families Future RadHardby Design Families

Logic Capacity of Virtex Rad Tolerant FamiliesCurrent Virtex Families with TMR Mitigation vs. Future RHBD Families

XQVR1000 XQR2V6000 XQR2VP70 SIRF 4V100(Virtex) (Virtex-II) (Virtex-II pro) (Virtex-4 RHBD)

Avail

able

Logi

c Cell

s (K)

25

125

75

50

100

0

Page 41: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · 2003 IEEE NSREC Short Course 2003 350nm - XQ4000XL − 60K Rads (Si) 220nm - XQVR (Virtex) − 100K Rads (Si) 150nm -

41

Issues

CMOS scaling will continue well into the next decade fueling reconfigurable FPGA architectural advances and system-level integration

The computing industry is trying to increase performance with parallel execution and reconfigurability today and this is clearly the way of the future

Performance of FPGAs as a compute platform exceed conventional processors in all three performance vectors; implementing an effective programming model is the main issue the industry is working hard to solve

Partial Reconfiguration capability is here today enabling new use models and software support tools are imminent

Rad Tolerant Reconfigurable FPGAs available today achieve virtual SEE immunity by applying Partial Reconfiguration and soft TMR techniques

Rad Hard by Design Reconfigurable FPGAs are under development and will offer a dramatic increase in available logic cells and radiation performance while freeing up reconfiguration resources for more efficient use of hardware, reconfigurable processing or computing applications