ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ (...

60
ΗΜΥ 664 ΨΗΦΙΑΚΟΣ ΣΧΕΔΙΑΣΜΟΣ ΜΕ FPGAs Χειμερινό Εξάμηνο 2010 ΔΙΑΛΕΞΗ 10: Structured ASICs and Other Things ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( [email protected]) Ack: •Neena Imam, Oakridge National Laboratory •Frank Vahid, UC Riverside Dan Lander Haru Yamamoto Shane Erickson

Transcript of ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ (...

Page 1: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜΥ 664ΨΗΦΙΑΚΟΣ ΣΧΕΔΙΑΣΜΟΣ ΜΕ FPGAs

Χειμερινό Εξάμηνο 2010

ΔΙΑΛΕΞΗ 10:Structured ASICs and Other Things

ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ( [email protected])

Ack: •Neena Imam, Oakridge National Laboratory•Frank Vahid, UC RiversideDan LanderHaru YamamotoShane Erickson

Page 2: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.2 © Θεοχαρίδης, ΗΜΥ, 2010

Complex Systems Overview

Mission: Innovative Technology in Support of DOE & DODTheory – Computation – Experiments

Research topics: Missile defense: C2BMC (tracking and discrimination), NATO(ALTBD),

flash hyperspectral imaging. Modeling and Simulation: Sensitivity and uncertainty analysis of complex

nonlinear models, global optimization. Laser arrays: directed energy, ultraweak signal detection, terahertz

sources, underwater communications, SNS laser stripping. Terascale embedded computing: emerging multicore processors for real-

time signal processing applications (CELL, Optical Processor, …). Anti-submarine warfare: ultra-sensitive detection, sensor networks,

advanced computational architectures, Doppler-sensitive waveforms. Quantum optics: cryptography, quantum teleportation (remote sensing). Computer Science: UltraScience network. Intelligent Systems: neural networks, multisensor fusion, robotics. Materials Science: control of friction at micro and nanoscale.

UltraScience Net

Sponsors: DOD(DARPA, MDA, ONR, NAVSEA ), DOE(SC), IC (CIA, IARPA, NSA), NASA, NSF

Page 3: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.3 © Θεοχαρίδης, ΗΜΥ, 2010

MDA's HALO-II/AIRS Project

Independent Verification and Validation (IV&V) of software.

Improved tracking algorithm development. Sensitivity analysis of system modules

using Automatic Differentiation (AD).

ORNL TASKS

Page 4: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.4 © Θεοχαρίδης, ΗΜΥ, 2010

Orbital Signatures

Meet MDA T&E Requirements Sensor / Technology Testbed

Kill Assessment or

Miss Distance

VehicleSeparation

ChemicalReleases

Booster Tracks

InterceptorPerformance

Flash Radiometry

Plume Signatures

Counter-measure

Signatures

TargetSignatures

Photo documentation

TrajectoryReconstruction

FailureDiagnostics

Exo-AtmosphericTarget Characterization

FOR

Motivation For HALO-II/AIRS

Page 5: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.5 © Θεοχαρίδης, ΗΜΥ, 2010

HALO-II System Overview

Closed Loop Tracking

Image Processing

Airborne Pointing System

Object Track Generation)

RTPS pointing

Pointing hardware

highest level view

Five Subsystems. Sensors installed in aerodynamic pod.

In-PodPointingAcquisitionTracking

In-CabinReal time processorSurveillance processor

In-Pod

In-Cabin

Page 6: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.6 © Θεοχαρίδης, ΗΜΥ, 2010

Wideband Sonar Signal Processing

For wideband signals, the effect of target velocity is no longer approximated as a simple "shift" in frequency.

Doppler effect: a compression/stretching of the transmitted pulse. Wideband Ambiguity Function (WAF): a function of time delay τ and Doppler

compression factor η.

Doppler Cross Power Spectrum (DCPS): forms a Fourier pair with the ambiguity function and can be used to calculate the ambiguity function and the Q function [1, 2]

1 /1 /

cc

η +=

−uu

( , ) ( ) [ ( )]s s t s t dtχ τ η η η τ∞

−∞

= −∫1( , ) ( ) ( )s

ff S f Sηηη

∗Γ =

2 2( ) ( , ) ( , )s s sQ f df dη η χ τ η τ∞ ∞

−∞ −∞

= Γ =∫ ∫

1. R. A. Altes, "Some invariance properties of the wideband ambiguity function," J. Acoust. Soc. Am. 53, pp. 1154-1160, 1973.2. E. J. Kelly and R. P. Wishner, "Matched filter theory for high velocity accelerating targets," IEEE Trans. Mil. Electron.

MIL 9, pp. 59-69, 1965.

Page 7: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.7 © Θεοχαρίδης, ΗΜΥ, 2010

Matched Filtering for Active Sonar Processing

A synthetic echo is generated for a particular target range and velocity. The echo signal is correlated with a bank of replicas. Spectral techniques are used. The correlation with the highest magnitude provides an estimate of the Doppler velocity bin. The location of the maximum within that correlation yields the time delay of the echo, and thus provides an estimate of the range.

Page 8: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.8 © Θεοχαρίδης, ΗΜΥ, 2010

Matched Filtering for Active Sonar Processing

SFM pulse of fc=1200 Hz Bandwidth B= 400 Hz Pulse duration = 1 s Modulation frequency = 5 Hz Sonar sampling rate fs = 5000Hz FFT length = 80K

Target• assumed range: 3Km• assumed velocity: - 5m/s (bin#1)• 32 matched filter bank.

Result:• output of the first filter has the

closest match to the received signal.• Time delay = 4 seconds; thus,

estimated target range = 3 Km.

Page 9: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.9 © Θεοχαρίδης, ΗΜΥ, 2010

EnLight 64αdemonstrator

Power dissipation (at 8000 GOPS throughput):• EnLight: 40 W (single board)• DSP solution: 2.79 kW [ 62 boards, 16 DSPs

(TMS320C64x) per board ]

The EnLight TM Prototype Optical Core Processor

• Full matrix ( 256 x 256 ) - vector multiplication per single clock cycle

• Fixed point architecture, 8-bit native accuracy per clock cycle• Enhanced by on node FPGA-based processing and control • Demonstrated accuracy and performance in complex signal

processing tasks• Developed by Israeli startup

Application Programs FORTRAN C MATLAB

SIMULINK

VHDL

Libraries FPGAs

Optical Core

Information provided by Lenslet, Inc

Page 10: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.10 © Θεοχαρίδης, ΗΜΥ, 2010

Matched filter calculation on EnLight-64α hardware

• Speed-up factor per processor– E_64α : 6,826 × 2 > 13,000

actual hardware– E_256 : 56,624 × 2 > 113,000 emulator

Performance ComparisonHardware Implementation Results

Time PerformanceIntel Dual

XeonEnlight

64αEnlight

256

Specs2 GHz

1 GB RAM60 MHz 125 MHz

FFT radix 2 32 128

Timing 9,626 ms 1.41 ms 0.17 ms

• Computation parameters− FFTs: 80K complex samples ×

number of filter banks

− 33 filter banks: 32 Doppler cells, 1 target echo

-30

-35

-40

-45

-50

-55

2000 2600 40003200 3400 3600 38002800 300024002200Range (meters)

Out

put o

f filt

er #

1, d

B

MATLABAlphaMATLABAlpha

Page 11: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.11 © Θεοχαρίδης, ΗΜΥ, 2010

Hyperspectral SensorComputer Tomography Imaging Spectrometer (CTIS)

CTIS: Simultaneously acquires spectral information from every position element within a 2-D FOV with high spatial and spectral resolution.

CTIS is being developed at Optical Detection Lab of U. Arizona by Eustace Dereniak et. al.

Objective is to collect a set of registered, spectrally contiguous images of a scene’s spatial-radiation distribution within the shortest possible data collection time

Page 12: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.12 © Θεοχαρίδης, ΗΜΥ, 2010

CTIS Instrumentation at U. Arizona

Page 13: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.13 © Θεοχαρίδης, ΗΜΥ, 2010

IBM Cell Multicore Device

Courtesy IBM 2006

CELL Broadband Engine Architecture (CBEA) jointly developed by Sony, Toshiba and IBM

Took 5 years, over 400 Million dollars, and hundreds of engineers

New design relies on heterogeneous multicore architecture abandons mechanisms such as cache hierarchies, speculative execution, etc based on fast local memories and powerful DMA engines

Research Centers contributing

IBM USA• Austin, TX (lead, STIDC)• Almaden, CA• Raleigh, NC• Rochester, MN• Yorktown Heights, NY

IBM Germany• Boeblingen

IBM Israel• Haifa

IBM Japan• Yasu

IBM India• Bangalore

Page 14: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.14 © Θεοχαρίδης, ΗΜΥ, 2010

Reconfigurable Computing via FPGAs

The emergence of high capacity reconfigurable devices has ignited a revolution ingeneral-purpose processing.

It is now possible to tailor and dedicate functional units and interconnects to takeadvantage of application dependent dataflow.

Early research in this area of reconfigurable computing has shown encouragingresults in a number of areas including signal processing, achieving 10-100xcomputational density and reduced latency over more conventional processorsolutions.

FPGA, short for Field-Programmable Gate Array, is a type of logic chip that can beprogrammed.

An FPGA is similar to a PLD, but whereas PLDs are generally limited to hundreds ofgates, FPGAs support thousands of gates.

SPECT Laboratory is involved in the development and demonstration of latest generation FPGA computing

applications.

Page 15: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.15 © Θεοχαρίδης, ΗΜΥ, 2010

Xilinx XtremeDSPTM FPGA Hardware 500 MHz Clocking. Multi-Gigabit Serial I/O. 256 GMACS Digital Signal Processing. 450 MHz PowerPC™ Processors with H/W

Acceleration . Highest Logic Integration. 200,000 Logic Cells. Reduced Power Consumption. Achieve performance goals while staying within

your power budget.

The Xilinx XtremeDSP™ initiative helps develop tailored high performance DSP solutions for aerospace and naval defense, digital

communications, and imaging applications.

VIRTEX-4 XtremeDSPTM

Development Board

Page 16: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.16 © Θεοχαρίδης, ΗΜΥ, 2010

FPGA Signal Processing Station at SPECT Laboratory

1. Pegasus Demo Board with SPARTAN-2

2. Digilent VIRTEX-2 Development board

3. VIRTEX-4 XtremeDSPTM

Development Board

Page 17: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.17 © Θεοχαρίδης, ΗΜΥ, 2010

How Much is Enough?

Page 18: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.18 © Θεοχαρίδης, ΗΜΥ, 2010

How Much is Enough?

Perhaps a bit small

Page 19: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.19 © Θεοχαρίδης, ΗΜΥ, 2010

How Much is Enough?

Reasonably sized

Page 20: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.20 © Θεοχαρίδης, ΗΜΥ, 2010

How Much is Enough?

Probably plenty big

Page 21: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.21 © Θεοχαρίδης, ΗΜΥ, 2010

How Much is Enough?

More than typically necessary

Page 22: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.22 © Θεοχαρίδης, ΗΜΥ, 2010

How Much is Enough?

Very few people could use this

Page 23: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.23 © Θεοχαρίδης, ΗΜΥ, 2010

How Much is Enough for an IC?

1993: ~ 1 million logic transistors

IC package IC

Perhaps a bit small

Page 24: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.24 © Θεοχαρίδης, ΗΜΥ, 2010

How Much is Enough for an IC?

1996: ~ 5-8 million logic transistors

Reasonably sized

Page 25: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.25 © Θεοχαρίδης, ΗΜΥ, 2010

How Much is Enough for an IC?

1999: ~ 10-50 million logic transistors

Probably plenty big

Page 26: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.26 © Θεοχαρίδης, ΗΜΥ, 2010

How Much is Enough for an IC?

2002: ~ 100-200 million logic transistors

More than typically necessary

Page 27: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.27 © Θεοχαρίδης, ΗΜΥ, 2010

How Much is Enough for an IC?

2008: >1 BILLION logic transistors

1993: 1 M

Perhaps very few people could design this

Point of diminishing returns 8-bit uC: ~15K 32-bit ARM: ~30K MPEG dcd: ~1M 100M good enough

for audio/video/etc.? Other examples

Fast cars (> 100 mph)

High res digital cameras (> 4M)

Disk space Even IC performance

Page 28: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.28 © Θεοχαρίδης, ΗΜΥ, 2010

Very Few Companies Can Design High-End ICs

• Designer productivity growing at slower rate• 1981: 100 designer months ~$1M• 2002: 30,000 designer months ~$300M

10,000

1,000

100

10

1

0.1

0.01

0.001

Logic transistors per chip

(in millions)

100,000

10,000

1000

100

10

1

0.1

0.01

Productivity(K) Trans./Staff-Mo.IC capacity

productivity

Gap

Design productivity gap

Source: ITRS’99

Page 29: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.29 © Θεοχαρίδης, ΗΜΥ, 2010

Meanwhile, ICs Themselves are Costlier

• And take longer to fabricate• While market windows are shrinking• Less than 1,000 out of 10,000 ASIC designs have

volumes to justify fabrication in 0.13 micron

Tech: 0.8 0.35 0.18 0.13

NRE: $40k $100k $350k $1,000k

Turnaround 42 days 49 days 56 days 76 days

Market: $3.5B $6B $12B $18BSource: DAC’01 panel on embedded programmable logic

Page 30: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.30 © Θεοχαρίδης, ΗΜΥ, 2010

Summarizing So Far...

* Transistors are less scarce

• ICs are big enough, fast enough

* ICs take more time and money to design and fabricate

• While market windows are shrinking

Buy pre-fabricatedsystem-level ICs: platforms

Designers

Page 31: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.31 © Θεοχαρίδης, ΗΜΥ, 2010

Trend Towards Pre-Fabricated Platforms: ASSPs

• ASSP: application specific standard product– Domain-specific pre-fabricated

IC– e.g., digital camera IC

• ASIC: application specific IC• ASSP revenue > ASIC• ASSP design starts > ASIC

– Unique IC design• Ignores quantity of same IC

– ASIC design starts decreasing• Due to strong benefits of using

pre-fabricated devices

Sou

rce:

Gar

tner

/Dat

aque

st S

epte

mbe

r’01

Page 32: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.32 © Θεοχαρίδης, ΗΜΥ, 2010

A Sample Pre-Fabricated Platform

uP

L1 cache

L2 cache

DSP

JPEG dcd

Periph-erals

FPGA

Pre-fabricated Platform

• Must be programmable for use in variety of products– Ideally also configurable– Means high volume

• Platform designer’s investment pays off

• Cost per IC is reasonable– Use additional (readily available)

transistors for high configurability

• Our research focus– Design and use of highly

configurable platforms

IC

Page 33: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.33 © Θεοχαρίδης, ΗΜΥ, 2010

Commercial Highly-Configurable Platform Type: Single-Chip Microprocessor/FPGA Platforms

Triscend E5 chip

Con

figur

able

logi

c

8051 processor plus other peripherals

Memory

• Triscend E5: based on 8-bit 8051 CISC core– 10 Dhrystone MIPS at

40MHz– 60 kbytes on-chip

RAM– up to 40K logic gates– Cost only about $4 (in

volume)

Page 34: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.34 © Θεοχαρίδης, ΗΜΥ, 2010

Single-Chip Microprocessor/FPGA Platforms Atmel FPSLIC

Field-Programmable System-Level IC

Based on AVR 8-bit RISC core 20 Dhrystone MIPS 5k-40k configurable logic

gates On-chip RAM (20-36Kb) and

EEPROM $5-$10

Courtesy of Atmel

Page 35: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.35 © Θεοχαρίδης, ΗΜΥ, 2010

Single-Chip Microprocessor/FPGA Platforms

• Triscend A7 chip• Based on ARM7

32-bit RISC processor– 54 Dhrystone MIPS

at 60 MHz– Up to 40k logic

gates– On-chip cache and

RAM– $10-$20 in volume

Courtesy of Triscend

Page 36: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.36 © Θεοχαρίδης, ΗΜΥ, 2010

Single-Chip Microprocessor/FPGA Platforms

• Altera’s Excalibur EPXA 10• ARM (922T) hard core• ~200 Dhrystone MIPS at

~200 MHz• Devices range from ~200k to

~2 million programmable logic gates

Source: www.altera.com

Page 37: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.37 © Θεοχαρίδης, ΗΜΥ, 2010

Single-Chip Microprocessor/FPGA Platforms

• Xilinx Virtex II Pro• PowerPC based

– 420 Dhrystone MIPS at 300 MHz

– 1 to 4 PowerPCs– 4 to 16 gigabit

transceivers– 12 to 216 multipliers– 3,000 to 50,000 logic

cells– 200k to 4M bits RAM– 204 to 852 I/O– $100-$500 (>25,000

units)

Config.logic

Up to 16 serial transceivers• 622 Mbps to 3.125 Gbps

Pow

erPC

s

Courtesy of Xilinx

Page 38: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.38 © Θεοχαρίδης, ΗΜΥ, 2010

• Why wouldn’t future microprocessor chips include some amount of on-chip FPGA?

Single-Chip Microprocessor/FPGA Platforms

Page 39: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.39 © Θεοχαρίδης, ΗΜΥ, 2010

Single-Chip Microprocessor/FPGA Platforms• Lots of silicon area taken up by

configurable logic– As discussed earlier, less of an issue

every year– Smaller area doesn’t necessarily

mean higher yield (lower costs) any more

• Previously could pack more die onto a wafer

• But die are becoming pad (pin) limited in nanoscale technologies

• Configurable logic typically used for peripherals, glue logic, etc.– We have investigated another use...

Page 40: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.40 © Θεοχαρίδης, ΗΜΥ, 2010

Structured ASICs

• A Structured ASIC falls between an FPGA and a Standard Cell-based ASIC

• Structured ASIC’s are used mainly for mid-volume level designs

• The design task for structured ASIC’s is to map the circuit into a fixed arrangement of known cells

Page 41: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.41 © Θεοχαρίδης, ΗΜΥ, 2010

Properties

• Low NRE cost– Implementation engineering effort– Mask tooling charges

• High performance• Low power consumption• Less Complex

– Fewer layers to fabricate

• Small marketing time– Pre-made cell blocks available for placing

Page 42: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.42 © Θεοχαρίδης, ΗΜΥ, 2010

Architecture

• Two Main Levels– Structured Elements

• Combinational and sequential function blocks

• Can be a logical or storage element

– Array of Structured Elements

• Uniform or non-uniform array styles

• A fixed arrangement of structured elements

Page 43: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.43 © Θεοχαρίδης, ΗΜΥ, 2010

Main Implementation Steps

1. RTL DesignRegister transfer level design

2. Logical synthesisMaps RTL into structured elements

3. Design for Test insertionImproves testability and fault coverage

4. PlacementMaps each structured element onto array elementsPlaces each element into a fixed arrangement

Page 44: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.44 © Θεοχαρίδης, ΗΜΥ, 2010

Main Implementation Steps

5. Physical synthesisImproves the timing of the layoutOptimizes the placement of each element

6. Clock synthesisDistributes the clock network Minimizes the clock skew and delay

7. RoutingInserts the wiring between the elements

Page 45: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.45 © Θεοχαρίδης, ΗΜΥ, 2010

Implementation Issues

• Logical synthesis, placement and routing all depend on the target structure element architecture and hence add more complexity to the design process.

• The completeness of the target structured ASIC library also affects what specifically can be implemented from the design.

Page 46: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.46 © Θεοχαρίδης, ΗΜΥ, 2010

Easy to DesignShort Development TimeLow NRE CostsDesign Size LimitedDesign Complexity LimitedPerformance LimitedHigh Power ConsumptionHigh Per-Unit Cost

Difficult to DesignLong Development TimeHigh NRE CostsSupport Large DesignsSupport Complex DesignsHigh PerformanceLow Power ConsumptionLow Per-Unit Cost (at high volume)

FPGA Standard Cell ASICVs.

Structured ASIC’s Combine the Best of Both Worlds

Page 47: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.47 © Θεοχαρίδης, ΗΜΥ, 2010

Structured ASIC ArchitecturesFine-Grained

• Structured elements contain unconnected discrete components

• Could include transistors, resistors, and others

Page 48: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.48 © Θεοχαρίδης, ΗΜΥ, 2010

Structured ASIC ArchitecturesMedium-Grained

• Structured elements contain generic logic• Could include gates, MUX’s, LUT’s or flip-flops

Page 49: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.49 © Θεοχαρίδης, ΗΜΥ, 2010

Structured ASIC ArchitecturesHierarchical

• Use mini structured elements that contain only gates, MUX’s, and LUT’s

• It does not contain storage elements like flip-flops

• This mini element is then combined with registers or flip-flops

Page 50: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.50 © Θεοχαρίδης, ΗΜΥ, 2010

Architecture Comparison

• Fine-grained requires many connections in and out of a structured element

• Higher granularities reduce connections to the structured element but decreases the functionality it can support

• Clearly, each individual design will benefit differently at varying granularities

Page 51: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.51 © Θεοχαρίδης, ΗΜΥ, 2010

Structured ASIC Advantages

• Largely Prefabricated– Components are “almost”

connected in a variety of predefined configurations

– Only a few metal layers are needed for fabrication

– Drastically reduces turnaround time

Pre-Routed Layer

Pre-Routed Layer

Pre-Routed Layer

Routing Layer

Routing Layer

Page 52: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.52 © Θεοχαρίδης, ΗΜΥ, 2010

Structured ASIC Advantages

• Easier and faster to design than standard cell ASIC’s– Multiple global and local clocks are prefabricated– No skew problems that need to be addressed– Signal integrity and timing issues are inherently

addressed

Page 53: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.53 © Θεοχαρίδης, ΗΜΥ, 2010

Structured ASIC Advantages

• Capacity, performance, and power consumption closer to that of a standard cell ASIC

• Faster design time, reduced NRE costs, and quicker turnaround

• Therefore, the per-unit cost is reasonable for several hundreds to 100k unit production runs

Page 54: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.54 © Θεοχαρίδης, ΗΜΥ, 2010

Structured ASIC Disadvantages

• Lack of adequate design tools– Expensive– Altered from traditional ASIC tools

• These new architectures have not yet been subject to formal evaluation and comparative analysis– Tradeoffs between 3-, 4-, and 5-input LUT’s– Tradeoffs between sizes of distributed RAM

Page 55: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.55 © Θεοχαρίδης, ΗΜΥ, 2010

Technology Comparison

• Generally speaking– 100:33:1 ratio between the number of gates in a

given area for standard cell ASIC’s, structured ASIC’s, and FPGA’s, respectively

– 100:75:15 ratio for performance (based on clock frequency)

– 1:3:12 ratio for power

Page 56: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.56 © Θεοχαρίδης, ΗΜΥ, 2010

Design Tools

• Many companies are using existing standard cell-based CAD tools– They add product specific placement tools– To maximize benefits, we need CAD tools designed specifically

for structured ASIC’s– Need updated algorithms to exploit the modularity of

structured ASIC’s– Clock aware design

• Need architectural evaluation and analysis tools

Page 57: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.57 © Θεοχαρίδης, ΗΜΥ, 2010

Case Study: NEC ISSPProblem Formulation

• Prefabricated– Standard Cells, Flip-Flops, DSP, Memory and other

IP (Intellectual Properties)– Interconnects for modules, DFT circuit, and clocks

• Physical Design (Placement) Problems– Modules are already embedded– Mapping problem

Page 58: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.58 © Θεοχαρίδης, ΗΜΥ, 2010

After Logical Synthesis

• Different clock signals for different groups of modules (FFs)

• Multiple clock signals in one chip• Must perform clock-aware placement

Page 59: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.59 © Θεοχαρίδης, ΗΜΥ, 2010

Conclusions

• Enhancements should be made to existing EDA tools to achieve a better performance result on structured ASIC architectures

• The structured ASIC is a revolution to businesses, but another evolution of ASIC implementation

• The structured ASIC was developed to bridge the gap between the FPGA and the Standard Cell-based ASIC

Page 60: ΧΑΡΗΣ ΘΕΟΧΑΡΙΔΗΣ ( ttheocharides@ucy.ac.cy) · PDF fileMatched Filtering for Active Sonar Processing SFM pulse of f c ... • Fixed point architecture, ... 56,624 ×2

ΗΜ648 L9 FPGA Current Trends in Industry and Academia.60 © Θεοχαρίδης, ΗΜΥ, 2010

References• T. Okamoto, T. Kimoto, N. Maeda, “Design Methodology and Tools

for NEC Electronics - Structured ASIC ISSP", [p. 90] Proceeding of the 2004 international symposium on Physical design.

• B. Zahiri, “Structured ASICs: Opportunities and Challenges,” Proceedings of the 21st International Conference on Computer Design (ICCD’03).

• K. Wu, Y. Tsai, “Structured ASIC, Evolution or Revolution?,” Faraday Technology Corporation, Proceedings of the 2004 International Symposium on Physical Design.