TBD

21
23.09.2009 Dr. John Jones Princeton University [email protected] m TBD CMS CERN FERMI ELETTRA The Matrix Card and its Applicatio

description

TBD. FERMI ELETTRA. The Matrix Card and its Applications. 23.09.2009 Dr. John Jones Princeton University [email protected]. CMS CERN. Progression in Physics Hardware Over ~20 Years. Speed. Connectivity. Density. Size. 1995 VME. 2003 VME / ATCA. 2008+ ATCA / μ TCA. - PowerPoint PPT Presentation

Transcript of TBD

Page 1: TBD

23.09.2009

Dr. John JonesPrinceton University

[email protected]

TBD

CMSCERN

FERMIELETTRA

The Matrix Card and its Applications

Page 2: TBD

Dr. John Jones ([email protected]) 223.09.2009

Progression in Physics Hardware Over ~20 Years

1995VME

2003VME / ATCA

2008+ATCA / μTCA

SizeDensityConnectivitySpeed

Page 3: TBD

Dr. John Jones ([email protected]) 323.09.2009

The Matrix Processor - Schematic

Xilinx Virtex 5FPGA

(LX110T)

Mindspeed72x72

Cross-pointSwitch

SNAP12

POP4

SNAP12

2GbDDR2

2GbDDR2

OP

TIC

AL

I/O

(16

/16)

3U μ

TC

A I/

O (

20/2

0)

NXP2366μC

Page 4: TBD

Dr. John Jones ([email protected]) 423.09.2009

The Matrix Processor – Top Photo

MTP Optics

FPGA

Mindspeed Switch

Ethernet

Page 5: TBD

Dr. John Jones ([email protected]) 523.09.2009

The Matrix Processor – Bottom Photo

DDR2 SDRAM

NXP Microcontroller

TCA Connector

Page 6: TBD

Dr. John Jones ([email protected]) 623.09.2009

Switched Topology ProcessingThe switch technology makes it topologically agnostic:

This is a critical difference compared to previous systemsAllows the system to be used to solve calculations in various geometries1-to-N data duplication is easy to implement w/ latency ~100psAlso allows real-time redundancy, dynamic switching, etc…

Linear (e.g. batch)

2D / projective 2D (e.g. CMS)3D (e.g. lattice sim.)

4D…

Page 7: TBD

Dr. John Jones ([email protected]) 723.09.2009

Example 1 – CMS Trigger Upgrade

2-phase upgrade of trigger system:

Phase 1 (2011 2016): Replacement of older components, HCAL FE & associated trigger hardwareCalorimeter trigger upgrade

Phase 2 (Some time after...): Installation of upgraded tracker including TP generationIntegration of tracking information into enhanced trigger system

Page 8: TBD

Dr. John Jones ([email protected]) 823.09.2009

GCT (25 / 9)

External link latency (BX)Link speed (Gb/s)Internal processing latency (with internal connections)Internal processing latency (without internal connections)

GT (11 / 6)

RCT (20 / 18*)

1.5

5.5

0.08

3.0

ECAL TCC (17 / 17*) HCAL HTR (? / ?)

4*1.25*

ECAL FE (4.5 / 4.5)

Collision t=0

Detector readout t=131

19*

0.5

0.8

HCAL FE (? / ?)

? 1.6

TTC (2* / 2*)

10.04

19*

51.5 (37)79.556.5

Current CMS Trigger Architecture

Page 9: TBD

Dr. John Jones ([email protected]) 923.09.2009

Current CMS Trigger Architecture

Processing subdivided into eta-phi regions / link (e.g. calorimeter trigger)

2 scaling problems with this approach:Difficult to add new input sources (i.e. improved HCAL, tracking)Data reduction layer doesn’t scale efficiently & balance boundary data

sharing

CAL TPG

RCT

GCT

GTSignificant data reduction

400

20

φ

Page 10: TBD

Dr. John Jones ([email protected]) 1023.09.2009

Current-Revised CMS Trigger Architecture

Revisit calorimeter TPG principle:

CAL TPG

Current

CAL TPGCAL TPG CAL TPG

Revised (time-mulitplexed serialisation)

CAL TPGCAL TPG

RCT CAL TPGCAL TPG ROGCAL TPG CAL TPG

ηφ=1, t=1 ηφ=3, t=3

Page 11: TBD

Dr. John Jones ([email protected]) 1123.09.2009

Data Serialisation in TPG

TPG multiplexes data into BX-serialised streams:

η

φ

t0 t1 t2

Initial cost: lost time due to multiplexingLater gain: Compact, redundant, time-multiplexed system up to GT

Overall latency DECREASES!

t0

t1

t2φ

φ

Page 12: TBD

Dr. John Jones ([email protected]) 1223.09.2009

Current CMS Trigger Architecture

Processing subdivided into eta-phi regions / link (e.g. calorimeter trigger)

CAL TPG

ROG

GT

More compactFasterLower latencyTopological

400

20

1

Region / card increasesEliminates GCT / RCT boundarySpace for additional future dataInter-card data sharing decreases

OR

MUON TPG

Page 13: TBD

Dr. John Jones ([email protected]) 1323.09.2009

Doing the Numbers (Based on Current CT)

Post-TPG link speed ~3.75Gb/s ~8b * 9.375 / BX / fibre16 x serialisation in TPG => ~75 towers (ECAL+HCAL) / BX / fibre

Eliminate phi-boundary (one fibre absorbs entire eta segment!)

Calorimeter dimensions 88 (eta) x 72 (phi) trigger towerse.g. 1 matrix card = 16 (eta) x 72 (phi)

16 input channels => all inputs for jet trigger + overlap in current CMS

10 matrix cards for full-phi-granularity, coarse (4 tower) eta processing (x16 copies)

16 matrix cards for full-tower-granularity processing (x16 copies)

2 fibers => output for results (electrons, photons & jets)

32 input fibres into GT card

Page 14: TBD

Dr. John Jones ([email protected]) 1423.09.2009

Processing Topology – New and Old

φ

η

New Scheme3x3 jet tower finder (full phi resolution)

4x4 calorimeter towers / jet tower3.75Gb/s links

ProcessingFibers

22x18(88x72)

Data sharing – input fiber ratio: 160/88 = 1.82

φ

η

Data sharing – input fiber ratio: ~21888/680 = 32.19

Old scheme – NN sharing6.5Gb/s links

Real input fiber count: 16x88 = ~1408

Real input fiber count: 16x72x88/144 = ~680

Factor of two from link speed – need 6.5Gb/s to use old scheme

Page 15: TBD

Dr. John Jones ([email protected]) 1523.09.2009

Can have a fully-redundant crate (spare fibres from TPG)Redundant power & communicationsImprovements in link speed = reduction in crate size or latencyComplete system test can be achieved with a small setup (e.g. debug)

The Modular Trigger Crate – 3.5Gb/s, Partial Granularity

PW

R2

CM

S A

UX

/DA

Q

PW

R1

MC

H2

MA

TR

IX

MA

TR

IX

MA

TR

IX

MA

TR

IX

MA

TR

IX

MA

TR

IX

MA

TR

IX

MA

TR

IX

MIN

I-T

MC

H1

CLK

DATA

MA

TR

IX

MA

TR

IX

12 8 8 8 8 8 8 8 8 12

20

Page 16: TBD

Dr. John Jones ([email protected]) 1623.09.2009

Example 2 – FERMI, Trieste

4th generation Free Electron Laser (FEL)http://www.elettra.trieste.it/FERMILinear accelerator, VUV-XRAY (10-100nm)Extremely challenging (3GHz) RF control system

Tolerance: 0.1% amplitude, 0.1 degree phase

Precision (~20fs accuracy over 24 hours / 200m distance) RF timing system

Control / diagnostic system for RF cavities will use matrix card and LLRF board

Control system accuracy: 50ps clock resolution, synchronised at 16 stations

This will be achieved without a dedicated timing interface

Page 17: TBD

Dr. John Jones ([email protected]) 1723.09.2009

Timing System Principle

A standard optical fiber has very similar path lengths (~ps) in each direction.

Any change in path length in fiber of a TX/RX pair is matched by the other.

If you have a timing reference at each end of a serial link with guaranteed constant phase relationship between them, you can measure the loop time and use it to measure the propagation delay from the master board (matrix) to the slave (LLRF), and therefore compensate for the delay.

Such guaranteed phase can be achieved by either:

1) A shared reference clock at both ends of a link.2) An extremely accurate OCXO that can be used to track the recovered

serial clock at the receiving end.

Given the available components in a Xilinx FPGA, the loop time can be measured consistently to an accuracy of ~50ps (Xilinx DCM limited).

(N.B. With a few tricks, you can possibly do better)

Page 18: TBD

Dr. John Jones ([email protected]) 1823.09.2009

Round-Trip Phase Compensation, Version I

GTP RX

GTP TX GTP RX

GTP TX

δc

CLKBRIDGE

RFCLK

δR+δFRPCBi

δT+δNTPCBi

δCB1

δT+δFTPCBiδR

+δNRPCBi

DCM

TX

CL

K

TXCLK

CLKBRIDGE

PIPELINEDELAY

LLRFCLK

LLRFDATA

LLRFMATRIX

δCB2

δFL

δFCB

δFDCM

NOTE: Control logic not shown

CLKBRIDGE

TXCLK

CTRL

DCM

Page 19: TBD

Dr. John Jones ([email protected]) 1923.09.2009

DPLL

Round-Trip Phase Compensation, Version II

GTP RX

GTP TX GTP RX

GTP TX

δc

OCXO

RX

RE

CC

LK

CMP

CLKBRIDGE

RFCLK

δR+δFRPCBi

δT+δNTPCBi

δCB1

δT+δFTPCBiδR

+δNRPCBi

DCM

TX

CL

K

TXCLK

CLKBRIDGE

PIPELINEDELAY

LLRFCLK

LLRFDATA

LLRFMATRIX

δCB2

δFL

δFCB

δFDCM

NOTE: Control logic not shown

CLKBRIDGE

TXCLK

CTRL

Page 20: TBD

Dr. John Jones ([email protected]) 2023.09.2009

Timing System Details

Version I has been implemented, mostly finished (calibration in software a.t.m.)

Caveat: 1 serial time UI variability seen on one channel in matrix cardThis needs further study, hard to reproduce and doesn’t occur on all channelsPossible to correct for this effect using a loopback techniqueXilinx datasheet implies this is an artifact of the way V5 MGTs work……but they don’t tell you the details…

Backup: Use LVDS @ 1Gb/s, which has completely deterministic behaviour

Page 21: TBD

Dr. John Jones ([email protected]) 2123.09.2009

Conclusions

The Matrix Card is an extremely flexible device with many applications

A large part of this flexibility comes from evolution in FPGA technology

The addition of the cross-point switch provides significant extra flexibility