CMS Global Calorimeter Trigger Hardware Design 21/9/06.

download CMS Global Calorimeter Trigger Hardware Design 21/9/06.

of 29

  • date post

    17-Jan-2016
  • Category

    Documents

  • view

    212
  • download

    0

Embed Size (px)

Transcript of CMS Global Calorimeter Trigger Hardware Design 21/9/06.

  • CMS Global Calorimeter TriggerHardware Design21/9/06

  • CMS Level-1 TriggerPosition of the Global Calorimeter Trigger inThe CMS Level-1 Trigger system

  • RCT Data Output18 RCT Crates cover CMS Calorimeter barrelEach Crate covers 0.7 x 5 regionOutputs Electron and Jet Information to GCT01234567891011121314151617-55002

  • GCT RequirementsSorts Electrons4 highest EnergyFinds and sorts JetsTop 4 by energy, physical size, tau Sorting criteria may be changedSort by energy for initial implementationAlgorithm implements 3x3 sliding windowRequires contiguous data space spans RCT crate boundariesData sharing scheme needs to be implementedThis algorithm drives processing requirementsProcesses data at ~250 GbpsLatency requirement of 24 bunch crossings600 nSInterfaces to 18 RCT crates108, 68 pin parallel cables Differential ECL, not DC balancedNeed to maintain ground reference with entire RCTEntire row of racks on different floor

  • Jet finderParallel algorithm implements 3x3 sliding window over each RCT crates data spaceOverlaps neighboring cratesRCT data must be shared between InstancesInstances must be in close proximitySame Device if possibleHigh bandwidth connection if notNote overlap at 0Data duplicatedSent to both jet findersPoster covers this subject in detail 01234591011121314-5500Jet finder 1Jet finder 2Jet finder 3Revised CMS Global Calorimeter Trigger Functionality & Performance

  • Design TradeoffsDesign on a compressed scheduleReduce risk to the extent possibleBase on existing modulesConservative data ratesMinimize number of FPGAsReduce firmware riskNever split algorithmsAlgorithm size and complexity drives FPGA selectionEasier simulationBetter synthesis efficiencySuperior timing (and lower overall power consumption)Judicious use of serializer/deserializersMost efficient method of concentrating dataLatency penalty of at least 6 clocksOnly use on system boundariesRCT/GCTGCT/GTSignificant negative impact on complexity of designSeveral cards used mainly as signal plumbing

  • Design OverviewThree main elementsData transport and physical concentrationTrigger processingData plumbing and sortingData TransportCompress RCT data and provide electrical isolationFunctionally part of the RCTWhat we really wanted its output to beTrigger ProcessingImplement Jet findersModular processing elementData PlumbingLarge, physically complex boardsRequired to deal with multiple wide parallel bussesExcellent example of why SERDES technology was developedUnfortunately required due to latency constraints

  • Design OverviewModular design with 4 card typesSource card (based on Imperial College IDAQ VME module)Serializes RCT data and transmits on fiberElectrically isolates GCT from RCTReduces interface cabling, allows physical data concentrationResides in RCT racksAlso provides RCT readoutLeaf card (based on Los Alamos digital channelizer double PMC)Jet processing and fiber receiverLogic capacity driven by Jet finder algorithmAlso used for Electron sortWheel card (new CERN design)Only used for Jet processingCarries multiple Leaf cardsFacilitates Leaf data sharingSorts resulting JetsConcentrator card (new CERN Design)Electron and final Jet sortCarrier Leaf cards for Electron sortInterfaces with Wheel cards for final jet sortSlow control, TTC, and DAQ interface

  • GCT Block DiagramLeaf CardWheel CardConcentrator card63 source cards (not shown)8 Leaf cards2 Wheel cards1 Concentrator

  • Electrical InterfacesLVDS signalingUsed between Leaf cards, and from Wheel to ConcentratorCable based board/board connections40MHz DDR required, but can support fasterDirect FPGA drive in most casesDirect connections provide maximum flexibility and speedWheel/Concentrator Jet data passes through single ended/diff convertersPin limits due to large number of signalsSamtec QTS/H differential connectorsHigh density and speed, Rated for multi GHz operationCommercial cable assembliesDDR used for all single ended I/OFPGA intercommunication at 40MHzShort runs allow faster operationCommunication with Leaf cards at 40MHzPMC connectors limit speed hereLeaf cards utilize 2.5V LVCMOS with DCI (nominally 50 ohms at this point)Single ended outputs from FPGAs utilize DCI driversAllow controlled impedance driveNominally 50 ohms at this point

  • Clock DistributionTTCrxQPLL40MHz80MHz160MHzFan out4x4 Cross point switchExternal InputFPGA ControlPMC sitesWheel cardsLocal FPGAsConcentratorQPLL40MHz80MHzFan out4x4 Cross point switch40MHz from ConcentratorFPGA ControlPMC sitesLocal FPGAsWheel80MHz from ConcentratorOSCFan out4x4 Cross point switchLocal 80MHz reference FPGA ControlFPGASERDES refLeaf40, 80MHz from WheelFPGALogic clocks80MHzExternalinput Fully differential distribution tree Controlled by cross point switches Allows stand alone operation Use DLLs in FPGAs to tune if necessary

  • JTAG SystemJTAGJTAGSWITCH(CPLD)Front panelCPLDJTAGBoard headerSWITCHES

    DifferentialbuffersSamtec differentialConnector/cableFrom concentratorJTAG Mode6, 2.5V JTAG chains2, 3.3V JTAG chainsJTAG CPLD is a combinatorial flow-through design

    A single chain, or multiple chains (daisy chained) areselected via front manual mode switches

    JTAG source is either header or remote device,selected by mode switch

    Leaf JTAG is single chain (4 devices)JTAGJTAGSWITCH(CPLD)Board headerCPLDJTAGBoard headerSWITCHES

    DifferentialbuffersSamtec differentialConnector/cableTo wheelJTAG Mode6, 2.5V JTAG chains2, 3.3V JTAG chainsVMEConcentratorWheel

  • Source CardBased in IDAQ module designed at Imperial CollegeSimplified due to large number required for complete systemConverts ECL RCT input to SFP opticalTwo VHDCI SCSI inputs, 32 bits at 80MHzFour SFP fiber outputsSpartan 3 4 SERDES/SFP modules8b/10b encodingDC balanced1.6GbpsComma generation (sync)TTCrxTime synchronizationUSB slow controlProvides RCT readback

  • Block diagram3445728110 layer PCBUSB slow controlTTCrx and QPLLVHDCI SCSI inputsSerial SPF outputsAgilent HFBR-5720ALTLK2501 SERDESRated at 2.5 GbpsLinear power for SERDES/SFPSwitching power for FPGASpartan3 FPGA3S1000

  • Leaf CardBased on satellite channelizer design at Los Alamos LabModified to include V2Pro and MFP connectorsMain processing engine of GCTAccepts Jet data from 3 RCT cratesElectron data from 9 RCT crates32 fiber optic linksSNAP-12 MFP opticsRated at 2.5 GbpsJet algorithm drives capacity3M gates/jet finder10 fibers/crate2 V2Pro70 FPGAs

  • Block DiagramV2P70V2P703346664412557Clock DistributionLocal oscillatorPMC/coax inputsPMC connectors8 fully populated60 pair differential linksSwitching power supplies1.5V core and 2.5V I/OPhase and freq controlledLinear SERDES supplies12 channel optical receiversAgilent AFBR-742B14 layer PCB

  • ImplementationHigh density Optical Inputs Cannot fit enough SFP single channel modulesSnap 12 parallel receiver12 channels at 2.5GbpsIndustry standard short distance linkXilinx Embedded SERDES links (Rocket I/O)Virtex2 Pro devices selectedV2P70 with 16 links eachSupport improved differential I/OEasily obtainableNo external (off FPGA chip) memoryNice to have, but not required for GCT processingDouble PMC formatPower supply and basic layout retained from existing designElectrically compatible, but too high mechanicallyNot truly PMC compliant

  • Routing ParametersLength matchingDifferential lines matched to as a busYields on board/board connectionsIndividual pairs matched to a few milsMatched in groups of 8 pairsNot required for 40MHz DDRAllows significant speed increaseSingle ended lines matched to as a busMatched in groups of 8-12 linesNot required for 40MHz DDRAllows higher speed operationDifferential SERDES lines matched to 1-2 milsImpedance controlled and individually matchedBoard structure isolates SERDES with ground planes50 Ohm stripline

  • Test routing

  • Test routing

  • Power supplies15A, 1.5V switcher for each V2Pro VccIntDevices can be run at thermal limitFan headers on board if neededEstimated load less than 1/2 this figure40MHz, 100% utilization yields 6ASingle 15A, 2.5V switcher for I/OEstimated load is of this capacitySwitchers powered from 5VNot used for other logicPhase and frequency controlledCan optimize noise or efficiencySwitch out of phase to control surge currentsSeparate linear supplies for SERDESEach FPGA has local linear SERDES supplyOptical receivers powered directly from 3.3V PMC powerManufacturer claims this is acceptable

  • Wheel CardCarries 3 leaf cards (double PMC)Compresses (sorts) Jet dataCalculates Et and Jet countSingle ended electrical interface (DDR 40MHz)Interfaces to concentrator boardHigh speed cable interfaceLVDS electrical interfaceDDR 40Mhz required, but could support higher9U VME form factorPower only, no VME interfaceECAL backplane

  • Block Diagram EnergyFPGAJetFPGALeaf(x3)J12J14J13J11J22J21J24J23Finished Jet dataEnergy Sum DataJ1J2J3DiffBuffersDiffBuffersSlow control and readback

  • ImplementationAccepts parallel data from leafs on 3 DPMC sites278 signals on each site (total 834)186 signals/site to Jet FPGA, 92/site to Energy FPGASingle endedOutputs parallel dataElectrical interface240 differential pairs providedProcessingTwo Xilinx Virtex4 FPGAsXC4VLX100FF1513I/O (as opposed to logic) intensive designAdvanced Virtex4 I/O features reduce riskBetter double data rate supportImproved Differential supportOne for Jet sorting, one for Et and Jet countJet FPGA pin limitedRequires single ended output to meet signal countExternal differential buffers drive data to concentrator

  • Power supplies10A, 1.2V switcher for each V4 VccIntDevices can be run near thermal limitEstimated load less than 1/2 this figure at 40MHzTwo 10A, 2.5V switchers for I/OLeaf cards may be jumpered to provide own VIOWheel will only nee