Evaluation of HEP worker nodes Michele Michelotto at … TB-N TB KSI2K TB-N TB KSI2K TB-N TB KSI2K...
Transcript of Evaluation of HEP worker nodes Michele Michelotto at … TB-N TB KSI2K TB-N TB KSI2K TB-N TB KSI2K...
8/2/2008 CSN5 Trieste michele michelotto - INFN Padova 2
Computing model
Tier3physics
department
α
β
γ
Desktop
Germany
Tier-1 UK
France
Italy
CERNTier 1
JapanCERN Tier 0
Tier-2
Lab aUni a
Lab c
Uni n
Lab m
Lab b
Uni bUni y
Uni x
grid f
or a
region
al gro
up
USABNL
USAFNAL
grid for aphysicsstudy group
8/2/2008 CSN5 Trieste michele michelotto - INFN Padova 3
Computing Needs• Tape Storage:
– Very Easy: events Terabyte• Disk Storage
– Easy again: events Terabyte– (1000x1000 or 1024x1024?)– RAID protected or raw size?
• Computing Power– Tricky: Event/sec? Sim or Reco?– MIPS, CernUnit, MHz, Spec, SI2K….
8/2/2008 CSN5 Trieste michele michelotto - INFN Padova 4
T1 requirements
Experiment % CPU DISK TAPE CPU DISK TAPE CPU DISK TAPE CPU DISK TAPE CPU DISK TAPE CPU DISK TAPE CPU DISK TAPE
KSI2K TB-N TB KSI2K TB-N TB KSI2K TB-N TB KSI2K TB-N TB KSI2K TB-N TB KSI2K TB-N TB KSI2K TB-N TB
ALICE 22% 154 16 77 286 110 143 748 330 428 1727 789 888 3026 1512 1554 4431 2503 2691 5714 3426 3905
ATLAS 32% 224 40 112 416 160 208 1088 480 623 2513 1148 1292 4401 2199 2260 6446 3641 3913 8311 4983 5680
CMS 35% 245 86 123 455 175 228 1190 525 681 2748 1256 1413 4813 2405 2472 7050 3983 4280 9090 5450 6212
LHCB 11% 77 26 39 143 55 72 374 165 214 864 395 444 1513 756 777 2216 1252 1345 2857 1713 1952
Total LHC TIER1 700 168 350 1300 500 650 3400 1500 1946 7852 3588 4037 13753 6871 7061,8 20143 11380 12230 25972 15571 17748
BaBar 585 149 0 680 200 0 1215 350 0 1215 350 0 1215 350 0 700 350 0 400 350 0
CDF 900 66 0 820 100 15 1161 170 15 1290 220 15 1420 270 15 800 270 15 600 270 15
LHCB TIER2 0 0 0 150 0 0 600 0 0 900 0 0 1300 0 0 1600 0 0 1600 0 0
TOTALE GRUPPO I 1485 214 0 1650 300 15 2976 520 15 3405 570 15 3935 620 15 3100 620 15 2600 620 15
AMS2 32 2 16 25 5 16 32 5 24 180 16 128 180 28 232 0 28 232 0 28 232
ARGO 22 12 28 150 70 186 288 122 366 288 159 546 288 195 726 288 195 726 288 195 726
GLAST 5 10 0 200 50 10 200 70 20 200 100 30 200 100 30 200 100 30
MAGIC 1 20 5 4 25 5 8 25 6 12 25 6 16 25 6 16 25 6 16
PAMELA 4 20 10 16 25 12 32 25 14 48 25 16 64 0 16 64 0 16 64
Virgo 10 25 75 180 90 130 250 150 200 500 220 250 500 220 250 0 0 0 0 0 0
TOTALE GRUPPO II 64 43 119 400 190 352 820 344 640 1218 485 1004 1218 565 1318 513 345 1068 513 345 1068
All experiments 2249 426 469 3350 990 1017 7196 2364 2601 12475 4643 5056 18906 8056 8395 23756 12345 13313 29085 16536 18831
All w/ overlap factor 1874 387 469 2792 900 1017 5997 2149 2601 10396 4221 5056 15755 7324 8395 19796 11222 13313 24237 15033 18831
CNAF TOTAL (PLAN) 1874 387 469 3000 1000 1000 5997 2149 2601 10396 4221 5056 15755 7324 8395 19796 11222 13313 24237 15033 18831
CNAF ACTUAL 1570 400 510 3000 1000 ?
Relative Contingency
Absolute contingency 0 0 0 1199 429,8 520,2 3119 1266 1517 6302 2929 3358 9898 5611 6656 12119 7516 9416
Zoccolo duro (TOTAL-CONTINGENCY) 3000 1000 1000 4797 1719 2081 7277 2954 3539 9453 4394 5037 9898 5611 6656 12119 7516 9416
INFN T1 P2P 2005 1800 850 850 2400 1200 1000 5500 2500 2100 8000 4000 4100 11500 5800 6000
INFN T1 P2P 2007 - - - 1300 500 650 4500 2000 2100 6500 3200 3300 10000 5000 5000
INFN T1 P2P 2007 v3 3000 1300 1500 5500 2500 2600 8500 4100 4200 12000 6800 7100 16000 9500 11000
2007 2011
50%
2012
50%
CNAF Plan September 2007
0% 20% 30% 40%
2008 2009 20102006
8/2/2008 CSN5 Trieste michele michelotto - INFN Padova 6
SI2K frozen• SI2K is the benchmark used up to now to
measure the computing power of all the HEPexperiments– Computing power requested by experiment– Computing power provided by a Tier-[0,1,2]
• SI2K is the nickname for SPEC CPU Int 2000benchmark– Came after Spec89, Spec Int 92 and Spec Int 95– Declared obsolete by SPEC in 2006– Replaced by SPEC with CPU Int 2006
8/2/2008 CSN5 Trieste michele michelotto - INFN Padova 7
Transition problem• Impossible to find SPEC Int 2000pubblished results for the new processors(e.g. the not so new Clovertown 4-core)
• Impossible to find pubblished SPEC Int 2006for old processor (before 2006)– E.g. Old P4 Xeon, P4, AMD 2xx
• You can’t convert from SI2000 to SI2006 butthe ratio for x86 architecture is in the 137 –172 range
8/2/2008 CSN5 Trieste michele michelotto - INFN Padova 8
The SI2K inflaction• The main problems with SI2000 in our
community: it is not proportional to HEPcodes performance (as it was)
• You can buy processors with huge SI2Knumber but with a smaller increase in realperformances
8/2/2008 CSN5 Trieste michele michelotto - INFN Padova 9
Nominal SI vs real SI• SI2K results for the last generation processor
affected by inflation• So CERN (and FZK) started to use a new
currency: SI2K measured with “gcc”, the gnuC compiler and using two flavour ofoptimization– High tuning: gcc –O3 –funroll-
loops–march=$ARCH– Low tuning: gcc –O2 –fPIC –pthread
8/2/2008 CSN5 Trieste michele michelotto - INFN Padova 10
Nominal SI vs real SI
• CERN Proposal: Use as site rating the “RealSI” obtained by SI measured with gcc-lowand increased by 50%– Actually this make sense only for a short period of
time and for the last generation of processor• Run n copies in parallel
– Where n is the number of cores in the workernode
– To take in account the drop in performance of amulticore machine when fully loaded.
8/2/2008 CSN5 Trieste michele michelotto - INFN Padova 11
Too many SI2K• Take as an example a worker node with two
Intel Woodcrest dual core 5160 at 3.06 GHz• SI2K nominal: 2929 – 3089 (min – max)• SI2K sum on 4 cores: 11716 - 12536• SI2K gcc-low: 5523• SI2K gcc-high: 7034• SI2K gcc-low + 50%: 8284
8/2/2008 CSN5 Trieste michele michelotto - INFN Padova 12
Even more• Actually all the gcc results in the previous slide are
on i386 (32bit)• if you would like to know how your code is running
on 64 bit machine, you can measure Specint INT2000 with gcc on x86_64.
• So the worker node with two Intel Woodcrest dualcore 5160 at 3.06 GHz
• SI2K nominal: 2929 – 3089 (min – max)• SI2K on 4 cores: 11716 - 12536• SI2K gcc-low: 6021• SI2K gcc-high: 6409• SI2K gcc-low + 50%: 9031
8/2/2008 CSN5 Trieste michele michelotto - INFN Padova 13
A scale factor• All these numbers would be only annoying in
a world with a unique architecture, in whichonly clock improves in time
• You would be able to find a fixed ratiobetween all those number.
• But in the real world, the ratio depends onCPU producer (intel vs AMD) and processorgeneration (old xeon vs new “core” Xeon
8/2/2008 CSN5 Trieste michele michelotto - INFN Padova 14
The nominal SI2K
Big Differences between Intel and AMD whenSI2K/GHz are plotted
8/2/2008 CSN5 Trieste michele michelotto - INFN Padova 16
Which is the better?• I started to measure performances of HEP
codes on several machines• The goal was to find a “commercial
mantained” benchmark to replace SI2K• I compared HEP code with
– SI2K pubblished results– SI2K measured with gcc and “CERN” tuning– SI2006 and SI2006 rate pubblished results– SI2006 and SI2006 with gcc4 (32 and 64 bit)
8/2/2008 CSN5 Trieste michele michelotto - INFN Padova 17
Babar TierA ResultsBabar Stroili
0.0% 50.0% 100.0% 150.0% 200.0%
SI2K
SI2KCERN
SI2006
SI2006gcc
BABAR
be
nc
hm
ark
ratio
Opteron 2218
Opteron 275
Opteron 265
Xeon 5355
Xeon 5345
Xeon 5160
Xeon 2.8
Xeon 2.4
PIII 1.26
• If you normalize bycore and clock all newprocessors have thesame performance
• Doubling the oldergeneration cpu
• SI2006 matches thispattern (pubblishedand gcc ratio constant)
• SI2000-cern betterthan SI2K nominal
• SI2000 clearly doesn’twork
8/2/2008 CSN5 Trieste michele michelotto - INFN Padova 18
CMS sw SIM and Pythia• CMS Montecarlo simulation
(32bit) and Pythia (64bit)show the sameperformance oncenormalized
• Both Specint 2006pubblished and Specint2006 with gcc show thesame behaviour
• SI2K pubbished does notmatch HEP sw
• SI2K cern better but not asgood as SI2006
8/2/2008 CSN5 Trieste michele michelotto - INFN Padova 19
Atlas• Here 100% is Xeon5160• Few results for
SI2006+gcc but no difffrom CMS and babar
• Few results also fromSI2006 pubblishedbecause of several oldarchitectures
• SI2K+gcc not bad• SI2K pubblished heavily
overstimate new Xeon• Atlas simulation
normalized performs thesame on the new intel“core” or amd “opteron”(like CMS, Babar)
8/2/2008 CSN5 Trieste michele michelotto - INFN Padova 20
Many gaps• Easy to find SPEC pubblished result
– But only for new machines• Difficult to measure:
– Not easy to have machine on loan from Server reseller orproducer
– Not easy to borrow machine from colleagues– Always for short periods of time– A SPEC run can last 15-20 hours
• Need a set of dedicated worker node to make SPECand HEP application measurement– The set of WN should be available to other INFN people
who want to make similar measurements
8/2/2008 CSN5 Trieste michele michelotto - INFN Padova 21
Cache• In the 80’s the latency (3-10 clock time)• Now latency is 1000s of clock time• Importance of the cache architecture
– 1st level, 2nd level, 3rd level– Cache latency– Cache bandwidth– Shared or exclusive?
8/2/2008 CSN5 Trieste michele michelotto - INFN Padova 25
Load transactional
Performance don’tdrop in the new4core processor
Clovertown drop wrtHarpwertown
A dual coreprocessor keepsonly up to Load3
8/2/2008 CSN5 Trieste michele michelotto - INFN Padova 26
Perf/watt• AMD
Barcelona at65nmPerformanceper watt similarto INTEL xeonat 45nm
8/2/2008 CSN5 Trieste michele michelotto - INFN Padova 28
Cache behaviour• 54xx has lower latency even with bigger cache• The 3 processors behave very differently in the 4MB e 64MB
range• If your (HEP) application works in this range you will see a
big change of performance changing processor
8/2/2008 CSN5 Trieste michele michelotto - INFN Padova 29
Memory intel vs amd• Access time very similar• At 1GB (tipical footprint of HEP application) the new AMD
behave better• But the new are Xeon 54xx much better than the 53xx
8/2/2008 CSN5 Trieste michele michelotto - INFN Padova 30
Mem intel vs amd• Who is faster?• It depends on
the block size• On the red
zones Intel isbetter.
• On the greenzone AMD isbetter
8/2/2008 CSN5 Trieste michele michelotto - INFN Padova 31
Cache behaviour• We need to study the behaviour of tipical
HEP application– Simulation, event generation, Reconstruction,
Analysis– To understand how to write more efficient
application
8/2/2008 CSN5 Trieste michele michelotto - INFN Padova 32
Power issues• Power
consumptionchange fromone processorto another– Clock, High-K
dielectric,Active PowerManagements,Clock throttling
8/2/2008 CSN5 Trieste michele michelotto - INFN Padova 34
HT on or off• Turning off Hyperthreading causes a 10% drop in
performance but also a 20% drop in Power consumption
8/2/2008 CSN5 Trieste michele michelotto - INFN Padova 35
What about HEP?• Need to make measurement of Power usage for
HEP application• Example: a big Tier2 with 500 boxes needs 100kW
– Like the whole CED of INFN Padova– About 800 MWh in one year– Energy cost 0.12 Euro per kWh Energy bills of 100
kEuro/year– A 10% improvement on Power efficiency means 10
kEuro/year savings– And savings on the infrastructure (power distribution,
UPS, Cooling)
8/2/2008 CSN5 Trieste michele michelotto - INFN Padova 36
Power meter• Need a device to measure Voltage and Current• And logging capabilities• E.g. Fluke 1735
8/2/2008 CSN5 Trieste michele michelotto - INFN Padova 37
Financial request• Need to buy a new worker node each time a new
processor is released in the dual proc market segment– Only if significantly new features are presents– One or two each for INTEL and AMD per year– 4 kEuro each (dual proc, 2GB/core, 1disk)– 2 box to start with
8/2/2008 CSN5 Trieste michele michelotto - INFN Padova 38
Manpower• Padova:
– Michele Michelotto (Primo Tecnologo) 75%– Alberto Crescente (CTER) 30%– Roberto Ferrari (CTER) 30%
• Ferrara:– Alberto Gianoli (Primo Tecnologo): 20%
• Bologna:– Franco Brasolin (CTER): 20%
8/2/2008 CSN5 Trieste michele michelotto - INFN Padova 39
International Outlook• Hepix is an organization where HEP
Computing Center people meets twice a year(Sprint Europe, Fall USA)
• IHEPCCC asked Hepix to form two Workinggroup to study Storage and CPUbenchmarking
• Real work started at the end of 2007• CPU group chaired by H.Meinhard (CERN)
8/2/2008 CSN5 Trieste michele michelotto - INFN Padova 40
Milestone• 2008
– Undestand SPEC 2006. Propose a newbenchmark to replace SI2K
– Measure the performance of the currentarchitectures for Montecarlo SIM (evt/sec vsSPEC)
• 2008/2009– Power performances
• 2009– Cache profiling