Hugues Brun, Suzanne Gascon-Shotkin, Morgan Lethuillier ... · H → γγ sensitivity studies using...

13
IPNL-PKU Collaboration on ttH at CMS FCPPL - 22/05/2018, Marseille Nicolas Chanon - IPNL, CNRS/IN2P3

Transcript of Hugues Brun, Suzanne Gascon-Shotkin, Morgan Lethuillier ... · H → γγ sensitivity studies using...

Page 1: Hugues Brun, Suzanne Gascon-Shotkin, Morgan Lethuillier ... · H → γγ sensitivity studies using RooStats H → γγ W.G. meeting H → γγ W.G. meeting Nicolas Chanon,ETH Gr´egory

H → γγ sensitivity studies using RooStats

H → γγ W.G. meetingH → γγ W.G. meetingNicolas Chanon, ETHGregory Schott, KIT

Hugues Brun, Suzanne Gascon-Shotkin, Morgan Lethuillier, IPNL

ETH Zurich

11/02/2011

Nicolas Chanon H → γγ sensitivity studies using RooStats 1 / 7

IPNL-PKU Collaboration on ttH at CMS

FCPPL - 22/05/2018, Marseille

Nicolas Chanon - IPNL, CNRS/IN2P3

Page 2: Hugues Brun, Suzanne Gascon-Shotkin, Morgan Lethuillier ... · H → γγ sensitivity studies using RooStats H → γγ W.G. meeting H → γγ W.G. meeting Nicolas Chanon,ETH Gr´egory

IPNL-PKU collaboration

N. Chanon for IPNL-PKU collaboration - 2

IPNL-PKU is former IPHC-PKU collaboration- The collaboration started in 2014, and was formalised in 2015

within the FCPPL framework - In 2016 and 2017: Ms. Jing Li visited IPHC for 3 months twice

(supported by FCPPL, IPHC, PKU) - End of 2017 and beginning of 2018, Junho Lee visited IPNL 1

week twice, with support from FCPPL

H → γγ sensitivity studies using RooStats

H → γγ W.G. meetingH → γγ W.G. meetingNicolas Chanon, ETHGregory Schott, KIT

Hugues Brun, Suzanne Gascon-Shotkin, Morgan Lethuillier, IPNL

ETH Zurich

11/02/2011

Nicolas Chanon H → γγ sensitivity studies using RooStats 1 / 7

Goals of the collaboration- New analysis techniques for CMS ttH multilepton analysis- Development of the Matrix Element Method (MEM) - Combination of MEM with multivariate techniques (NN, BDT) - Developments of new tools with Deep Learning

Page 3: Hugues Brun, Suzanne Gascon-Shotkin, Morgan Lethuillier ... · H → γγ sensitivity studies using RooStats H → γγ W.G. meeting H → γγ W.G. meeting Nicolas Chanon,ETH Gr´egory

ttH multileptonCMS HIG-17-004

3

Targeting final states with electrons and muons- 2 same sign leptons: ≥4 jets, ≥1 b-tag (same-sign required to

reduce Drell-Yan and ttZ) - 3 leptons: ≥2 jets, ≥1 b-tag - 4 leptons: same as 3 leptons, veto H→4ℓ (dedicated H→ZZ) - Veto presence of 𝝉h (dedicated H→𝝉𝝉 analysis, see later)

H → γγ sensitivity studies using RooStats

H → γγ W.G. meetingH → γγ W.G. meetingNicolas Chanon, ETHGregory Schott, KIT

Hugues Brun, Suzanne Gascon-Shotkin, Morgan Lethuillier, IPNL

ETH Zurich

11/02/2011

Nicolas Chanon H → γγ sensitivity studies using RooStats 1 / 7

Analysis categories:

44 5 Signal extraction

Figure 42: Diagram of all event categories in the analysis. Categories are based on leptonmultiplicity and flavor, b-jet composition, and the sign of the sum of the lepton charges.

Even

ts

0

20

40

60

80

100

120

140

160 DatattHTTWTTZRaresFakesFlipstotal unc.

(13 TeV)-135.9 fbCMS Preliminary

ee - ee + bl -µe bl +µe bt -µe bt +µe bl -µµ bl +µµ bt -µµ bt +µµ

Dat

a/pr

ed.

0.00.51.01.52.0

stat. unc. total unc.

Even

ts

0

20

40

60

80

100DatattHTTWTTZWZRaresFakestotal unc.

(13 TeV)-135.9 fbCMS Preliminary

bl - bl + bt - bt +

Dat

a/pr

ed.

0.00.51.01.52.0

stat. unc. total unc.

Figure 43: Splitting in categories for the 2lss (left) and 3l (right) channels.

Analysis sensitivity:- 2ℓss and 3ℓ categories: Train 2 kinematic BDTs, against ttbar and ttW/Z- Map 2D BDTs into 1D discriminant (group into bins with similar s/b)

Page 4: Hugues Brun, Suzanne Gascon-Shotkin, Morgan Lethuillier ... · H → γγ sensitivity studies using RooStats H → γγ W.G. meeting H → γγ W.G. meeting Nicolas Chanon,ETH Gr´egory

ttH multilepton: backgroundsCMS HIG-17-004

H → γγ sensitivity studies using RooStats

H → γγ W.G. meetingH → γγ W.G. meetingNicolas Chanon, ETHGregory Schott, KIT

Hugues Brun, Suzanne Gascon-Shotkin, Morgan Lethuillier, IPNL

ETH Zurich

11/02/2011

Nicolas Chanon H → γγ sensitivity studies using RooStats 1 / 7

Reducible: mainly tt+jets, - shape obtained from data, - O(30%) uncertainty

- Jets faking leptons: fake rate computed from QCD control region with loosened identification

- Charge mis-assignment (2ℓss only): flip rate from Z→ℓ±ℓ± data

Irreducible: tt+W/Z/γ* - from Monte Carlo, - O(10%) uncertainty

ttH, multilepton •  Select events with ℓ±ℓ± or ≥3ℓ, plus jets and b-tags. •  Residual backgrounds are mainly –  tt + W/Z/γ* production: irreducible except for jets & ν’s.

•  Taken from theory predictions, with O(10%) uncertainty –  reducible backgrounds, mostly from tt + jets with

non-prompt leptons or charge mis-assignment •  Estimated from data, with O(30%) uncertainty

Moriond EWK, 2017 G. Petrucciani (CERN) 14

ttH � 3ℓ + X ttZ � 3ℓ + X tt � 3ℓ + X ttH, multilepton

•  Select events with ℓ±ℓ± or ≥3ℓ, plus jets and b-tags. •  Residual backgrounds are mainly –  tt + W/Z/γ* production: irreducible except for jets & ν’s.

•  Taken from theory predictions, with O(10%) uncertainty –  reducible backgrounds, mostly from tt + jets with

non-prompt leptons or charge mis-assignment •  Estimated from data, with O(30%) uncertainty

Moriond EWK, 2017 G. Petrucciani (CERN) 14

ttH � 3ℓ + X ttZ � 3ℓ + X tt � 3ℓ + X

N. Chanon for IPNL-PKU collaboration - 4

Page 5: Hugues Brun, Suzanne Gascon-Shotkin, Morgan Lethuillier ... · H → γγ sensitivity studies using RooStats H → γγ W.G. meeting H → γγ W.G. meeting Nicolas Chanon,ETH Gr´egory

5

ttH multilepton discriminantsCMS HIG-17-004

H → γγ sensitivity studies using RooStats

H → γγ W.G. meetingH → γγ W.G. meetingNicolas Chanon, ETHGregory Schott, KIT

Hugues Brun, Suzanne Gascon-Shotkin, Morgan Lethuillier, IPNL

ETH Zurich

11/02/2011

Nicolas Chanon H → γγ sensitivity studies using RooStats 1 / 7

3ℓ ttbar BDT

3ℓ ttV BDT

3ℓ

- 3ℓ vs ttW/Z: Includes Matrix Element Method likelihood ratio of ttH vs ttW+ttZ

References 3

Even

ts

0

20

40

60

80

100

120

140

160

180 , post-fit (SM prediction)±l±lData WZ Non-promptttH Rares Charge mis-m.ttW ±W±W Total unc.ttZ Conv.

Preliminary CMS (13 TeV)-135.9 fb

hadronic top tagger BDT score0.4− 0.2− 0 0.2 0.4 0.6 0.8 1

Dat

a/pr

ed.

0.60.81.01.21.41.61.8 stat. unc. total unc.

Even

ts

020406080

100120140160180200220 , post-fit (SM prediction)±l±l

Data WZ Non-promptttH Rares Charge mis-m.ttW ±W±W Total unc.ttZ Conv.

Preliminary CMS (13 TeV)-135.9 fb

lv+jet(s) tagger BDT score→H 1− 0.8− 0.6− 0.4− 0.2− 0 0.2 0.4 0.6 0.8 1

Dat

a/pr

ed.

0.60.81.01.21.41.61.8 stat. unc. total unc.

Figure 3: Distribution of the hadronic top and Higgs jet tagger BDT scores in the 2LSS channel.The distributions are shown after the fit to the data, with all processes constrained to the SMexpectation.

Even

ts

0

10

20

30

40

50 3l, post-fit (SM prediction)Data ttZ Conv. ttH WZ Non-promptttW Rares Total unc.

Preliminary CMS (13 TeV)-135.9 fb

MEM ln(LR)0 5 10 15 20 25 30 35 40

Dat

a/pr

ed.

0.00.51.01.5

stat. unc. total unc.

Figure 4: Distribution of the likelihood ratio of matrix element weights in the 3L channel. Thedistributions are shown after the fit to the data, with all processes constrained to the SM expec-tation.

3ℓ MEM

Page 6: Hugues Brun, Suzanne Gascon-Shotkin, Morgan Lethuillier ... · H → γγ sensitivity studies using RooStats H → γγ W.G. meeting H → γγ W.G. meeting Nicolas Chanon,ETH Gr´egory

ttH multilepton : Matrix Element Method6 5 Signal and background modeling and systematic uncertainties

The Hj discriminator is designed to identify jets originating from the Higgs decay products.The classifier is trained against a background of jets in ttW/ttZ events in the 2LSS category,and uses jet identification (CSV discriminator, quark-gluon jet likelihood) and geometric (DRwith respect to the leptons) properties as input variables. It is not evaluated on jets compatiblewith top decay products according to the previous discriminator.

In the 3L event category only, the kinematic variables listed above are complemented by matrixelement weights. A weight wi,a is computed for each hypothesis a (where a is either ttH, ttW,or ttZ) and for the event i as follows:

wi,a(F0) =1sa

ZdFa · d4

⇣pµ

1 + pµ2 � Â

k�2pµ

k

⌘· f (x1, µF) f (x2, µF)

x1x2s·���Ma(pµ

k )���2· W(F0|Fa),

where sa is the cross section; F0 are the 4-momenta of the reconstructed particles; dFa is theelement of phase space corresponding to unmeasured quantities with momentum conservationenforced; f (x, µF) are the parton density functions, computed using NNPDF3.0 LO [26]; |Ma|2is the squared matrix element, computed with MADGRAPH 5 AMC@NLO standalone [27] atLO in the narrow-width approximation for t, t and H; and W are the transfer functions forjet energy and Emiss

T , relating parton to reconstructed quantities, estimated from simulated ttHevents.

The two jets with the highest CSV tagging output are assigned to the two b quarks in the matrixelement. Among the remaining jets, the pair with dijet mass closest to mW is selected. In ttH,for semileptonic decays of the Higgs daughters, the pair with lowest dijet mass is selected. Ifone or two jets needed to evaluate |Ma|2 fail to be reconstructed, the weight is recovered byextending the integration phase space for the missing jets.

The final weight for each hypothesis a is taken as the average of the weights computed for eachlepton and jet permutation. The MEM weights of signal and backgrounds are combined in alikelihood ratio that is used as an input variable to the BDT. Including the MEM weights in theBDT training against ttW/ttZ improves the background rejection power by about 10% for thethree lepton category.

The plane spanned by the outputs of the two BDT classifiers is binned using a method basedon the likelihood ratio between signal and background. Starting from a fine binning allowedby considerations on the signal and background statistical uncertainties in each bin, the jointlikelihood is approximated by the signal-to-background ratio in each bin, and then smoothedusing gaussian kernels. Each background event is associated to the value of the likelihoodratio in the bin the event belongs to; the cumulative distribution of the likelihood ratio forbackground events is then partitioned, based on its quantiles, in a certain number of regionsof equal background content. The number of regions is chosen using a recursive application ofthe k-means clustering algorithm [28]. The resulting regions are finally interpreted as bins ofa one-dimensional distribution, which features in a natural way a roughly constant number ofbackground events and an increasing number of signal events.

The distributions obtained in this way for each category are simulaneously fit to extract thesignal normalization. Figures 2, 3, 4, and 5 show distributions of event observables and BDTclassifier outputs in data, compared to the predicted background processes.

5 Signal and background modeling and systematic uncertaintiesSignal ttH events are generated using the MADGRAPH 5 AMC@NLO package (version 5.222) [27],which includes up to one additional hadronic jet at next-to-leading order (NLO) QCD accuracy.

Evaluate MEM weights under ttH, ttW, ttZ/γ* hypotheses:- Custom framework in C++ - If jets are needed at ME level and are not reconstructed (“mising jets”):

included, as supplementary phase space to integrate - MEM weight is the average weight of all possible lepton, jets, b-jets permutations

MEM weightIntegrationwith VEGAS

Phase-spaceenforcing 4-momentum

conservation

Matrix Elementfrom Madgraph

Parton distribution function

from LHAPDF

Transfer functionsfrom CMS simulation

Jing Li (PKU), NC

N. Chanon for IPNL-PKU collaboration - 6

Page 7: Hugues Brun, Suzanne Gascon-Shotkin, Morgan Lethuillier ... · H → γγ sensitivity studies using RooStats H → γγ W.G. meeting H → γγ W.G. meeting Nicolas Chanon,ETH Gr´egory

ttH multilepton : MEM in 3l categoryCMS HIG-16-022, HIG-17-004

H → γγ sensitivity studies using RooStats

H → γγ W.G. meetingH → γγ W.G. meetingNicolas Chanon, ETHGregory Schott, KIT

Hugues Brun, Suzanne Gascon-Shotkin, Morgan Lethuillier, IPNL

ETH Zurich

11/02/2011

Nicolas Chanon H → γγ sensitivity studies using RooStats 1 / 7

Xavier COUBEZ ttH leptonic working meeting

Suggest to use likelihood instead of log weight

5

BDT 3l ttH vs ttV + MEM likelihood

BDT performances with MEM likelihood

Replacing log weight with MEM likelihood

(post ICHEP)

Moriond

�log

✓�ttZwttZ + k · �ttWwttW

�ttHwttH + �ttZwttZ + k · �ttWwttW

Already shown

Likelihood ratio of ttV vs ttH+ttV

References 3

Even

ts

0

20

40

60

80

100

120

140

160

180 , post-fit (SM prediction)±l±lData WZ Non-promptttH Rares Charge mis-m.ttW ±W±W Total unc.ttZ Conv.

Preliminary CMS (13 TeV)-135.9 fb

hadronic top tagger BDT score0.4− 0.2− 0 0.2 0.4 0.6 0.8 1

Dat

a/pr

ed.

0.60.81.01.21.41.61.8 stat. unc. total unc.

Even

ts

020406080

100120140160180200220 , post-fit (SM prediction)±l±l

Data WZ Non-promptttH Rares Charge mis-m.ttW ±W±W Total unc.ttZ Conv.

Preliminary CMS (13 TeV)-135.9 fb

lv+jet(s) tagger BDT score→H 1− 0.8− 0.6− 0.4− 0.2− 0 0.2 0.4 0.6 0.8 1

Dat

a/pr

ed.

0.60.81.01.21.41.61.8 stat. unc. total unc.

Figure 3: Distribution of the hadronic top and Higgs jet tagger BDT scores in the 2LSS channel.The distributions are shown after the fit to the data, with all processes constrained to the SMexpectation.

Even

ts

0

10

20

30

40

50 3l, post-fit (SM prediction)Data ttZ Conv. ttH WZ Non-promptttW Rares Total unc.

Preliminary CMS (13 TeV)-135.9 fb

MEM ln(LR)0 5 10 15 20 25 30 35 40

Dat

a/pr

ed.

0.00.51.01.5

stat. unc. total unc.

Figure 4: Distribution of the likelihood ratio of matrix element weights in the 3L channel. Thedistributions are shown after the fit to the data, with all processes constrained to the SM expec-tation.

MEM weights under ttH, ttW, ttZ/γ* hypotheses

ttH weight ttW weight

ttZ weight

HIG-16-022 (ICHEP 2016):- improved

discrimination by 10% in 3ℓ category

- Include log(weights) as input to a kinematic BDT trained against ttV

HIG-17-004 (Moriond 2017): include the likelihood of ttH vs ttV weights inside the ttW/Z BDT

Jing Li (PKU), Xavier Coubez (formerly IPHC), Daniel Bloch (IPHC), NC

Page 8: Hugues Brun, Suzanne Gascon-Shotkin, Morgan Lethuillier ... · H → γγ sensitivity studies using RooStats H → γγ W.G. meeting H → γγ W.G. meeting Nicolas Chanon,ETH Gr´egory

8

ttH multilepton resultsarxiv:1803.05485, submitted to JHEP

H → γγ sensitivity studies using RooStats

H → γγ W.G. meetingH → γγ W.G. meetingNicolas Chanon, ETHGregory Schott, KIT

Hugues Brun, Suzanne Gascon-Shotkin, Morgan Lethuillier, IPNL

ETH Zurich

11/02/2011

Nicolas Chanon H → γγ sensitivity studies using RooStats 1 / 7

Full combination (Run 1, Run 2 ttH multilepton, 𝝉h+X, γγ, ZZ and bb): 5.2σ observed (4.2σ expected)

Significance, combining ttH multilepton and tau analyses: Observation: 3.2σ (2.8σ expected)

ttH observationarxiv:1804.02610, accepted by Phys. Rev. Lett.

Our collaboration presented at- Moriond QCD 2017 (NC)- Poster at LHCP 2017 (Jing Li)

Page 9: Hugues Brun, Suzanne Gascon-Shotkin, Morgan Lethuillier ... · H → γγ sensitivity studies using RooStats H → γγ W.G. meeting H → γγ W.G. meeting Nicolas Chanon,ETH Gr´egory

MEM in single top + ZPhys. Lett. B 779 (2018) 358

H → γγ sensitivity studies using RooStats

H → γγ W.G. meetingH → γγ W.G. meetingNicolas Chanon, ETHGregory Schott, KIT

Hugues Brun, Suzanne Gascon-Shotkin, Morgan Lethuillier, IPNL

ETH Zurich

11/02/2011

Nicolas Chanon H → γγ sensitivity studies using RooStats 1 / 7

tZq MEM weight

As a spin-off, include MEM in CMS tZq analysis with 2016 data- tZq is a rare process in the SM - Same MEM framework as developed for ttH multi lepton

analysis- MEM is a powerful tool to use process kinematics: especially

forward jet in tZq - Include MEM weights and MEM as a kinematic fit - Observation 3.7σ (3.2σ expected)

MEM improves the analysis significance by 20%

Nicolas Tonon (IPHC), Jeremy Andrea (IPHC), NC

N. Chanon for IPNL-PKU collaboration - 9

Page 10: Hugues Brun, Suzanne Gascon-Shotkin, Morgan Lethuillier ... · H → γγ sensitivity studies using RooStats H → γγ W.G. meeting H → γγ W.G. meeting Nicolas Chanon,ETH Gr´egory

Going further with MEM (2017-)(Preliminary)

Studying MEM as kinematic fit- Given reconstructed particles in the event, a

MEM score can be attributed to the most probable kinematic configuration

- Also obtain unreconstructed quantities : neutrino momenta, etc.

ttH, multilepton •  Select events with ℓ±ℓ± or ≥3ℓ, plus jets and b-tags. •  Residual backgrounds are mainly

–  tt + W/Z/γ* production: irreducible except for jets & ν’s. •  Taken from theory predictions, with O(10%) uncertainty

–  reducible backgrounds, mostly from tt + jets with non-prompt leptons or charge mis-assignment •  Estimated from data, with O(30%) uncertainty

Moriond EWK, 2017 G. Petrucciani (CERN) 14

ttH � 3ℓ + X ttZ � 3ℓ + X tt � 3ℓ + X

B-jet Energy0 100 200 300 400 500 600 700 800 90010000

0.01

0.02

0.03

0.04

0.05

0.06

0.073l_2b_2j

Had. top

Lep. top

)Kin,ttZ

log(w100− 90− 80− 70− 60− 50− 40− 30− 20− 10− 00

0.010.020.030.040.050.060.070.080.09 3l

Annealing -49.3Max Int -42.8Minuit2 -35.0SubGradient -34.4Simplex -34.2

ttZ in 3ℓ category, Delphes simulation

Score of kin. fit

(Delphes)

Inferred b-quark energy

(Delphes)

- Studies on behaviour of MEM function

- Explore several minimisation algorithms

- Target: phenomenology publication

Jing Li (PKU), NC

N. Chanon for IPNL-PKU collaboration - 10

Page 11: Hugues Brun, Suzanne Gascon-Shotkin, Morgan Lethuillier ... · H → γγ sensitivity studies using RooStats H → γγ W.G. meeting H → γγ W.G. meeting Nicolas Chanon,ETH Gr´egory

MEM and Deep Learning(Exploratory work)

- Infer unknown parton-level quantities needed to compute the MEM, using neural network

- First studies performed with custom code (Jing Li): showed the need for more powerful tools

- Moving to modern Deep Learning tools : Keras + Tensorflow (Junho Lee)

- As a first step, working on a regression of MEM : can be useful to accelerate evaluation of MEM in CMS analysis

Jing Li (PKU), Junho Lee (PKU), Qiang Li (PKU), NC

N. Chanon for IPNL-PKU collaboration - 11

8.10 Artificial Neural Networks (nonlinear discriminant analysis) 99

1x

2x

3x

11w1

12w1

y21

y22

y23

y24

y25

y13

y11

y12

y14

y31

11w2

51w2

01w2

05w1

45w14x

Input Layer Output LayerHidden Layer

Bias

Bias

yANN

1

1

Figure 15: Multilayer perceptron with one hidden layer.

ylj

wl−11jwl−12j..

.yl−12yl−11

wl−1njyl−1n

Σ

Output

Input

ρ

Figure 16: Single neuron j in layer ` with n input connections. The incoming connections carry a weight of

w(l�1)ij .

perceptron is the input layer, the last one the output layer, and all others are hidden layers. Fora classification problem with n

var

input variables the input layer consists of nvar

neurons that holdthe input values, x

1

, . . . , xnvar

, and one neuron in the output layer that holds the output variable,the neural net estimator y

ANN

.

For a regression problem the network structure is similar, except that for multi-target regressioneach of the targets is represented by one output neuron. A weight is associated to each directionalconnection between the output of one neuron and the input of another neuron. When calculatingthe input value to the response function of a neuron, the output values of all neurons connected tothe given neuron are multiplied with theses weights.

MEM

Working on new hybridisations of MEM and multivariate techniques

Page 12: Hugues Brun, Suzanne Gascon-Shotkin, Morgan Lethuillier ... · H → γγ sensitivity studies using RooStats H → γγ W.G. meeting H → γγ W.G. meeting Nicolas Chanon,ETH Gr´egory

12

Perspectives

H → γγ sensitivity studies using RooStats

H → γγ W.G. meetingH → γγ W.G. meetingNicolas Chanon, ETHGregory Schott, KIT

Hugues Brun, Suzanne Gascon-Shotkin, Morgan Lethuillier, IPNL

ETH Zurich

11/02/2011

Nicolas Chanon H → γγ sensitivity studies using RooStats 1 / 7 - Involvement in CMS ttH multi lepton analysis with 2017 data and planned contribution to Run 2 legacy paper

- Working on a phenomenology publication to expose new analysis techniques, with MEM and Deep Learning

- Possible involvement in CMS tHq analysis with 2017 data

- Consider later possible contributions to VBS WW analysis

- FCPPL support in 2018 will allow a 3 months stay of Junho Lee at IPNL in Autumn

- PKU, IPHC and IPNL are grateful for continued FCPPL support since the start of our collaboration

Page 13: Hugues Brun, Suzanne Gascon-Shotkin, Morgan Lethuillier ... · H → γγ sensitivity studies using RooStats H → γγ W.G. meeting H → γγ W.G. meeting Nicolas Chanon,ETH Gr´egory

Back-up slides

13