Reconstruction and identification of tau lepton decays to ......2 CMS detector The central feature...

74
EUROPEAN ORGANIZATION FOR NUCLEAR RESEARCH (CERN) CERN-PH-EP/2013-037 2016/02/22 CMS-TAU-14-001 Reconstruction and identification of τ lepton decays to hadrons and ν τ at CMS The CMS Collaboration * Abstract This paper describes the algorithms used by the CMS experiment to reconstruct and identify τ hadrons + ν τ decays during Run 1 of the LHC. The performance of the algorithms is studied in proton-proton collisions recorded at a centre-of-mass en- ergy of 8 TeV, corresponding to an integrated luminosity of 19.7 fb -1 . The algorithms achieve an identification efficiency of 50–60%, with misidentification rates for quark and gluon jets, electrons, and muons between per mille and per cent levels. Published in the Journal of Instrumentation as doi:10.1088/1748-0221/11/01/P01019. c 2016 CERN for the benefit of the CMS Collaboration. CC-BY-3.0 license * See Appendix A for the list of collaboration members arXiv:1510.07488v2 [physics.ins-det] 19 Feb 2016

Transcript of Reconstruction and identification of tau lepton decays to ......2 CMS detector The central feature...

  • EUROPEAN ORGANIZATION FOR NUCLEAR RESEARCH (CERN)

    CERN-PH-EP/2013-0372016/02/22

    CMS-TAU-14-001

    Reconstruction and identification of τ lepton decays tohadrons and ντ at CMS

    The CMS Collaboration∗

    Abstract

    This paper describes the algorithms used by the CMS experiment to reconstruct andidentify τ → hadrons + ντ decays during Run 1 of the LHC. The performance ofthe algorithms is studied in proton-proton collisions recorded at a centre-of-mass en-ergy of 8 TeV, corresponding to an integrated luminosity of 19.7 fb−1. The algorithmsachieve an identification efficiency of 50–60%, with misidentification rates for quarkand gluon jets, electrons, and muons between per mille and per cent levels.

    Published in the Journal of Instrumentation as doi:10.1088/1748-0221/11/01/P01019.

    c© 2016 CERN for the benefit of the CMS Collaboration. CC-BY-3.0 license∗See Appendix A for the list of collaboration members

    arX

    iv:1

    510.

    0748

    8v2

    [ph

    ysic

    s.in

    s-de

    t] 1

    9 Fe

    b 20

    16

    http://dx.doi.org/10.1088/1748-0221/11/01/P01019http://creativecommons.org/licenses/by/3.0

  • 1

    1 IntroductionDecays of τ leptons provide an important experimental signature for analyses at the CERNLHC. Evidence for decays of the standard model (SM) Higgs boson (H) into ττ has been re-ported [1, 2], as have searches for neutral and charged Higgs bosons in decays to τ leptonsthat have special interest in the context of the minimal supersymmetric extension of the SM(MSSM) [3–8]. The CMS collaboration has published analyses of Drell–Yan (qq→ Z/γ∗ → ττ)and top quark pair production [9–11] in final states with τ leptons. Searches for supersymme-try, leptoquarks, W′ and Z′ bosons, as well as other non-SM Higgs bosons [12–17] benefit fromthe high performance τ reconstruction and identification capabilities of the CMS detector.

    With a mass of mτ = 1.777 GeV [18], the τ is the only lepton heavy enough to decay intohadrons (h), and it does so in about two thirds of the cases, typically into either one or threecharged pions or kaons and up to two neutral pions (π0), and one neutrino (ντ). The π0 mesondecays almost exclusively into γγ. In about 35% of the cases, τ leptons decay into an electronor muon and two neutrinos. The branching fractions for the main τ decay modes are givenin Table 1. The decays τ− → h− π0 ντ, τ− → h− π0 π0 ντ, and τ− → h− h+ h− ντ (with cor-responding channels for τ+) proceed via intermediate ρ(770) and a1(1260) meson resonances.The electrons and muons originating from τ decays are difficult to distinguish from electronsand muons produced directly in the primary proton-proton (pp) interaction, and are handledusing the standard CMS algorithms for electron and muon reconstruction and identification.The algorithms for τ reconstruction and identification presented in this paper focus on τ lep-ton decays to hadrons + ντ, that we refer to as “hadronic” τ decays and denote by τh. Thealgorithms provide the means for reconstructing individually the dominant τh decay modes.In comparing the energies of reconstructed τh candidates to their true energies, we refer to thecharged hadrons and neutral pions produced in the τ decay as “visible” τ decay products, andignore the ντ.

    Table 1: Approximate branching fractions (B) of different τ decay modes [18]. The generic sym-bol h− represents a charged hadron (either a pion or a kaon). Charge conjugation invariance isassumed in this paper.

    Decay mode Meson resonance B [%]τ− → e− νe ντ 17.8τ− → µ− νµ ντ 17.4τ− → h− ντ 11.5τ− → h− π0 ντ ρ(770) 26.0τ− → h− π0 π0 ντ a1(1260) 9.5τ− → h− h+ h− ντ a1(1260) 9.8τ− → h− h+ h− π0 ντ 4.8Other modes with hadrons 3.2All modes containing hadrons 64.8

    The mean lifetime of τ leptons at rest is 290× 10−15 s [18]. The distances that τ leptons travel be-tween their production and decay are small, but nevertheless significant compared to the trans-verse impact parameter and secondary-vertex resolution of the CMS tracking detector [19]. En-ergetic τ leptons originating from Z or SM Higgs boson decays typically traverse distances ofa few millimetres before decaying.

    The main challenge in identifying hadronic τ decays is distinguishing them from quark andgluon jet background. The cross section for multijet production from perturbative quantumchromodynamical (QCD) calculations exceeds by many orders of magnitude the rate at which

  • 2 2 CMS detector

    τ leptons are produced at the LHC. To reduce the background arising from quark and gluonjets, we exploit the fact that hadronic τ decays result in a lower particle multiplicity, and aremore collimated and isolated relative to other particles in the event. In some analyses, themisidentification of electrons or muons as τh candidates may constitute a sizeable problem,and dedicated algorithms have been developed to reduce this type of background.

    The performance of τh reconstruction and identification algorithms has been validated usingthe first LHC data recorded at

    √s = 7 TeV [20]. Since then, the algorithms have been further de-

    veloped, especially to improve their performance in dealing with additional inelastic pp inter-actions (pileup) that occur in the same bunch crossing as the hard scattering of interest. More-over, the rejection of backgrounds arising from misidentification of jets, electrons, and muonsas τh has improved significantly through the introduction of multivariate analysis (MVA) tech-niques. In this paper, we report on the performance of the improved algorithms used to analyzethe 8 TeV pp data at CMS, corresponding to an integrated luminosity of 19.7 fb−1.

    The paper is organized as follows. The CMS detector is described briefly in Section 2. Section 3describes the data and the Monte Carlo (MC) simulations used for studying the performanceof τh reconstruction and identification. The reconstruction of electrons, muons, and jets, alongwith various kinematic quantities is described in Section 4. The algorithms used for reconstruc-tion and identification of τh decays are detailed in Section 5. The performance of the algorithmsin simulated events is presented in Section 6. Sections 7–11 detail the validation of the algo-rithms with data. The results are summarized in Section 12.

    2 CMS detectorThe central feature of the CMS detector is a superconducting solenoid of 6 m internal diameter,providing a magnetic field of 3.8 T. A silicon pixel and strip tracker, a lead tungstate crystalelectromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL),each composed of a barrel and two endcap sections, are positioned within the solenoid volume.Muons are measured and identified in gas-ionization detectors embedded in the steel flux-return yoke outside the solenoid. Extensive forward calorimetry complements the coverageprovided by the barrel and endcap detectors.

    The CMS tracker is a cylindrical detector of 5.5 m length and 2.5 m diameter, constructed en-tirely of silicon modules. It provides an active sensor area of about 200 m2 to reconstructcharged particles within the pseudorapidity range |η| < 2.5. The innermost region aroundthe interaction point, subject to the highest particle flux, is instrumented with silicon pixelsensors. The central part of the pixel detector consists of three cylindrical layers, installed attransverse radii of r = 4.4, 7.3, and 10.2 cm, which extend over a total length of 53 cm. Thecentral part is complemented by two forward endcap disks of radius 6 < r < 15 cm, located atlongitudinal distances |z| = 34.5 and 46.5 cm on either side of the interaction point. The cen-tral part of the silicon strip detector consists of ten cylindrical layers and twelve endcap disksthat surround the pixel detector volume. The cylindrical layers cover radial distances of up to108 cm and |z| < 109 cm, and the disks cover up to r < 113 cm and |z| < 280 cm. Tracks ofcharged hadrons are reconstructed with an efficiency of 75–95% that depends on the transversemomentum pT and η [21].

    The silicon tracker adds a significant amount of material in front of the ECAL, mainly becauseof the mechanical structure, the services, and the cooling system. Figure 1 shows, as a functionof η, the number of radiation lengths (X0) of material that particles produced at the interactionpoint must traverse before they reach the ECAL. This rises from about 0.4X0 at |η| ≈ 0 to about

  • 3

    2.0X0 at |η| ≈ 1.4, and decreases to about 1.3X0 at |η| ≈ 2.5. As a result, photons originatingfrom π0 → γγ decays have a high probability for converting to e+e− pairs within the volumeof the tracking detector.

    η-4 -3 -2 -1 0 1 2 3 4

    0t/X

    0

    0.5

    1

    1.5

    2

    2.5 Support Tube TOB PixelTEC TIB and TID Beam Pipe

    CMS Simulation

    Figure 1: The total material thickness (t) in units of radiation length X0, as a function of η,that a particle produced at the interaction point must traverse before it reaches the ECAL. Thematerial used for sensors, readout electronics, mechanical structures, cooling, and services isgiven separately for the silicon pixel detector and for individual components of the silicon stripdetector (“TEC”, “TOB”, “TIB and TID”) [21]. The material used for the beam pipe and for thesupport tube that separates the tracker from the ECAL is also shown separately.

    The ECAL is a homogeneous and hermetic calorimeter made of PbWO4 scintillating crystals. Itis composed of a central barrel, covering |η| < 1.48, and two endcaps covering 1.48 < |η| < 3.0.The barrel is made of 61 200 trapezoidal crystals of front-face transverse section 22×22 mm2,giving a granularity of 0.0174×0.0174 in η and azimuth φ, and a length of 230 mm (25.8X0).The crystals are organized in 36 supermodules, 18 on each side of η = 0. Each supermodulecontains 1700 crystals, covers π/9 radians in φ, and is made of four modules along η. Thisstructure has a few thin uninstrumented regions between the modules in η (at |η| = 0, 0.435,0.783, 1.131, and 1.479), and between the supermodules in φ (every π/9 radians). The crystalsare installed with a quasi-projective geometry, tilted by an angle of 3◦ relative to the projectiveaxis that passes through the centre of CMS (the nominal interaction point), to minimize thepassage of electrons or photons through uninstrumented regions. The endcaps consist of atotal of 14 648 trapezoidal crystals with front-face transverse sections of 28.62×28.62 mm2, andlengths of 220 mm (24.7X0). The small radiation length (X0 = 0.89 cm) and small Molièreradius (2.3 cm) of the PbWO4 crystals provide a compact calorimeter with excellent two-showerseparation.

    The HCAL is a sampling calorimeter, with brass as passive absorber, and plastic scintillatortiles serving as active material, and provides coverage for |η| < 2.9. The calorimeter cellsare grouped in projective towers of approximate size 0.087×0.087 in η × φ in the barrel and0.17×0.17 in the endcaps.

    The muon system is composed of a cylindrical barrel section, and two planar endcaps thatsurround the solenoid with about 25 000 m2 of detection planes. Drift tube (DT) and cathodestrip chamber (CSC) layers provide muon reconstruction, identification, and trigger capability

  • 4 3 Data samples and Monte Carlo simulation

    within |η| < 2.4. The muon system consists of four muon stations, located at different distancesfrom the centre of CMS, and separated by layers of steel plates. Drift tubes are installed inthe barrel region |η| < 1.2, where the muon rate is low and the magnetic field in the returnyoke is uniform. Each DT station contains eight layers of tubes that measure the position inthe transverse plane (r-φ), and four layers that provide position information in the r-z plane,except for the outermost station, which contains only eight r-φ layers. In the endcaps, wherethe muon rates as well as the background from neutron radiation are higher and the magneticfield is non-uniform, CSC detectors cover the region 0.9 < |η| < 2.4. Each CSC station containssix layers of anode wires and cathode planes to measure the position in the bending plane(precise in φ, coarse in r). The combination of DT and CSC detectors covers the pseudorapidityinterval |η| < 2.4 without any gaps in acceptance. The DT and CSC systems are complementedby a system of resistive-plate chambers (RPC) that provide precise timing signals for triggeringon muons within the region |η| < 1.6. Particles produced at the nominal interaction point musttraverse more than 10 and 15 interaction lengths (λ) of absorber material before they reach theirrespective innermost and outermost detection planes. This greatly reduces the contributionfrom punch-through particles.

    The first level of the CMS trigger system, based on special hardware processors, uses informa-tion from calorimeters and muon detectors to select the most interesting events in a fixed timeinterval of

  • 5

    the mass range 900–4000 GeV and 750–2500 GeV, respectively. The list of training samples iscomplemented by SM H → ττ events, generated using POWHEG. The QCD samples used forthe MVA training extend up to a scale of p̂T = 3000 GeV.

    [GeV]hτ

    TGenerated p

    20 40 60 80 100 120

    Rel

    ativ

    e yi

    eld

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    ττ → *γZ/

    CMSSimulation

    [GeV]hτ

    TGenerated p

    0 500 1000 1500

    Rel

    ativ

    e yi

    eld

    0

    0.01

    0.02

    0.03

    0.04

    ττ →Z'(2.5 TeV)

    CMSSimulation

    [GeV] jet

    TGenerated p

    100 200 300 400 500

    Rel

    ativ

    e yi

    eld

    2−10

    1−10

    1

    10

    210

    310

    410

    Multijets

    W+jets

    CMSSimulation

    Figure 2: Transverse momentum distributions of the visible decay products of τh decays, in(left) simulated Z/γ∗ → ττ events, (middle) Z′(2.5 TeV)→ ττ events, and (right) of quark andgluon jets in simulated W+jets and multijet events, at the generator level.

    The transverse momentum distribution of the visible τ decay products in simulated Z/γ∗ →ττ and Z′ → ττ events is shown in Fig. 2. The Z′ sample is generated for a mass of mZ′ =2.5 TeV, and used to study the efficiency to identify τh decays at high pT. The pT distributionof generator level quark and gluon jets in simulated W+jets and QCD multijet events is alsoshown in the figure. The jets are constructed using the anti-kT algorithm [35] with a distanceparameter of 0.5.

    On average, 21 inelastic pp interactions occur per LHC bunch crossing. Minimum bias eventsgenerated with PYTHIA are overlaid on all simulated events, according to the luminosity profileof the analyzed data.

    All generated events are passed through a detailed simulation of the CMS apparatus, based onGEANT4 [36], and are reconstructed using the same version of the CMS event reconstructionsoftware as used for data.

    Small differences between data and MC simulation are observed in selection efficiencies andin energy and momentum measurements of electrons and muons, as well as in the efficienciesfor electron, muon, and τh final states to pass the trigger requirements. These differences arecorrected by applying suitably-chosen weights to simulated events. The corrections are de-termined by comparing Z/γ∗ → `` events in simulation and data. Differences in responseand resolution of the missing transverse momentum in data and simulation are corrected asdescribed in Ref. [37].

    4 Event reconstructionThe information available from all CMS subdetectors is employed in the particle-flow (PF)algorithm [38–41] to identify and reconstruct individual particles in the event, namely muons,electrons, photons, and charged and neutral hadrons. These particles are used to reconstructjets, τh candidates, and the vector imbalance in transverse momentum in the event, referred toas ~pmissT , as well as to quantify the isolation of leptons.

    Electrons are reconstructed by matching tracks in the inner detector with energy depositions

  • 6 4 Event reconstruction

    in the ECAL [38, 42]. The tracks of electron candidates are reconstructed using a Gaussian-sum filter (GSF) [43] algorithm, which accounts for the emission of bremsstrahlung photonsalong the electron trajectory. Energy loss in bremsstrahlung is reconstructed by searching forenergy depositions in the ECAL located in directions tangential to the electron track. A multi-variate approach based on boosted decision trees (BDT) [44] is employed for electron identifi-cation [45]. Observables that quantify the quality of the electron track, the compactness of theelectron cluster in directions transverse and longitudinal to the electron track, and the compati-bility between the track momentum and the energy depositions in the ECAL are used as inputsto the BDT. Additional requirements are applied to reject electrons originating from photonconversions to e+e− pairs in detector material.

    The identification of muons is based on linking track segments reconstructed in the silicontracking detector and in the muon system [46]. The matching between track segments is doneoutside-in, starting from a track in the muon system, and inside-out, starting from a trackreconstructed in the inner detector. In case a link can be established, the track parametersare refitted using the combined hits in the inner and outer detectors, with the resulting trackreferred to as a global muon track. Quality criteria are applied on the multiplicity of hits, onthe number of matched segments, and on the fit quality of the global muon track, quantifiedthrough a χ2.

    Electrons and muons originating from decays of W and Z bosons are expected to be isolated,while leptons from heavy flavour (charm and bottom quark) decays, as well as from in-flightdecays of pions and kaons, are often reconstructed within jets. The signal is distinguishedfrom multijet background through the sum of scalar pT values of charged particles, neutralhadrons, and photons, reconstructed within a cone of size ∆R =

    √(∆η)2 + (∆φ)2 of 0.4, centred

    around the lepton direction, using the PF algorithm. Neutral hadrons and photons withinthe innermost region of the cone are excluded from the sum, to prevent the footprint of thelepton in ECAL and HCAL from causing the lepton to fail isolation criteria. Charged particlesclose to the direction of electrons are also excluded from the computation, to avoid countingtracks from converted photons emitted by bremsstrahlung. Efficiency loss due to pileup is keptminimal by considering only charged particles originating from the lepton production vertexin the isolation sum. The contribution of the neutral component of pileup to the isolation of thelepton is taken into account by means of so-called ∆β corrections:

    I` = ∑charged

    pT + max

    {0, ∑

    neutralspT − ∆β

    }, (1)

    where ` corresponds to either e or µ, and the sums extend over, respectively, the charged par-ticles that originate from the lepton production vertex and the neutral particles. Charged andneutral particles are required to be within a cone of size ∆R = 0.4 around the lepton direc-tion. The ∆β corrections are computed by summing the scalar pT of charged particles that arewithin a cone of size ∆R = 0.4 around the lepton direction and do not originate from the leptonproduction vertex, and scaling this sum down by a factor of two:

    ∆β = 0.5 ∑charged, pileup

    pT. (2)

    The factor of 0.5 approximates the phenomenological ratio of neutral-to-charged hadron pro-duction in the hadronization of inelastic pp collisions.

    Collision vertices are reconstructed using a deterministic annealing algorithm [47, 48]. Thereconstructed vertex position is required to be compatible with the location of the LHC beam

  • 7

    in the x-y plane. The primary collision vertex (PV) is taken to be the vertex that maximizes∑tracks p2T. The sum extends over all tracks associated with a given vertex.

    Jets within the range |η| < 4.7 are reconstructed using the anti-kT algorithm [35] with a distanceparameter of 0.5. As mentioned previously, the particles reconstructed by the PF algorithm areused as input to the jet reconstruction. Reconstructed jets are required not to overlap withidentified electrons, muons, or τh within ∆R < 0.5, and to pass two levels of jet identificationcriteria: (i) misidentified jets, mainly arising from calorimeter noise, are rejected by requiringreconstructed jets to pass a set of loose jet identification criteria [49] and (ii) jets originatingfrom pileup interactions are rejected through an MVA-based jet identification discriminant,relying on information about the vertex and energy distribution within the jet [50]. The energyof reconstructed jets is calibrated as a function of jet pT and η [51]. The contribution of pileupto the energy of jets originating from the hard scattering is compensated by determining amedian transverse momentum density (ρ) for each event, and subtracting the product of ρtimes the area of the jet, computed in the η− φ plane, from the reconstructed jet pT [52, 53]. Jetsoriginating from the hadronization of b quarks are identified through the combined secondaryvertex (CSV) algorithm [54], which exploits observables related to the long lifetime of b hadronsand the higher particle multiplicity and mass of b jets compared to light-quark and gluon jets.

    Two algorithms are used to reconstruct ~pmissT , the imbalance in transverse momentum in theevent, whose magnitude is referred to as EmissT . The standard algorithm computes the negativevectorial sum of all particle momenta reconstructed using the PF algorithm. In addition, amultivariate regression algorithm [37] has been developed to reduce the effect of pileup on theresolution in EmissT . The algorithm utilizes the fact that pileup predominantly produces jets oflow pT, while leptons and high-pT jets are produced almost exclusively in the hard-scatter.

    The transverse mass, mT, of the system constituted by an electron or a muon and EmissT is usedto either select or remove events that are due to W+jets and tt production. It is defined by:

    mT =√

    2p`TEmissT (1− cos ∆φ), (3)

    where the symbol ` refers to electron or muon and ∆φ denotes the difference in azimuthal anglebetween the lepton momentum and the ~pmissT vector.

    5 Algorithm for τh reconstruction and identificationThe τh decays are reconstructed and identified using the hadrons-plus-strips (HPS) algorithm [20].The algorithm is designed to reconstruct individual decay modes of the τ lepton, taking advan-tage of the excellent performance of the PF algorithm in reconstructing individual charged andneutral particles.

    The reconstruction and identification of τh decays in the HPS algorithm is performed in twosteps:

    1. Reconstruction: combinations of charged and neutral particles reconstructed by the PFalgorithm that are compatible with specific τh decays are constructed, and the four-momentum,expressed in terms of (pT, η, φ, and mass) of τh candidates, is computed.

    2. Identification: discriminators that separate τh decays from quark and gluon jets, andfrom electrons and muons, are computed. This provides a reduction in the jet → τh,e→ τh, and µ→ τh misidentification rates.

  • 8 5 Algorithm for τh reconstruction and identification

    The HPS algorithm is seeded by jets of pT > 14 GeV and |η| < 2.5, reconstructed using theanti-kT algorithm [35] with a distance parameter of 0.5. The pT criterion is applied on the jetmomentum given by the vectorial sum of all particle constituents of the jet, before the jet energycalibration and pileup corrections described in Section 4 are taken into account.

    5.1 Identification of decay modes

    Reconstruction of specific τh decay modes requires reconstruction of neutral pions that arepresent in most of the hadronic τ decays. The high probability for photons originating fromπ0 → γγ decays to convert to e+e− pairs within the volume of the CMS tracking detector istaken into account by clustering the photon and electron constituents of the τ-seeding jet into“strips” in the η − φ plane. The clustering of electrons and photons of pT > 0.5 GeV into stripsproceeds via an iterative procedure. The electron or photon of highest pT not yet included intoany strip is used to seed a new strip. The initial position of the strip in the η − φ plane is setaccording to the η and φ of the seed e or γ. The e or γ of next-highest pT that is within anη × φ window centred on the strip location is merged into the strip. The strip position is thenrecomputed as an energy-weighted average of all electrons and photons contained in the strip:

    ηstrip =1

    pstripT∑ pγTηγ

    φstrip =1

    pstripT∑ pγTφγ,

    with pstripT = ∑ pγT. The construction of the strip ends when no additional electrons or photons

    are found within an η × φ window of size 0.05× 0.20. In which case the clustering proceedsby constructing a new strip, which is seeded by the e or γ with next highest pT. The size ofthe window is enlarged in the φ direction to account for the bending of e+ and e− from photonconversions in the 3.8 T magnetic field. Strips with pT sums of electrons and photons in thestrip of >2.5 GeV are kept as π0 candidates.

    Hadronic τ candidates are formed by combining the strips with the charged-particle con-stituents of the jet. The charged particles are required to satisfy the condition pT > 0.5 GeV.The distance of closest approach between their tracks and the hypothetical production vertexof the τh candidate, taken to be the vertex closest to the charged particle of highest pT within thejet, is required to be less than 0.4 cm in the z direction and

  • 5.2 Tau-isolation discriminants 9

    reconstructed by summing the charges of all particles included in the construction of the τhcandidate, except for the electrons contained in strips. The probability for misreconstructingthe τh charge is ≈1%, with a moderate dependence on pT and η, for taus from Z decays.

    The following criteria are applied to assure the compatibility of each hypothesis with the sig-natures expected for the different τh decays in Table 1:

    1. h±h∓h±: Combination of three charged particles with mass 0.8 < mτh < 1.5 GeV. Thetracks are required to originate within ∆z < 0.4 cm of the same event vertex, and to havea total charge of one.

    2. h±π0π0: Combination of a single charged particle with two strips. The mass of the τhcandidate is required to satisfy the condition 0.4 < mτh < 1.2

    √pT [GeV]/100 GeV. The

    size of the mass window is enlarged for τh candidates of high pT to account for resolutioneffects. The upper limit on the mass window is constrained to be at least 1.2 and at most4.0 GeV.

    3. h±π0: Combination of one charged particle and one strip with mass 0.3 < mτh <1.3√

    pT [GeV]/100 GeV. The upper limit on the mass window is constrained to be at least1.3 and at most 4.2 GeV.

    4. h±: A single charged particle without any strips.

    The combinations of charged particles and strips considered by the HPS algorithm representall hadronic τ decay modes in Table 1, except τ− → h−h+h−π0ντ. The latter corresponds toa branching fraction of 4.8%, and is not considered in the present version of the algorithm,because of its contamination by jets. The h±π0 and h±π0π0 decays are analyzed together, andreferred to as h±π0s.

    Hypotheses that fail the mass window selection for the corresponding decay mode are dis-carded, as are hypotheses that have a charge different from unity, or hypotheses that includeany charged hadron or strip outside of a signal cone of ∆R = 3.0/pT [GeV] of the axis givenby the momentum vector of the τh candidate. The size of the cone takes into account the factthat decay products of energetic τ leptons are more collimated. When ∆R is smaller than 0.05or exceeds 0.10, a cone of size ∆R = 0.05 or ∆R = 0.10 is used as the limit, respectively.

    When multiple combinations of charged hadrons and strips pass the mass window and thesignal cone requirements, the hypothesis for the candidate with largest pT is retained. All othercombinations are discarded, resulting in a unique τh candidate to be associated to each jet.

    The distributions in the decay modes and in the mass of τh candidates in Z/γ∗ → ττ eventsare shown in Fig. 3. The contribution of the Z/γ∗ → ττ signal is split according to the recon-structed τh mode, as shown in the legend. For τh candidates reconstructed in the h±π0s andh±h∓h± modes, the mτh distribution peaks near the intermediate ρ(770) and a1(1260) mesonresonances (cf. Table 1), as expected. The narrow peak at the charged pion mass is due to τhcandidates reconstructed in the h± mode.

    5.2 Tau-isolation discriminants

    Requiring reconstructed τh candidates to pass strict isolation requirements constitutes the mainhandle for reducing the large multijet background. Tau leptons are usually isolated relative toother particles in the event, and so are their decay products, in contrast to quark and gluonjets. Two types of τh isolation discriminants have been developed, using simple cutoff-based

  • 10 5 Algorithm for τh reconstruction and identification

    decay modeτ±h s0π±h ±h

    ±

    h±h

    Eve

    nts

    0

    10000

    20000

    30000

    40000

    50000

    60000

    70000Observed

    ττ →Z tt

    ElectroweakMultijetsUncertainty

    CMS

    (8 TeV)-119.7 fb

    [GeV]hτm

    0 0.5 1 1.5

    [1/G

    eV]

    hτdN

    /dm

    0

    50

    100

    150

    200

    250310×

    Observed±h

    ±

    h±: hττ →Z s0π±: hττ →Z

    ±: hττ →Z tt

    ElectroweakMultijetsUncertainty

    CMS

    (8 TeV)-119.7 fb

    Figure 3: Distributions in (left) reconstructed τh decay modes and (right) τh candidate massesin Z/γ∗ → ττ events selected in data, compared to MC expectations. The Z/γ∗ → ττ eventsare selected in the decay channel of muon and τh, as described in Section 7.1.1. The τh arerequired to pass the medium working point of the MVA-based τh isolation discriminant. Themass of τh candidates reconstructed in simulated Z/γ∗ → ττ events is corrected for smalldata/MC differences in the τh energy scale, discussed in Section 9. The electroweak back-ground is dominated by W+jets production, with minor contributions arising from single topquark and diboson production. The shaded uncertainty band represents the sum of systematicand statistical uncertainties on the MC simulation.

    selections and an MVA approach. An overview of the discriminants, with their respectiveefficiencies and misidentification rates, is given in Table 2.

    5.2.1 Cutoff-based discriminants

    The isolation of τh candidates is computed by summing the scalar values of pT of charged par-ticles and photons with pT > 0.5 GeV, reconstructed with the PF algorithm, within an isolationcone of size ∆R = 0.5, centred on the τh direction. The effect of pileup is reduced by requiringthe tracks associated to charged particles considered in the isolation sum to be compatible withoriginating from the production vertex of the τh candidate within a distance of ∆z < 0.2 cm and∆r < 0.03 cm. Charged hadrons used to form the τh candidate are excluded from the isolationsum, as are electrons and photons used to construct any of the strips. The effect of pileup onphoton isolation is compensated on a statistical basis through the modified ∆β corrections:

    Iτ = ∑charged,∆z0.2 cm

    pT. (5)

  • 5.2 Tau-isolation discriminants 11

    Loose, medium, and tight working points (WP) are defined for the cutoff-based τh isolationdiscriminants by requiring the pT sum defined by Eq. (4) not to exceed thresholds of 2.0, 1.0,and 0.8 GeV, respectively.

    5.2.2 MVA-based discriminants

    In order to minimize the jet → τh background, the MVA-based τh identification discriminantutilizes the transverse impact parameter of the “leading” (highest pT) track of the τh candidate,defined as the distance of closest approach in the transverse plane of the track to the τh pro-duction vertex. It also uses, for τh candidates reconstructed in the h±h∓h± decay mode, thedistance between the τ production point and the decay vertex. A BDT is used to discriminateτh decays (“signal”) from quark and gluon jets (“background”). The variables used as inputsto the BDT are:

    1. The charged- and neutral-particle isolation sums defined in Eq. (4) as separate inputs.

    2. The reconstructed τh decay mode, represented by an integer that takes the value of 0 forτh candidates reconstructed in the h± decay mode, as 1 and 2 for candidates reconstructedin the h±π0 and h±π0π0 decay modes, respectively, and 10 for candidates reconstructedin the h±h∓h± decay mode.

    3. The transverse impact parameter d0 of the leading track of the τh candidate, and its valuedivided by its uncertainty, which corresponds to its significance d0/σd0 .

    4. The distance between the τ production and decay vertices, |~rSV −~rPV|, and its signifi-cance, |~rSV−~rPV|/σ|~rSV−~rPV|, and a flag indicating whether a decay vertex has successfullybeen reconstructed for a given τh candidate. The positions of the vertices,~rSV and~rPV, arereconstructed using the adaptive vertex fitter algorithm [48].

    The position of the primary event vertex is refitted after excluding the tracks associated withthe τh candidate. The discrimination power of individual input variables is illustrated in Fig. 4.

    The inputs are complemented by the pT and η of the τh candidate and by the ∆β correctiondefined in Eqs. (4) and (5). The purpose of the pT and η variables is to parameterize possibledependences of the other input variables on pT and η. The events used for the training of theBDT are reweighted such that the two-dimensional pT and η distribution of the τh candidatesfor signal and background are identical, which makes the MVA result independent of eventkinematics. The ∆β correction parameterizes the dependence on pileup, in particular, the pTsum of the neutral particles.

    The BDT is trained on event samples produced using MC simulation. Samples of Z/γ∗ → ττ,H → ττ, Z′ → ττ, and W′ → τντ events are used for the “signal” category. Reconstructedτh candidates are required to match τh decays within ∆R < 0.3 at the generator level. Multi-jet and W+jets events are used for the “background” category. The τh candidates that matchleptons originating from the W boson decays are excluded from the training. The samples con-tain ≈ 107 events in total, and cover the range 20–2000 GeV in τh candidate pT. Half of theavailable events are used for training, the other half for evaluating the MVA performance, andconducting overtraining checks. The distribution in MVA output is shown in Fig. 5.

    Different working points, corresponding to different τh identification efficiencies and jet →τh misidentification rates, are defined by changing the selections on the MVA output. Thethresholds are adjusted as function of the pT of the τh candidate, such that the τh identificationefficiency for each WP is constant as function of pT.

  • 12 5 Algorithm for τh reconstruction and identification

    z < 0.2 cm) [GeV]∆ (chargedT

    pΣ0 5 10 15 20 25 30 35 40

    Rel

    ativ

    e yi

    eld

    / bin

    4−10

    3−10

    2−10

    1−10

    1

    hτJets

    CMS Simulation

    [GeV]γT

    pΣ0 5 10 15 20 25 30 35 40

    Rel

    ativ

    e yi

    eld

    / bin

    4−10

    3−10

    2−10

    1−10

    1

    hτJets

    CMS Simulation

    decay modehτ0 2 4 6 8 10

    Rel

    ativ

    e yi

    eld

    / bin

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6 hτJets

    CMS Simulation

    ]1/2 [cm |0

    | d

    0 0.05 0.1 0.15 0.2

    Rel

    ativ

    e yi

    eld

    / bin

    0

    0.01

    0.02

    0.03

    0.04

    0.05hτ

    Jets

    CMS Simulation

    0dσ | /

    0| d

    0 2 4 6 8 10

    Rel

    ativ

    e yi

    eld

    / bin

    00.010.020.030.040.050.060.070.080.09

    hτJets

    CMS Simulation

    Rel

    ativ

    e yi

    eld

    / bin

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    no SV has SV

    hτJets

    CMS Simulation

    ]1/2 [cm |PVr - SVr|

    0 0.5 1 1.5 2 2.5 3

    Rel

    ativ

    e yi

    eld

    / bin

    4−10

    3−10

    2−10

    1−10

    1

    hτJets

    CMS Simulation

    |PVr - SVr| σ | / PVr - SVr|

    0 2 4 6 8 10

    Rel

    ativ

    e yi

    eld

    / bin

    0

    0.02

    0.04

    0.06

    0.08

    0.1

    0.12

    0.14

    hτJets

    CMS Simulation

    Figure 4: Distributions, normalized to unity, in observables used as input variables to the MVA-based isolation discriminant, for hadronic τ decays in simulated Z/γ∗ → ττ (blue), and jets insimulated W+jets (red) events. The τh candidates must have pT > 20 GeV and |η| < 2.3, and bereconstructed in one of the decay modes h±, h±π0, h±π0π0, or h±h∓h±. In the plot of the τhdecay mode on the upper right, an entry at 0 represents the decay mode h±, 1 and 2 representthe decay modes h±π0 and h±π0π0, respectively, and entry 10 represents the h±h∓h± decaymode.

  • 5.3 Discriminants against electrons and muons 13

    MVA output1− 0.5− 0 0.5 1

    Rel

    ativ

    e yi

    eld

    / bin

    4−10

    3−10

    2−10

    1−10

    1

    hτJets

    CMS Simulation

    Figure 5: Distribution of MVA output for the τh identification discriminant that includes life-time information for hadronic τ decays in simulated Z/γ∗ → ττ (blue), and jets in simulatedW+jets (red) events.

    5.3 Discriminants against electrons and muons

    Electrons and muons have a sizeable probability to get reconstructed in the h± decay mode.Electrons radiating a bremsstrahlung photon that subsequently converts may also get recon-structed in the h±π0 decay mode. In particular, electrons and muons originating from decays ofW and Z bosons, which are produced with cross sections of ≈100 nb at the LHC at

    √s = 8 TeV,

    have a high chance to pass isolation-based τh identification criteria. Dedicated discriminantshave been developed to separate τh from electrons and muons. The separation of τh from elec-trons is based on an MVA approach. A cutoff-based and an MVA based discriminant are usedto separate τh from muons.

    5.3.1 MVA-based electron discriminant

    A BDT discriminant is trained to separate τh decays from electrons. The algorithm utilizesobservables that quantify the distribution in energy depositions in the ECAL, in combinationwith observables sensitive to the amount of bremsstrahlung emitted along the leading track,and observables sensitive to the overall particle multiplicity, to distinguish electromagneticfrom hadronic showers. More specifically, the following variables are used as inputs to theBDT:

    1. Electromagnetic energy fraction, EECAL/(EECAL + EHCAL), defined as the ratio of energydepositions in the ECAL to the sum of energy in the ECAL and HCAL, associated withthe charged particles and photons that constitute the τh candidate.

    2. EECAL/p and EHCAL/p, defined as ratios of ECAL and HCAL energies relative to themomentum of the leading charged-particle track of the τh candidate.

    3.√

    ∑(∆η)2 pγT and√

    ∑(∆φ)2 pγT, the respective pT-weighted (in GeV) root-mean-squaredistances in η and φ between the photons in any strip and the leading charged particle.

  • 14 5 Algorithm for τh reconstruction and identification

    4. ∑ Eγ/Eτ, the fraction of τh energy carried by photons.

    5. Fbrem = (pin− pout)/pin, where pin and pout are measured by the curvature of the leadingtrack, reconstructed using the GSF algorithm, at the innermost and outermost positionsof the tracker.

    6. (Ee + ∑ Eγ)/pin, the ratio between the total ECAL energy and the inner track momen-tum. The quantities Ee and ∑ Eγ represent the energies of the electron cluster and ofbremsstrahlung photons, respectively. ∑ Eγ is reconstructed by summing the energy de-positions in ECAL clusters located along the tangent to the GSF track.

    7. ∑ Eγ/(pin − pout), the ratio of energies of the bremsstrahlung photons measured in theECAL and in the tracker.

    8. mτh , the mass of the τh candidate.

    9. (NGSFhits − NKFhits)/(NGSFhits + NKFhits), with NGSFhits and NKFhits representing, respectively, the num-ber of hits in the silicon pixel and strip tracking detector associated with the track recon-structed using, respectively, the GSF and Kalman filter (KF) track reconstruction algo-rithms. The KF algorithm is the standard algorithm for track reconstruction at CMS [21].The number of hits associated with GSF and KF track is sensitive to the emission of hardbremsstrahlung photons.

    10. χ2 per degree-of-freedom (DoF) of the GSF track.

    The discriminating power of these variables is illustrated in Fig. 6.

    The inputs are complemented by the pT and η of the τh candidate, the pT, σpT /pT, and η ofthe GSF track, and by the distances in η and in φ of the GSF track to the nearest boundarybetween ECAL modules. These variables are used to parameterize the dependence of the otherinput variables. Electrons entering the boundaries between ECAL modules are more difficult todiscriminate from τh decays, as their electromagnetic showers are often not well reconstructed,and the probability to reach the hadron calorimeter increases in these regions.

    Samples of simulated Z/γ∗ → ττ, Z/γ∗ → ee, W → τντ, W → eνe, tt, H → ττ, Z′ → ττ,Z′ → ee, W′ → τντ, and W′ → eνe events have been used to train the BDT. Reconstructed τhcandidates are considered as signal or background when they are matched, respectively, within∆R < 0.3 to a hadronic τ decay or to an electron at the generator level.

    Different WP are defined by changing the cutoff on the BDT output. The τh candidates recon-structed in the uninstrumented region between ECAL barrel and endcap, 1.45 < η < 1.56, arerejected in all cases.

    5.3.2 Cutoff-based muon discriminant

    The cutoff-based discriminant against muons vetoes τh candidates when signals in the muonsystem are found near the τh direction. Two working points are provided:

    1. Loose: τh candidates pass the cutoff on this discriminant, except when track segmentsare found in at least two muon stations within a cone of size ∆R = 0.3 centred on the τhdirection, or when the sum of the energies in the ECAL and HCAL corresponds to < 0.2of the momentum of the leading track of the τh candidate.

  • 5.3 Discriminants against electrons and muons 15

    )HCAL + EECAL / (EECALE0 0.2 0.4 0.6 0.8 1

    Rel

    ativ

    e yi

    eld

    / bin

    4−10

    3−10

    2−10

    1−10

    1

    hτe

    CMS Simulation

    / pECALE

    0 0.2 0.4 0.6 0.8 1R

    elat

    ive

    yiel

    d / b

    in4−10

    3−10

    2−10

    1−10

    1

    hτe

    CMS Simulation

    / pHCALE

    0 0.2 0.4 0.6 0.8 1

    Rel

    ativ

    e yi

    eld

    / bin

    4−10

    3−10

    2−10

    1−10

    1

    hτe

    CMS Simulation

    ]1/2 [GeVγT

    p⋅ 2)η∆ (Σ0 20 40 60 80 100

    Rel

    ativ

    e yi

    eld

    / bin

    4−10

    3−10

    2−10

    1−10

    1

    hτe

    CMS Simulation

    ]1/2 [GeVγT

    p⋅ 2)φ∆ (Σ0 20 40 60 80 100

    Rel

    ativ

    e yi

    eld

    / bin

    4−10

    3−10

    2−10

    1−10

    1

    hτe

    CMS Simulation

    τ / Eγ EΣ0 0.2 0.4 0.6 0.8 1

    Rel

    ativ

    e yi

    eld

    / bin

    4−10

    3−10

    2−10

    1−10

    1

    hτe

    CMS Simulation

    bremF

    0.2− 0 0.2 0.4 0.6 0.8 1

    Rel

    ativ

    e yi

    eld

    / bin

    4−10

    3−10

    2−10

    1−10

    1

    hτe

    CMS Simulation

    in) / pγ EΣ + e(E

    0 2 4 6 8 10

    Rel

    ativ

    e yi

    eld

    / bin

    4−10

    3−10

    2−10

    1−10

    1

    hτe

    CMS Simulation

    )out

    - pin

    / (pγ EΣ0 0.5 1 1.5 2 2.5 3 3.5 4

    Rel

    ativ

    e yi

    eld

    / bin

    4−10

    3−10

    2−10

    1−10

    1

    hτe

    CMS Simulation

    [GeV]hτm

    0 0.5 1 1.5 2

    Rel

    ativ

    e yi

    eld

    / bin

    4−10

    3−10

    2−10

    1−10

    1

    hτe

    CMS Simulation

    )KFhits + NGSFhits

    ) / (NKFhits - NGSFhits

    (N

    3− 2− 1− 0 1 2 3 4

    Rel

    ativ

    e yi

    eld

    / bin

    4−10

    3−10

    2−10

    1−10

    1

    10 hτe

    CMS Simulation

    / DoF2χGSF track 0 2 4 6 8 10

    Rel

    ativ

    e yi

    eld

    / bin

    4−10

    3−10

    2−10

    1−10

    1

    hτe

    CMS Simulation

    Figure 6: Distributions, normalized to unity, in observables that are used as inputs to the MVA-based electron discriminant, for hadronic τ decays in simulated Z/γ∗ → ττ (blue), and elec-trons in simulated Z/γ∗ → ee (red) events. The τh candidates must have pT > 20 GeV and|η| < 2.3, and be reconstructed in one of the decay modes h±, h±π0, h±π0π0, or h±h∓h±. Therightmost bin of the distributions is used as overflow bin.

  • 16 6 Expected performance

    2. Tight: τh candidates pass this discriminant restriction when they pass the loose WP, andno hits are present within a cone of ∆R = 0.3 around the τh direction in the CSC, DT, andRPC detectors located in the two outermost muon stations.

    5.3.3 MVA-based muon discriminant

    A multivariate BDT discriminant has also been trained to separate τh decays from muons. Thefollowing variables are used as BDT inputs:

    1. The calorimeter energy associated with the leading charged particle of the τh candidate,with separate energy sums computed for ECAL and HCAL.

    2. The calorimeter energy associated in the PF algorithm with any charged particle or pho-ton constituting the τh candidate, again, with separate energy sums computed for ECALand HCAL.

    3. The fraction of pT carried by the charged particle with highest pT.

    4. The number of track segments in the muon system reconstructed within a cone of size∆R = 0.5 around the τh direction.

    5. The number of muon stations with at least one hit detected within a cone of size ∆R = 0.5centred on the τh direction, computed separately for DT, CSC, and RPC detectors.

    The inputs are complemented by the η of the τh candidate, to parameterize the dependenceof the input variables on the DT, CSC, and RPC muon acceptance, and on the path length ofmuons traversed in the ECAL and HCAL.

    The BDT is trained using samples of simulated Z/γ∗ → ττ, Z/γ∗ → µµ, W→ τντ, W→ µνµ,tt, H→ ττ, Z′ → ττ, Z′ → µµ, W′ → τντ, and W′ → µνµ events. Reconstructed τh candidatesare considered as signal or background when they are matched, respectively, to generator-levelhadronic tau decays or muons within ∆R < 0.3.

    Different WP are defined by changing the cutoff on the MVA output.

    6 Expected performanceThe expected performance of the HPS τh identification algorithm is studied in terms of decaymodes and energy reconstruction, τh identification efficiency, and misidentification rates forjets, electrons, and muons using simulated samples of Z/γ∗ → `` (` = e, µ, τ), Z′ → ττ,W+jets, and multijet events.

    Tau identification efficiencies and misidentification rates in MC simulated events, averagedover pT and η, for pileup conditions characteristic of the data-taking period, are given in Ta-ble 2.

    6.1 Decay modes and energy reconstruction

    The τh decay mode reconstruction is studied in simulated Z/γ∗ → ττ events. The performanceis quantified by the correlation between reconstructed and generator-level τh decay modes.Figure 7 demonstrates that the true τ decay mode is reconstructed in about 90% of the cases,irrespective of pileup conditions, represented by the number of reconstructed vertices (Nvtx).The few per cent decrease in the fraction of τ leptons decaying to a single charged hadron that

  • 6.1 Decay modes and energy reconstruction 17

    Table 2: Expected efficiencies and misidentification rates of various τh identification discrim-inants, averaged over pT and η, for pileup conditions characteristic of the LHC Run 1 data-taking period. The DM-finding criterion refers to the requirement that the τh candidate bereconstructed in one of the decay modes h±, h±π0, h±π0π0, or h±h∓h± (cf. Section 5.1).

    DM-finding and τh isolation discriminants

    WPEfficiency Jet→ τh misidentification rate

    Z/γ∗ → ττ Z′(2.5 TeV)→ ττ W+jets MultijetCutoff-based

    Loose 49.0% 58.9% 9.09× 10−3 3.86× 10−3Medium 40.8% 50.8% 5.13× 10−3 2.06× 10−3Tight 38.1% 48.1% 4.38× 10−3 1.75× 10−3

    MVA-basedVery loose 55.9% 71.2% 1.29× 10−2 6.21× 10−3Loose 50.7% 64.3% 7.38× 10−3 3.21× 10−3Medium 39.6% 50.7% 3.32× 10−3 1.30× 10−3Tight 27.3% 36.4% 1.56× 10−3 4.43× 10−4

    Discriminant against electrons

    WPEfficiency e→ τh misidentification rate

    Z/γ∗ → ττ Z′(2.5 TeV )→ ττ Z/γ∗ → eeVery loose 94.3% 89.6% 2.38× 10−2Loose 90.6% 81.5% 4.43× 10−3Medium 84.8% 73.2% 1.38× 10−3Tight 78.3% 65.1% 6.21× 10−4Very tight 72.1% 60.0% 3.54× 10−4

    Discriminant against muons

    WPEfficiency µ→ τh misidentification rate

    Z/γ∗ → ττ Z′(2.5 TeV)→ ττ Z/γ∗ → µµCutoff-based

    Loose 99.3% 96.4% 1.77× 10−3Tight 99.1% 95.0% 7.74× 10−4

    MVA-basedLoose 99.5% 99.4% 5.20× 10−4Medium 99.0% 98.8% 3.67× 10−4Tight 98.0% 97.7% 3.18× 10−4

  • 18 6 Expected performance

    are reconstructed in the true decay mode is due to events in which particles from pileup depositenergy in the ECAL near the τ, causing the τ to be reconstructed in the h±π0 or h±π0π0 decaymodes.

    The performance of energy reconstruction is studied in simulated Z/γ∗ → ττ and Z′ → ττevents, and quantified in terms of response and resolution, defined as the mean and standarddeviation of the reconstructed momentum distribution relative to the generator-level momen-tum of the visible τ decay products. The distributions for τh decays in simulated Z/γ∗ → ττand Z′ → ττ events are shown in Fig. 8. The average response is below 1.0, because of anasymmetry of the 〈precT /p

    genT 〉 distribution, where precT and p

    genT refer, respectively, to the pT of

    the reconstructed τh candidate and to the pT of the vectorial momentum sum of the visible τdecay products at the generator level. The most probable value of the ratio 〈precT /p

    genT 〉 is close

    to 1.0. The effect of pileup on τ reconstruction is small.

    0.91 0.16 0.01

    0.09 0.83 0.02

    0.00 0.01 0.97

    decay modeτGenerated ±h s0π±h ±h

    ±

    h±h

    dec

    ay m

    ode

    τR

    econ

    stru

    cted

    ±h

    s0π±h

    ±h

    ±

    h±h

    CMS Simulation

    vtxN10 20 30 40

    Fra

    ctio

    n co

    rrec

    tly a

    ssig

    ned

    0.8

    0.9

    1

    ±hs0π±h

    ±h

    ±

    h±h

    CMS Simulation

    Figure 7: Left: Correlation between generated and reconstructed τh decay modes for τh decaysin Z/γ∗ → ττ events, simulated for pileup conditions characteristic of the LHC Run 1 data-taking period. Right: Fraction of generated τh reconstructed in the correct decay mode asfunction of Nvtx. Reconstructed τh candidates are required to be matched to hadronic τ decaysat the generator-level within ∆R < 0.3, to be reconstructed in one of the decay modes h±, h±π0,h±π0π0, or h±h∓h±, and pass pT > 20 GeV, |η| < 2.3, and the loose WP of the cutoff-based τhisolation discriminant.

    6.2 The τh identification efficiency

    The efficiency to pass the decay mode reconstruction and the different τh identification dis-criminants is determined for hadronic τ decays with visible decay products that satisfy theconditions pT > 20 GeV and |η| < 2.3 at the generator level. More specifically, the efficiency isdefined by the percentage of τh candidates that satisfy:

    ετ =precT > 20 GeV, |ηrec| < 2.3, DM-finding, τh ID discriminant

    pgenT > 20 GeV, |ηgen| < 2.3, (6)

    where ηrec and ηgen refer, respectively, to the η of the reconstructed τh candidate and to the ηof the vectorial momentum sum of the visible τ decay products at the generator level. TheDM-finding criterion refers to the requirement that the τh candidate be reconstructed in one

  • 6.2 The τh identification efficiency 19

    [GeV]hτ

    TGenerated p

    200 400 600 800 1000 1200

    >ge

    n

    T /

    pre

    cT

    < p

    0.9

    0.95

    1

    1.05

    12≤ vtxN

    17≤ vtx N≤13

    18≥ vtxN

    CMS Simulation

    [GeV]hτ

    TGenerated p

    200 400 600 800 1000 1200

    >ge

    n

    T/p

    rec

    T)

    / < p

    gen

    T/p

    rec

    T(pσ

    0

    0.1

    0.2

    0.3

    12≤ vtxN

    17≤ vtx N≤13

    18≥ vtxN

    CMS Simulation

    Figure 8: The τh energy response (left) and relative resolution (right) as function of generator-level visible τ pT in simulated Z′ → ττ events for different pileup conditions: Nvtx ≤ 12,13 ≤ Nvtx ≤ 17, and Nvtx ≥ 18. Reconstructed τh candidates are required to be matched tohadronic τ decays at the generator-level within ∆R < 0.3, to be reconstructed in one of thedecay modes h±, h±π0, h±π0π0 or h±h∓h±, and to pass pT > 20 GeV, |η| < 2.3, and the looseWP of the cutoff-based τh isolation discriminant.

    of the decay modes h±, h±π0, h±π0π0, or h±h∓h± (cf. Section 5.1), and τh ID refers to theτh identification discriminant used in the analysis. The p

    genT and ηgen selection criteria in the

    denominator are also applied in the numerator. Only those τh candidates matched to generator-level hadronic τ decays within ∆R < 0.3 are considered in the numerator.

    The efficiencies of the discriminants against electrons and muons are determined for τh candi-dates matched to generator-level τh decays within ∆R < 0.3, passing precT > 20 GeV, |ηrec| < 2.3,reconstructed in one of the decay modes h±, h±π0, h±π0π0, or h±h∓h±, and satisfying theloose WP of the cutoff-based τh isolation discriminant:

    ετ =lepton discriminant

    precT > 20 GeV, |ηrec| < 2.3, DM-finding, loose cutoff-based isolation. (7)

    The selection criteria in the denominators of Eqs. (6) and (7) are also applied in the numerators.

    The efficiency for τh decays to pass the cutoff-based and MVA-based τh identification discrim-inants are shown for simulated Z/γ∗ → ττ and Z′ → ττ events in Fig. 9.

    The efficiencies are higher in Z′ → ττ than in SM Z/γ∗ → ττ events, as the τ leptons havelarger pT in the former case. The expected efficiencies of the isolation discriminants rangebetween 40% and 70%, depending on whether tight or loose criteria are applied. The discrimi-nation against electrons and against muons have respective efficiencies between 60% and 95%,and between 95% and 99%.

  • 20 6 Expected performance

    [GeV]hτ

    TGenerated p

    20 40 60 80 100 120

    effi

    cien

    cyτ

    Exp

    ecte

    d

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2 Loose cutoff-based isolationMedium cutoff-based isolationTight cutoff-based isolation

    CMS Simulation

    ττ → *γZ/

    [GeV]hτ

    TGenerated p

    200 400 600 800 100012001400

    effi

    cien

    cyτ

    Exp

    ecte

    d

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2 Loose cutoff-based isolationMedium cutoff-based isolationTight cutoff-based isolation

    CMS Simulation

    ττ →Z' (2.5 TeV)

    [GeV]hτ

    TGenerated p

    20 40 60 80 100 120

    effi

    cien

    cyτ

    Exp

    ecte

    d

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2 Very loose MVA isolationLoose MVA isolationMedium MVA isolationTight MVA isolation

    CMS Simulation

    ττ → *γZ/

    [GeV]hτ

    TGenerated p

    200 400 600 800 100012001400

    effi

    cien

    cyτ

    Exp

    ecte

    d

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2 Very loose MVA isolationLoose MVA isolationMedium MVA isolationTight MVA isolation

    CMS Simulation

    ττ →Z' (2.5 TeV)

    Figure 9: Efficiency for τh decays in simulated Z/γ∗ → ττ (left) and Z′ → ττ (right) eventsto be reconstructed in one of the decay modes h±, h±π0, h±π0π0, or h±h∓h±, to satisfy theconditions pT > 20 GeV and |η| < 2.3, and to pass: the loose, medium and tight WP of thecutoff-based τh isolation discriminant (top) and the very loose, loose, medium and tight WP ofthe MVA-based tau isolation discriminant (bottom). The efficiency is shown as a function ofthe generator-level pT of the visible τ decay products in τh decays that are within |η| < 2.3.

  • 6.3 Misidentification rate for jets 21

    6.3 Misidentification rate for jets

    The rate at which quark and gluon jets are reconstructed as τh candidates passing τ identifica-tion is computed for jets with pjetT > 20 GeV and |ηjet| < 2.3 as follows:

    Pmisid =pτhT > 20 GeV, |ητh | < 2.3, DM-finding, τh ID discriminant

    pjetT > 20 GeV, |ηjet| < 2.3. (8)

    The pjetT and ηjet selection criteria of the denominator are also applied in the numerator. Notethat pT and η are different in the numerator and denominator, because p

    jetT and ηjet are com-

    puted by summing the momenta of all the particle constituents of the jet, while pτhT and ητhrefer to only the charged particles and photons included in the decay mode reconstruction ofthe τh candidate. Besides, jet energies are calibrated [51] and corrected for pileup effects [52, 53],whereas no energy calibration or pileup correction is applied to τh candidates.

    The rates of jet→ τh misidentification range from a few 10−4 to 10−2. They differ for W+jets andmultijet events, because of the different fractions of quark and gluon jets in the two samples,and because of differences in jet pT spectra, which are relevant due to the dependence of the jet→ τh misidentification rates on jet pT (cf. Section 10).

    The MVA-based τh identification discriminants that include lifetime information reduce the jet→ τh misidentification rate by about 40% relative to cutoff-based discriminants, while the τhidentification efficiencies are very similar.

    6.4 Misidentification rate for electrons and muons

    The misidentification rates for e → τh and µ → τh are determined for electrons and muonswith p`T > 20 GeV and |η`| < 2.3, and can be written as follows:

    Pmisid =pτhT > 20 GeV, |ητh |,< 2.3, DM-finding, loose cutoff-based isolation, lepton discriminant

    p`T > 20 GeV, |η`| < 2.3.

    (9)Only τh candidates reconstructed within ∆R < 0.3 of a generator-level electron or muon tra-jectory are considered for the numerator. The p`T and η` symbols refer to the generator-level pTand η of the electron or muon.

    Typical e → τh misidentification rates range from a few per mille to a few per cent. The ratesfor µ→ τh misidentification are at or below the per mille level.

    7 Validation with dataDifferent kinds of events are used to evaluate the τh reconstruction and identification in data.The τh identification efficiency and energy scale are validated using Z/γ∗ → ττ events. Theefficiency to reconstruct and identify τh of higher pT in more dense hadronic environments ismeasured using tt events. Samples of W+jets and multijet events are used to validate the rateswith which quark and gluon jets are misidentified as τh candidates. The misidentification ratesfor electrons and muons are measured using Z/γ∗ → ee and Z/γ∗ → µµ events.

    The selection of event samples is described in Section 7.1. Systematic uncertainties relevantto the validation of the τh reconstruction and identification are detailed in Section 7.2. Themeasurement of τh identification efficiency, as well as of the rates at which electrons and muonsare misidentified as τh candidates, is based on determining the yield of signal and background

  • 22 7 Validation with data

    processes, for which we use fits of simulated distributions (templates) to data, as described inSection 7.3.

    7.1 Event selection

    7.1.1 Z/γ∗→ ττ events

    The sample of Z/γ∗ → ττ events is selected in decay channels of τ leptons to muon and τhfinal states. Except for extracting the τh identification efficiency, Z/γ∗ → ττ → µτh events arerecorded using a trigger that demands the presence of a muon and τh [9]. The events used forthe τh identification efficiency measurement are recorded using a single-muon trigger [46], toavoid potential bias that may arise from requiring a τh at the trigger level. The reconstructedmuon is required to satisfy the conditions pT > 20 GeV and |η| < 2.1, to pass tight identificationcriteria, and to be isolated relative to other particles in the event by Iµ < 0.10p

    µT, computed

    according to Eq. (1). The τh candidates are required to be reconstructed in one of the decaymodes described in Section 5.1, to satisfy the conditions pT > 20 GeV and |η| < 2.3, and to passthe loose WP of the cutoff-based τh isolation discriminant, the tight WP of the cutoff-baseddiscriminant against muons, and the loose WP of the discriminant against electrons. The muonand τh candidate are required to be compatible with originating from the primary collisionvertex and be of opposite charge. In case multiple combinations of muon and τh exist in anevent, the combination with the highest sum in scalar pT is chosen. Background arising fromW+jets production is removed by requiring the transverse mass computed in Eq. (3) to satisfythe condition mT < 40 GeV. Events containing a second muon of pT > 15 GeV and |ηµ| < 2.4,passing loose identification and isolation criteria, are rejected to suppress Z/γ∗ → µµ Drell–Yan (DY) background.

    The transverse impact parameter d0 and the distance |~rSV −~rPV| between the τ production anddecay vertices in selected Z/γ∗ → ττ events are shown in Fig. 10. The normalization of theZ/γ∗ → ττ → µτh signal and of background processes is determined through a template fitto the data, as described in Section 7.3, using the visible mass of the muon and τh (mvis) asobservable in the fit. Separate fits are performed for events with τh candidates containing oneand three charged particles. The fitted mvis spectra are also shown in Fig. 10. The shaded areasrepresent the sum of statistical uncertainties of the MC samples and systematic uncertainties,added in quadrature, as discussed in Section 7.2. All distributions agree well with their respec-tive MC simulations.

    7.1.2 tt events

    A sample of tt events is also selected in the µτh channel. The tt→ bbµτh events are required topass a single-muon trigger and to contain a muon with pT > 25 GeV and |η| < 2.1. The muonis required to pass tight identification criteria and to be isolated at the level of Iµ < 0.10p

    µT. The

    τh candidate is required to be reconstructed in one of the decay modes described in Section 5.1,to satisfy the conditions pT > 20 GeV and |η| < 2.3, to pass the loose WP of the cutoff-basedτh isolation discriminant, and to be separated from the muon by ∆R > 0.5. The event is alsorequired to contain two jets of pT > 30 GeV and |η| < 2.5, separated from the muon and the τhcandidate by ∆R > 0.5. At least one of the jets is required to meet the b tagging criteria [54, 55].Background from Z/γ∗ → `` (` = e, µ, τ) events is reduced by requiring EmissT > 40 GeV.Events containing an electron of pT > 15 GeV and |η| < 2.3, or a second muon of pT > 10 GeVand |η| < 2.4 that pass loose identification and isolation criteria are rejected.

    The pT distribution of τh candidates in the tt sample is compared to the Z/γ∗ → ττ sample inFig. 11.

  • 7.1 Event selection 23

    [GeV]vism

    50 100 150 200

    [1/G

    eV]

    vis

    dN/d

    m

    0200400600800

    1000120014001600180020002200

    Observedττ →Z

    ttW+jetsDY othersMultijetsUncertainty

    CMS (8 TeV)-119.7 fb

    s0π±, h±h

    [GeV]vism50 100 150 200

    Sim

    ulat

    ion

    Dat

    a -

    Sim

    ulat

    ion

    0.4−0.2−

    00.20.4 [GeV]vism

    50 100 150 200

    [1/G

    eV]

    vis

    dN/d

    m

    0

    100

    200

    300

    400

    500

    600

    700

    800 Observedττ →Z

    ttW+jetsDY othersMultijetsUncertainty

    CMS (8 TeV)-119.7 fb

    ±h

    ±

    h±h

    [GeV]vism50 100 150 200

    Sim

    ulat

    ion

    Dat

    a -

    Sim

    ulat

    ion

    0.4−0.2−

    00.20.4

    [cm] 0d

    0.02− 0 0.02

    Eve

    nts

    0

    1000

    2000

    3000

    4000

    5000

    6000

    7000

    8000

    310×Observed

    ττ →Z tt

    W+jetsDY othersMultijetsUncertainty

    CMS

    (8 TeV)-119.7 fb

    s0π±, h±h

    [cm] 0d0.02− 0 0.02

    Sim

    ulat

    ion

    Dat

    a -

    Sim

    ulat

    ion

    0.4−0.2−

    00.20.4 [cm] PVr - SVr

    0 0.5 1 1.5 2

    Eve

    nts

    210

    310

    410

    510

    Observedττ →Z

    ttW+jetsDY othersMultijetsUncertainty

    CMS (8 TeV)-119.7 fb

    ±h

    ±

    h±h

    [cm] PVr - SVr 0 0.5 1 1.5 2

    Sim

    ulat

    ion

    Dat

    a -

    Sim

    ulat

    ion

    0.4−0.2−

    00.20.4

    Figure 10: Top: Distribution in the visible mass of Z/γ∗ → ττ → µτh candidate events,in which the reconstructed τh candidate contains (upper left) a single or (upper right) threecharged particles. Bottom: Distribution in (lower left) transverse impact parameter for eventsin which the τh candidate contains one charged particle and (lower right) in the distance be-tween the τ production and decay vertex for events in which the τh candidate contains threecharged particles. The Z/γ∗ → `` (` = e, µ, τ) events, in which either the reconstructed muonor the reconstructed τh candidate are misidentified, are denoted by “DY others”.

  • 24 7 Validation with data

    [GeV]hτT

    p20 30 40 50 60 70 80

    [1/G

    eV]

    hτ TdN

    /dp

    0

    500

    1000

    1500

    2000

    2500

    3000

    3500

    4000Observed

    ττ →Z DY otherstt

    W + jetsMultijetsUncertainty

    CMS (8 TeV)-119.7 fb

    [GeV]hτT

    p20 40 60 80

    Sim

    ulat

    ion

    Dat

    a -

    Sim

    ulat

    ion

    0.2−0

    0.2 [GeV]hτ

    Tp

    20 30 40 50 60 70 80 90 100 110 120

    [1/G

    eV]

    hτ TdN

    /dp

    0

    50

    100

    150

    200

    250

    300

    350

    400 Observed otherstt

    W + jets

    Z + jets

    Single top

    Diboson

    Multijets

    hτµb b→ tt

    Uncertainty

    CMS (8 TeV)-119.7 fb

    [GeV]hτT

    p20 40 60 80 100 120

    Sim

    ulat

    ion

    Dat

    a -

    Sim

    ulat

    ion

    0.2−0

    0.2

    Figure 11: Distribution in the pT of τh candidates in (left) Z/γ∗ → ττ and (right) tt eventsin data and in simulations. The Z/γ∗ → `` (` = e, µ, τ) and tt events, in which either thereconstructed muon or the reconstructed τh candidate is misidentified, are denoted in the MCsimulation by “DY others” and “tt others”, respectively.

    7.1.3 The W+jets sample

    Events selected for the W+jets sample are required to pass the single-muon trigger and to con-tain a muon with pT > 25 GeV and |η| < 2.1, passing tight identification and isolation criteriaIµ < 0.10p

    µT. The muon and E

    missT transverse mass, computed according to Eq. (3), is required

    to satisfy the condition mT > 50 GeV. Selected W+jets candidate events are further requiredto contain at least one jet with pT > 20 GeV and |η| < 2.3 that is separated from the muon by∆R > 0.5.

    7.1.4 Multijet sample

    The sample of multijet events is selected by requiring the events to pass a single-jet triggerwith the pT threshold of 320 GeV. The trigger was not prescaled during the whole data-takingperiod. The jet that passes the trigger is required to satisfy the conditions pT > 350 GeV and|η| < 2.5. In order to measure the jet → τh misidentification rate for jets unbiased by thetrigger selection, the following procedure is used: If only one jet in the event passes the triggerrequirement, that jet is excluded from the computation of the jet → τh misidentification rate,and the other jets with pT > 20 GeV and |η| < 2.3 in the event are used instead. When two ormore jets in the event pass the trigger requirement, all jets with pT > 20 GeV and |η| < 2.3 areincluded in the computation of the misidentification rate. Each jet is unbiased relative to thetrigger selection, because the event would have been triggered by another jet regardless of therest of the objects in the event.

    The pT distribution of jets considered for the computation of the jet→ τh misidentification rateis compared for W+jets and multijet samples in Fig. 12. The multijet sample provides more jets

  • 7.1 Event selection 25

    with large pT. Since the single-jet trigger used to select the multijet events requires at least onejet with pT greater than 320 GeV, the sample is enriched with events containing high pT jets thatare likely recoiling against each other. This is the reason for the increase in the jet pT spectrumin bin 300–400 GeV. The distributions observed in data agree with the MC expectation withinuncertainties.

    [GeV] jetT

    p0 50 100 150 200 250 300 350 400 450 500

    [1/G

    eV]

    jet

    TdN

    /dp

    510

    610

    Observed

    Multijets

    Uncertainty

    CMS

    (8 TeV)-119.7 fb

    [GeV] jetT

    p0 100 200 300 400 500

    Sim

    ulat

    ion

    Dat

    a -

    Sim

    ulat

    ion

    0.4−0.2−

    00.20.4

    [GeV] jetT

    p0 50 100 150 200 250 300 350 400 450 500

    [1/G

    eV]

    jet

    TdN

    /dp

    210

    310

    410

    510

    610

    710Observed

    W + jets

    Other processes

    Uncertainty

    CMS

    (8 TeV)-119.7 fb

    [GeV] jetT

    p0 100 200 300 400 500

    Sim

    ulat

    ion

    Dat

    a -

    Sim

    ulat

    ion

    0.4−0.2−

    00.20.4

    Figure 12: Jet pT distribution in (left) multijet and (right) W+jets events observed in data, com-pared to the MC expectation. The uncertainty in the MC expectation is dominated by theuncertainty in the jet energy scale.

    7.1.5 The Z/γ∗→ ee and Z/γ∗→ µµ events

    A high-purity sample of Z/γ∗ → ee and Z/γ∗ → µµ events is selected by requiring candidateevents to contain at least one well-identified and isolated electron or muon, referred to as the“tag”, and one τh candidate that passes loose preselection criteria, referred to as the “probe”.The e → τh and µ → τh misidentification rates are given by the fraction of probes that passthe τ isolation criteria, as well as one of the dedicated discriminants for vetoing electrons ormuons.

    Tag electrons are required to pass a single-electron trigger, to satisfy the conditions pT > 30 GeVand |η| < 2.1, pass tight electron identification criteria, and isolation, with Ie < 0.10peT. Tagelectrons reconstructed in the transition region between ECAL barrel and endcap, 1.46 < |η| <1.56, are discarded. Similarly, tag muons are required to pass a single-muon trigger, to satisfythe conditions pT > 25 GeV and |η| < 2.1, pass tight muon identification criteria, and isolation,with Iµ < 0.10p

    µT.

    The probe is required to be reconstructed in one of the decay modes h±, h±π0, h±π0π0, orh±h∓h±, to satisfy the conditions pT > 20 GeV and |η| < 2.3, and to be separated from the tagelectron or muon by ∆R > 0.5. The pT and η requirements are applied on the values recon-structed using the HPS algorithm. When an event contains two electron or muon candidatesthat pass the tight selection criteria and qualify as tags, the event is selected if it contains at

  • 26 7 Validation with data

    least one combination of tag and probe leptons that are separated by ∆R > 0.5. In this case, allcombinations of tag and probe leptons separated by ∆R > 0.5 are considered in the analysis.

    The contribution of W+jets and tt backgrounds is reduced by requiring selected events to passa requirement on the transverse mass of a tag electron or muon and EmissT , computed accordingto Eq. (3), respectively, of mT < 25 GeV or < 40 GeV. The contribution of the W+jets and ttbackground to the Z/γ∗ → ee event sample is further suppressed by requiring EmissT < 25 GeV.

    7.2 Systematic uncertainties

    Various imprecisely known or simulated effects can affect the level of agreement between dataand simulation, irrespective of τh reconstruction and identification.

    Electron and muon trigger, identification, and isolation efficiencies are measured using Z/γ∗ →ee and Z/γ∗ → µµ events via the “tag-and-probe” method [56] with a precision of 2% [45, 46].The uncertainty on the τh trigger efficiency is ≈3%.

    The jet energy scale (JES) is validated using γ+jets, Z+jets, and dijet events [51]. The uncertaintyin JES ranges from 1% to 10%, depending on pT and η of the jet. The effect of uncertainty inenergy resolution is found to be small and is not considered in this analysis. The efficiency forb jets to pass the medium WP of the CSV b tagging algorithm, and the mistag rates for light-quark and gluon jets are measured using tt and multijet events, and are in the ranges of 2–7%and 10–20%, respectively [54, 55].

    The uncertainty in the integrated luminosity is 2.6% [57].

    The theoretical uncertainties on the production cross sections are 5% for Z/γ∗ → `` (` = e, µ, τ)DY production, and 15% for the tt, diboson, and single top quark backgrounds. These includethe uncertainties in PDF, estimated following the recommendation in Refs. [58, 59], and on themodelling of parton showers and of the underlying event.

    The uncertainty in reweighting of simulated tt MC events, described in Section 3, is estimatedby changing the weights from their nominal values to the square of the nominal values and tono reweighting.

    The energy scale of electrons and muons is calibrated using J/Ψ → ``, Υ → ``, and Z → `` (`= e, µ) events, and is known to an uncertainty of 1% [45, 46].

    The EmissT scale and resolution is known to a few per cent uncertainty from studies performedin Z/γ∗ → µµ, Z/γ∗ → ee, and γ+jets events [37].

    7.3 Template fits

    The measurements of the τh identification efficiency, of the τh energy scale, and of the misiden-tification rates for electrons and muons are based on fitting the distribution of some observablein data with templates representing signal and background processes.

    The likelihood function L used in the fit is given by the product of Poisson probabilities toobserve ni events in each bin i of the distribution, given a number νi events expected fromsignal and background processes in that bin:

    L (µ, θ) = P (data|µ, θ) p(θ̃|θ) = ∏i

    νniini!

    exp(−νi) p(θ̃|θ). (10)

    The number of expected events depends on the parameter of interest (POI) µ that we wishto measure, such as the τh identification efficiency, the energy scale, or the misidentification

  • 27

    rates for electrons and muons, and on the values of “nuisance” parameters θ that represent thesystematic uncertainties discussed in the previous section.

    The function p(θ̃|θ) represents the probability to observe a value θ̃ in an auxiliary measure-ment of the nuisance parameter, given that the true value is θ. The nuisance parameters aretreated via the frequentist paradigm, as described in Refs. [60, 61]. Constraints on nuisance pa-rameters that affect the normalization, but not the shape of the distribution are represented bylog-normal probability density functions. Systematic uncertainties that affect the distributionas well as the normalization are incorporated into the likelihood fit via the technique detailedin Ref. [62] and constrained by Gaussian probability density functions.

    Statistical uncertainties on the templates are accounted for by introducing additional nuisanceparameters into the likelihood fit that provide uncorrelated single-bin fluctuations of the back-ground expectation, following the method described in Ref. [63].

    The value of µ that maximizes the likelihood function L in Eq. (10) is taken as the best-fitestimate for the parameter of interest, referred to as µobs. The uncertainty in the measuredvalue µobs is obtained by determining lower and upper bounds, µmin and µmax, for which thenegative logarithm of the likelihood function exceeds the maximum by half a unit:

    − lnL(

    µmin, θ̂µmin)= − lnL

    (µobs, θ̂µobs

    )+ 0.5, (11)

    and similarly for µmax. The nuisance parameters are profiled, that is, the values θ̂µmin and θ̂µmaxare chosen such that the likelihood function reaches its local maximum, subject to the constraintthat the POI value equals µmin and µmax, respectively.

    The best-fit value of the POI that we obtain from one measurement, e.g. of the τh identifica-tion efficiency, can depend on the POI of another measurement, e.g. of the τh energy scale.Correlations of this kind are taken into account in the template fits by using the other POI mea-surements as nuisance parameters in the fit, with an uncertainty of 6% for the τh identificationefficiency, 3% for the τh energy scale, 20% for the jet→ τh misidentification rate, and 30% forthe e → τh and µ → τh misidentification rates. The rate for e → τh and µ → τh instrumentalbackground in the MC simulation is corrected by the data-to-MC ratios given in Tables 6 and 7.

    8 Measurement of the τh identification efficiencyThe efficiency to reconstruct and identify τh decays is measured in Z/γ∗ → ττ → µτh andtt→ bbµτh events.

    8.1 Tau identification efficiency in Z/γ∗→ ττ eventsThe measurement of the τh identification efficiency in Z/γ∗ → ττ events is based on selectinga sample of Z/γ∗ → ττ → µτh events without applying any τh identification criteria anddetermining the number of τh decays passing and failing the τh identification discriminant.

    Following the event selection criteria described in Section 7.1.1, candidate events are requiredto pass a single-muon trigger, a higher pT threshold for the muon with pT > 25 GeV, and,instead of requiring the event to contain a τh candidate that passes the τh identification dis-criminants, a loose τh candidate selection is applied as follows. Reconstructed jets are requiredto satisfy the conditions pjetT > 20 GeV and |ηjet| < 2.3, to be separated from the muon by∆R > 0.5, and to contain at least one track with pT > 5 GeV. The track of highest pT withinthe jet is required to have a charge opposite to that of the muon, and to be compatible with

  • 28 8 Measurement of the τh identification efficiency

    originating from the same vertex. When more than one jet passes the τh candidate selectioncriteria, the jet with largest pT is used for this check.

    In addition, tight kinematic criteria are applied to reduce contributions from background pro-cesses. The mT criterion described in Section 7.1.1 is complemented by a requirement on atopological discriminant. The topological discriminant [64] is based on the projections:

    Pζ =(~pµT + ~p

    τhT + ~p

    missT

    )·~ζ

    |~ζ|and Pvisζ =

    (~pµT + ~p

    τhT

    )·~ζ

    |~ζ|(12)

    on the axis ~ζ, given by the bisector of the momenta in the transverse plane of the visible decayproducts of the two τ leptons. The discriminant utilizes the fact that the angle between theneutrinos produced in τ decays and the visible τ decay products is typically small, forcingthe ~pmissT vector in Z/γ

    ∗ → ττ events to point in the direction of ~pµT + ~pτhT , which is often not

    the case in W+jets and tt events. Selected events are required to satisfy the condition Pζ −1.85Pvisζ > −15 GeV. This reduces the sum of backgrounds passing the mT criterion by abouta factor two. Background from tt production is reduced by vetoing events that contain jetsof pT > 20 GeV and |η| < 2.5 that pass b tagging criteria. The background contributionsarising from WW, WZ, and ZZ production are suppressed by rejecting events that contain anelectron with pT > 15 GeV and |η| < 2.4 or a second muon with pT > 5 GeV and |η| < 2.4.The electrons and muons considered for this veto are required to pass loose identification andisolation criteria.

    The τh identification efficiency, eτ, is obtained through a simultaneous fit of the number ofZ/γ∗ → ττ events, Nτpass and Nτfail, with τh candidates passing (“pass” region) and failing(“fail” region) the τh identification discriminant. The fit is performed as described in Section 7.3.The τh identification efficiency is taken as the parameter of interest µ in the fit. The numberof Z/γ∗ → ττ events in the pass and fail regions as a function of µ are given by Nτpass =µ NZ/γ

    ∗→ττ and Nτfail = (1− µ) NZ/γ∗→ττ, respectively. The normalization of the Z/γ∗ → ττ

    signal in the sum of pass and fail regions, NZ/γ∗→ττ, as well as the templates for signal in

    both regions are obtained from the MC simulation. The systematic uncertainties discussed inSection 7.2 are represented by nuisance parameters in the fit. An additional nuisance parameterwith an uncertainty of 3% is included in the fit to account for the uncertainty in the energy scaleof the τh decays.

    Contributions from background processes, especially to the fail region, are sizeable. The dis-tributions for Z/γ∗ → µµ, W+jets, tt, single top quark, and diboson backgrounds are obtainedfrom MC simulation. The uncertainty in the yield of Z/γ∗ → µµ and diboson (tt and single topquark) backgrounds is increased to 30% (20%), to account for the uncertainty in the rate withwhich muons (light-quark and gluon jets) are misidentified as τh decays.

    The normalization of the W+jets background that is used as input to the fit is determined fromdata, using a control region defined by inverting the mT < 40 GeV selection and requiringmT > 70 GeV instead. The contributions of other backgrounds to this control region, referredto as high-mT sideband, are subtracted, based on MC predictions, before extrapolating theevent yield observed in the control region into the signal region. The extrapolation factor frommT > 70 GeV to mT < 40 GeV is obtained from MC simulation. The uncertainty in the W+jetsbackground in the signal region, arising from the statistical uncertainty in the event yield inthe control region, and from the uncertainty in the extrapolation factor, amounts to 15%, and isrepresented by a nuisance parameter in the fit.

    The normalization and distribution of the multijet background is estimated from data, using

  • 8.1 Tau identification efficiency in Z/γ∗ → ττ events 29

    events in which the muon and loose τh candidate have the same charge. The extrapolationfactor from the same-sign (SS) to the opposite-sign (OS) region is measured in events in whichthe muon fails the isolation criterion. The contributions from DY, tt, single top quark, anddiboson backgrounds to the OS and SS event samples with non-isolated muons, and to the SSevent sample with isolated muons, are subtracted according to MC expectation. The numberof W+jets events subtracted is determined using a control region in which the muon and the τhcandidate have the same charge, and mT > 70 GeV. The procedure provides an estimate of themultijet background in the signal region with an uncertainty of 10%.

    Two alternative observables are used to perform the fit: (i) mvis, the visible mass of the muonand the τh candidate, and (ii) Ntracks, the multiplicity of tracks within a cone of size ∆R <0.5 centred on the τh direction. The main results are obtained using mvis. Fits of the Ntracksdistribution are used to measure the τh identification efficiency as function of pT and η of theτh candidate, and also as function of Nvtx.

    Two other uncertainties are considered when Ntracks is used in the fit. The track reconstructionefficiency is measured with an uncertainty of 3.9% [21], and an uncertainty of 10% is attributedto the multiplicity of tracks associated with the τh candidates that are from jet→ τh misidenti-fications. The 10% represents the uncertainty on the multiplicity of charged hadrons producedin the hadronization of quarks and gluons into jets. The uncertainties in track reconstructionefficiency and hadronization affect the Ntracks distributions obtained from the MC simulation.We account for these uncertainties by producing Ntracks distributions with means shifted by±3.9% and ±10%. The shifted distributions are produced as follows: for a given event, we setNshiftedtracks = Ntracks. We then iterate over the collection of reconstructed tracks. For each track,we sample from a uniform distribution, and when the random number thus selected is be-low the magnitude of the shift (either 0.039 or 0.10) we reduce or increase Nshiftedtracks by one unit,depending on whether we have, respectively, a downward- or upward-shifted template.

    A closure test is performed using pseudo-data, given by the sum of MC simulated signaland background events and the multijet background obtained from data. Different pseudo-experiments are generated so as to be able to change signal yields and verify that the fit de-termines the τh identification efficiency without bias when the signal fraction differs from thenominal value.

    An uncertainty of 3.9% is added in quadrature to the uncertainty in eτ determined in the fit.The value of 3.9% represents the uncertainty to pass the loose τh candidate selections, and inparticular to reconstruct a track with pT > 5 GeV.

    The τh identification efficiencies measured in the data are quoted relative to the MC expectation.The results are given in Table 3. The data-to-MC ratios obtained using mvis and Ntracks arecompatible. All ratios are compatibl