ICT role in 21st century education and it's challenges.
PDFs at the LHC: Lessons from Run I and preparation for Run II
1. !
Parton Distributions at the LHC Run II!
Lessons from Run I and preparation for Run II
Juan Rojo!
STFC Rutherford Fellow!
Rudolf Peierls Center for Theoretical Physics!
University of Oxford!
!
ATLAS Standard Model Workshop!
LAPP, Annecy, 05/02/2015
Juan Rojo ATLAS SM Workshop, LAPP Annecy, 07/02/2015
3. Parton Distributions and LHC phenomenology
!
!
!
!
!
!
2) Very large PDF uncertainties (>100%) for
new heavy particle production
Supersymmetric QCD
1) PDFs fundamental limit for Higgs boson
characterization in terms of couplings
3) PDFs dominant systematic for precision
measurements, like W boson mass, that test internal
consistency of the Standard Model
Juan Rojo ATLAS SM Workshop, LAPP Annecy, 07/02/2015
4. 4
PDF sets at the dawn of LHC Run II
Juan Rojo ATLAS SM Workshop, LAPP Annecy, 07/02/2015
Data Theory Methodology
NNPDF3.0
arxiv:1410.8849
DIS
FixedTarget DY
Jets
Top quarks
LHC DY
!
NNLOapprox for jets
APPLgrid/aMCfast/
FastNLO for LHC data
EW corrections
FONLL for heavy quarks
!
Closure testing validation
Artificial Neural Nets
Monte Carlo replicas
Bayesian reweighting
!
MMHT14
arxiv:1410.3989
DIS
FixedTarget DY
Jets
Top quarks
LHC DY
!
No LHC jets at NNLO
APPLgrid/FastNLO
EW corrections
TR for heavy quarks
Deuteron corrections
!
Hessian eigenvectors
More flexible param (Cheb
polynomials)
MC and Hessian RW
Dynamic tolerance
!
CT14
preliminary
DIS
FixedTarget DY
Jets
LHC DY
!
No LHC jets at NNLO
APPLgrid/FastNLO
Estimate of scale vars
ACOT for heavy quarks
!
Hessian eigenvectors
More flexible param
MC and Hessian RW
Fixed tolerance
!
ABM12
arxiv:1310.3059
DIS
FixedTarget DY
LHC DY
!
Fixed-Flavor N for DIS
VFN for LHC
Fitted
!
Hessian eigenvectors
No tolerance
!
HERAPDF2.0
preliminary
HERA-I and HERA-II
!
Different HQ schemes
RT default for HQ
!
Hessian eigenvectors
Model and param
uncertainties
MC representation also
!
5. 5
PDF sets at the dawn of LHC Run II
Juan Rojo ATLAS SM Workshop, LAPP Annecy, 07/02/2015
ATLAS CMS LHCb
NNPDF3.0
arxiv:1410.8849
2010 7TeV incl jets
2.76TeV incl jets
2010 W, Z rap dist, W pt
2011 high-mass DY
7 & 8TeV ttbar xsecs
!
2011 7TeV incl jets
2011 W asy
2011 7TeV Drell-Yan
2011 7TeV W+charm
7 & 8TeV ttbar xsecs
!
2010 W rap
2011 Z rap
MMHT14
arxiv:1410.3989
2010 7TeV incl jets
2.76TeV incl jets
2010 W, Z rap dist
2011 high-mass DY
7 & 8TeV ttbar xsecs
!
2011 7TeV incl jets
2011 W asy
2011 7TeV Drell-Yan
2010 Z rap
7 & 8TeV ttbar xsecs
!
2010 W rap
2011 Z rap
CT14
preliminary
TBD TBD TBD
ABM12
arxiv:1310.3059
!
2010 W, Z rap dist
!
2011 W asy
2010 W rap
2011 Z rap
HERAPDF2.0
preliminary
None None None
6. 6
PDF benchmarking
Juan Rojo ATLAS SM Workshop, LAPP Annecy, 07/02/2015
In 2012, a extensive benchmark comparison was performed between NNPDF2.3, CT10, MSTW08,
ABM11 and HERAPDF1.5!
Reasonable agreement between the three global PDF sets, with some important exceptions, like the
gluon in the Higgs region or the large-x quarks, relevant for searches!
ABM systematically different from global sets: softer large-x gluon, harder small-x quarks. Understood
from different treatment of heavy flavours. N.B. comparisons performed for common value of αS(MZ)!
HERAPDF1.5 affected by very large uncertainties due to the reduced dataset (no constrains from
hadronic data)!
2012 Benchmarks, arXiv:1211.5142
7. 7
PDF benchmarking
Juan Rojo ATLAS SM Workshop, LAPP Annecy, 07/02/2015
2012 Benchmarks, arXiv:1211.5142
In 2012, a extensive benchmark comparison was performed between NNPDF2.3, CT10, MSTW08,
ABM11 and HERAPDF1.5!
Reasonable agreement between the three global PDF sets, with some important exceptions, like the
gluon in the Higgs region or the large-x quarks, relevant for searches!
ABM systematically different from global sets: softer large-x gluon, harder small-x quarks. Understood
from different treatment of heavy flavours. N.B. comparisons performed for common value of αS(MZ)!
HERAPDF1.5 affected by very large uncertainties due to the reduced dataset (no constrains from
hadronic data)!
8. 8
PDF comparisons made easy: APFEL-Web
http://apfel.mi.infn.it/
Comparing different PDF sets is now really easy thanks to the new APFEL-Web online PDF plotter!
Just log in, select the PDF sets that you want to compare, the plotting settings, and have fun!!
Bertone, Carrazza, J.R. 13
Juan Rojo ATLAS SM Workshop, LAPP Annecy, 07/02/2015
9. 9
Benchmarks Revisited
Juan Rojo ATLAS SM Workshop, LAPP Annecy, 07/02/2015[GeV]XM
2
10 3
10
Ratio
0.8
0.85
0.9
0.95
1
1.05
1.1
1.15
1.2
1.25
1.3
Gluon-Gluon, luminosity
NNPDF30_nnlo_as_0118
MMHT2014_nnlo_as_0118
CT10_nnlo_as_0118
= 1.30e+04 GeVS
GeneratedwithAPFEL3.0.0Web
[GeV]XM
2
10 3
10
Ratio
0.8
0.85
0.9
0.95
1
1.05
1.1
1.15
1.2
1.25
1.3
Quark-Antiquark, luminosity
NNPDF30_nnlo_as_0118
MMHT2014_nnlo_as_0118
CT10_nnlo_as_0118
= 1.30e+04 GeVS
GeneratedwithAPFEL3.0.0Web
These benchmarks are now being revisited in the
context of the PDF4LHC updated recommendations
for Run II!
Improved agreement between global sets for some
crucial processes, like gg Higgs, but still important
differences i.e. in large-x antiquarks, crucial for
searches!
Also the PDF4LHC prescription for the PDF+αS
combined uncertainty has been simplified (addition
in quadrature, common αS value for all sets )!
10. 10
ABM vs global PDF fits
Juan Rojo ATLAS SM Workshop, LAPP Annecy, 07/02/2015
NNPDF, arXiv:1303.1189
Thorne, arXiv:1402.3536
The predictions of the ABM sets are systematically
different from those of the global PDF fits, even once
the different value of αS(MZ) is accounted for!
These differences are understood due to the
combination of a Fixed-Flavor Number scheme used
to fit DIS data (not suitable at high Q2) and the use of a
limited dataset without jet data!
NNPDF2.3 DIS+FFN reproduces well ABM11
predictions, but worse quality of fit to HERA data!
12. 12
PDF fits in ATLAS: pros and cons
Juan Rojo ATLAS SM Workshop, LAPP Annecy, 07/02/2015
Pros Cons
Allows detailed validation of PDF-sensitive
measurements before data released, in particular the
covariance matrix and the correlated systematics
Might delay the availability of the data for global PDF fitters
Internal tests of the consistency of measurements in
terms of PDF constraints, for instance, top data should
constrain gluon at large-x, but not quarks
One should be careful not to sell an interpretation paper (a
QCD fit done within ATLAS, based on ATLAS data) as an actual
ATLAS measurement
Useful comparisons of the predictions between different
PDF sets, which provide guidance for the global fits,
and motivation to include new data
Sometimes unclear when and how the theory calculations
used in an ATLAS fit (applgrid, K-factors) will be available to the
PDF fitters
Essential to extract fundamental SM parameters from
data, i.e. the strong constant, top mass or W mass
PDF fits based on ATLAS data are by no means exclusive of
ATLAS, other groups as well can do them (ie NNPDF)
Develop and maintain a solid PDF expertise in the
collaboration, which in turn ensures that relevant PDF-
sensitive measurements will be carried out in ATLAS
Could be dangerous to assume that fits based on HERA and
ATLAS data only are competitive with global fits
!
IMHO, it is really excellent that ATLAS performs PDF fits, and these activities should be encouraged
within the collaboration - but the scope and limitations of such fits should be also clear
13. 13
The ATLAS strangeness determination
Juan Rojo ATLAS SM Workshop, LAPP Annecy, 07/02/2015
!
First PDF determination performed in ATLAS:
excellent measurement and very nice QCD
analysis, that illustrates the potential of the W and
Z rapidity distributions to constrain strangeness!
What would be my interpretation of this
analysis? A QCD analysis of HERA and ATLAS
W,Z data shows that this measurement is sensitive
to the strange PDF, with preference for a
symmetric strange sea.!
And what would be an incorrect interpretation?
ATLAS has measured the strange sea to be
symmetric!
A PDF fit within ATLAS is very useful, but one
should never forget that in a global fit, different
datasets pull in different directions, and in
addition that different PDF fitting methodologies
might change the conclusion!
The NNPDF2.3 paper showed that a fit only
with ATLAS and HERA data has very large
uncertainties, and that is possible to fit at the
same time ATLAS W,Z and neutrino data
NNPDF re-analysis based on identical dataset
14. 14
Towards an ATLAS global fit
Juan Rojo ATLAS SM Workshop, LAPP Annecy, 07/02/2015
!
The wealth of PDF-sensitive measurements from ATLAS suggest that an ATLAS global fit might be feasible:
starting with the usual backbone of HERA data, ATLAS data can be used for quark flavour separation (W,Z, low
and high-mass Drell-Yan, …), gluons (jets, photons, top quarks, Z and W pt), even for the photon PDF!
Useful to test internal consistency of all the ATLAS datasets, their cross-correlations, important information
for global PDF fits. i.e. do jet and top data pull in the same direction for gluon? do all DY datasets pull in the
same direction for strangeness?!
With available data, such ATLAS global fit is still far from being competitive compared with global fits, as the
NNPDF3.0 HERA+ATLAS fit shows!
But with more data, and including Run II, the scenario could be different …
“ATLAS Global fit” based on NNPDF3.0
15. 15
Stress Tests of QCD Theory with PDF fits
Juan Rojo ATLAS SM Workshop, LAPP Annecy, 07/02/2015
In addition to quantifying the constraints on PDFs of new ATLAS data, many interesting PDF studies
could be performed with the aim of testing advanced QCD theory at the LHC. Some examples:!
Determination of the strong coupling constant and its running in the TeV range!
Validate higher order calculations, i.e. NNLO for jets and top pair differential. Fit quality improved?!
High-energy resummation in a joint fit of ATLAS and HERA data, deviations from DGLAP?!
Threshold resummation, and impact on Higgs production?!
Determination of the intrinsic charm content of the proton!
PDF fits with parton shower effects, specific PDFs for NLO event generators
aMCfast
To maximise the physics output of PDF fits in ATLAS,
essential to optimise the interface between HERAfitter (used
for PDF fits in ATLAS) and the relevant theory codes that
implement some of the above features
16. 16
Some additional questions …
Juan Rojo ATLAS SM Workshop, LAPP Annecy, 07/02/2015
Q1 Should ATLAS PDF fits include also data from other experiments, and become a global fit?!
No, ATLAS PDF fits should be restricted to the interpretation of their own data only, with the usual
HERA backbone. Individuals can pursue more global fits, but these should take place outside ATLAS!
!
Q2 Should ATLAS, given that it is experimental collaboration, stop doing PDF fits, which are
interpretation papers?!
No, interpretation papers can and should be pursued within ATLAS, just as is done for Higgs couplings
or BSM searches, no reason why not doing it for SM measurements!!
!
Q3 Should ATLAS PDF fits estimate model and parametrisation variations?!
Yes, this is really essential due to the reduced dataset. Else, the quoted PDF uncertainties would be
completely unrealistic e.g. the Hessian HERAPDF2.0 uncertainties are similar to those of global fits ….!
!
Q4 Should ATLAS not pursue some measurements (and their interpretation) in cases where theory is
not precise enough, like Z+1 jet or inclusive jets, where NNLO effects are important?!
No, ATLAS should do the measurements of these important processes as accurately as possible, and then
it is up to us theorists to keep the pace. Perhaps the Z high pT data can be used in PDF fits only in 3 years
from now, but the data will be already available!
!
Q5 Should we worry about TMD PDFs or TMD factorisation for precision measurements of Run II?!
No, in the current state-of-the-art of TMD fits, tuning Monte Carlo generators is probably enough, no
obvious improvement if a different theory framework, like TMDs, is used
18. 18
The PDF fitter’s wish-list for ATLAS
Process Run I Run II
Inclusive Jets
2011 7TeV published but data not available
2012 8TeV not yet
Explore
Constrains on large-x gluon and quarks
Dijets
2011 7TeV published but data not available
2012 8TeV not yet
Explore
Constrains on large-x gluon and quarks
Test NLO and NNLO QCD calculations
Inclusive
photons
2011 7TeV published (no cov mat)
2012 8TeV not yet
Explore
Constrains on medium-x gluon
Ratios to reduce scale dependence?
Top quark
production
2011 7TeV diff distributions available
2012 8TeV not yet
Differential distributions competitive with jets?
Need fast interface to NNLO calculation
Constrain gluon at large-x
W,Z + jets
2011 7TeV available
2012 8TeV not yet
Explore
Impact on gluon PDF
Ratios of W and Z for quark flav separation
Inclusive W,Z
2010 rap dists available
2011 rap dist should follow soon
Constrain quark flavor separation and strangeness
Limited by systematics?
High-mass DY 2011 7TeV available
Explore
Antiquarks at large-x
Test NNLO QCD and NLO EW calculations!
All these processes are of course very important for PDF studies, and should be pursued with high-priority
at Run II. Let me concentrate here on some less obvious possibilities ….!
19. 19
Cross section Ratios between 7, 8 and 14 TeV!
The staged increase of the LHC beam energy provides a new class of interesting observables: cross
section ratios for different beam energies!
!
!
These ratios can be computed with very high precision due to the large degree of correlation of
theoretical uncertainties at different energies!
Experimentally these ratios can also be measured accurately since many systematics, like luminosity or
jet energy scale, cancel partially in the ratios!
These ratios allow stringent precision tests of the SM, like PDF discrimination!
!
!
!
!
!
!
!
!
Mangano, J. R., 12
Juan Rojo ATLAS SM Workshop, LAPP Annecy, 07/02/2015
20. 20
Cross section Ratios between 7, 8 and 14 TeV!
The staged increase of the LHC beam energy provides a new class of interesting observables: cross
section ratios for different beam energies!
!
!
These ratios can be computed with very high precision due to the large degree of correlation of
theoretical uncertainties at different energies!
Experimentally these ratios can also be measured accurately since many systematics, like luminosity or
jet energy scale, cancel partially in the ratios!
These ratios allow stringent precision tests of the SM, like PDF discrimination!
!
!
!
!
!
!
!
!
Juan Rojo NIKHEF Theory Seminar, Amsterdam, 22/01/2015
ATLAS: Gluon from 7 TeV / 2.76 TeV jet xsecs
CMS: Drell-Yan 8 TeV / 7 TeV ratio
21. 21
Cross section Ratios between 7, 8 and 14 TeV!
The staged increase of the LHC beam energy provides a new class of interesting observables: cross
section ratios for different beam energies!
!
!
These ratios can be computed with very high precision due to the large degree of correlation of
theoretical uncertainties at different energies!
Experimentally these ratios can also be measured accurately since many systematics, like luminosity or
jet energy scale, cancel partially in the ratios!
These ratios allow stringent precision tests of the SM, like PDF discrimination!
!
!
!
!
!
!
!
!
Juan Rojo NIKHEF Theory Seminar, Amsterdam, 22/01/2015
Czakon, Mitov, Mangano, J. R., 13
Top Quark pair production: Excellent PDF discrimination
)Z
( MS
0.112 0.113 0.114 0.115 0.116 0.117 0.118 0.119 0.12
(tt,7TeV)(tt,8TeV)/
1.2
1.25
1.3
1.35
1.4
1.45
1.5
1.55
1.6
LHC 8 over 7 TeV
ATLAS
ABM11
CT10
HERAPDF
MSTW2008
NNPDF2.3
LHC 8 over 7 TeV
effects like value of top mass largely cancel in the ratio
22. 22
Cross section Ratios between 8 and 13 TeV!
The ratios of cross-sections, inclusive or differential, between 13 TeV and 8 TeV, should be one of the
highest priorities in the first months of Run II data taking!
Allows to perform precision physics even if some systematics, like luminosity, are still large, since they
can be cancelled, for instance forming suitable double ratios, e.g., for top production!
!
!
!
Should be a useful ingredient for global PDF fits!!
!
!
!
!
!
!
!
!
Juan Rojo NIKHEF Theory Seminar, Amsterdam, 22/01/2015
Mangano, J. R., 12
23. 23
Heavy Quark production
!
By now well stablished that top quark production can be used to constrain the large-x gluon PDF!
Thanks to the NNLO calculation, now also differential data can be used for fits!
However, it is less appreciated that charm and bottom production can be used to constrain the gluon,
this time at small-x. And the top NNLO calculation can be also applied here.!
LHCb data most sensitive, but ATLAS should also provide a competitive measurement. Important to
explore at Run II. Provide full covariance matrix, breakup of systematics (not currently available)
Juan Rojo NIKHEF Theory Seminar, Amsterdam, 22/01/2015
x
6
10
5
10 4
10
3
10 2
10 1
10
)2
=4GeV2
g(x,Q
0
2
4
6
8
10
12
14
16
=0.118sNNPDF3.0 NLO
NNPDF3.0
NNPDF3.0 + LHCb D0 data (wgt)
NNPDF3.0 + LHCb D0 data (unw)
=0.118sNNPDF3.0 NLO
Gauld, J. R., Rottoli, Sarkar, Talbert, in prepsimilar studies by the HERAfitter group
24. 24
Jets in NNLO global fits
!
The recent calculation of the gluon-gluon channel
NNLO jet cross sections is an important milestone
towards the exact inclusion of jet data in NNLO PDF fits!
Large O(20-25%) enhancements wrt NLO results, if the
scale used #=pT,leading (pT of leading jet)!
Perturbative convergence improved if the jet #=pT is
used instead: smaller K-factors!
On the other hand, the gg channel is small at medium
and large pT at the LHC energies!
!
!
Juan Rojo SMatLHC14, Madrid, 09/04/2014
!
While full NNLO result becomes available, approximate
NNLO results can be derived from the improved
threshold calculation: reasonable approximation to exact
at large pT and central region, breaks down at small pT!
Assume same K-factor holds for other partonic channels!
Carrazza and Pires,
arXiv:1407.7031
NNLO threshold !
De Florian et al, arXiv:1310.7192
25. 25
Jets in NNLO global fits!
We can therefore compute approximate NNLO K-factors using the threshold approximation!
Comparison with exact gg NNLO can determine for which values of jet pT and η the NNLOthres
calculation can be trusted!
!
Juan Rojo SMatLHC14, Madrid, 09/04/2014
!
At NNLO, the numerical value of χ2 depends sizeably on definition, “experimental” vs “t0”!
Will be very interesting to revisit these issues for the ATLAS 2011 jet data!
!
Percentage difference between exact and approx gg
Jet data included in NNPDF3.0 at NNLO
χ2 NLO (exp) χ2 NLO (t0) χ2 NNLO (exp) χ2 NNLO (t0)
At medium pT and non-central rapidities, most
of ATLAS data cut from the NNLO NNPDF3.0
fit. Should be better with 2011 jets!
!
26. !
!
!
!
26
PDF fits at NLO+PS accuracy
Juan Rojo NIKHEF Theory Seminar, Amsterdam, 22/01/2015
!
NLO+PS is current standard for LHC event simulation, and improves in many directions over fixed-order
NLO results: improved pert. behaviour, direct relation with measured quantities, less need for kin cuts …!
Using NLO+PS calculations in global PDF fits should have many important applications, like for the W
mass among others, and is now technically possible thanks to aMCfast, the fast interface to
MadGraph5_aMC@NLO based on the applgrid library!
!
!
aMCfast: Bertone, Frixione, Frederix, J.R., Sutton,!
arXiv:1406.7693 (for NLO), NLO+PS in preparation
!
One crucial aspect to
explore is the role of the
PDF used by the MC
shower, since this is fixed
even in the fast NLO+PS
grid!
Quite small effect in most
observables, except extreme
kinematics like forward
rapidities!
!
!
27. !
!
!
!
27
PDF fits at NLO+PS accuracy
Juan Rojo NIKHEF Theory Seminar, Amsterdam, 22/01/2015
!
NLO+PS is current standard for LHC event simulation, and improves in many directions over fixed-order
NLO results: improved pert. behaviour, direct relation with measured quantities, less need for kin cuts …!
Using NLO+PS calculations in global PDF fits should have many important applications, like for the W
mass among others, and is now technically possible thanks to aMCfast, the fast interface to
MadGraph5_aMC@NLO based on the applgrid library!
!
!
aMCfast: Bertone, Frixione, Frederix, J.R., Sutton,!
arXiv:1406.7693 (for NLO), NLO+PS in preparation
!
One crucial aspect to
explore is the role of the
PDF used by the MC
shower, since this is fixed
even in the fast NLO+PS
grid!
Quite small effect in most
observables, except extreme
kinematics like forward
rapidities!
!
!
28. !
!
!
!
28
PDF fits at NLO+PS accuracy
Juan Rojo NIKHEF Theory Seminar, Amsterdam, 22/01/2015
!
NLO+PS is current standard for LHC event simulation, and improves in many directions over fixed-order
NLO results: improved pert. behaviour, direct relation with measured quantities, less need for kin cuts …!
Using NLO+PS calculations in global PDF fits should have many important applications, like for the W
mass among others, and is now technically possible thanks to aMCfast, the fast interface to
MadGraph5_aMC@NLO based on the applgrid library!
!
!
!
With such tools, one could
include directly the Z pT
distribution into global
fits, which is not possible
at fixed-order NLO!
!
!
29. !
QED and electroweak corrections are essential for precision LHC phenomenology: W and Z
production, W mass determination, WW boson pair production, TeV scale jet and top quark pair
production, searches for new W’, Z’ bosons!
Consistent inclusion of electroweak effects require PDFs with QED corrections and a photon PDF !
NNPDF2.3 QED: first-ever determination of the photon PDF from LHC data!
Neglecting photon-initiated contributions leads to systematically underestimating theory errors in
crucial BSM search channels!
!
!
29
PDFs with QED corrections
( GeV )llM
500 1000 1500 2000 2500 3000 3500
/dM(fb/GeV)[ref]/dM(fb/GeV)/dd
0
0.5
1
1.5
2
2.5
/Z production @ LHC 8 TeV
*
gamma
BornqNNPDF2.3 QED, q
) QEDNNPDF2.3 QED, full O(
) QEDMRST04 QED, full O(
/Z production @ LHC 8 TeV
*
gamma
( GeV )cut
WW
M
200 400 600 800 1000 1200
(fb)
0
0.5
1
1.5
2
2.5
WW production @ LHC 8 TeV
qNNPDF2.3 QED, q
NNPDF2.3 QED,
MRST04 QED,
WW production @ LHC 8 TeV
High-Mass Drell-Yan High-Mass WW prod
Juan Rojo NIKHEF Theory Seminar, Amsterdam, 22/01/2015
NNPDF 13
30. At Run II, some interesting EWK measurements that have been proposed to constrain the photon PDF!
For instance, the triple differential measurement of Drell-Yan cross-sections in invariant mass, rapidity and
lepton transverse momentum allows to neatly disentangle photon PDF effects from other EWK effects!
!
!
30
Pinning down the photon PDF
Juan Rojo NIKHEF Theory Seminar, Amsterdam, 22/01/2015
Boughezal et al, arXiv:1312.3972
Also the measurement of WW production at low transverse momentum and high invariant masses
should provide direct constraints on the photon PDF!
!
!
31. !
!
!
!
!
!
PDF uncertainties !
at the LHC made easy
Juan Rojo ATLAS SM Workshop, LAPP Annecy, 07/02/2015
x
-5
10 -4
10
-3
10 -2
10 -1
10 1
0.6
0.8
1
1.2
1.4
1.6
xg(x,Q), comparison
NNPDF3.0 NLO
NNPDF3.0 Hessian
Q = 1.00e+00 GeV
GeneratedwithAPFEL3.0.0Web
32. 32
Compressed Monte Carlo PDFs!
Motivation: provide a practical implementation of the PDF4LHC recommendation, easy to use by
the experiments and computationally less intensive that the original prescription!
Having a single combined PDF sets (even with large number of eigenvector/replicas) would
already be useful since widely-used tools like MadGraph5_aMC@NLO, POWHEG or FEWZ provide
the PDF uncertainties without any additional cost!
But this is not true for all theory tools used at the LHC, so there is still a good motivation to be able
to use a combined PDF set with a small number of eigenvectors/replicas!
Two approaches: META-PDFs (Gao and Nadolsky) and Compressed MC PDFs (CMC-PDFs)!
CMC-PDFs based on the Monte Carlo statistical combination of different PDF sets, followed by a
compression algorithm to end up with a reduced number of replicas
Juan Rojo ATLAS SM Workshop, LAPP Annecy, 07/02/2015
!
!
The Monte Carlo combination has a robust statistical interpretation, and in many cases leads to
similar results, with somewhat smaller uncertainties, compared to the original PDF4LHC envelope.
33. 33
Basic strategy!
The MC combination of PDF sets is easy, but number of MC replicas still too large!
Compress the original probability distribution to one with a smaller number of replicas, in a way
that all the relevant estimators (mean, variances, correlations etc) for the PDFs are reproduced
!
The compression is applied at Q = 2 GeV,
though the results are robust wrt other choices!
Various options about how the error
function to be minimised can be defined, ie.,
to reproduce central values add a term
!
The algorithm also minimises the Kolmogorov distance
between the original and compressed distributions
!
Same for variances, correlations and higher
moments!
At the end, optimal choice decided by the
resulting phenomenology
34. 34
Gaussian vs non-Gaussian!
Even if the original PDF sets in the combination are approximately Gaussian, their combination in general will
be non-Gaussian, and linear propagation might not be adequate!
Working in Gaussian approximation might not be reliable: i.e. skewness is not reproduced in the compression
(despite central values and variances are) unless we explicitly include it in the minimised figure of merit
Juan Rojo PDF4LHC pre-Meeting, 16/01/2014
Skewness included
Skewness excluded
35. 35
LHC Phenomenology!
The ultimate validation is of course to check that the compressed set reproduces the original
combination for a wide variety of observables!
We have tested a very large number of processes, both at the inclusive and differential level and
always found that Nrep=20-30 replicas are enough for phenomenology
Compression also works for
fully differential distributions!
Tested on a large number of
processes: jets, Drell-Yan, WW,
W+charm, Z+jets, ….!
Calculations use fast NLO
interfaces:!
1. aMCfast/applgrid for
MadGraph5_aMC@NLO!
2. applgrid for MCFM/
NLOjet++!
Very flexible to redo validation for
any other compressed set
Juan Rojo ATLAS SM Workshop, LAPP Annecy, 07/02/2015
CMC-PDFs: Carrazza, Latorre, J.R. Watt, in prep
36. 36
Correlations!
Agreement between original and CMC-PDFs also holds at the level of cross-section correlations !
Not an accident: selecting replicas at random fails to reproduce the correlation accurately enough
Correlationcoefficient
1
0.5
0
0.5
1
Correlation Coefficient for ttbar
= 40repN
Reference
Compressed
Random (68% CL)
Correlation Coefficient for ttbar
Juan Rojo ATLAS SM Workshop, LAPP Annecy, 07/02/2015
37. 37
MC2Hessian!
For many important applications, a linearised Hessian version of Monte Carlo sets, based on
orthogonal eigenvectors, would be useful!
A MC2Hessian algorithm would allow to use NNPDF and CMC-PDF sets for profiling, as nuisance
parameters, construct sets with reduced number of eigenvectors for specific applications like W mass ….!
…. while keeping also the crucial possibility of testing for the potential deviations from Gaussianity of
the underlying probability distribution, quantifying the range of validity of linear approximation!
Various options possible, for example a la Meta-PDF, fit a functional form to each of the MC replicas
Juan Rojo ATLAS SM Workshop, LAPP Annecy, 07/02/2015
!
But this is subject to the usual functional bias. !
A more robust choice is to use also MC replicas as linear expansion basis:
!
The eigenvectors can then be determined from the a χ2 defined in the space of PDFs:
Covariance Matrix in the space of PDFs
38. 38
MC2Hessian - Preliminary results
!
Preliminary results validate this strategy:
it is possible to efficiently construct a
Hessian representation of NNPDF3.0, or
in general of any MC PDF set!
In particular, the CMC-PDFs will also be
available in a Hessian representation
Juan Rojo ATLAS SM Workshop, LAPP Annecy, 07/02/2015
x
-5
10 -4
10 -3
10 -2
10 -1
10 1
0.6
0.8
1
1.2
1.4
1.6
xg(x,Q), comparison
NNPDF3.0 NLO
NNPDF3.0 Hessian
Q = 1.00e+00 GeV
GeneratedwithAPFEL3.0.0Web
x
-5
10 -4
10 -3
10 -2
10 -1
10 1
0.6
0.8
1
1.2
1.4
1.6
xg(x,Q), comparison
NNPDF3.0 NLO
NNPDF3.0 Hessian
Q = 1.00e+00 GeV
GeneratedwithAPFEL3.0.0Web
!
The comparison between the original
MC representation of a PDF set and its
Hessian representation allow to determine
the range of validity of the latter!
For instance, in important cases like BSM
searches at high-mass, it is know that the
Gaussian approximation is not adequate
40. 40
Summary and discussion
!
ATLAS has already provided a number of high-quality measurements for PDF fits, which provide
important constraints in global PDF analysis ….!
… with many more to come still from Run I, and many new opportunities will open at Run II!!
PDF fits within ATLAS are highly valuable: validate crucial aspects of the measurement like the
correlated systematics, test new theory calculations and tools, understand PDF impact of measurements
and provide guidance for global fits ….!
… but it should also be clear that their scope is limited, and they are not meant to be alternative to
global fits. Variants of NNPDF3.0 already explore the potential of an ATLAS global fit!
Many interesting physics opportunities not only in terms of PDF constrains from ATLAS data, but also
for tests of QCD theory at the LHC!
We are making easier the estimation of PDF uncertainties at the LHC Run II: combine different PDF
sets into a single compressed set, using a Hessian representation of MC sets, PDF sets with combined
PDF+αS uncertainties, …If you have additional requests to make your life easier, just let us know!!
Juan Rojo ATLAS SM Workshop, LAPP Annecy, 07/02/2015
41. 41
Summary and discussion
!
ATLAS has already provided a number of high-quality measurements for PDF fits, which provide
important constraints in global PDF analysis ….!
… with many more to come still from Run I, and many new opportunities will open at Run II!
PDF fits within ATLAS are highly valuable: validate crucial aspects of the measurement like the
correlated systematics, test new theory calculations and tools, understand PDF impact of measurements
and provide guidance for global fits ….!
… but it should also be clear that their scope is limited, and they are not meant to be alternative to
global fits. Variants of NNPDF3.0 already explore the potential of an ATLAS global fit!
Many interesting physics opportunities not only in terms of PDF constrains from ATLAS data, but also
for tests of QCD theory at the LHC!
We are making easier the estimation of PDF uncertainties at the LHC Run II: combine different PDF
sets into a single compressed set, using a Hessian representation of MC sets, PDF sets with combined
PDF+αS uncertainties, …If you have additional requests to make your life easier, just let us know!
Juan Rojo ATLAS SM Workshop, LAPP Annecy, 07/02/2015
Thanks a lot for your attention!
and now time for discussion…
43. 43
Comparison with Meta-PDFs!
To compare with the available Meta-PDFs in LHAPDF6, we have produced compressed sets based
on MSTW08, CT10 and NNPDF2.3!
Reasonable agreement found for central values and variances, except perhaps small- and at large-x!
Need to redo the comparison when the two approaches use NNPDF3.0, MMHT14 and CT14
Juan Rojo ATLAS SM Workshop, LAPP Annecy, 07/02/2015
44. 44
The combined PDF set!
Since there is reasonable agreement between CT10, MMHT14 and NNPDF3.0, the resulting
combined distribution is in general Gaussian, but there are also important cases where the non-
gaussianity of the combined PDFs is substantial
x
5
10 4
10
3
10 2
10 1
10
1
0
1
2
3
4
5
6
Q = 1.4142 GeV
NNPDF30_nnlo_as_0118
MMHT2014nnlo68cl_rand1002
CT10nnlo_rand1004
MCcompPDFnnlo
Q = 1.4142 GeV
x * PDF
0.76 0.78 0.8 0.82 0.84 0.86 0.88 0.9 0.92 0.94 0.96 0.98
Probabilityperbin
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Gluon PDF, x=0.1, Q=100 GeV
histo1
Entries 100
Mean 0.8887
RMS 0.0151
NNPDF3.0
CT10
MMHT14
MCcompPDFs
Gluon PDF, x=0.1, Q=100 GeV
!
It is possible to have smoother distributions by increasing the number of replicas for each set, but
this does not seem to be required by phenomenology!
For typical applications using Nrep=100 for each of the three PDF sets is enough!
Note that in general, the combination of Gaussian distributions is not a Gaussian itself
Juan Rojo ATLAS SM Workshop, LAPP Annecy, 07/02/2015
45. 45
Compression as a mathematical problem!
The goal now is to compressed the combined set of Nrep=300 replicas to a smaller subset, in a way
that this subset reproduces the statistical properties of the original distribution
!
Mathematically, this is a well-defined problem:
compression is finding the subset that minimises
the distance between two probability
distributions
!
Many equally good minimisations possible,
so choice of minimisation algorithm not crucial
(similar to the travelling salesman problem)!
Mathematically well-posed problem, with a
number of robust solutions!
1. Kolmogorov distance!
2. Kullback-Leibler entropy!
3. …..!
Optimal choice determined by the requirements
of the problem at hand, in this case LHC
phenomenology
Juan Rojo ATLAS SM Workshop, LAPP Annecy, 07/02/2015
46. 46
Results of the compression!
To gauge improvements due to compression, compare various contributions to the error function in
the best compression and in randoms selection with the same number of replicas
!
Substantial improvements as
compared to random
compressions, typically by one
order or magnitude or more!
Compression is also able to
successfully reproduce higher
moments like skewness or
kurtosis!
Similar improvements for the
correlations and the Kolmogorov
distances
Horizontal dashed line: !
lower limit of 68%CL range for random
compressions with Nrep=100
Juan Rojo ATLAS SM Workshop, LAPP Annecy, 07/02/2015
47. 47
Results of the compression
Juan Rojo PDF4LHC Meeting, 03/11/2014
!
For example, for Nrep=40 replicas the compressed and the original PDFs are virtually identical
x
5
10 4
10
3
10 2
10 1
10
Gluon,ratiotoprior
0.75
0.8
0.85
0.9
0.95
1
1.05
1.1
1.15
1.2
1.25
Q = 100 GeV
Prior, 300 MC replicas
Compressed set, 40 MC replicas
Q = 100 GeV
x
5
10 4
10
3
10 2
10 1
10
Up,ratiotoprior
0.75
0.8
0.85
0.9
0.95
1
1.05
1.1
1.15
1.2
1.25
Q = 100 GeV
Prior, 300 MC replicas
Compressed set, 40 MC replicas
Q = 100 GeV
x
5
10 4
10
3
10 2
10 1
10
Up,ratiotoprior
0.75
0.8
0.85
0.9
0.95
1
1.05
1.1
1.15
1.2
1.25
Q = 100 GeV
Prior, 300 MC replicas
Compressed set, 10 MC replicas
Q = 100 GeV
x
5
10 4
10
3
10 2
10 1
10
Gluon,ratiotoprior
0.75
0.8
0.85
0.9
0.95
1
1.05
1.1
1.15
1.2
1.25
Q = 100 GeV
Prior, 300 MC replicas
Compressed set, 10 MC replicas
Q = 100 GeV
!
As expected, for a very small number of replicas (10 in this case) agreement is much worse
48. 48
The compressed PDF set!
Since there is reasonable agreement between CT10, MMHT14 and NNPDF3.0, the resulting
combined distribution looks typically Gaussian
!
On average, the same number of replicas from each of the three sets is selected in the compressed
set, a further demonstration that the algorithm is unbiased
Replicas
0 50 100 150 200 250 300
Entries
0
1
CMC-PDF NLO - 25 replica distribution
NNPDF3.0 CT10 MMHT14
9 replicas 8 replicas 8 replicas
Juan Rojo ATLAS SM Workshop, LAPP Annecy, 07/02/2015
49. 49
Compressing native MC sets!
The compression algorithm can of course be also used in native MC sets, like NNPDF. We have
shown that starting from NNPDF3.0 with Nrep=1000 replicas we can compress down to 40-50 replicas
maintaining all relevant statistical properties
!
Central values and variances well reproduced, but also, non-trivially, also higher moments and
correlations!
Sets with Nrep=1000 replicas are still useful for other applications, like Bayesian reweighting
All plots done with the APFEL Web plotter
Juan Rojo ATLAS SM Workshop, LAPP Annecy, 07/02/2015
50. 50
Phenomenology!
The ultimate validation is of course to check that the compressed set reproduces the original PDF
combination for a wide variety of LHC observables!
We have tested a very large number of processes, both at the inclusive and differential level and
always found that Nrep=20-30 replicas are enough for phenomenology
!
NNLO cross-sections:!
gg->H with ggHiggs!
tt with top++!
W,Z with Vrap
Juan Rojo ATLAS SM Workshop, LAPP Annecy, 07/02/2015
51. 51
Phenomenology!
The ultimate validation is of course to check that the compressed set reproduces the original
combination for a wide variety of observables!
We have tested a very large number of processes, both at the inclusive and differential level and
always found that Nrep=20-30 replicas are enough for phenomenology
!
NLO inclusive cross sections
with MCFM!
H VBF!
WW!
WH
Juan Rojo ATLAS SM Workshop, LAPP Annecy, 07/02/2015