











































































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
The “test statistic” is a single number that quantifies the entire experiment, it could just be number of events observed, but often its more sophisticated, ...
Typology: Study notes
1 / 83
This page cannot be seen from the preview
Don't miss anything!
Cosmology and
Particle Physics
New York University
Practical Statistics for Particle Physics
Kyle Cranmer (NYU)
Cosmology and Particle Physics
CERN School HEP, Romania, Sept. 2011
98
Cosmology and Particle Physics
A simple likelihood
ratio with no free
parameters
Q =
L(x|H 1 )
L(x|H 0 )
=
∏ Nchan
i
P ois(ni|si + bi)
∏ ni
j
sifs(xij )+bifb(xij )
si+bi
∏ Nchan
i
P ois(ni|bi)
∏ ni
j
fb(xij )
q = ln Q = −stot
Nchan ∑
i
ni ∑
j
ln
(
1 +
sifs(xij )
bifb(xij )
)
0
0.
0.
0.
0.
0.
0.
-15 -10 -5 0 5 10 15
-2 ln(Q)
Probability density
Observed
Expected for background
Expected for signal
plus background
LEP
m H
= 115 GeV/c
2
(a)
-**
0
10
20
30
40
50
106 108 110 112 114 116 118 120
m H
(GeV/c
2
)
-2 ln(Q)
Observed
Expected for background
Expected for signal plus background
LEP
Cosmology and Particle Physics
Consider this schematic diagram
The “ test statistic ” is a single number that quantifies the entire experiment, it
could just be number of events observed, but often its more sophisticated, like
a likelihood ratio. What test statistic do we choose?
And how do we build the distribution? Usually “toy Monte Carlo”, but what
about the uncertainties... what do we do with the nuisance parameters?
Test Statistic
Probability Density
observed
background-only
signal + background
CL s+b
1-CL b
signal like background like
! (^) "
Cosmology and Particle Physics Properties of the Profile Likelihood Ratio
After a close look at the profile likelihood ratio
one can see the function is independent of true values of_!_
‣ (^) though its distribution might depend indirectly
Wilks’s theorem states that under certain conditions the
distribution of -2 ln_! ("=" 0 )_ given that the true value of " is " 0
converges to a chi-square distribution
‣ (^) more on this tomorrow, but the important points are:
‣ (^) “asymptotic distribution” is known and it is independent of_!_!
● (^) more complicated if parameters have boundaries (eg. μ! 0)
Thus, we can calculate the p-value for the background-only
hypothesis without having to generate Toy Monte Carlo!
Cosmology and Particle Physics
Profile Likelihood Ratio
0 5 10 15 20 25 30 35 40
**- 10
10**
1
10
signalplusbackground
background
test statistic data
2-channel
3.35σ
Profile Likelihood Ratio
0 5 10 15 20 25 30 35 40 45
10 -
**- 10
10**
1
10
signalplusbackground
background
test statistic data
4.4σ
5-channel
Explicitly build distribution by generating “toys” / pseudo experiments assuming a
specific value of μ and_!_.
‣ (^) randomize both main measurement m and auxiliary measurements a
‣ (^) fit the model twice for the numerator and denominator of profile likelihood ratio
‣ (^) evaluate - 2ln "(μ) and add to histogram
Choice of μ is straight forward: typically μ =0 and μ =1, but choice of_!_ is less clear
‣ (^) more on this tomorrow
This can be very time consuming. Plots below use millions of toy pseudo-
experiments on a model with ~50 parameters.
Cosmology and Particle Physics
So far this looks a bit like magic. How can you claim that you
incorporated your systematic just by fitting the best value of your
uncertain parameters and making a ratio?
It won’t unless the the parametrization is sufficiently flexible.
So check by varying the settings of your simulation, and see if the
profile likelihood ratio is still distributed as a chi-square
log Likelihood Ratio
0 2 4 6 8 10 12 14 16 18 20
Probability
**- 10
10**
Nominal (Fast Sim) miss
T
Smeared P
scale 1
2 Q
scale 2
2 Q
scale 3
2 Q
scale 4
2 Q
Leading-order t t
Leading-order WWb b
Full Simulation
- L dt=10 fb
ATLAS
Here it is pretty stable, but
it’s not perfect (and this is
a log plot, so it hides some
pretty big discrepancies)
For the distribution to be
independent of the nuisance
parameters your
parametrization must be
sufficiently flexible.
Cosmology and Particle Physics A very important point
If we keep pushing this point to the extreme, the physics problem
goes beyond what we can handle practically
The p-values are usually predicated on the assumption that the true
distribution is in the family of functions being considered
‣ (^) eg. we have sufficiently flexible models of signal & background to
incorporate all systematic effects
‣ (^) but we don’t believe we simulate everything perfectly
‣ (^) ..and when we parametrize our models usually we have further
approximated our simulation.
● nature -> simulation -> parametrization
At some point these approaches are limited by honest systematics
uncertainties (not statistical ones). Statistics can only help us so much
after this point. Now we must be physicists!
Cosmology and Particle Physics
What is a “Confidence Interval?
‣ (^) you see them all the time:
Want to say there is a 68% chance
that the true value of (mW, mt) is in
this interval
‣ (^) but that’s P(theory|data)!
Correct frequentist statement is that
the interval covers the true value
68% of the time
‣ (^) remember, the contour is a function of
the data, which is random. So it moves
around from experiment to experiment
150 175 200
m H
[GeV]
114 300 1000
m t
[GeV]
m
W
[
GeV
]
68 % CL
∆α
LEP1 and SLD
LEP2 and Tevatron (prel.)
P (θ ∈ V ) =
V
π(θ|x) =
V
dθ
f (x|θ)π(θ)
dθf (x|θ)π(θ)
‣ (^) Bayesian “credible interval” does
mean probability parameter is
in interval. The procedure is
very intuitive:
Cosmology and Particle Physics
For each value of consider
x
θ
θ 0
θ 1
θ 2
f (x|θ)
θ f^ (x|θ)
Cosmology and Particle Physics
x
f (x|θ 0
f (x|θo)
α
1 − α
100(1 − α)% θ
1 − α
Cosmology and Particle Physics
f (x|θo)
1 − α
x
f (x|θ 0
1 − α
α
Cosmology and Particle Physics
x
f (x|θ 0
f (x|θ 0 )
f (x|θbest(x))
= kα
f (x|θo)
1 − α
α
Cosmology and Particle Physics
Now make acceptance region for every value of
x
θ
θ 0
θ 1
θ 2
f (x|θ)
θ