Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Practical Statistics for Particle Physics: Lecture 3 - Limits & Confidence Intervals, Study notes of Particle Physics

The “test statistic” is a single number that quantifies the entire experiment, it could just be number of events observed, but often its more sophisticated, ...

Typology: Study notes

2022/2023

Uploaded on 05/11/2023

deville
deville 🇺🇸

4.7

(23)

396 documents

1 / 83

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Kyle Cranmer (NYU) CERN School HEP, Romania, Sept. 2011
Center for
Cosmology and
Particle Physics
Kyle Cranmer,
New York University
Practical Statistics for Particle Physics
1
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53

Partial preview of the text

Download Practical Statistics for Particle Physics: Lecture 3 - Limits & Confidence Intervals and more Study notes Particle Physics in PDF only on Docsity!

Cosmology and

Particle Physics

Kyle Cranmer,

New York University

Practical Statistics for Particle Physics

Kyle Cranmer (NYU)

Cosmology and Particle Physics

CERN School HEP, Romania, Sept. 2011

Lecture 3

98

Cosmology and Particle Physics

LEP Higgs

A simple likelihood

ratio with no free

parameters

Q =

L(x|H 1 )

L(x|H 0 )

=

∏ Nchan

i

P ois(ni|si + bi)

∏ ni

j

sifs(xij )+bifb(xij )

si+bi

∏ Nchan

i

P ois(ni|bi)

∏ ni

j

fb(xij )

q = ln Q = −stot

Nchan ∑

i

ni ∑

j

ln

(

1 +

sifs(xij )

bifb(xij )

)

0

0.

0.

0.

0.

0.

0.

-15 -10 -5 0 5 10 15

-2 ln(Q)

Probability density

Observed

Expected for background

Expected for signal

plus background

LEP

m H

= 115 GeV/c

2

(a)

**-

-**

0

10

20

30

40

50

106 108 110 112 114 116 118 120

m H

(GeV/c

2

)

-2 ln(Q)

Observed

Expected for background

Expected for signal plus background

LEP

Cosmology and Particle Physics

The Test Statistic and its distribution

Consider this schematic diagram

The “ test statistic ” is a single number that quantifies the entire experiment, it

could just be number of events observed, but often its more sophisticated, like

a likelihood ratio. What test statistic do we choose?

And how do we build the distribution? Usually “toy Monte Carlo”, but what

about the uncertainties... what do we do with the nuisance parameters?

Test Statistic

Probability Density

observed

background-only

signal + background

CL s+b

1-CL b

signal like background like

! (^) "

Cosmology and Particle Physics Properties of the Profile Likelihood Ratio

After a close look at the profile likelihood ratio

one can see the function is independent of true values of_!_

‣ (^) though its distribution might depend indirectly

Wilks’s theorem states that under certain conditions the

distribution of -2 ln_! ("=" 0 )_ given that the true value of " is " 0

converges to a chi-square distribution

‣ (^) more on this tomorrow, but the important points are:

‣ (^) “asymptotic distribution” is known and it is independent of_!_!

● (^) more complicated if parameters have boundaries (eg. μ! 0)

Thus, we can calculate the p-value for the background-only

hypothesis without having to generate Toy Monte Carlo!

P (m, a|μ,

νˆ(μ; m, a) )

P (m, a|μ,ˆ νˆ)

Cosmology and Particle Physics

Toy Monte Carlo

Profile Likelihood Ratio

0 5 10 15 20 25 30 35 40

**- 10

10

10

10

10

10**

1

10

signalplusbackground

background

test statistic data

2-channel

3.35σ

Profile Likelihood Ratio

0 5 10 15 20 25 30 35 40 45

10 -

**- 10

10

10

10

10

10**

1

10

signalplusbackground

background

test statistic data

4.4σ

5-channel

Explicitly build distribution by generating “toys” / pseudo experiments assuming a

specific value of μ and_!_.

‣ (^) randomize both main measurement m and auxiliary measurements a

‣ (^) fit the model twice for the numerator and denominator of profile likelihood ratio

‣ (^) evaluate - 2ln "(μ) and add to histogram

Choice of μ is straight forward: typically μ =0 and μ =1, but choice of_!_ is less clear

‣ (^) more on this tomorrow

This can be very time consuming. Plots below use millions of toy pseudo-

experiments on a model with ~50 parameters.

Cosmology and Particle Physics

Experimentalist Justification

So far this looks a bit like magic. How can you claim that you

incorporated your systematic just by fitting the best value of your

uncertain parameters and making a ratio?

It won’t unless the the parametrization is sufficiently flexible.

So check by varying the settings of your simulation, and see if the

profile likelihood ratio is still distributed as a chi-square

log Likelihood Ratio

0 2 4 6 8 10 12 14 16 18 20

Probability

**- 10

10

10

10

10

10**

Nominal (Fast Sim) miss

T

Smeared P

scale 1

2 Q

scale 2

2 Q

scale 3

2 Q

scale 4

2 Q

Leading-order t t

Leading-order WWb b

Full Simulation

- L dt=10 fb

ATLAS

Here it is pretty stable, but

it’s not perfect (and this is

a log plot, so it hides some

pretty big discrepancies)

For the distribution to be

independent of the nuisance

parameters your

parametrization must be

sufficiently flexible.

Cosmology and Particle Physics A very important point

If we keep pushing this point to the extreme, the physics problem

goes beyond what we can handle practically

The p-values are usually predicated on the assumption that the true

distribution is in the family of functions being considered

‣ (^) eg. we have sufficiently flexible models of signal & background to

incorporate all systematic effects

‣ (^) but we don’t believe we simulate everything perfectly

‣ (^) ..and when we parametrize our models usually we have further

approximated our simulation.

● nature -> simulation -> parametrization

At some point these approaches are limited by honest systematics

uncertainties (not statistical ones). Statistics can only help us so much

after this point. Now we must be physicists!

Cosmology and Particle Physics

Confidence Interval

What is a “Confidence Interval?

‣ (^) you see them all the time:

Want to say there is a 68% chance

that the true value of (mW, mt) is in

this interval

‣ (^) but that’s P(theory|data)!

Correct frequentist statement is that

the interval covers the true value

68% of the time

‣ (^) remember, the contour is a function of

the data, which is random. So it moves

around from experiment to experiment

150 175 200

m H

[GeV]

114 300 1000

m t

[GeV]

m

W

[

GeV

]

68 % CL

∆α

LEP1 and SLD

LEP2 and Tevatron (prel.)

P (θ ∈ V ) =

￿

V

π(θ|x) =

￿

V

f (x|θ)π(θ)

￿

dθf (x|θ)π(θ)

‣ (^) Bayesian “credible interval” does

mean probability parameter is

in interval. The procedure is

very intuitive:

Cosmology and Particle Physics

Neyman Construction example

For each value of consider

x

θ

θ 0

θ 1

θ 2

f (x|θ)

θ f^ (x|θ)

Cosmology and Particle Physics

Neyman Construction example

x

f (x|θ 0

Let’s focus on a particular point

‣ we want a test of size

‣ equivalent to a confidence interval on

‣ so we find an acceptance region with probability

f (x|θo)

α

1 − α

100(1 − α)% θ

1 − α

Cosmology and Particle Physics

Neyman Construction example

Let’s focus on a particular point

‣ No unique choice of an acceptance region

‣ here’s an example of a lower limit

f (x|θo)

1 − α

x

f (x|θ 0

1 − α

α

Cosmology and Particle Physics

x

f (x|θ 0

f (x|θ 0 )

f (x|θbest(x))

= kα

Neyman Construction example

Let’s focus on a particular point

‣ choice of this region is called an ordering rule

‣ In Feldman-Cousins approach, ordering rule is the

likelihood ratio. Find contour of L.R. that gives size

f (x|θo)

1 − α

α

Cosmology and Particle Physics

Neyman Construction example

Now make acceptance region for every value of

x

θ

θ 0

θ 1

θ 2

f (x|θ)

θ