Download Probability Theory: Definitions, Controversies, and Interpretations and more Study notes Statistics in PDF only on Docsity!
What is Probability
“..the key to the relation between statistics and truth may be found in a
reasonable definition of probability”−
R. von Mises (1928/1951) Probability, Statistics, and Truth
“Probability does not exist”− De Finetti (1970) Theory of Probability
James L. Wayman, Ph.D., FIET, FIEEE
Office of Research
San José State University
Outline
1. Equivocation
2. Probability
3. Conditional Probability
4. Inverse Probability
5. Likelihood Ratio
6. Bayes Factor
7. Options for Quantifying Evidence
Probability
Often heard (disjunctive?) taxonomies:
- Objective vs Subjective (aka Epistemic)
- Post - 1950: Frequentist vs. Bayesian (aka Personal) Gillies (1987) ”Was Bayes a Bayesian? Glymour(1981) “Why I am Not a Bayesian” Berger (2006) “The case for objective Bayesian analysis."
Gillies (2000) Philosophical Theories of Probability
- Classical
- Logical
- Subjective
- Frequency
- Propensity
- Various approaches by Gillies
I will take historical approach following
- Stigler (1986) The History of Statistics
- Hacking (1990) Taming of Chance ; (2006) Emergence of Probability;
- Feinberg (1992) “A Brief History of Statistics in Three and One-Half Chapters”
- Howie (2002) Interpreting Probability: Controversies and Developments in Early 20th^ Century
Circa 18th^ Century Probability
- Bernoulli/Leipniz/de Moivre/Bayes/Laplace Games of chance and legal testimony/juries Probability as a measure of ignorance− Locke(1690), Essay CHU Book IV “Classical” definition: Probability as proportion of equally possible cases
- Principle of: Insufficient Reason (Leipniz) or Indifference (Keynes) 19 th^ century objections by Venn and Boole ( Ex nihilo nihil) 20 th^ century objections Is the book black/white/red?− Keynes(1921), Treatise on Probability “necessary condition.. indivisible alternatives” Loaded die do not have equiprobable states − von Mises(1928/1951) Prob, Stats &Truth Additional modern objection: Uniformity of the distribution of a parameter is not invariant under very reasonable transformations of the parameter
#𝑥 P(x based on a finite Collective) = (^) 𝑁→∞lim (^) 𝑁
20 th^ Century Probability
Von Mises(1928/1951) Probability, Statistics and Truth “… probability….applies only to problems in which either the same event repeats itself again and again or a great number of uniform elements are involved at the same time”. Non-repeatable events do not have a probability.
Von Mise’s response: “theory...not logically definable but sufficiently exact in practice”
Cambridge School: Russell=> Keynes => Jeffreys (1939) Theory of Probability, Branch of logic: formal logical relationships “degree of belief” “I believe I’ll have a beer” as a statement of probability
Fisher (1935), “The Logic of Inductive Inference” “I mean by mathematical probability only that objective quality of the individual which corresponds to frequency in the population, of which the individual is spoken of as a typical member” See also Zabell (1992) “RA Fisher and the Fiducial Argument”
Popper (1959), “The Propensity Interpretation of Probability,” Objective, but not frequentist P(x based on repeatable conditions with a tendency to produce sequences with frequencies equal to probabilities) Which conditions are important? (^7)
20 th^ Century Probability
Kolmogorov (1933), Foundations of the Theory of Probability
“Probability” as undefined “primitive”, like the concept of “point” in geometry
All probability as mathematics: field theory (an algebraic concept with operations +, - , X, ÷ )
5 Axioms
I & II: If F is a set of subsets of E (collection of elementary events), then F is a field and
contains E.
III. For each set A in F , P(A)∈ R , P(A) ≥ 0
IV. P(E) = 1
V. If A and B have no element in common (are mutually exclusive), then P(A∪B)= P(A) + P(B)
Added definition: P(A|B) ≝ P(A⋂B) P(B)
L. “Jimmy” Savage, Foundations of Statistics (1972), Frank Ramsey (1926?), De Finetti(1931)
“Personal” probability
How much would you bet to win $1?
Apologies for omitting Pearson^2 , Neyman, Carnap, Jaynes….. 8
Inverse Probability
P(H|E, I) calculated from P(E|H, I) via “Bayes Theorem”
Because H is either 100% true or 100% false, this P must certainly be subjective even if
P(E|H, I) calculated from objective measures
“…the theory of inverse probability is founded upon error and must be wholly rejected”− Fisher
(1925), “Statistical Methods for Research Workers”
“I know of only one case in mathematics of a doctrine which has been accepted and developed by
the most eminent men of their time, and is now perhaps accepted by men now living, which is at
the same time has appeared to a succession of sound writers to be fundamentally false and
devoid of foundation. Yet that is exactly quite the position with respect of inverse
probability…reduces all probability to subjective judgement…The underlying cause is…that we
learn by experience that science has its inductive processes, so it is naturally thought that such
inductions, being uncertain, must be expressible in terms of probability ” − Fisher (1930) “Inverse
Probability”
See Zabell (1989) "RA Fisher on the History of Inverse Probability.“ 10
Likelihood Ratio
Fisher’s attempt to avoid inverse probability
H and 𝐻 are taken as exclusive and exhaustive
“When we speak of the probability of a certain object fulfilling a certain condition, we imagine all such objects to be divided into two classes, according as they do or do not fulfil the condition. This is the only characteristic in them of which we take cognizance”. − “On the Mathematical Foundations of Theoretical Statistics” (1921)
If evidence metric is continuous, P(E|H) is evaluated at the point on the CDF given H at which the value E’ is observed. P(E≤E’|H)
Bayes Factor with H 2 as Proxy for H 1
But only if (^) H 2 ⇒ 𝐻 1 and H 1 ⇒ 𝐻 2 (exclusive)
Such that P(H 1 ) + P(H 2 ) = P( 𝐻 1 ) + P(𝐻 2 ) = 1 (exhaustive)
Otherwise, by Ramsey−De Finetti Theorem, system is “incoherent/inconsistent”
(vulnerable to “Dutch book”)
The “Reference Class Problem”
- Either H 1 = “from this source” (subject specific)
or “from someone who is the same source” (general population)
(within-class variation homogeneous across population?)
- H 2 = “from relevant population”, where either “relevant” refers to
subject or to questioned (People v Pizarro, CA 5th^ Dist. Ct. Appeals, 1992)
Can H 1 and H 2 be chosen as to maintain exclusive/exhaustive requirements?
Bayes Factor with H 2 as Proxy for H 1
But only if (^) H 2 ⇒ 𝐻 1 and (^) H 1 ⇒ 𝐻 2 (exclusive)
Such that P(H 1 ) + P(H 2 ) = P(𝐻 1 ) + P(𝐻 2 ) = 1 (exhaustive)
Otherwise, by Ramsey−De Finetti Theorem, system is “incoherent/inconsistent”
(vulnerable to “Dutch book”)
The “Reference Class Problem”
- Either H 1 = “from this source” (subject specific)
or “from someone who is the same source” (general population)
(within-class variation homogeneous across population?)
- H 2 = “from relevant population”, where either “relevant” refers to
subject or to questioned (People v Pizarro, CA 5th^ Dist. Ct. Appeals, 1992)
Can H 1 and H 2 be chosen as to maintain exclusive/exhaustive requirements?
One More Alternative
Probabilities have no place in court
Tribe (1971), "Trial by mathematics: Precision and ritual in the legal process." Harvard Law Rev.
P. Tillers (2011), "Trial by mathematics—reconsidered." Law, probability and risk
“ It could be…that judgements about the confirmation of a hypothesis by evidence are inherently
qualitative rather than quantitative in nature” − Gilles (2000) Philosophical Theories of Probability
“The (incorrect) argument runs somewhat as follows: a number of uncertain but useful
judgements can be expressed with exactitude in terms of probability; our judgements concerning
causes or hypotheses are uncertain, therefore our rational attitude towards them is expressible in
terms of probability” − Fisher (1930) “Inverse Probability”
R. v T [2010] EWCA Crim 2439; [2011] 1 Cr. App. R. 9
Quiz
Based on the evidence just presented, state which of the
following non-exclusive, non-exhaustive hypotheses is
more probable:
H 1 : We can all agree on a single definition of probability
H 2 : I can agree on a single definition of probability
H 3 : I’ll have a beer
H 4 : Sorry, I’m a strict frequentist