Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Introduction to Probability (Blitzstein/Hwang) Selected Problems Solved, Exercises of Probability and Statistics

All the problems marked with S in Introduction to Probability Blitzstein, Hwang are solved

Typology: Exercises

2020/2021

Uploaded on 05/27/2021

ekasha
ekasha 🇺🇸

4.8

(22)

270 documents

1 / 111

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Solutions to Exercises Marked with s
from the book
Introduction to Probability by
Joseph K. Blitzstein and Jessica Hwang
Departments of Statistics, Harvard University and Stanford
University
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55
pf56
pf57
pf58
pf59
pf5a
pf5b
pf5c
pf5d
pf5e
pf5f
pf60
pf61
pf62
pf63
pf64

Partial preview of the text

Download Introduction to Probability (Blitzstein/Hwang) Selected Problems Solved and more Exercises Probability and Statistics in PDF only on Docsity!

Solutions to Exercises Marked with ©s

from the book

Introduction to Probability by

Joseph K. Blitzstein and Jessica Hwang

Departments of Statistics, Harvard University and Stanford

University

Chapter 1: Probability and counting

Counting

  1. ©s (a) How many ways are there to split a dozen people into 3 teams, where one team has 2 people, and the other two teams have 5 people each?

(b) How many ways are there to split a dozen people into 3 teams, where each team has 4 people?

Solution:

(a) Pick any 2 of the 12 people to make the 2 person team, and then any 5 of the remaining 10 for the first team of 5, and then the remaining 5 are on the other team of 5; this overcounts by a factor of 2 though, since there is no designated “first” team of

  1. So the number of possibilities is

2

5

/2 = 8316. Alternatively, politely ask the 12 people to line up, and then let the first 2 be the team of 2, the next 5 be a team of 5, and then last 5 be a team of 5. There are 12! ways for them to line up, but it does not matter which order they line up in within each group, nor does the order of the 2 teams of 5 matter, so the number of possibilities is (^) 2!5!5!12!· 2 = 8316.

(b) By either of the approaches above, there are (^) 4!4!4!12! ways to divide the people into a Team A, a Team B, and a Team C, if we care about which team is which (this is called a multinomial coefficient). Since here it doesn’t matter which team is which, this over counts by a factor of 3!, so the number of possibilities is (^) 4!4!4!3!12! = 5775.

  1. ©s (a) How many paths are there from the point (0, 0) to the point (110, 111) in the plane such that each step either consists of going one unit up or one unit to the right?

(b) How many paths are there from (0, 0) to (210, 211), where each step consists of going one unit up or one unit to the right, and the path has to go through (110, 111)?

Solution:

(a) Encode a path as a sequence of U ’s and R’s, like U RU RU RU U U R... U R, where U and R stand for “up” and “right” respectively. The sequence must consist of 110 R’s and 111 U ’s, and to determine the sequence we just need to specify where the R’s are located. So there are

110

possible paths.

(b) There are

110

paths to (110, 111), as above. From there, we need 100 R’s and 100 U( ’s to get to (210, 211), so by the multiplication rule the number of possible paths is 221 110

100

Story proofs

  1. ©s Give a story proof that

∑n k=

(n k

= 2n.

Solution: Consider picking a subset of n people. There are

(n k

choices with size k, on the one hand, and on the other hand there are 2n^ subsets by the multiplication rule.

Chapter 1: Probability and counting 3

chosen. In general, if there are j people in the full group who are younger than Aemon, then there are

(j k

possible choices for the rest of the subgroup. Thus,

∑^ n

j=k

j k

n + 1 k + 1

(b) For a pack of i gummi bears, there are

(5+i− 1 i

(i+ i

(i+ 4

possibilities since the situation is equivalent to getting a sample of size i from the n = 5 flavors (with replacement, and with order not mattering). So the total number of possibilities is

∑^50

i=

i + 4 4

∑^54

j=

j 4

Applying the previous part, we can simplify this by writing

∑^54

j=

j 4

∑^54

j=

j 4

∑^33

j=

j 4

(This works out to 3200505 possibilities!)

Naive definition of probability

  1. ©s A certain family has 6 children, consisting of 3 boys and 3 girls. Assuming that all birth orders are equally likely, what is the probability that the 3 eldest children are the 3 girls?

Solution: Label the girls as 1, 2 , 3 and the boys as 4, 5 , 6. Think of the birth order is a permutation of 1, 2 , 3 , 4 , 5 , 6, e.g., we can interpret 314265 as meaning that child 3 was born first, then child 1, etc. The number of possible permutations of the birth orders is 6!. Now we need to count how many of these have all of 1, 2 , 3 appear before all of 4 , 5 , 6. This means that the sequence must be a permutation of 1, 2 , 3 followed by a permutation of 4, 5 , 6. So with all birth orders equally likely, we have

P (the 3 girls are the 3 eldest children) = (3!)

2 6!

Alternatively, we can use the fact that there are

3

ways to choose where the girls appear in the birth order (without taking into account the ordering of the girls amongst themselves). These are all equally likely. Of these possibilities, there is only 1 where the 3 girls are the 3 eldest children. So again the probability is 1 (^63 )

  1. ©s A city with 6 districts has 6 robberies in a particular week. Assume the robberies are located randomly, with all possibilities for which robbery occurred where equally likely. What is the probability that some district had more than 1 robbery?

Solution: There are 6^6 possible configurations for which robbery occurred where. There are 6! configurations where each district had exactly 1 of the 6, so the probability of the complement of the desired event is 6!/ 66. So the probability of some district having more than 1 robbery is 1 − 6!/ 66 ≈ 0. 9846.

Note that this also says that if a fair die is rolled 6 times, there’s over a 98% chance that some value is repeated!

4 Chapter 1: Probability and counting

  1. ©s A college has 10 (non-overlapping) time slots for its courses, and blithely assigns courses to time slots randomly and independently. A student randomly chooses 3 of the courses to enroll in. What is the probability that there is a conflict in the student’s schedule?

Solution: The probability of no conflict is 1010 ·^93 · 8 = 0.72. So the probability of there being at least one scheduling conflict is 0.28.

  1. ©s For each part, decide whether the blank should be filled in with =, <, or >, and give a clear explanation.

(a) (probability that the total after rolling 4 fair dice is 21) (probability that the total after rolling 4 fair dice is 22)

(b) (probability that a random 2-letter word is a palindrome^1 ) (probability that a random 3-letter word is a palindrome)

Solution:

(a) >. All ordered outcomes are equally likely here. So for example with two dice, obtaining a total of 9 is more likely than obtaining a total of 10 since there are two ways to get a 5 and a 4, and only one way to get two 5’s. To get a 21, the outcome must be a permutation of (6, 6 , 6 , 3) (4 possibilities), (6, 5 , 5 , 5) (4 possibilities), or (6, 6 , 5 , 4) (4!/2 = 12 possibilities). To get a 22, the outcome must be a permutation of (6, 6 , 6 , 4) (4 possibilities), or (6, 6 , 5 , 5) (4!/ 22 = 6 possibilities). So getting a 21 is more likely; in fact, it is exactly twice as likely as getting a 22.

(b) =. The probabilities are equal, since for both 2-letter and 3-letter words, being a palindrome means that the first and last letter are the same.

  1. ©s Elk dwell in a certain forest. There are N elk, of which a simple random sample of size n are captured and tagged (“simple random sample” means that all

(N

n

sets of n elk are equally likely). The captured elk are returned to the population, and then a new sample is drawn, this time with size m. This is an important method that is widely used in ecology, known as capture-recapture. What is the probability that exactly k of the m elk in the new sample were previously tagged? (Assume that an elk that was captured before doesn’t become more or less likely to be captured again.)

Solution: We can use the naive definition here since we’re assuming all samples of size m are equally likely. To have exactly k be tagged elk, we need to choose k of the n tagged elk, and then m − k from the N − n untagged elk. So the probability is (n k

(N −n m−k

(N

m

for k such that 0 ≤ k ≤ n and 0 ≤ m − k ≤ N − n, and the probability is 0 for all other values of k (for example, if k > n the probability is 0 since then there aren’t even k tagged elk in the entire population!). This is known as a Hypergeometric probability; we will encounter it again in Chapter 3.

  1. ©s A jar contains r red balls and g green balls, where r and g are fixed positive integers. A ball is drawn from the jar randomly (with all possibilities equally likely), and then a second ball is drawn randomly.

(^1) A palindrome is an expression such as “A man, a plan, a canal: Panama” that reads the same

backwards as forwards (ignoring spaces, capitalization, and punctuation). Assume for this problem that all words of the specified length are equally likely, that there are no spaces or punctuation, and that the alphabet consists of the lowercase letters a,b,... ,z.

6 Chapter 1: Probability and counting

  1. ©s A random 5-card poker hand is dealt from a standard deck of cards. Find the prob- ability of each of the following possibilities (in terms of binomial coefficients).

(a) A flush (all 5 cards being of the same suit; do not count a royal flush, which is a flush with an ace, king, queen, jack, and 10).

(b) Two pair (e.g., two 3’s, two 7’s, and an ace).

Solution:

(a) A flush can occur in any of the 4 suits (imagine the tree, and for concreteness suppose the suit is Hearts); there are

5

ways to choose the cards in that suit, except for one way to have a royal flush in that suit. So the probability is

4

5

5

(b) Choose the two ranks of the pairs, which specific cards to have for those 4 cards, and then choose the extraneous card (which can be any of the 52 − 8 cards not of the two chosen ranks). This gives that the probability of getting two pairs is ( 13 2

2

5

  1. ©s A norepeatword is a sequence of at least one (and possibly all) of the usual 26 letters a,b,c,... ,z, with repetitions not allowed. For example, “course” is a norepeatword, but “statistics” is not. Order matters, e.g., “course” is not the same as “source”. A norepeatword is chosen randomly, with all norepeatwords equally likely. Show that the probability that it uses all 26 letters is very close to 1/e.

Solution: The number of norepeatwords having all 26 letters is the number of ordered arrangements of 26 letters: 26!. To construct a norepeatword with k ≤ 26 letters, we first select k letters from the alphabet (

k

selections) and then arrange them into a word (k! arrangements). Hence there are

k

k! norepeatwords with k letters, with k ranging from 1 to 26. With all norepeatwords equally likely, we have

P (norepeatword having all 26 letters) =

norepeatwords having all 26 letters

norepeatwords

=

k=

k

k!

k=

26! k!(26−k)! k!

=

1 25! +^

1 24! +^...^ +^

1 1! + 1^

The denominator is the first 26 terms in the Taylor series ex^ = 1 + x + x^2 /2! +.. ., evaluated at x = 1. Thus the probability is approximately 1/e (this is an extremely good approximation since the series for e converges very quickly; the approximation for e differs from the truth by less than 10−^26 ).

Axioms of probability

  1. ©s Arby has a belief system assigning a number PArby(A) between 0 and 1 to every event A (for some sample space). This represents Arby’s degree of belief about how likely A is to occur. For any event A, Arby is willing to pay a price of 1000 · PArby(A) dollars to buy a certificate such as the one shown below:

Chapter 1: Probability and counting 7

Certificate

The owner of this certificate can redeem it for $1000 if A occurs. No value if A does not occur, except as required by federal, state, or local law. No expiration date.

Likewise, Arby is willing to sell such a certificate at the same price. Indeed, Arby is willing to buy or sell any number of certificates at this price, as Arby considers it the “fair” price. Arby stubbornly refuses to accept the axioms of probability. In particular, suppose that there are two disjoint events A and B with

PArby(A ∪ B) 6 = PArby(A) + PArby(B).

Show how to make Arby go bankrupt, by giving a list of transactions Arby is willing to make that will guarantee that Arby will lose money (you can assume it will be known whether A occurred and whether B occurred the day after any certificates are bought/sold).

Solution: Suppose first that

PArby(A ∪ B) < PArby(A) + PArby(B).

Call a certificate like the one show above, with any event C in place of A, a C-certificate. Measuring money in units of thousands of dollars, Arby is willing to pay PArby(A) + PArby(B) to buy an A-certificate and a B-certificate, and is willing to sell an (A ∪ B)- certificate for PArby(A ∪ B). In those transactions, Arby loses PArby(A) + PArby(B) − PArby(A ∪ B) and will not recoup any of that loss because if A or B occurs, Arby will have to pay out an amount equal to the amount Arby receives (since it’s impossible for both A and B to occur).

Now suppose instead that

PArby(A ∪ B) > PArby(A) + PArby(B).

Measuring money in units of thousands of dollars, Arby is willing to sell an A-certificate for PArby(A), sell a B-certificate for PArby(B), and buy a (A∪B)-certificate for PArby(A∪ B). In so doing, Arby loses PArby(A∪B)−(PArby(A)+PArby(B)), and Arby won’t recoup any of this loss, similarly to the above. (In fact, in this case, even if A and B are not disjoint, Arby will not recoup any of the loss, and will lose more money if both A and B occur.)

By buying/selling a sufficiently large number of certificates from/to Arby as described above, you can guarantee that you’ll get all of Arby’s money; this is called an arbitrage opportunity. This problem illustrates the fact that the axioms of probability are not arbitrary, but rather are essential for coherent thought (at least the first axiom, and the second with finite unions rather than countably infinite unions).

Arbitrary axioms allow arbitrage attacks; principled properties and perspectives on prob- ability potentially prevent perdition.

Inclusion-exclusion

  1. ©s A card player is dealt a 13-card hand from a well-shuffled, standard deck of cards. What is the probability that the hand is void in at least one suit (“void in a suit” means having no cards of that suit)?

Chapter 1: Probability and counting 9

Mixed practice

  1. ©s There are 100 passengers lined up to board an airplane with 100 seats (with each seat assigned to one of the passengers). The first passenger in line crazily decides to sit in a randomly chosen seat (with all seats equally likely). Each subsequent passenger takes his or her assigned seat if available, and otherwise sits in a random available seat. What is the probability that the last passenger in line gets to sit in his or her assigned seat? (This is a common interview problem, and a beautiful example of the power of symmetry.) Hint: Call the seat assigned to the jth passenger in line “seat j” (regardless of whether the airline calls it seat 23A or whatever). What are the possibilities for which seats are available to the last passenger in line, and what is the probability of each of these possibilities?

Solution: The seat for the last passenger is either seat 1 or seat 100; for example, seat 42 can’t be available to the last passenger since the 42nd passenger in line would have sat there if possible. Seat 1 and seat 100 are equally likely to be available to the last passenger, since the previous 99 passengers view these two seats symmetrically. So the probability that the last passenger gets seat 100 is 1/2.

12 Chapter 2: Conditional probability

increase the chance of guilt (so P (G|E 1 ) > P (G) and P (G|E 2 ) > P (G)), but together they decrease the chance of guilt (so P (G|E 1 , E 2 ) < P (G))?

Solution: Yes, this is possible. In fact, it is possible to have two events which separately provide evidence in favor of G, yet which together preclude G! For example, suppose that the crime was committed between 1 pm and 3 pm on a certain day. Let E 1 be the event that the suspect was at a specific nearby coffeeshop from 1 pm to 2 pm that day, and let E 2 be the event that the suspect was at the nearby coffeeshop from 2 pm to 3 pm that day. Then P (G|E 1 ) > P (G), P (G|E 2 ) > P (G) (assuming that being in the vicinity helps show that the suspect had the opportunity to commit the crime), yet P (G|E 1 ∩ E 2 ) < P (G) (as being in the coffeehouse from 1 pm to 3 pm gives the suspect an alibi for the full time).

  1. ©s A crime is committed by one of two suspects, A and B. Initially, there is equal evidence against both of them. In further investigation at the crime scene, it is found that the guilty party had a blood type found in 10% of the population. Suspect A does match this blood type, whereas the blood type of Suspect B is unknown.

(a) Given this new information, what is the probability that A is the guilty party?

(b) Given this new information, what is the probability that B’s blood type matches that found at the crime scene?

Solution:

(a) Let M be the event that A’s blood type matches the guilty party’s and for brevity, write A for “A is guilty” and B for “B is guilty”. By Bayes’ Rule,

P (A|M ) =

P (M |A)P (A)

P (M |A)P (A) + P (M |B)P (B)

(We have P (M |B) = 1/10 since, given that B is guilty, the probability that A’s blood type matches the guilty party’s is the same probability as for the general population.)

(b) Let C be the event that B’s blood type matches, and condition on whether B is guilty. This gives

P (C|M ) = P (C|M, A)P (A|M ) + P (C|M, B)P (B|M ) =

  1. ©s To battle against spam, Bob installs two anti-spam programs. An email arrives, which is either legitimate (event L) or spam (event Lc), and which program j marks as legitimate (event Mj ) or marks as spam (event M (^) jc ) for j ∈ { 1 , 2 }. Assume that 10% of Bob’s email is legitimate and that the two programs are each “90% accurate” in the sense that P (Mj |L) = P (M (^) jc |Lc) = 9/10. Also assume that given whether an email is spam, the two programs’ outputs are conditionally independent.

(a) Find the probability that the email is legitimate, given that the 1st program marks it as legitimate (simplify).

(b) Find the probability that the email is legitimate, given that both programs mark it as legitimate (simplify).

(c) Bob runs the 1st program and M 1 occurs. He updates his probabilities and then runs the 2nd program. Let P˜ (A) = P (A|M 1 ) be the updated probability function after running the 1st program. Explain briefly in words whether or not P˜ (L|M 2 ) = P (L|M 1 ∩ M 2 ): is conditioning on M 1 ∩ M 2 in one step equivalent to first conditioning on M 1 , then updating probabilities, and then conditioning on M 2?

Solution:

Chapter 2: Conditional probability 13

(a) By Bayes’ rule,

P (L|M 1 ) =

P (M 1 |L)P (L)

P (M 1 )

9 10 ·^

1 10 9 10 ·^

1 10 +^

1 10 ·^

9 10

(b) By Bayes’ rule,

P (L|M 1 , M 2 ) =

P (M 1 , M 2 |L)P (L)

P (M 1 , M 2 )

( 109 )^2 · 101

( 109 )^2 · 101 + ( 101 )^2 · 109

(c) Yes, they are the same, since Bayes’ rule is coherent. The probability of an event given various pieces of evidence does not depend on the order in which the pieces of evidence are incorporated into the updated probabilities.

Independence and conditional independence

  1. ©s A family has 3 children, creatively named A, B, and C.

(a) Discuss intuitively (but clearly) whether the event “A is older than B” is independent of the event “A is older than C”.

(b) Find the probability that A is older than B, given that A is older than C.

Solution:

(a) They are not independent: knowing that A is older than B makes it more likely that A is older than C, as the if A is older than B, then the only way that A can be younger than C is if the birth order is CAB, whereas the birth orders ABC and ACB are both compatible with A being older than B. To make this more intuitive, think of an extreme case where there are 100 children instead of 3, call them A 1 ,... , A 100. Given that A 1 is older than all of A 2 , A 3 ,... , A 99 , it’s clear that A 1 is very old (relatively), whereas there isn’t evidence about where A 100 fits into the birth order.

(b) Writing x > y to mean that x is older than y,

P (A > B|A > C) = P^ (A > B, A > C)

P (A > C)

=^1 /^3

=^2

since P (A > B, A > C) = P (A is the eldest child) = 1/3 (unconditionally, any of the 3 children is equally likely to be the eldest).

  1. ©s Is it possible that an event is independent of itself? If so, when is this the case?

Solution: Let A be an event. If A is independent of itself, then P (A) = P (A∩A) = P (A)^2 , so P (A) is 0 or 1. So this is only possible in the extreme cases that the event has probability 0 or 1.

  1. ©s Consider four nonstandard dice (the Efron dice), whose sides are labeled as follows (the 6 sides on each die are equally likely).

A: 4, 4 , 4 , 4 , 0 , 0 B: 3, 3 , 3 , 3 , 3 , 3 C: 6, 6 , 2 , 2 , 2 , 2 D: 5, 5 , 5 , 1 , 1 , 1

These four dice are each rolled once. Let A be the result for die A, B be the result for die B, etc.

Chapter 2: Conditional probability 15

Monty Hall

  1. ©s (a) Consider the following 7-door version of the Monty Hall problem. There are 7 doors, behind one of which there is a car (which you want), and behind the rest of which there are goats (which you don’t want). Initially, all possibilities are equally likely for where the car is. You choose a door. Monty Hall then opens 3 goat doors, and offers you the option of switching to any of the remaining 3 doors. Assume that Monty Hall knows which door has the car, will always open 3 goat doors and offer the option of switching, and that Monty chooses with equal probabilities from all his choices of which goat doors to open. Should you switch? What is your probability of success if you switch to one of the remaining 3 doors?

(b) Generalize the above to a Monty Hall problem where there are n ≥ 3 doors, of which Monty opens m goat doors, with 1 ≤ m ≤ n − 2.

Solution:

(a) Assume the doors are labeled such that you choose door 1 (to simplify notation), and suppose first that you follow the “stick to your original choice” strategy. Let S be the event of success in getting the car, and let Cj be the event that the car is behind door j. Conditioning on which door has the car, we have

P (S) = P (S|C 1 )P (C 1 ) + · · · + P (S|C 7 )P (C 7 ) = P (C 1 ) =

Let Mijk be the event that Monty opens doors i, j, k. Then

P (S) =

i,j,k

P (S|Mijk )P (Mijk )

(summed over all i, j, k with 2 ≤ i < j < k ≤ 7 .) By symmetry, this gives

P (S|Mijk ) = P (S) =^1 7 for all i, j, k with 2 ≤ i < j < k ≤ 7. Thus, the conditional probability that the car is behind 1 of the remaining 3 doors is 6/7, which gives 2/7 for each. So you should switch, thus making your probability of success 2/7 rather than 1/7.

(b) By the same reasoning, the probability of success for “stick to your original choice” is (^) n^1 , both unconditionally and conditionally. Each of the n − m − 1 remaining doors has conditional probability (^) (n−nm−−^1 1)n of having the car. This value is greater than (^) n^1 , so you should switch, thus obtaining probability (^) (n−nm−−^1 1)n of success (both conditionally and unconditionally).

  1. ©s Consider the Monty Hall problem, except that Monty enjoys opening door 2 more than he enjoys opening door 3, and if he has a choice between opening these two doors, he opens door 2 with probability p, where 12 ≤ p ≤ 1. To recap: there are three doors, behind one of which there is a car (which you want), and behind the other two of which there are goats (which you don’t want). Initially, all possibilities are equally likely for where the car is. You choose a door, which for concreteness we assume is door 1. Monty Hall then opens a door to reveal a goat, and offers you the option of switching. Assume that Monty Hall knows which door has the car, will always open a goat door and offer the option of switching, and as above assume that if Monty Hall has a choice between opening door 2 and door 3, he chooses door 2 with probability p (with 12 ≤ p ≤ 1).

(a) Find the unconditional probability that the strategy of always switching succeeds (unconditional in the sense that we do not condition on which of doors 2 or 3 Monty opens).

16 Chapter 2: Conditional probability

(b) Find the probability that the strategy of always switching succeeds, given that Monty opens door 2.

(c) Find the probability that the strategy of always switching succeeds, given that Monty opens door 3.

Solution:

(a) Let Cj be the event that the car is hidden behind door j and let W be the event that we win using the switching strategy. Using the law of total probability, we can find the unconditional probability of winning in the same way as in class:

P (W ) = P (W |C 1 )P (C 1 ) + P (W |C 2 )P (C 2 ) + P (W |C 3 )P (C 3 ) = 0 · 1 /3 + 1 · 1 /3 + 1 · 1 /3 = 2/ 3.

(b) A tree method works well here (delete the paths which are no longer relevant after the conditioning, and reweight the remaining values by dividing by their sum), or we can use Bayes’ rule and the law of total probability (as below).

Let Di be the event that Monty opens Door i. Note that we are looking for P (W |D 2 ), which is the same as P (C 3 |D 2 ) as we first choose Door 1 and then switch to Door 3. By Bayes’ rule and the law of total probability,

P (C 3 |D 2 ) =

P (D 2 |C 3 )P (C 3 )

P (D 2 )

= P^ (D^2 |C^3 )P^ (C^3 )

P (D 2 |C 1 )P (C 1 ) + P (D 2 |C 2 )P (C 2 ) + P (D 2 |C 3 )P (C 3 )

p · 1 /3 + 0 · 1 /3 + 1 · 1 / 3

=

1 + p

(c) The structure of the problem is the same as part (b) (except for the condition that p ≥ 1 /2, which was no needed above). Imagine repainting doors 2 and 3, reversing which is called which. By part (b) with 1 − p in place of p, P (C 2 |D 3 ) = (^) 1+(1^1 −p) = (^2) −^1 p.

First-step analysis and gambler’s ruin

  1. ©s A fair die is rolled repeatedly, and a running total is kept (which is, at each time, the total of all the rolls up until that time). Let pn be the probability that the running total is ever exactly n (assume the die will always be rolled enough times so that the running total will eventually exceed n, but it may or may not ever equal n).

(a) Write down a recursive equation for pn (relating pn to earlier terms pk in a simple way). Your equation should be true for all positive integers n, so give a definition of p 0 and pk for k < 0 so that the recursive equation is true for small values of n.

(b) Find p 7.

(c) Give an intuitive explanation for the fact that pn → 1 / 3 .5 = 2/7 as n → ∞.

Solution:

(a) We will find something to condition on to reduce the case of interest to earlier, simpler cases. This is achieved by the useful strategy of first step anaysis. Let pn be the probability that the running total is ever exactly n. Note that if, for example, the first

18 Chapter 2: Conditional probability

1 2 for^ p^ =^

1

  1. Also, it makes sense that the probability of Hobbes winning, which is 1 − P (C) = q

2 p^2 +q^2 , can also be obtained by swapping^ p^ and^ q. (b) The problem can be thought of as a gambler’s ruin where each player starts out with $2. So the probability that Calvin wins the match is

1 − (q/p)^2 1 − (q/p)^4

= (p

(^2) − q (^2) )/p 2 (p^4 − q^4 )/p^4

= (p

(^2) − q (^2) )/p 2 (p^2 − q^2 )(p^2 + q^2 )/p^4

= p

2 p^2 + q^2

which agrees with the above.

Simpson’s paradox

  1. ©s (a) Is it possible to have events A, B, C such that P (A|C) < P (B|C) and P (A|Cc) < P (B|Cc), yet P (A) > P (B)? That is, A is less likely than B given that C is true, and also less likely than B given that C is false, yet A is more likely than B if we’re given no information about C. Show this is impossible (with a short proof) or find a counterexample (with a story interpreting A, B, C).

(b) If the scenario in (a) is possible, is it a special case of Simpson’s paradox, equivalent to Simpson’s paradox, or neither? If it is impossible, explain intuitively why it is impossible even though Simpson’s paradox is possible.

Solution:

(a) It is not possible, as seen using the law of total probability:

P (A) = P (A|C)P (C) + P (A|Cc)P (Cc) < P (B|C)P (C) + P (B|Cc)P (Cc) = P (B).

(b) In Simpson’s paradox, using the notation from the chapter, we can expand out P (A|B) and P (A|Bc) using LOTP to condition on C, but the inequality can flip because of the weights such as P (C|B) on the terms (e.g., Dr. Nick performs a lot more Band- Aid removals than Dr. Hibbert). In this problem, the weights P (C) and P (Cc) are the same in both expansions, so the inequality is preserved.

  1. ©s Consider the following conversation from an episode of The Simpsons: Lisa: Dad, I think he’s an ivory dealer! His boots are ivory, his hat is ivory, and I’m pretty sure that check is ivory. Homer: Lisa, a guy who has lots of ivory is less likely to hurt Stampy than a guy whose ivory supplies are low.

Here Homer and Lisa are debating the question of whether or not the man (named Blackheart) is likely to hurt Stampy the Elephant if they sell Stampy to him. They clearly disagree about how to use their observations about Blackheart to learn about the probability (conditional on the evidence) that Blackheart will hurt Stampy.

(a) Define clear notation for the various events of interest here.

(b) Express Lisa’s and Homer’s arguments (Lisa’s is partly implicit) as conditional probability statements in terms of your notation from (a).

(c) Assume it is true that someone who has a lot of a commodity will have less desire to acquire more of the commodity. Explain what is wrong with Homer’s reasoning that the evidence about Blackheart makes it less likely that he will harm Stampy.

Solution:

(a) Let H be the event that the man will hurt Stampy, let L be the event that a man has lots of ivory, and let D be the event that the man is an ivory dealer.

Chapter 2: Conditional probability 19

(b) Lisa observes that L is true. She suggests (reasonably) that this evidence makes D more likely, i.e., P (D|L) > P (D). Implicitly, she suggests that this makes it likely that the man will hurt Stampy, i.e.,

P (H|L) > P (H|Lc).

Homer argues that P (H|L) < P (H|Lc).

(c) Homer does not realize that observing that Blackheart has so much ivory makes it much more likely that Blackheart is an ivory dealer, which in turn makes it more likely that the man will hurt Stampy. This is an example of Simpson’s paradox. It may be true that, controlling for whether or not Blackheart is a dealer, having high ivory supplies makes it less likely that he will harm Stampy: P (H|L, D) < P (H|Lc, D) and P (H|L, Dc) < P (H|Lc, Dc). However, this does not imply that P (H|L) < P (H|Lc).

  1. ©s The book Red State, Blue State, Rich State, Poor State by Andrew Gelman [?] discusses the following election phenomenon: within any U.S. state, a wealthy voter is more likely to vote for a Republican than a poor voter, yet the wealthier states tend to favor Democratic candidates! In short: rich individuals (in any state) tend to vote for Republicans, while states with a higher percentage of rich people tend to favor Democrats.

(a) Assume for simplicity that there are only 2 states (called Red and Blue), each of which has 100 people, and that each person is either rich or poor, and either a Democrat or a Republican. Make up numbers consistent with the above, showing how this phenomenon is possible, by giving a 2 × 2 table for each state (listing how many people in each state are rich Democrats, etc.).

(b) In the setup of (a) (not necessarily with the numbers you made up there), let D be the event that a randomly chosen person is a Democrat (with all 200 people equally likely), and B be the event that the person lives in the Blue State. Suppose that 10 people move from the Blue State to the Red State. Write Pold and Pnew for probabilities before and after they move. Assume that people do not change parties, so we have Pnew(D) = Pold(D). Is it possible that both Pnew(D|B) > Pold(D|B) and Pnew(D|Bc) > Pold(D|Bc) are true? If so, explain how it is possible and why it does not contradict the law of total probability P (D) = P (D|B)P (B) + P (D|Bc)P (Bc); if not, show that it is impossible.

Solution:

(a) Here are two tables that are as desired: Red Dem Rep Total Rich 5 25 30 Poor 20 50 70 Total 25 75 100

Blue Dem Rep Total Rich 45 15 60 Poor 35 5 40 Total 80 20 100

In these tables, within each state a rich person is more likely to be a Republican than a poor person; but the richer state has a higher percentage of Democrats than the poorer state. Of course, there are many possible tables that work. The above example is a form of Simpson’s paradox: aggregating the two tables seems to give different conclusions than conditioning on which state a person is in. Letting D, W, B be the events that a randomly chosen person is a Democrat, wealthy, and from the Blue State (respectively), for the above numbers we have P (D|W, B) < P (D|W c, B) and P (D|W, Bc) < P (D|W c, Bc) (controlling for whether the person is in the Red State or the Blue State, a poor person is more likely to be a Democrat than a rich person),