






Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Examples of probability and expected value calculations using conditional probabilities and joint outcomes. It includes calculations of joint probabilities, expected values, variances, and conditional probabilities. It also discusses the concept of conditional independence and the representativeness heuristic. The examples are based on problems related to adventuresomeness, stock prices, unit sales, carcinogenic risk, and monty hall problem.
Typology: Study notes
1 / 12
This page cannot be seen from the preview
Don't miss anything!
σY = 0.445 (-5.55^2 ) + 0.555 (4.45^2 ) = 4.
Thus, the correlation is ρXY =
7.11. a. The basic setup for this problem is the same as it is for the previous problem. We already have the joint probabilities, so we can start by calculating the expected values of X and Y :
Thus, we have the following table:
X Y X - E( X ) Y - E( Y ) ( X -E( X ))( Y -E( Y )) -2 2 -2 0.8 -1. -1 1 -1 -0.2 0. 0 0 0 -1.2 0 1 1 1 -0.2 -0. 2 2 2 0.8 1.
The covariance is the expected value of the numbers in the last column, each of which can occur with probability 1/5. Calculating this expected value gives a covariance of zero. Likewise, the correlation equals zero.
b. P( Y = 2) = 0.4, but P( Y = 2 | X = -2) = P( Y = 2 | X = 2) = 1.0 and P( Y = 2 | X = 0) = 0
c. P( X = -2) = 0.2, but P( X = -2 | Y = 2) = 0.5 and P( X = -2 | Y = 0) = 0
d. Clearly, X and Y are dependent. In fact, Y = | X |. But it is not a linear relationship, and the covariance relationship does not capture this nonlinear relationship.
7.12. The influence diagram would show conditional independence between hemlines and stock prices, given adventuresomeness:
Adventuresomeness
Hemlines (^) Stock Prices
Thus (blatantly ignoring the clarity test), the probability statements would be P(Adventuresomeness), P(Hemlines | Adventuresomeness), and P(Stock prices | Adventuresomeness).
7.13. In many cases, it is not feasible to use a discrete model because of the large number of possible outcomes. The continuous model is a “convenient fiction” that allows us to construct a model and analyze it.
P(B and A) = P(B | A) P(A) = 0.30 (0.68) = 0.
P(B and A– ) = P(B | A– ) P(A– ) = 0.02 (0.32) = 0.
and A) P(A)
and A– ) P(A– )
P(B) = P(B and A) + P(B and A
P(A and B) P(B)
P(A and B
7.15. P(offer) = 0. P(good interview | offer) = 0. P(good interview | no offer) = 0.
P(offer | good interview) = P(offer | good)
P(good | offer) P(offer) P(good | offer) P(offer) + P(good | no offer) P(no offer)
b. E(Total revenue) = $3.50 (2000) + $2.00 (10,000) + $1.87 (8500) = $42,
Var(Total revenue) = 3.50^2 (1000) + 2.00^2 (6400) + 1.87^2 (1150) = 41,871 “dollars squared”
7.20. Let X 1 = random number of breakdowns for Computer 1, and X 2 = random number of breakdowns
for Computer 2.
Cost = $200 ( X 1) + $165 ( X 2)
E(Cost) = $200 E( X 1) + $165 E( X 2) = $200 (5) + $165 (3.6) = $
If X 1 and X 2 are independent, then
Var(Cost) = 200^2 Var( X 1) + 165^2 Var( X 2) = 200^2 (6) + 165^2 (7) = 430,575 “dollars squared”
σCost = 430,575 “dollars squared” = $656.
The assumption made for the variance computation is that the computers break down independently of one another. Given that they are in separate buildings and operated separately, this seems like a reasonable assumption.
7.21. The possible values for revenue are 100 ($3) = $300 and 300 ($2) = $600, each with probability 0.5. thus, the expected revenue is 0.5 ($300) + 0.5 ($600) = $450. The manager’s mistake is in thinking that the expected value of the product is equal to the product of the expected values, which is true only if the two variables are independent, which is not true in this case.
7.22. Notation: “Pos” = positive “Neg” = negative “D” = disease “D
” = no disease
P(Pos) = P(Pos | D) P(D) + P(Pos | D
P(Neg) = 1 - P(Pos) = 1 - 0.0239 = 0.
P(D | Pos) = P(Pos | D) P(D) P(Pos)
P(D | Neg) =
P(Neg | D) P(D) P(Neg)
Test positive (0.0239)
Disease (0.795)
No disease (0.205)
Disease (0.0010)
No disease (0.9990)
Test negative (0.9761)
Pos Neg
Probability Table
7.23. Test results and field results are conditionally independent given the level of carcinogenic risk. Alternatively, given the level of carcinogenic risk, knowing the test results will not help specify the field results.
7.24. P(TR+ and FR+ | CP high) = P(TR+ | FR+ and CP high) P(FR+ | CP high)
= P(TR+ | CP high) P(FR+ | CP high)
The second equality follows because FR and TR are conditionally independent given CP. In other words, we just multiply the probabilities together. This is true for all four of the probabilities required:
P(TR+ and FR+ | CP high) = 0.82 (0.95) = 0.
P(TR+ and FR- | CP high) = 0.82 (0.05) = 0.
P(TR- and FR- | CP low) = 0.79 (0.83) = 0.
P(TR- and FR+ | CP low) = 0.79 (0.17) = 0.
7.25. Students’ answers will vary considerably here, depending on their opinions. However, most will rate h as more likely than f. Tversky and Kahneman (1982) (see reference in text) found that as many as 85% of experimental subjects ranked the statements in this way, which is inconsistent with the idea of joint probability (see the next question). Moreover, this phenomenon was found to occur consistently regardless of the degree of statistical sophistication of the subject.
7.26. a. The students’ explanations will vary, but many of them argue on the basis of the degree to which Linda’s description is consistent with the possible classifications. Her description makes her sound not much like a bank teller and a lot like an active feminist. Thus, statement h (bank teller and feminist) is more consistent with the description than f (bank teller). Tversky and Kahneman claim that the conjunction effect observed in the responses to problem 7.25 stem from the representativeness heuristic. This heuristic is discussed in Chapter 8 of Making Hard Decisions with DecisionTools.
7.28. The host is proposing a decision tree that looks like this:
Keep
Switch
x
x/
2x
But this is not correct. Suppose that x is equal to $100. Then the host is saying that if you swap, you have equally likely chances at an envelope with $200 and an envelope with $50. But that’s not the case! (If it were true, you would definitely want to switch.) Labeling the two envelopes A and B, the contestant correctly understands that the decision tree is as follows:
Keep A
Switch to B
x/
x
A has x (0.5)
x/
x
B has x (0.5)
A has x (0.5)
B has x (0.5)
The two decision branches are equivalent from the point of view of the decision maker.
7.29. Here are two possible influence diagrams:
Site?
Dome at Site 1? Amount of oil
Value
Site 1 Site 2
Site 1
2
Dome at Site 1? Yes
No
Yes
No
Amount High Low Dry High Low Dry Low Dry Low Dry
Probability
Yes No
Site 1
2
Amount High Low Dry Low Dry
Value 500 150
Site?
Dome at Site 1? Amount of oil
Value
Site 1 Site 2
Site 1
2
Dome at Site 1? Yes
No
Yes
No
Amount High Low Dry High Low Dry Low Dry Low Dry
Probability
Yes No
Site 1
2
Amount High Low Dry Low Dry
Value 500 150
The two diagrams are equivalent. However, while the first is somewhat more compact, the second may represent the uncertainty in the problem better. Also, with the second it would be more straightforward to calculate the expected value of information for the uncertain events.
The second version is modeled in the Excel file “Problem 7.29.xls”. The optimal solution is to choose the first site, and the expected value is $10,000.
7.30.
Sales
High (0.4)
Low (0.6)
High (0.5)
Low (0.5)
No Delay High price (0.95)
Delay Low price (0.05)
Profit
$8 M
0
$3.5 M
$1 M
A (^) A
No delay, high price, sales high (0.38 = 0.95 x 0.4)
Delay, low price, sales high (0.025 = 0.05 x 0.5) Delay, low price, sales low (0.025 = 0.05 x 0.5)
No delay, high price, sales low (0.57 = 0.95 x 0.6)
$8 M
0
$3.5 M
$1 M
Profit
Site?
Dome at Site 1?
Amount of oil at Site 1
Value
Site 1 Site 2
Dome at Site 1? Yes
No
Amount High Low Dry High Low Dry
Probability
Yes No
Site 1
2
Amt at Site 1 High
Low
Dry
High
Low
Dry
Value 500 500 150 150
Amount of oil at Site 2
Low Dry
Amt at Site 2 Low Dry Low Dry Low Dry Low Dry Low Dry Low Dry
Site?
Dome at Site 1?
Amount of oil at Site 1
Value
Site 1 Site 2
Dome at Site 1? Yes
No
Amount High Low Dry High Low Dry
Probability
Yes No
Site 1
2
Amt at Site 1 High
Low
Dry
High
Low
Dry
Value 500 500 150 150
Amount of oil at Site 2
Low Dry
Amt at Site 2 Low Dry Low Dry Low Dry Low Dry Low Dry Low Dry
P(No Dome | -) = 1 - 0.017 = 0.
We can now calculate the EMV for Site 1, given test results are negative:
EMV(Site 1 | -) = (EMV | Dome) P(Dome | -) + (EMV | No dome) P(No dome | -) = ($52.50 K) 0.017 + (-$53.75 K) 0. = -$51.944 K
EMV(Site 1 | -) is less than the EMV(Site 2 | -). If the test gives a negative result, choose Site 2.
7.33. P(+ and Dome) = P(+ | Dome) P(Dome) = 0.99 (0.60) = 0.594.
P(+ and Dome and Dry) = P(Dry | + and Dome) P(+ and Dome)
But P(Dry | + and Dome) = P(Dry | Dome) = 0.60. That is, the presence or absence of the dome is what matters, not the test results themselves. Therefore:
P(+ and Dome and Dry) = 0.60 (0.594) = 0.
Finally,
P(Dome | + and Dry) =
P(Dome and + and Dry) P(+ and Dry )
But
P(+ and Dry) = P(+ and Dry | Dome) P(Dome) + P(+ and Dry | No dome) P(No dome)
and P(+ and Dry | Dome) = P(Dry | + and Dome) P(+ | Dome) = P(Dry | Dome) P(+ | Dome) = 0.6 (0.99)
P(+ and Dry | No dome) = P(Dry | + and No dome) P(+ | No dome) = P(Dry | No dome) P(+ | No dome) = 0.85 (0.15)
Now we can substitute back in:
P(+ and Dry) = 0.6 (0.99) (0.6) + 0.85 (0.15) (0.4) = 0.
and
P(Dome | + and Dry) =
7.34. EMV(Site 1) = p (52.50) + (1 - p )(-53.75)
Set this equal to EMV( Site 2) = 0, and solve for p :
p (52.50) + (1 - p )(-53.75) = 0
52.50 p + 53.75 p = 53.
p =
If 0.55 < P(Dome) < 0.65, then the optimal choice for the entire region is to drill at Site #1.
7.35. Choose Site 1 if
EMV(Site 1) > EMV(Site 2)
q (52.50) + (1 - q ) (-53.75) > p (-200) + (1- p ) (50)
q > -2.3529 p + 0.
0 0.25 (^) 0.50 0.75 1.
p = P(Dry at Site 2)
q (^) = P(Dome at Site 1)
Site 2
Site 1
7.36. P(FR pos) = P(FR pos | CP High) P(CP High) + P(FR pos | CP Low) P(CP Low) = 0.95 (0.27) + 0.17 (0.73) = 0.
P(FR + | TR +) = P(FR + | CP High and TR +) P(CP High | TR +)
But FR and TR are conditionally independent given CP, so
P(FR + | CP High and TR +) = P(FR + | CP High) = 0.