Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Understanding Conditional Probability and Independence with Joint Probability Tables and I, Exercises of Artificial Intelligence

Central University of Jammu and Kashmir Artificial Intelligence

How to construct joint probability tables and use inference nets to understand conditional probability and independence. It covers the axioms of probability, the definition of conditional probability, and the chain rule. The document also discusses the concept of independence and its relationship to conditional independence.

Typology: Exercises

2011/2012

Uploaded on 07/31/2012

shaina_44kin 🇮🇳

3.9

(9)

68 documents

1 / 7

This page cannot be seen from the preview

Don't miss anything!

6.034f Probabilistic Inference Notes

Patrick Winston

Draft of December 9, 2010

The Joint Probability Table

Given a set of n binary, variables, with values T or F, you can construct a table, of size 2n

,

to keep track of value combinations observed. In the following table, for example, there are

three binary variables, so there are 23 = 8 rows.

Dog barks Burgla

r

Raccoon Tally P Selected

false false false 405 0.405

false false true 225 0.225

false true false 0 0.000

false true true 0 0.000

true false false 45 0.045

true false true 225 0.225

true true false 50 0.050

true true true 50 0.050

TF ? T F ? T F ? 1000 1.000 0.000

Tallying enables you, if you are a frequentest, to construct occurrence frequencies for the

rows in the table, and you refer to those frequencies as probabilities. Alternatively, if you are

a subjectivist, you can provide the probabilities by guessing what the frequencies should be.

Given the table, you can calculate the probability of any combination of rows by adding

together their probabilities. You can limit your calculations to rows in which some criteria

is satisﬁed. For example, the following table shows the probability that there is a raccoon

present, given that the dog barks.

docsity.com

Partial preview of the text

Download Understanding Conditional Probability and Independence with Joint Probability Tables and I and more Exercises Artificial Intelligence in PDF only on Docsity!

6.034f Probabilistic Inference Notes

Patrick Winston

Draft of December 9, 2010

The Joint Probability Table

Given a set of n binary, variables, with values T or F, you can construct a table, of size 2

n

,

to keep track of value combinations observed. In the following table, for example, there are

three binary variables, so there are 2

3

= 8 rows.

Dog barks Burglar Raccoon Tally P Selected

false false false 405 0.

false false true 225 0.

false true false 0 0.

false true true 0 0.

true false false 45 0.

true false true 225 0.

true true false 50 0.

true true true 50 0.

T F? T F? T F? 1000 1.000 0.

Tallying enables you, if you are a frequentest , to construct occurrence frequencies for the

rows in the table, and you refer to those frequencies as probabilities. Alternatively, if you are

a subjectivist , you can provide the probabilities by guessing what the frequencies should be.

Given the table, you can calculate the probability of any combination of rows by adding

together their probabilities. You can limit your calculations to rows in which some criteria

is satisfied. For example, the following table shows the probability that there is a raccoon

present, given that the dog barks.

Dog barks Burglar Raccoon Tally P Selected

false false false 0 0.

false false true 0 0.

false true false 0 0.

false true true 0 0.

true false false 45 0.

true false true 225 0.

true true false 50 0.

true true true 50 0.

T F? T F? T F? 370 1.000 0.

Unfortunately, the size of the table grows exponentially, so often there are too many

probabilities to extract from frequency data or to estimate subjectively. You have to find

another way that takes you through the axioms of probability, the definition of conditional

probability, and the idea of independence.

The Axioms of Probability

The axioms of probability make sense intuitively given the capacity to draw Venn diagrams

filled with a colored-pencil crosshatching. The first axiom states that probabilities are always

equal to or greater than zero and less than or equal to one:

0 ≤ P ( a ) ≤ 1. 0

Another axiom captures the idea that certainty means a probability of one; impossible,

zero:

P ( F ) = 0. 0 P ( T ) = 1. 0

Finally, you have an axiom relating the either (∨) to the both (∧);

P ( a ∨ b ) = P ( a ) + P ( b ) − P ( a ∧ b )

Conjunction is generally indicated by a comma, rather than ∧:

P ( a , b ) = Pa ∧ b

Inference Nets

An inference net is a loop-free diagram that provides a convenient way to assert independence

conditions. They often, but not always, reflect causal pathways:

3 667

275 333

F

T

D P T All

Police called

8 521

386 479

F

T

R P T All

Trash can

8 472

247 438

42 49

36 41

F

T

F

T

F

T

B R P T All

Dog barks

0.479 479 1000

Raccoon

0.090 90 1000

Burglar

When you draw such a net, you suggest that the influences on a variable all flow through

the variable’s parents, thus enabling the following to be said: Each variable in an inference

net is independent of all nondescendant variables, given the variable’s parents.

Note that the burglar and raccoon each appear with probabilities that do not depend

on anything else, but the dog barks with differing probabilities depending on whether the

burglar or the raccoon or both or neither are present.

The probabilities and conditional probabilities in the diagram are determined using the

data to provide frequencies for all the possibilities, just as when creating the joint probability

table.

Using the inference net, there are far fewer numbers to determine with frequency data

or to invent subjectively. Here, there are just 10 instead of 2

5

= 32 numbers to make up. In

general, if there are n variables, and no variable depends on more than p max

parents, then

you are talking about n 2

p max rather than 2

n , a huge, exponential difference.

Generating a Joint Probability Table

Is the inference net enough to do calculation? You know that the joint probability table is

enough, so it follows, via the chain rule, that an inference net is enough, because you can

generate the rows in the joint probability table from the corresponding inference net.

To see why, note that, because inference nets have no loops, each inference net must

have a variable without any descendants. Pick such a variable to be first in an ordering of the

variables. Then, delete that variable from the diagram and pick another. There will always

be one with no still-around descendents until you have constructed a complete ordering. No

variable in your list can have any descendents to its right; the descendents, by virtue of how

you constructed the list, are all to the left.

Next, you use the chain rule to write out the probability of any row in the joint probability

table in terms of the variables in your inference net, ordered as you have just laid them out.

For example, you can order the variables in the evolving example by chewing away at

the variables without still-around descendants, producing, say, C, D, B, T, R. Then, using the

chain rule, you produce the following equation:

P ( C , D , B , T , R ) = P ( C | D , B , T , R ) P ( D | B , T , R ) P ( B | T , R ) P ( T | R ) P ( R )

With this ordering, all the conditional dependencies are on non descendants. Then,

knowing that the variables are independent of all non descendants given their parents, we

can strike out a lot of the apparent dependencies, leaving only dependencies on parents:

P ( C , D , B , T , R ) = P ( C | D ) P ( D | B , R ) P ( B ) P ( T | R ) P ( R )

Thus, it is easy to get the probability of any row in the joint probability table; thus, it is

easy to construct the table; thus, anything you need to infer can be inferred via the inference

net.

You need not actually create the full joint probability table, but it is comforting to know

that you can, in principle. You don want to, in practice, because there are ways of performing

your inference calculations that are more efficient, especially if your net has at most one path

from any variable to any other.

Naive Bayes Inference

Now, it is time to revisit the definition of conditional probability and take a walk on a path

that will soon come back to inference nets. By symmetry, note that there are two ways to

recast P ( a , b ):

P ( a , b ) = P ( a | b ) P ( b )

P ( a , b ) = P ( b | a ) P ( a )

This leads to the famous Bayes rule:

P ( a | b ) =

P ( b

P

| a

| b

) P

( a )

Now, suppose you are interested in classifying the cause of some observed evidence. You

use Bayes rule to turn the probability of a class, c i

, given evidence into the probability of the

evidence given the class, c i

P ( c i

| e ) =

P ( e |

P

c

i

e

P

( c i

Then, if the evidence consists of a variety of independent observations, you can write the

naive Bayes classifier, so called because the independence assumption is often unjustified:

and a way to favor fewer connections over more connections, which you can think of as a

special case of Occam’s razor.

Understanding Conditional Probability and Independence with Joint Probability Tables and I, Exercises of Artificial Intelligence

Related documents

Partial preview of the text

Download Understanding Conditional Probability and Independence with Joint Probability Tables and I and more Exercises Artificial Intelligence in PDF only on Docsity!

6.034f Probabilistic Inference Notes

Patrick Winston

Draft of December 9, 2010

The Joint Probability Table

The Axioms of Probability

P ( F ) = 0. 0 P ( T ) = 1. 0

Inference Nets

Generating a Joint Probability Table

P ( C , D , B , T , R ) = P ( C | D , B , T , R ) P ( D | B , T , R ) P ( B | T , R ) P ( T | R ) P ( R )

P ( C , D , B , T , R ) = P ( C | D ) P ( D | B , R ) P ( B ) P ( T | R ) P ( R )

Naive Bayes Inference

P

) P

P

P