













































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
We will begin with a very important and very common kind of inductive argument, generalizing from a sample. Then later we will consider the wide variety of ...
Typology: Study notes
1 / 53
This page cannot be seen from the preview
Don't miss anything!
f it looks like a duck, walks like a duck, and quacks like a duck, then it's a duck. This is usually good reasoning. It‘s probably a duck. Just don't assume that it must be a duck for these reasons. The line of reasoning is not sure-fire. It is strong inductive reasoning but it is not strong enough to be deductively valid. Deductive arguments are arguments judged by the deductive standard of, "Do the premises force the conclusion to be true?" Inductive arguments are arguments judged by the inductive standard of, "Do the premises make the conclusion probable?" So the strengths of inductive arguments range from very weak to very strong. With inductively strong arguments there is a small probability that the conclusion is false even if the premises are true, unlike with deductively valid arguments. An inductive argument can be affected by acquiring new premises (evidence), but a deductive argument cannot be. This chapter focuses specifically on the nature of the inductive process because inductive arguments play such a central role in our lives. We will begin with a very important and very common kind of inductive argument, generalizing from a sample. Then later we will consider the wide variety of inductive arguments.
Scientists collect data not because they are in the business of gathering facts at random but because they hope to establish a generalization that goes beyond the individual facts. The scientist is in the business of sampling a part of nature and then looking for a pattern in the data that holds for nature as a whole. A sociologist collects data about murders in order to draw a general conclusion, such as "Most murders involve guns used on acquaintances." A statistician
would say that the scientist has sampled some cases of murder in order to draw a general conclusion about the whole population of murders. The terms sample and population are technical terms. The population need not be people; in our example it is the set of all murders. A sample is a subset of the population. The population is the set of things you are interested in generalizing about. The sample is examined to get a clue to what the whole population is like.
The goal in drawing a generalization based on a sample is for the sample to be representative of the population, to be just like it. If your method of selecting the sample is likely to be unrepresentative then you are using a biased method and that will cause you to commit the fallacy of biased generalization. If you draw the conclusion that the vast majority of philosophers write about the meaning of life because the web pages of all the philosophers at your university do, then you‘ve got a biased method of sampling philosophers‘ writings.
Whenever a generalization is produced by generalizing on a sample, the reasoning process (or the general conclusion itself) is said to be an inductive generalization. It is also called an induction by enumeration or an empirical generalization. Inductive generalizations are a kind of argument by analogy with the implicit assumption that the sample is analogous to the population. The more analogous or representative the sample, the stronger the inductive argument.
Generalizations may be statistical or non-statistical. The generalization, "Most murders involve guns," contains no statistics. Replacing the term most with the statistic 80 percent would transform it into a statistical generalization. The statement "80 percent of murders involve guns" is called a simple statistical claim because it has the form
x percent of the group G has characteristic C.
In the example, x = 80, G = murders, and C = involving guns.
A general claim, whether statistical or not, is called an inductive generalization only if it is obtained by a process of generalizing from a sample. If the statistical claim about murders were obtained by looking at police records, it would be an inductive generalization, but if it were deduced from a more general principle of social psychology, then it would not be an inductive generalization, although it would still be a generalization.
Is the generalization "Most emeralds are green" a statistical generalization? Is it an inductive generalization?
────^306
306 It is not statistical, but you cannot tell whether it is an inductive generalization just by
looking. It all depends on where it came from. If it was the product of sampling, it's an
Random Sample
Statisticians have discovered several techniques for avoiding bias. The first is to obtain a random sample. When you sample at random, you don't favor any one member of the population over another. For example, when sampling tomato sauce cans, you don't pick the first three cans you see.
Definition A random sample is any sample obtained by using a random sampling method.
Definition A random sampling method is taking a sample from a target population in such a way that any member of the population has an equal chance of being chosen.
It is easy to recognize the value of obtaining a random sample, but achieving this goal can be difficult. If you want to poll students for their views on canceling the school's intercollegiate athletics program in the face of the latest school budget crisis, how do you give everybody an equal chance to be polled? Some students are less apt to want to talk with you when you walk up to them with your clipboard. If you ask all your questions in three spots on campus, you may not be giving an equal chance to students who are never at those spots. Then there are problems with the poll questions themselves. The way the questions are constructed might influence the answers you get, and so you won't be getting a random sample of students' views even if you do get a random sample of students.
Purposely not using a random sample is perhaps the main way to lie with statistics. For one example, newspapers occasionally report that students in American middle schools and high schools are especially poor at math and science when compared to students in other countries. This surprising statistical generalization is probably based on a biased sample. It is quite true that those American students taking the international standardized tests of mathematics and science achievement do score worse than foreign students. The problem is that school administrators in other countries try too hard to do well on these tests. "In many countries, to look good is very good for international prestige. Some restrict the students taking the test to elite schools," says Harold Hodgkinson, the director of the Center for Demographic Policy in Washington and a former director of the National Institute of Education. For example, whereas the United States tests almost all of its students, Hong Kong does not. By the 12th grade, Hong Kong has eliminated all but the top 3 percent of its students from taking mathematics and thus from taking the standardized tests. In Japan, only 12 percent of their 12th grade students take any mathematics. Canada has especially good test results for the same reason. According to Hodgkinson, the United States doesn't look so bad when you take the above into account.
The following passage describes a non-statistical generalization from a sample. Try to spot the conclusion, the population, the sample, and any bias.
David went to the grocery store to get three cartons of strawberries. He briefly looked at the top layer of strawberries in each of the first three cartons in the strawberry section and noticed no fuzz on the berries. Confident that the berries in his three cartons were fuzz-free, he bought all three.
A sample S is less representative of P according to the degree to which the percentage of S that are C deviates from the percentage of P that are C.
If you are about to do some sampling, what can you do to improve your chances of getting a representative sample? The answer is to follow these four procedures, if you can:
We‘ve already discussed how to obtain a random sample. After we explore the other procedures, we‘ll be in a better position to appreciate why some random samples are to be avoided.
Which is the strongest and which is the weakest argument? The four arguments differ only in their use of the words random and about.
a. Twenty percent of a random sample of our university's students want library fines to be lower; so, 20 percent of our university's students want library fines to be lower.
b. Twenty percent of a sample of our university's students want library fines to be lower; so, 20 percent of our university's students want library fines to be lower.
c. Twenty percent of a random sample of our university's students want library fines to be lower; so, about 20 percent of our university's students want library fines to be lower.
a. Twenty percent of a sample of our university's students want library fines to be lower; so, about 20 percent of our university's students want library fines to be lower.
────^307
307 Answer (c) is strongest and (b) is the weakest. The word about in the conclusions of (c) and
(d) make their conclusions less precise and thus more likely to be true, all other things being equal. For this reason, arguments (c) and (d) are better than arguments (a) and (b). Within each
of these pairs, the argument whose premises speak about a random sample is better than the
one whose premises don't speak about this. So (c) is better than (d), and (b) is worse than (a). Answers (d) and (b) are worse because you lack information about whether the samples are
random; however, not being told whether they are random does not permit you to conclude
that they are not random.
For the following statistical report, (a) identify the sample, (b) identify the population, (c) discuss the quality of the sampling method, and (d) find other problems either with the study or with your knowledge of the study.
Voluntary tests of 25,000 drivers throughout the United States showed that 25 percent of them use some drug while driving and that 85 percent use no drugs at all while driving. The conclusion was that 25 percent of U.S. drivers do use drugs while driving. A remarkable conclusion. The tests were taken at random times of the day at randomly selected freeway restaurants.
Sample Size
If you hear a TV commercial say that four out of five doctors recommend the pain reliever in the
drug being advertised, you might be impressed with the drug. However, if you learn that only
five doctors were interviewed, you would be much less impressed. Sample size is important.
308 (a) The sample is 25,000 U.S. Drivers, (b) The population is U.S. drivers, (c) The sample size
is large enough, but it is not random, for four reasons: (1) Drivers who do not stop at roadside
restaurants did not have a chance of being sampled, (2) the study overemphasized freeway drivers rather than other drivers, (3) it overemphasized volunteers, (4) it overemphasized
drivers who drive at 4 a.m. (d) The most obvious error in the survey, or in the report of the
survey, is that 25 percent plus 85 percent is greater than 100 percent. Even though the survey
said these percentages are approximate, the 110 percent is still too high. Also, the reader would like more information in order to assess the quality of the study. In particular, how did the
study decide what counts as a drug, that is, how did it operationalize the concept of a drug? Are
these drugs: Aspirin? Caffeine? Vitamins? Alcohol? Only illegal drugs? Did the questionnaire ask whether the driver had ever used drugs while driving, or had ever used drugs period? Did
the pollster do the sampling on one day or over many days? Still, lack of information about the
survey is not necessarily a sign of error in the survey itself.
percent. At any rate, whether we can be specific or not, the greater the margin of error we can
permit, the smaller the sample size we need. This result is an instance of the principle that the
less specific the conclusion of our argument, the stronger the argument.
This chapter will have more to say about sample size, but first we need to consider other ways
of improving the sampling process.
Sample Diversity
In addition to selecting a random, large sample, you can also improve your chances of selecting a representative sample by sampling a wide variety of members of the population. That is, aim for diversity─so that diversity in the sample is just like the diversity in the population. If you are interested in how Ohio citizens will vote in the next election, will you trust a pollster who took a random sample and ended up talking only to white, female voters? No. Even though those 50 white women were picked at random, you know you want to throw them out and pick 50 more. You want to force the sample to be diverse. The greater the diversity of relevant characteristics in your sample, the better the inductive generalization, all other things being equal.
Because one purpose of getting a large, random sample is to get one that is sufficiently diverse, if you already know that the population is homogeneous — that is, not especially diverse — then you don't need a big sample, or a particularly random one. For example, in 1906 the Chicago physicist R. A. Millikan measured the electric charge on electrons in his newly invented oil-drop device. His measurements clustered around a precise value for the electron's charge. Referring to this experiment, science teachers tell students that all electrons have this same charge. Yet Millikan did not test all electrons; he tested only a few and then generalized from that sample. His sample was very small and was not selected randomly. Is this grounds for worry about whether untested electrons might have a different charge? Did he commit the fallacy of hasty generalization? No, because physical theory at the time said that all electrons should have the same charge. There was absolutely no reason to worry that Tuesday's electrons would be different from Wednesday's, or that English elections would be different from American ones. However, if this theoretical backup weren't there, Millikan's work with such a small, nonrandom sample would have committed the fallacy of hasty generalization. The moral: Relying on background knowledge about a population's lack of diversity can reduce the sample size needed for the generalization, and it can reduce the need for a random sampling procedure.
When you are sampling electrons, if you‘ve seen one you‘ve seen them all, so to speak. The diversity just isn't there, unlike with, say, Republican voters, who vary greatly from each other. If you want to sample Republican voters' opinions, you can't talk to one and assume that
his or her opinions are those of all the other Republicans. Republicans are heterogeneous─the fancy term for not being diverse.
A group having considerable diversity in the relevant factors affecting the outcome of interest is said to be a heterogeneous group. A group with a relatively insignificant amount of diversity is said to be a homogeneous group. For example, in predicting the outcome of measuring the average height of two groups, Americans and Japanese, the diversity of American ethnicity makes Americans a heterogeneous group compared to the more homogeneous Japanese group. It is easier to make predictions for homogeneous groups than for heterogeneous groups.
Being homogeneous is relative, however. The Japanese might be more homogeneous than
Americans relative to measurements about height, but the Japanese might be more
heterogeneous than Americans when it comes to attitudes about socialism and about how to
care for infants.
The most important goal in sampling is
a. randomness b. representativeness c. diversity d. large sample size
────^309
Suppose you know the average height of Japanese men and of American men. If you randomly pick a hundred Japanese businessmen, you can be more sure of their average height than you can be if you pick American businessmen. Explain why.
────^310
309 b
310 The variety of the Japanese data is less than that of the American data because Japan is a more homogeneous society. The American people are more ethnically diverse and so are more
genetically diverse, and genes affect human growth. Suppose the average Japanese man is 5' 5",
and the average American man is 5' 8". Then the point the message is making is that the average
all this information about the voting population to take a better sample by making sure that your random sample contains exactly 70 percent white voters and exactly 25 percent black voters. If your poll actually were to contain 73 percent white voters, you would be well advised to randomly throw away some of the white voters' responses until you get the number down to 70 percent. The resulting stratification on race will improve the chances that your sample is representative. Stratification on the voters' soft drink preference would not help, however.
The definition of stratification uses the helpful concept of a variable. Roughly speaking, a variable is anything that comes in various types or amounts. There are different types of races, so race is a variable; there are different amounts of salaries, so salary is a variable; and so forth. Each type or amount of the variable is called a possible value of the variable. White and black are two values of the race variable. Suppose a population (say, of people) could be divided into different groups or strata, according to some variable characteristic (such as race). Suppose each group's members have the same value for that variable (for example, all the members of one group are black, all the members of another group are white, and so on). Suppose a sample is taken under the requirement that the percentage that has a given value (black) of the variable (race) must be the same as the known percentage of the value for the population as a whole. If so, then a stratified sample has been taken from that population, and the sample is said to be stratified on that variable.
Stratification is a key to reducing sample size, thereby saving time and money. If you want to know how people are going to vote for the Republican candidate in the next presidential election, talking to only one randomly selected voter would obviously be too small a sample. However, getting a big enough sample is usually less of a problem than you migh t expect when you pay careful attention to stratification on groups that are likely to vote similarly. Most nonprofessionals believe that tens of thousands of people would need to be sampled. I asked my next-door neighbor how many he thought would be needed, and he said, "Oh, at least a hundred thousand." Surprisingly, 500 would be enough if the sample were stratified on race, income, employment type, political party, and other important variables. This 500 figure assumes the pollster need only be 95 percent sure that the results aren't off by more than 2 percent. If you can live with a greater margin of error than 2 percent and less confidence than 95%, then you can use a much smaller sample size.
The most important variables affecting voting are the voters' party, race, sex, income, and age. The more of these variables there are, the bigger the sample must be to make sure that enough voters representative of each value get polled. If the pollster has no idea what the variables are that will influence the results, he or she cannot know whether the sample is diverse in regard to these variables, so a very large sample will be needed. For example, if you wanted to know what percentage of jelly beans in an opaque jar are lime or licorice flavored, then all you can do is shake the jar and take as big a sample as you can.
Your quality control engineer conducts a weekly inspection of your company's new beverage. He gathers a random sample of 100 bottles produced on Mondays or Tuesdays. Over several
weeks, at most he finds one or two sampled bottles each week to be faulty. So you conclude that your manufacturing process is doing well on an average every week, since your goal was to have at least 98 percent of the beverage be OK.
Suppose, however, that the quality control engineer knows that your plant produces an equal amount of the beverage on each weekday and that it produces beverages only on weekdays. Describe the best way for the quality control engineer to improve the sampling by paying attention to stratification.
a. Sample one beverage from each weekday.
b. Pick a larger and more random sample.
c. Take an equal number of samples on Saturdays and Sundays as well.
d. Make sure that 20 percent of the sample comes from each weekday.
b. Sample more of the bottles that will be delivered to your most valued customers.
────^312
Statistical Significance
Frequently, the conclusions of inductive generalizations are simple statistical claims. Our premise is "x percent of the sample is la-de-da." From this we conclude, "The same percent of the population is, too." When the argument is inductively strong, statisticians say the percent is statistically significant. A statistically significant statistic is one that probably is not due to chance. The number need not be significant in the sense of being important; that is the non- technical sense of the word significant.
Suppose you are interested in determining the percentage of left-handers in the world, and you aren‘t willing to trust the results of other people who have guess at this percentage. Unless you have some deep insight into the genetic basis of left-handedness, you will have to obtain your answer from sampling. You will have to take a sample and use the fraction of people in your sample who are left-handed as your guess of the value of the target number. The target number is what statisticians call a parameter. The number you use to guess the parameter is called the statistic. Your statistic will have to meet higher standards the more confident you must be that it is a reliable estimate of the parameter.
312 Answer (d). The suggestion in (b) would be good to do, but it has nothing to do with
stratification.
Designing a Paired Comparison Test
Suppose you own a food business and are considering marketing what your researcher/cook
says is a better version of one of your old food products say, a vegetarian burrito. The main
factor in your decision will be whether your customers will like the taste of the new product better than the taste of the old one. You can make your marketing decision by guessing, by
letting your cook choose, by asking advice from your friends, or by some other method. You
decide to use another method: ask your own customers which of the two vegetarian burritos
they like best. Why not? If the customers in your sample prefer the new product, you will believe that the whole population will, too, and you will replace the old product with the new
one.
A good way to do this testing would be to use a procedure called paired comparison. In this kind of test, you remove the identifying labels from the old and new burrito products and then
give a few tasters the pairs of products in random orders. That is, some tasters get to taste the
new burrito first; some, the old one first. In neither case are they told which product they are
tasting. Then ask your taster/judges which product they like better. If a great many of them like the new one better than the old one, you can go with the new product.
How many tasters do you need in order to get useful results? And if most of the tasters like the
new product but many do not, then how much disagreement can you accept and still be sure your customers generally will like the new product better? If three out of five tasters say the
new product is better but two out of five disagree, would a conclusion that over half your
customers would prefer the new burrito product be a statistically significant result? These are
difficult questions, but they have been studied extensively by statisticians, and the answers are clear.
Before those difficult questions can be answered, you need to settle another issue. How sure do
you have to be that your tasters' decision is correct, in the sense of accurately representing the
tastes of the general population of your customers? If you need to be 99 percent sure, you will need more tasters than if you need only to be 95 percent sure. Let's suppose you decide on 95
percent. Then, if you have, say, twenty tasters, how many of them would have to prefer the new
product before you can be 95 percent sure that your customers will like the new product better, too? If your taster-judges are picked randomly from among your population of customers and
aren't professionals in the tasting business, then statistical theory says you would need at least
75 percent (fifteen) of your twenty judges to prefer the new product. However, if you had more
judges, you wouldn't need this much agreement. For example, with sixty judges, you would need only 65 percent (thirty-nine) of your judges to give a positive response in order for you to
be confident that your customers will prefer the new product. What this statistic of thirty-nine
out of sixty means is that even if twenty-one out of your sixty judges were to say that your new
burrito is awful, you could be 95 percent sure that most consumers would disagree with them.
Yet many business persons who are not versed in such statistical reasoning would probably
worry unnecessarily about their new burrito if twenty-one of sixty testers disliked the product.
Statistical theory also indicates how much agreement among the judges would be required to
raise your confidence level from 95 percent to 99 percent. To be 99 percent sure that your
customers would prefer the new product to the old, you would need seventeen positive responses from your twenty judges, or forty-one positive responses from sixty judges.
Let‘s try another example. You recently purchased a new service station (gas station) and have
decided on an advertising campaign both to increase your visibility in the community and to
encourage new customers to use the station. You plan to advertise a free gift to every customer purchasing $10 or more of gasoline any time during the next two weeks. The problem now is to
select the gift. You have business connections enabling you to make an inexpensive purchase of
a large supply of either six-packs of Pepsi or engraved ballpoint pens with the name of a local sports team. You could advertise that you will give away free Pepsi, or else you could advertise
that you will give away the pens. The cost to you would be the same. You decide to choose
between the two on the basis of what you predict your potential customers would prefer. To do
this, you could, and should, use a paired comparison test. You decide you would like to be 95 percent sure of the result before you select the gift. You randomly choose twenty potential
customers and offer them their choice of free Pepsi or a free ballpoint pen. Ten are told they can
have the Pepsi or the pen; ten are told they can have the pen or the Pepsi. You analyze the
results. Three customers say they don't care which gift they get. Five say that they strongly prefer Pepsi to the pen because they don't like the sports team. Six say they would be happy
with either gift but would barely prefer the Pepsi. Four customers choose Pepsi because they
have enough pens. The rest choose pens with no comment. From this result, can you be
confident that it would be a mistake to go with the ballpoint pen?
Yes, you can be sure it would be a mistake. Your paired comparison test shows fifteen of twenty
prefer Pepsi. At the 95 percent confidence level, you can be sure that over 50 percent of your
customers would prefer the Pepsi. By the way, this information about numbers is for illustrative purposes. You as a student aren‘t in a statistics class, so you won‘t be quizzed on making these
calculations. But if you did own that service station you should use a paired comparison test
and get some number advice by looking up the info on the Internet or by asking somebody who
has taken a statistics class.
Suppose you learn that your favorite TV program was canceled because the A. C. Nielsen
Corporation reported to CBS that only 25 percent of the viewers were tuned to your program
last week. CBS wanted a 30 percent program in that time slot. You then learn more about the Nielsen test. Nielsen polled 400 viewers, 100 of whom said they were watching your program.
Knowing that the United States has 100 million TV sets, you might be shocked by CBS's making
a major financial decision based on the simple statistical claim that 100 out of 400 viewers prefer
between asking "Do you favor Jones or Smith?" and "Do you favor Smith or Jones?" The moral is that natural obstacles and sloppy methodology combine to produce unreliable data and so to reduce the significance of our statistics.
Varieties of Inductive Arguments
We have just completed our analysis of our one kind of inductive argument, generalizing from a sample. There are other kinds. The study of inductive logic is more complex than deductive logic, and it is not as well developed. It consists merely of several independent topical areas that focus on a particular kind of inductive argument. This section of the chapter briefly introduces the different kinds. Some inductive arguments are of more than one kind.
Argument from Authority
Suppose a high school science teacher says to you,
The scientists I‘ve read agree that Neptune is a cold planet compared to Mars, Earth, and Venus. So, Neptune is definitely a cold planet.
This argument from authority does not jump to conclusions. The high school teacher offers expert testimony although it is secondhand. It might be called hearsay in a courtroom, but it is reasonable grounds for accepting the conclusion. So, the conclusion follows with probability.
But with how much probability? Nobody knows, not even the scientists. Nobody can say authoritatively whether the conclusion is 85 percent probable or instead 90 percent probable. All they can properly say is that the appeal to authority makes the conclusion a safe bet because the proper authorities have been consulted, they have been quoted correctly, and it is well known that the experts do not significantly disagree with each other about this.
The conclusion of the following argument is not such a safe bet:
The scientists say astral travel is impossible. That is, our spiritual bodies can't temporarily leave our physical bodies and travel to other places. So they say. However, my neighbor and several of her friends told me they separately traveled to Egypt while their physical bodies were asleep last night. They visited the pyramids. These people are sincere and reliable. Therefore, the scientists are wrong about astral travel.
Is this a successful inductive argument? The arguer asks us to accept stories from his neighbor and her friends. These anecdotes are pitted against the claims of the scientists. Which should you believe? Scientists have been wrong many times before; couldn't they be wrong here, too? Yes, they could, but it wouldn't be a good bet. If you had some evidence that could
convincingly show the scientists to be wrong, then you, yourself, would likely soon become a famous scientist. You should be cautious about jumping to the conclusion that the scientists are wrong. The stories are so extraordinary that you really need extraordinarily good evidence to believe them. The only evidence in favor of the stories is the fact that the neighbors and friends, who are presumed to be reasonable, agree on their stories and the fact that several times in history other persons also have claimed to be astral travelers.
The neighbor might say that she does have evidence that could convincingly show the scientists to be wrong but that she couldn't get a fair hearing from the scientists because their minds are
closed to these possibilities of expanding their consciousness. Yes, the scientists probably would
give her the brush-off, but by and large the scientific community is open to new ideas. She
wouldn't get the scientists' attention because they are as busy as the rest of us, and they don't want to spend much time on unproductive projects. However, if the neighbor were to produce
some knowledge about the Egyptian pyramids that she probably couldn't have gotten until she
did her astral traveling, then the scientists would look more closely at what she is saying. Until
then, she will continue to be ignored by the establishment.
Egypt’s Giza Pyramid
Most of what we know we got from believing what the experts said, either firsthand or, more likely, secondhand. Not being experts ourselves, our problem is to be careful about sorting out
the claims of experts from the other claims that bombard us, while being aware of the
possibility that experts are misinterpreted, that on some topics they disagree, and that
occasionally they themselves cannot be trusted to speak straightforwardly. Sensitive to the possibility of misinterpreting experts, we prefer firsthand testimony to secondhand, and
secondhand to third hand. Sensitive to disagreement among the experts, we prefer unanimity
and believe that the greater the consensus, the stronger the argument from authority.