






Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Material Type: Exam; Professor: Toppen; Class: Scope and Methods; Subject: Political Science; University: Hope College; Term: Unknown 1989;
Typology: Exams
1 / 10
This page cannot be seen from the preview
Don't miss anything!
Module 2: Comparing Two Categorical Variables
In Module 1, we looked at describing a single variable, and in Module 2, we will start to compare variables to one another and look for a relationship between them. Before we start to make such a comparison, it is important to make a note of the importance of random sampling and say a few words about a good hypothesis.
Random Sampling
Most of the data used by political scientists is a sample of the population they are trying to measure. A population is every single case of what the researcher wants to study. Usually it is not possible to obtain data on every single member of a population, so a researcher uses a sample instead. A sample is a smaller set of cases of the population. In order for a sample to be accepted for use in statistical testing, the sample must be representative of the population as a whole. To assure that a sample is representative of the population, researchers use a technique known as random sampling. The idea behind random sampling is to arrange all the members of the population in a list, randomly select a number of them, and obtain data from them. The key is that every member of a population must have an equal chance of being selected; if this weren’t true, then the sample wouldn’t be truly random, and therefore not representative of the population. Most statistical tests, and all of them discussed in this guide, assume that the data you are testing is truly representative of the population and uses random sampling. Suppose that a researcher wants to do a study on the American public. He obtains a list of telephone numbers of every person in Alabama, Colorado, Indiana, Maine, and Oregon. He then randomly selects 1000 numbers and obtains the data from them that he needs. Is this sample representative of his target population? No, of course not. Only people in those five states have a chance to be surveyed, whereas someone in Michigan has no chance of being surveyed. Therefore, the population for his sample is only the public in those five states and not the entire American public. Now suppose that same researcher is able to get a list of all the phone numbers of everyone in America. To obtain his sample, he selects every 10,000th^ person on the list and surveys him or her. Is this an acceptable method for the researcher to use? Again, the answer is no. The method the researcher used is not random. Everyone on the list does not have an equal chance to be selected because only every 10,000th^ person is selected. This is calculated rather than random. The most popular way of getting a random sample is by using a table of random numbers (available in most statistics textbooks). Other methods of making selections random could involve rolling dice or computer randomness programs. The data used in this guide can be assumed to be a random sample. GSS1998.dta and NES2000.dta are prime examples of random samples obtained by professional organizations. STATES.dta and WORLD.dta are examples of times when it is possible to collect data on an entire population. Thus, we don’t have to concern ourselves with whether these samples are representative. Keep in mind that even representative samples contain some sampling error. The statistical tests that we will be exploring will explain how to account for this error and make a conclusion about a relationship with some certainty.
A Good Hypothesis
Every empirical study starts with a good hypothesis. A hypothesis explains the results that a researcher thinks he will obtain from his testing. Usually a hypothesis has been well researched and involves a theory that the researcher hopes to support with data. A good hypothesis provides three important pieces of information about the study: the population, the variables involved, and the expected direction of the relationship. A good rough skeleton of a hypothesis is:
In comparing (insert population), those (cases) with a higher (independent variable) will have a (higher/lower) (dependent variable) than will those with a lower (independent variable).
An independent variable is the cause in the relationship, whereas the dependent variable is the effect of a change in the independent. On a graph, the x-axis is the independent variable, while the y-axis is the dependent variable. A good way to distinguish the variables is that you are saying the value of the dependent variable depends on the value of the independent variable. A few examples of good hypotheses are:
In comparing the United States, those states with a higher high school graduation rate will have a higher voting rate than will those states with a lower high school graduation rate.
In comparing individuals in the United States, men are more likely to oppose same-sex marriages than are women.
Notice how the hypothesis is different in the second example to accommodate the fact that the independent variable is nominal. With some practice, you will learn to write good hypotheses that identify the population, variables, and direction of the relationship. When performing statistical tests, it is important to note that we do not test the hypothesis directly, but rather the null hypothesis. The null hypothesis is essentially the opposite of your hypothesis (your hypothesis is referred to as the alternative hypothesis and is represented by Ha). The null (represented by Ho) states that there is no relationship between the variables. In statistical testing, we seek not to prove the alternative, but rather to provide evidence against the null.
Comparing Two Categorical Variables
Now that we know our data is valid since it is a random sample and you know how to write a good hypothesis, we can begin to make comparisons between two variables. Depending on the level of measurement of the variables, there are different statistical tests we can employ to test for a relationship. In this module, you will learn to compare to categorical (nominal or ordinal) variables and see if they are related.
Figure 2. 2
0
50
100
favor oppose favor oppose