Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Statistics and Probability: Sampling Distributions and Hypothesis Testing, Study notes of Political Science

An overview of statistical concepts related to sampling distributions and hypothesis testing. It covers topics such as empirical and theoretical probability distributions, discrete and continuous distributions, population parameters, expected value and mean, variance and standard deviation, chebyshev's inequality, normal distribution, hypothesis testing with known and unknown standard errors, and the t distribution. The document also discusses the central limit theorem and the sampling distribution of sample means.

Typology: Study notes

Pre 2010

Uploaded on 08/30/2009

koofers-user-y83-1
koofers-user-y83-1 🇺🇸

10 documents

1 / 8

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
18
Lecture
POLI 7962:
Seminar in
Quantitative Political Analysis
August 2-6, 2007
I. Sample Estimation of Population Parameters
A. Introduction
1. In testing hypotheses about the real world, it is unusual for the researcher to have collected data that
involves the population of all possible relevant cases. For instance, if we are interested in the
determinants of voting behavior for American citizens, it is impossible to collect data on all eligible
citizens of voting age. Instead, researchers must rely on samples of the population in order to make
inferences about the parameters (characteristics) of that population. The researcher hopes that the
distribution of outcomes observed is a good approximation of the population distribution from which
the sample was drawn. If it is a good approximation, the sample mean and standard deviation, for
example, would also be expected to be good approximations of the population mean and standard
deviation.
a. Random sampling
2. Inferential statistics: numbers that represent generalizations, or inferences, drawn about some
characteristic (parameter) of a population, based on evidence from a sample of observations from the
population.
a. Inference: the process of making generalizations or drawing conclusions about the attributes of a
population based on evidence contained in a sample.
b. More broadly, an inference is the process of drawing conclusions about that which is not
observed directly.
B. Probability Distributions
1. A probability distribution is a set of outcomes, each of which has an associated probability of
occurrence.
a. Empirical probability distribution: a probability distribution for a set of empirical observations.
b. Theoretical probability distribution: a probability distribution for a set of theoretical
observations.
2. Discrete vs. continuous probability distributions.
pf3
pf4
pf5
pf8

Partial preview of the text

Download Statistics and Probability: Sampling Distributions and Hypothesis Testing and more Study notes Political Science in PDF only on Docsity!

Lecture

POLI 7962:

Seminar in Quantitative Political Analysis

August 2-6, 2007

I. Sample Estimation of Population Parameters

A. Introduction

  1. In testing hypotheses about the real world, it is unusual for the researcher to have collected data that involves the population of all possible relevant cases. For instance, if we are interested in the determinants of voting behavior for American citizens, it is impossible to collect data on all eligible citizens of voting age. Instead, researchers must rely on samples of the population in order to make inferences about the parameters (characteristics) of that population. The researcher hopes that the distribution of outcomes observed is a good approximation of the population distribution from which the sample was drawn. If it is a good approximation, the sample mean and standard deviation, for example, would also be expected to be good approximations of the population mean and standard deviation.

a. Random sampling

  1. Inferential statistics: numbers that represent generalizations, or inferences, drawn about some characteristic (parameter) of a population, based on evidence from a sample of observations from the population.

a. Inference: the process of making generalizations or drawing conclusions about the attributes of a population based on evidence contained in a sample.

b. More broadly, an inference is the process of drawing conclusions about that which is not observed directly.

B. Probability Distributions

  1. A probability distribution is a set of outcomes, each of which has an associated probability of occurrence.

a. Empirical probability distribution : a probability distribution for a set of empirical observations.

b. Theoretical probability distribution : a probability distribution for a set of theoretical observations.

  1. Discrete vs. continuous probability distributions.

C. Describing Discrete Probability Distributions

  1. Population parameters : descriptive characteristics of a population (such as mean, variance, or correlation), usually designated by a Greek letter.
  2. Expected value and mean of a probability distribution

a. The single outcome that best describes a probability distribution is its expected value, which is also the mean of the probability distribution:

E(Y) = Σ Y (^) ip(Y (^) i)

μ = E(Y)

b. If one is looking for a single number that (1) characterizes a given distribution, (2) minimizes the sum of deviations from the mean, and (3) represents the best “guess” of a randomly-selected number from a given distribution, the mean (i.e., expected value) is that number.

  1. Variance (σ^2 ) and standard deviation (σ) of the population: because researchers do not ordinarily observe populations, these parameters are largely of theoretical interest.

D. Chebycheff's Inequality Theorem

  1. Observations that are distant from the mean of a distribution occur, on average, with less frequency than those close to the mean. In general, the more distant an outcome is from its mean, the lower the probability of observing it. Hence extreme scores are less likely to occur.
  2. Chebycheff's Inequality : a theorem which states that, regardless of the shape of a distribution, the probability of an observation being k standard deviations above (or below) the population mean is less than or equal to 1/k^2.

E. Normal Distributions

  1. Normal distribution : a smooth, bell-shaped theoretical probability distribution for continuous variables that ranges from -∞ to +∞.

a. The shape of any given normal curve can be determined by two values: the population mean and variance.

b. The values of any normal distribution can easily be converted to Z-scores using the formula:

Z = (Y - μY ) / σY

F. The Central Limit Theorem

  1. Central limit theorem : a mathematical theorem that states that if repeated random samples of size N are selected from any population with mean = μ and standard deviation = σ, then the means of the samples will be normally distributed with mean = μ and standard error = σ / (sqrt N) as N gets large. In other words, the mean of the distribution of all the sample means of a given sample size drawn at random will equal the population from which the samples were drawn. Moreover, the variance of this new hypothetical distribution is smaller than the original population variance by exactly a factor of 1 / (sqrt N). The theorem does not make any assumption about the shape of the population from which the samples are drawn.
  2. Sampling distribution of sample means : the population distribution of all possible means for samples of size N selected from a population.
  3. Standard error : the standard deviation of the sampling distribution. For the μ parameter, the standard error is represented by the following equation:

a. The central limit guarantees that a given sample mean can be made to come close to the population mean in value by simply choosing an N large enough, since the variance of the sample distribution of means becomes small as N gets larger.

b. Based on information pertaining to normal distributions, one should expect that 95% of all sample means will fall within 1.96 standard errors of the population mean.

G. Sample Point Estimates and Confidence Intervals

  1. Point estimate : a sample statistic used to estimate a population parameter. For instance, the sample mean (M) can be used to estimate the population mean (μ). This estimate has some level of uncertainty associated with it, so it becomes crucial to specify that uncertainty so that one can judge the quality of the parameter estimate.
  2. Confidence interval : a range of values constructed around a point estimate which makes it possible to state the probability that the interval contains the population parameter between its upper and lower confidence limits.

a. In general:

Y ± (Z (^) α/2)(σY )

b. For 95% confidence interval:

Y ± (1.96)(σY )

  1. In general, the larger the sample size, the smaller the interval around the sample mean for a given confidence interval, primarily because the standard error is an inverse function of sample size.
  1. In testing hypotheses relating to means, the following are the steps in the process:

(1) The first step in hypothesis testing is to specify the null hypothesis and the alternate hypothesis. In testing hypotheses about μ, the null hypothesis (H0) is a hypothesized value of μ, usually associated with a null effect. The alternative hypothesis (H (^) 1) represents the research hypothesis.

(2) The second step is to choose a significance level, or a level of uncertainty that one is willing to accept in falsely rejecting the null hypothesis. Normally one should assume the .05 level is chosen.

(3) The third step is to establish the critical test statistic—in this case, a Z value--associated with the level of significance from step #2. This is going to be the compared to a test statistic that will determine whether the observed mean is sufficiently different from the null hypothesized mean justify rejection of the null hypothesis.

(4) The fourth step is to calculate the test Z score, using the following formula:

Z = (M - μH0 ) / σM

(5) In the fifth step, one compares the critical Z and test Z. If the test statistic is further away from 0 than the critical Z, then one rejects the null hypothesis and “accepts” the working hypothesis. Otherwise one is unable to reject the null hypothesis.

K. Hypothesis testing when the standard error is unknown: The t distribution

  1. Thus far it has been assumed that the researcher knows the standard error of the sampling distribution of the mean, or that N is large enough so that we can use the sample variance as a reasonable estimate of the population variance. If either of these assumptions is not met, then the Z distribution is inappropriate for hypothesis testing. If this is the case, researchers must rely on the t distribution:

a. t distribution: a test statistic used with small samples selected from a normally distributed population or, for large samples, drawn from a population with any shape.

  1. t variable (t score): a transformation of the scores of a continuous frequency distribution derived by subtracting the mean and dividing by the estimated standard error. In general:

For hypothesis tests involving the mean, the following equation is used for calculating the critical t statistic:

t = (M - μH0 ) / s (^) M

  1. The t and Z distributions are very similar, with the major difference being that t involves the use of the standard deviation of the sample in calculating the standard error, while Z assumes knowledge of the population standard deviation. In addition, there are many t distributions, with their shapes varying with the sample size and the sample standard deviation. All t distributions, like Z- transformed normal distributions, are bell-shaped and have a mean of zero. But there are two other important differences:

a. For small samples (N < 100), the use of the t distribution to test hypotheses assumes that the sample is drawn from a normally distributed population. While this assumption appears to be restrictive, research has demonstrated that violations of this assumption have only minor effects on the computations of the test statistic. Therefore, unless there is evidence that the underlying population distribution is grossly non-normal, one can use the t test to test hypotheses even when N is small.

b. A t distribution for a given sample size has a larger variance than a normal Z distribution. Therefore, the standard error of a t distribution is larger than that of a normal Z distribution. However, as N becomes large (i.e. in the range of 100), a t distribution becomes increasingly similar to a normal Z distribution in shape. Therefore, as N gets large, the standard error of the t distribution approaches that of a normal Z distribution. This means that for N = 100, the probabilities associated with outcomes in the two distributions are virtually identical.

  1. Degrees of freedom

L. One- and two-tailed hypothesis tests

  1. One-tailed tests involve directional alternative hypotheses. In these instances, we are primarily concerned with levels of confidence that are based on all of the potential error being placed in one tail of the Z or t distribution. For instance, if one is interested in testing the null hypothesis that μ = 50 against a null hypothesis that μ > 50, one does not really care about the potential for error that is in the left-hand tail of the distribution. If one obtained a negative Z or t, one could pretty much eliminate H 1 without too much difficulty. In cases such as this, the entire  is placed on the right side of the distribution.
  2. Alternatively, when one is testing a nondirectional hypothesis, a two-tailed hypothesis test is utilized. For instance, if one is interested in testing the null hypothesis that μ = 50 against a null hypothesis that μ does not equal 50, one does care about the potential for error that is in both tails of the distribution. In such cases one divides  between the left and right tails. So, when  = .05, one would place 0.025 in the left tail and 0.025 in the right tail.
  3. What this means is that, for a given level, it is more difficult to obtain statistical significance in a two-tailed test than in a one-tailed test. This is because the critical value necessary to achieve significance at a given level will be lower for a one-tailed test than for a two-tailed test.