











Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
This ap statistics study guide covers a wide range of topics, including the difference between population and sample, sampling error and bias, types of sampling bias, response bias, hypothesis testing, probability distributions, regression analysis, and more. Detailed explanations and expert answers to help students understand the key concepts and prepare for the ap statistics exam. The guide covers essential topics such as the central limit theorem, confidence intervals, hypothesis testing, and the interpretation of statistical results. With its comprehensive coverage and clear explanations, this study guide can be a valuable resource for ap statistics students looking to deepen their understanding of the subject matter and improve their performance on the exam.
Typology: Exams
1 / 19
This page cannot be seen from the preview
Don't miss anything!
Difference between Population and Sample - ANS Both can be measured by a census. Population is a group of the individuals you want information about. Sample is a subset of individuals in the population that you actually collect data from. What do Samples allow you to do? - ANS Unlike Populations, the allow you to draw conclusions about whole populations based on the sample. Bias - ANS Anything that causes a sample to be not representative of the population of interest.
Types of Sampling Bias - ANS - Under-coverage Bais
Multistage Sample - ANS Taking the samples in stages using smaller sampling units at each stage. Stratified Random Sample - ANS A method of sampling from a population which can be partitioned into sub-populations
Confounding - ANS A variable that influences both the dependent variable and independent variable causing a spurious association. Experimental Units - ANS It is the physical entity which can be assigned at random to a treatment.
RBD ("Blocking") Advantages - ANS Advantages: Flexibility; can have any number of treatments and blocks, more accurate results; allows for calculation of unbiased error for specific treatments. RBD ("Blocking") Disadvantages - ANS Disadvantages: Not suitable for large numbers when complete block contains considerable variability; increased error. Matched Pairs Design - ANS Used when experiment has only two treatment conditions and subjects can be grouped into pairs. Matched Pairs Design Advantages - ANS Advantages: Fewer participant variables, no order effects, lower risk of demona characteristics, come tests/materials can be matched in every level. Matched Pairs Disadvantages - ANS Disadvantages: Participants cannot be matched in every level. Generalizability - ANS The extent to which the results of a sample (or experimental group can be applied to a certain population).
Boxplot - ANS Shows Min, Q1, Med, Q3, & Max Cannot show shape Outliers marked with an * Stemplot - ANS Have a Key to show what certain numbers mean Do not skip stems Read Stem first then leaf Dotplot - ANS Amount accounted for by a dot above x axis labeled Easy Histogram - ANS A graph of vertical bars representing the frequency distribution of a set of data. X axis shows INTERVALS, y axis shows FREQUENCY (the number of data points that belong in that interval) To find median: Find total amount of data points, use n+1 / 2 to find the position of the median, then see what interval contains that position. ^ Interval is your answer ^ Normal Distribution - ANS There is a symmetrical spread of frequency data that forms a bell shaped pattern. Theoretical, because in reality we do not consider data to be normal.
Empirical Rule - ANS The rules gives the approximate % of observations within
Dependent P(A & B) = P(A) * P(A | B) conditional probability Using nCr function - ANS Answering the question: "From a set of different item, how many ways can you select and order(arrange) these items?" "How many ways?" (Independent -Order does not matter) Conditional Probability - ANS You are given additional information to figure out the likelihood of an event occurring under that circumstance. P(A | B) = P(A n B) / P(B) Discrete Random Variable - ANS A random variable that takes on one of a list of possible values; typically counts. Finite Continuous Random Variable - ANS A random variable that may assume any numerical value in an interval or collection of intervals. Probability of getting exactly one given outcome = 1. Calculate Expected Value - ANS E(x) = x value (Probability) + .... Multiple times repeated with each value. Calculate SD (spread) by hand - ANS Variance is Standard Deviation Squared Var(x) = SD^ Var= x value (n - SD)^2 + ....
n = when it occurs Transforming and Combining Random Variables - ANS + or - Constant Center (Mean): Adds or subtracts Spread (SD): No change
Shape of a Geometric Distribution - ANS Unimodal and Skewed always As you continue, the probability of having a success gets smaller. Explanatory Variable - ANS Helps to predict or explain changes in a response variable (x) Response Variable - ANS Measures the outcome of a study (y) Characteristics of Bivariate Data - ANS Shape: Possibilities -Unimodal, Uniform, Bimodal/nomial, Skewed(Right or Left) R Values -Assumes that shape is uniform Strength: Possibilities -Strong, Moderate, Weak, None & Linear R Value -Closer to 1 = Stronger, Closer to 0 = Weak(-1) Direction: Possibilities -Positive or Negative R Value -Positive or Negative Slope Outliers: (especially if they substantially alter the equation of the regression line, or line of best fit. Context: Always -what two variables are we examining? x and y correlation - ANS x does not cause y, doesn't imply causeation
Line of Best fit Equsn. - ANS Y(hat) = mx + b Definition of Y(hat) - ANS The predicted value of y for a given value of x. Interpretation of Slope (m) - ANS For every increase in x the model predicts on average that the y will increase by m Interpretation of Y-intercept - ANS When the value of x is at zero y is predicted to start at b(yint) r^2 Value ("coefficient of determination") - ANS This accounts for the amount of data explained by the SRS, LSRL Extrapolation - ANS The use of a regression line for prediction that is outside of the interval of values the explanatory x variable used to obtain the line. Residual - ANS The difference of an actual value to the predicted value in the LSRL Sign is an r Calculating a Residual - ANS Predicted Y value (Y hat) calculated - actual value = r Residual Plot - ANS Gives you information of the relationship of the actual data to the predicted data, how far you are off (difference) and if there is pattern/no pattern
Sampling Distribution P Hat - ANS Describes the distribution of values taken by the sample proportion P Hat in all possible samples of the same size of population. Central Limit Theorem - ANS CLT The theory that, as sample size increases, the distribution of sample means of size n, randomly selected, approaches a normal distribution. Sample Distribution - ANS A graph of data taken from one sample. Sampling Distribution - ANS A graph of statistics taken from multiple samples. CLT Conditions - ANS Only go through the process if the problem does not say "assume all conditions are met" Sample size: The Sampling distribution is normal