Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Sampling and Data - Lecture Slides | MATH 170, Study notes of Statistics

Material Type: Notes; Professor: Chu; Class: Elementary Statistics; Subject: Mathematics; University: Eastern Michigan University; Term: Fall 2009;

Typology: Study notes

2009/2010

Uploaded on 02/24/2010

koofers-user-mpo
koofers-user-mpo 🇺🇸

10 documents

1 / 6

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
Chapter 1
Sampling and Data
What Is (Are?) Statistics?
Statistics (a discipline) is a science of dealing with
data. It consists of tools and methods to collect data,
organize data, and interpret the information or draw
conclusion from data.
Note: Statistics (plural) sometimes are referred to
particular calculations made from data. For
instance, mean, median, percentage etc. are
statistics, since these are numbers calculated
from a set of sample data collected.
Basic Terms
Population: A collection, or set, of individuals or objects
or events whose properties are to be analyzed.
Sample: A subset of the population.
Parameter: A numerical value summarizing all the data
of an entire population, for instance, a population mean.
Statistic: A numerical value summarizing the sample
data, for instance, a sample mean.
Two Areas of Statistics
Two areas of statistics:
Descriptive Statistics: collection, presentation,
and description of sample data.
Inferential Statistics: making decisions and
drawing conclusions about populations.
What is a Variable?
Variables are characteristics recorded
about each individual or thing.
The variables should have a name that
identify What has been measured.
What is an Observational Unit?
The person or thing to which the variable
is observed or measured, such as a
student in the class, is called the
observational/experimental unit or simply
a case .
pf3
pf4
pf5

Partial preview of the text

Download Sampling and Data - Lecture Slides | MATH 170 and more Study notes Statistics in PDF only on Docsity!

Chapter 1

Sampling and Data

What Is (Are?) Statistics?

Statistics (a discipline) is a science of dealing with data. It consists of tools and methods to collect data, organize data, and interpret the information or draw conclusion from data.

Note: Statistics (plural) sometimes are referred to particular calculations made from data. For instance, mean, median, percentage etc. are statistics, since these are numbers calculated from a set of sample data collected.

Basic Terms

  • Population: A collection, or set, of individuals or objects or events whose properties are to be analyzed.
  • Sample: A subset of the population.
  • Parameter: A numerical value summarizing all the data of an entire population, for instance, a population mean.
  • Statistic: A numerical value summarizing the sample data, for instance, a sample mean.

Two Areas of Statistics

Two areas of statistics:

  • Descriptive Statistics: collection, presentation,

and description of sample data.

  • Inferential Statistics: making decisions and

drawing conclusions about populations.

What is a Variable?

• Variables are characteristics recorded

about each individual or thing.

• The variables should have a name that

identify What has been measured.

What is an Observational Unit?

The person or thing to which the variable

is observed or measured, such as a

student in the class, is called the

observational/experimental unit or simply

a case.

What Are Data?

• Data can be numbers, record names, or

other labels recorded for the observational

unit.

• Not all data represented by numbers are

numerical data (e.g., 1=male, 2=female

where 1 and 2 are the indicators of

gender).

Data Tables

• The following data table clearly shows the

context of the data presented:

• Notice that this data table tells us the

variables (column) and observational units

(row) for these data.

What is Statistics Really About?

Statistics is about variation. Different

observational units may have different

data values for a variable. Statistics helps

us to deal with variation in order to make

sense of data.

Two kinds of Variables

  • Qualitative, or Attribute, or Categorical, Variable: A variable that identifies a categories for each case, for example, gender. Note: Arithmetic operations, such as addition and averaging, are not meaningful for data resulting from a qualitative variable
  • Quantitative, or Numerical, Variable: A variable that records measurements or amounts of something and must have measuring units, for example, height measured in inches. Note: Arithmetic operations such as addition and averaging, are meaningful for data resulting from a quantitative variable

Subdividing Variables Further

  • Qualitative and quantitative variables may be

further subdivided:

Variable

Qualitative

Quantitative

Nominal

Continuous

Discrete

Ordinal

Key Definitions

  • Nominal Variable: A qualitative variable that categorizes (or describes, or names) an element of a population, for example, color of a car purchased.
  • Ordinal Variable:ranking, for instance, The variable Age is recorded as young, middle, and old three A qualitative variable that incorporates an ordered position, or possible categories of values.
  • Discrete Variable: A quantitative variable that can assume a countable number of values. That is, the values are the counts, for example, number of cars owned. So, a discrete variable can assume values corresponding to integer values along a number line.
  • Continuous Variable: A quantitative variable that are measurements such as height, weight etc. The precision of the values recorded for the variable depends on the measuring scales used. Therefore, a weight of 120 lbs recorded may actually be 120.1 lbs or 120.14 lb or 120.143 lb etc. if a more accurate scale is used for measuring. Therefore, a continuous variable can assume any interval value along a number line, including every possible value between any two values.

Methods Used to Collect Data

Data can be collected through performing an Experiment or survey or census: Experiment: The investigator controls or modifies the environment and observes the effect on the variable under study

Census: A 100% survey. Every element of the population is listed. Seldom used: difficult and time-consuming to compile, and expensive.

Survey: Data are obtained by sampling some of the population of interest. The investigator does not modify the environment.

Sample Design: The process of selecting sample elements from the sampling frame

Note: It is important that the sampling frame be representative of the population

Note: There are many different types of sample designs. Usually they all fit into two categories: judgment samples and probability samples.

Sampling Frame: A list of the elements belonging to the population from which the sample will be drawn

Two types of sample designs

Probability Samples: Samples in which the elements to be selected are drawn on the basis of probability. Each

element in a population has a certain probability of being selected as part of the sample.

Judgment Samples: Samples that are selected on the basis of being “typical”

  • Items are selected that are representative of the population. The validity of the results from a judgment sample reflects the soundness of the collector’s judgment.

Probability Sampling

Probability sampling includes random

sampling, systematic sampling, stratified

sampling, proportional sampling, and

cluster sampling.

Random Sampling

Random Samples: A sample selected in such a way that every element in the population has a equal probability of being chosen. Equivalently, all samples of size n have an equal chance of being selected. Random samples are obtained either by sampling with replacement from a finite population or by sampling without replacement from an infinite population.

 Inherent in the concept of randomness: the next result (or occurrence) is not predictable

Notes:

 Proper procedure for selecting a random sample: use a random number generator or a table of random numbers

Example

 Example: An employer is interested in the time it takes each employee to commute to work each morning. A random sample of 35 employees will be selected and their commuting time will be recorded.

  1. There are 2712 employees
  2. Each employee is numbered: 0001, 0002, 0003, etc., up to 2712
  3. Using four-digit random numbers, a sample is identified: 1315, 0987, 1125, etc.

Systematic Sampling

Note: The systematic technique is easy to execute. However, it has some inherent dangers when the sampling frame is repetitive or cyclical in nature. In these situations the results may not approximate a simple random sample.

Systematic Sample: A sample in which every kth item of the sampling frame is selected, starting from the first element which is randomly selected from the first k elements

Example

Suppose you want to obtain a systematic sample

of 8 houses from a street of 120 houses., so

  • First, since 120/8=15, choose a random starting

point between 1 and 15. Let’s say, 11.

  • Then, choose every 15th house after the 11th

house.

The list of houses selected are

11, 26, 41, 56, 71, 86, 101, and 116.

Strartified Sampling

Stratified Random Sample: A sample obtained by stratifying or grouping the sampling frame and then selecting a fixed number of items from each of the strata/groups by means of a simple random sampling technique.

Proportional Sampling

Proportional Sample (or Quota Sample): A sample obtained by stratifying the sampling frame and then selecting a number of items in proportion to the size of the strata (or by quota) from each strata by means of a simple random sampling technique

Example

Suppose that in a company there are 180 staff include:

we are asked to take a proportional sample of 40 staff, stratified according to the above categories.

  • The first step is to calculate the percentage of staff in each group: % male, full time = (90/180) x 100 = 0.5 x 100 = 50 % male, part time = (18/180) x100 = 0.1 x 100 = 10 % female, full time = (9/180) x 100 = 0.05 x 100 = 5 % female, part time = (63/180) x100 = 0.35 x 100 = 35
  • This tells us that of our sample of 40, 50% should be male, full time. 10% should be male, part time. 5% should be female, full time. 35% should be female, part time. Therefore, 50% of 40 is 20. 10% of 40 is 4. 5% of 40 is 2. 35% of 40 is 14. We need to select 20 full time males, 4 part time males, 5 full time females, and 35 part time females.

Female, part time 63

Female, full time 9

Male, part time 18

Male, full time 90

Cluster Sampling

Cluster Sample: A sample obtained by stratifying the sampling frame into clusters first and then randomly selecting some clusters. Finally, the sample will include either all elements or a simple random sample of some of the elements in each of the clusters selected.

Note: The difference between strata and cluster samplings: All strata are represented in the sample; but only a subset of clusters are in the sample.