






















Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
The mean is used in computing other statistics such as variance and standard deviation. ... The median is used when one must find the center value of a data.
Typology: Lecture notes
1 / 30
This page cannot be seen from the preview
Don't miss anything!
(^) x represents the sum of all the data values of x. x
(^) is the sum of the data values after squaring them. x
x
(^) .
Measure Description Statistic and Parameter Notes and Insights Mean the sum of the data values divided by the total number of values The sample mean is denoted by x and calculated using the formula: x x n The population mean is denoted by and is found with the formula: x N The mean should be rounded to one more decimal place than occurs in the raw data. The mean is the balance point of the data. When the data is skewed the mean is pulled in the direction of the longer tail. The mean is used in computing other statistics such as variance and standard deviation. The mean is highly affected by outliers and may not be an appropriate statistic to use when an outlier is present. Median the middle number of the data set when they are ordered from smallest to largest Arrange the data in order. If n is odd, the median is the middle number. If n is even, the median is the mean of the middle two numbers We use the symbol MD for median. The median is robust against outliers (less affected by them). The median is used when one must find the center value of a data set Mode the value that occurs most often in a data set This is where the “peaks” occur in a histogram. Unimodal – when a data set has only one mode Bimodal – when a data set has 2 modes Multimodal – when a data set has more than 2 modes No Mode – when no data values occurs more than once The mode is used when the most typical case is desired.
Find the Mean, Median and Midrange of the daily vehicle pass charge for five U.S. National Parks. The costs are $25, $15, $15, $20, and $25. Find the Mean, Median and Midrange of the numbers of water- line breaks per month in the last two winter seasons for the city of Brownsville, Minnesota: 2, 3, 6, 8, 4, 1. Find the midrange.
x ^ 3.613524^ 6.2846^ 18.4^6
Notice how the statistics compare to each other for each variable, e.g., mean, median and midrange are all close to each other for the room variable. Why? Why is this not the case for the other variables?
Ways to Measure Spread: Measure Description Sample Population Notes Range The difference between the largest and smallest observations. Denoted by R. R = high value – low value Variance The average of the squares of the distance each value is from the mean The sample variance is an estimate of the population variance calculated from a sample. It is denoted by s 2 . The formula to calculate the sample variance is 2 2 (^ ) 1 x x s n . or
2
2 ^ x 2
Population variance is denoted by ^2 It is commonly used in statistics because it has nice theoretical properties. The formula for the population variance is 2 ( x ) 2 N
In practice, we don’t know the population values or parameters, so we cannot calculate ^2 or . We end up calculating the variance and standard deviation of a sample. Be careful to notice the difference of n-1 (sample) and n (population) in the denominator. Standard deviation the “typical” deviation from the sample mean The square root of the sample variance It is denoted by s. The formula to calculate the sample standard deviation is: 2 2 (^ ) 1 x x s s n
OR s n x 2 ^ ^ x 2 n ( n 1 ) the square root of the population variance The symbol for the population standard deviation is The formula for the population standard deviation is
( x )
N
The greater the spread of the data, the larger the value of s. s = 0 only when all observations take the same value. s can be influenced by outliers because outliers influence the mean and because outliers have large deviations from the mean
Steps for Calculating Sample Variance and Standard Deviation
And to check your answer, John’s statistics are:
x 185, s 13.6, s 3.
Uses of the Variance and Standard Deviation
Example: Mothers’ Heights An article in 1903 published the heights of 1052 mothers. The sample mean was 62.484 inches and the standard deviation was 2.390 inches. Note the summary table below regarding the actual percentages and the empirical rule.
Section 3-3: Measure of Position (some of…this section we need for use in Section 3-4) Quartiles – values that divide the distribution into four groups, separated by Q1, Q2 (median), and Q3. Q1 is the 25 th percentile. Q2 is the 50 th percentile (the median). Q3 is the 75 th percentile. Interquartile Range (IQR) – the difference between Q1 and Q3. This is the range of the middle 50% of the data. IQR Q3Q