

Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
How to calculate the five-number summary, interquartile range, and outliers from a dataset using the example of mlb team payrolls in millions. It provides the steps to determine the minimum, first quartile, median, third quartile, and maximum values, as well as the iqr and outlier boundaries.
Typology: Study Guides, Projects, Research
1 / 3
This page cannot be seen from the preview
Don't miss anything!
order from smallest to largest –
boundaries. These boundaries are then used to determine whether a data set has any actual outliers.
in a data set. Data values that are smaller than the lower outlier boundary or larger than the upper outlier
boundary are outliers. Some data sets do not have any outliers. Outliers that are determined to be the
result of an error should be removed from the data set.
summary, b) the IQR, c) the upper and lower outlier boundaries, and d) any outliers. Note – data should be
sorted from lowest to highest if it is not provided that way. This allows the easy identification of the min,
max, median, and individual data positions within the set.
Team Payroll Team Payroll Team Payroll Team Payroll
1 Padres 55 9 Rockies 78 17 Mets 93 25 Rangers 121
2 Athletics 55 10 Indians 78 18 Twins 94 26 Tigers 132
3 Astros 61 11 Nationals 81 19 Dodgers 95 27 Angels 154
4 Royals 61 12 Orioles 81 20 W Sox 97 28 Red Sox 173
5 Pirates 63 13 Mariners 82 21 Brewers 98 29 Phillies 175
6 Rays 64 14 Reds 82 22 Cardinals 110 30 Yankees 198
7 D Backs 74 15 Braves 83 23 Giants 118
8 Blue Jays 75 16 Cubs 88 24 Marlins 118
1. Minimum - The smallest value in the data set. 2. First Quartile - Separates the lowest 25% of the data in a set from the highest 75%. It is
typically denoted as 𝑸
𝟏
25
100
1
3. Median – The middle value in a sorted (smallest to largest) data set. If there is an even
number of values, it is calculated by averaging the two middle values. The Median is also
referred to as the Second Quartile (𝑸
𝟐
) because it separates the lower 50% of data in a set
from the upper 50%.
4. Third Quartile - Separates the lowest 75% of the data in a set from the highest 25%. It is
typically denoted as 𝑸
𝟑
75
100
3
5. Maximum – The largest value in the data set.
𝑰𝒏𝒕𝒆𝒓𝒒𝒖𝒂𝒓𝒕𝒊𝒍𝒆 𝑹𝒂𝒏𝒈𝒆 (𝐼𝑄𝑅) = 𝑄
3
− 𝑄
1
𝑳𝒐𝒘𝒆𝒓 𝑂𝑢𝑡𝑙𝑖𝑒𝑟 𝐵𝑜𝑢𝑛𝑑𝑎𝑟𝑦 = 𝑄
1
− 1. 5 𝐼𝑄𝑅
𝑼𝒑𝒑𝒆𝒓 𝑂𝑢𝑡𝑙𝑖𝑒𝑟 𝐵𝑜𝑢𝑛𝑑𝑎𝑟𝑦 = 𝑄
3
a) 5 Number Summary – These values can be calculated by hand (shown below) OR they can be found
using the “1-Var Stats” button from the Stat Menu on a TI-83 or TI-84 calculator.
b) IQR 𝐼𝑄𝑅 = 𝑄
3
1
c) Upper and Lower Outlier Boundaries –
1
3
d) Outliers – Lower Outliers None (There are no individual data points smaller than the lower
boundary of 10.5.)
Upper Outliers 198 (Yankees) (This data value is bigger than the upper
boundary of 182.5.)
whether the data set is symmetric or skewed.
x
55 65 75
85.
95
105 115 118 125 135 145 155 165 175 185 195
MLB Team Payrolls (in millions)
Minimum 𝑷𝒐𝒔𝒊𝒕𝒊𝒐𝒏 𝑸
𝟏
=
25
100
( 30 ) Median 𝑷𝒐𝒔𝒊𝒕𝒊𝒐𝒏 𝑸
𝟑
=
75
100
( 30 ) Maximum
55 = 7.5 8
th
Position =
83 + 88
2
= 22.5 23
rd
Position 198
= 75 = 85.5 = 118
Represents 25
th
percentile
points in set
Represents 75
th
percentile
points in set
Average of 2
middle data
points in set
𝑄
1
Median 𝑄
3
Draw the whisker out to the
smallest data value that is larger
than the lower boundary
Draw the whisker out to the
largest data value that is smaller
than the upper boundary
Mark outliers with an “x”
This data set is
Skewed RIGHT
If the “position” calculation results in a decimal, round up to the next whole number to determine the position.
If the calculation results in a whole number, average that position’s data value with the next data value