Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Identifying Outliers: IQR Method for Calculating Fences, Study notes of Statistics

The IQR method for identifying outliers in a dataset. It discusses how to calculate the fences for potential and extreme outliers using the interquartile range (IQR) and the first (Q1) and third (Q3) quartiles. The document also covers the significance of outliers and their potential impact on data analysis.

Typology: Study notes

2021/2022

Uploaded on 09/27/2022

avni
avni 🇺🇸

4.7

(3)

229 documents

1 / 7

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Wewilllookattwowaystoidentifyoutliers.Thefirstofwhichwewilldiscussnow.
Thismethod,whichwewillcalltheIQRmethod,isbaseduponthefivenumbersummary,
inparticularitusestheIQRandthe1st and3rd quartiles,Q1andQ3.
1
pf3
pf4
pf5

Partial preview of the text

Download Identifying Outliers: IQR Method for Calculating Fences and more Study notes Statistics in PDF only on Docsity!

We will look at two ways to identify outliers. The first of which we will discuss now. This method, which we will call the IQR method, is based upon the five‐number summary, in particular it uses the IQR and the 1 st^ and 3 rd^ quartiles, Q1 and Q3.

To locate potential or suspected outliers, we need to calculate two values, sometimes called “fences” These values are not necessarily data points but simply provide a range, where values falling outside the interval are possible outliers. The two values are calculated by going beyond Q1 and Q3 by 1.5 times the IQR. In other words we take Q1 minus 1.5 times the IQR and Q3 plus 1.5 times the IQR Any observation falling outside those values (more toward the extremes) is a potential outlier.

If an outlier was produced by essentially the same process as the rest of data, and if such extremes are expected to eventually occur again , then the outlier contains something important and interesting about the process, and it should be kept in the data

If the outlier was produced under different conditions from the rest of the data (or by a different process), the outlier can be removed from the data if your goal is to investigate only the process that produced the rest of the data

Identifying outliers is an important component of describing the distribution of one quantitative variable. Next, we will see that this process of identifying outliers is used when creating boxplots.