Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Statistics and Probability: Measures of Central Tendency and Dispersion, Lecture notes of Mathematical Statistics

WHAT IS STATISTICS?  That science which enables us to draw conclusions about various phenomena on the basis of real data collected on sample-basis  A tool for data-based research  Also known as Quantitative Analysis  A lot of application in a wide variety of disciplines Agriculture, Anthropology, Astronomy, Biology, Economic, Engineering, Environment, Geology, Genetics, Medicine, Physics, Psychology, Sociology, Zoology …. Virtually every single subject from Anthropology to Zoology …. A to

Typology: Lecture notes

2009/2010

Available from 01/03/2022

khurramrafiq21
khurramrafiq21 🇵🇰

1 document

1 / 343

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
STA301 Statistics and Probability
Virtual University of Pakistan i
Virtual University of Pakistan
Statistics and Probability
STA301
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55
pf56
pf57
pf58
pf59
pf5a
pf5b
pf5c
pf5d
pf5e
pf5f
pf60
pf61
pf62
pf63
pf64

Partial preview of the text

Download Statistics and Probability: Measures of Central Tendency and Dispersion and more Lecture notes Mathematical Statistics in PDF only on Docsity!

Virtual University of Pakistan i

Virtual University of Pakistan

Statistics and Probability

STA

Virtual University of Pakistan ii

TABLE OF CONTENTS

TITLE PAGE NO

LECTURE NO. 1 1

Definition of Statistics

Observation and Variable

Types of Variables

Measurement Sales

Error of Measurement

LECTURE NO. 2 6

Data collection

Sampling

LECTURE NO. 3 16

Types of Data

Tabulation and Presentation of Data

Frequency distribution of Discrete variable

LECTURE NO. 4 23

Frequency distribution of continuous variable

LECTURE NO. 5 32

Types o frequency Curves

Cumulative frequency Distribution

LECTURE NO. 6 42

Stem and Leaf

Introduction to Measures of Central Tendency

Mode

LECTURE NO. 7 53

Arithmetic Mean

Weighted Mean

Median in case of ungroup Data

LECTURE NO. 8 62

Median in case of group Data

Median in case of an open-ended frequency distribution

Empirical relation between the mean, median and the mode

Quantiles (quartiles, deciles & percentiles)

Graphic location of Quantiles

LECTURE NO. 9 70

Geometric mean

Harmonic mean

Relation between the arithmetic, geometric and harmonic means

Some other measures of central tendency

LECTURE NO. 10 76

Concept of dispersion

Absolute and relative measures of dispersion

Range

Coefficient of dispersion

Quartile deviation

Coefficient of quartile deviation

LECTURE NO. 11 82

Mean Deviation

Standard Deviation and Variance

Coefficient of variation

LECTURE NO. 12 89

Chebychev’s Inequality

The Empirical Rule

The Five-Number Summary

LECTURE NO. 13 95

Virtual University of Pakistan iv

Chebychev’s Inequality

Concept of Continuous Probability Distribution

Mathematical Expectation, Variance & Moments of a Continuous Probability Distribution

LECTURE NO. 25 185

Mathematical Expectation, Variance & Moments of a Continuous Probability Distribution

BIVARIATE Probability Distribution

LECTURE NO. 26 192

BIVARIATE Probability Distributions (Discrete and Continuous)

Properties of Expected Values in the case of Bivariate Probability Distributions

LECTURE NO. 27 199

Properties of Expected Values in the case of Bivariate Probability Distributions

Covariance & Correlation

Some Well-known Discrete Probability Distributions:

Discrete Uniform Distribution

An Introduction to the Binomial Distribution

LECTURE NO. 28 207

Binomial Distribution

Fitting a Binomial Distribution to Real Data

An Introduction to the Hyper geometric Distribution

LECTURE NO. 29 215

Hyper geometric Distribution

Poisson distribution and limiting approximation to Binomial

Poisson Process

Continuous Uniform Distribution

LECTURE NO. 30 221

Normal Distribution

The Standard Normal Distribution

Normal Approximation to the Binomial Distribution

LECTURE NO. 31 232

Sampling Distribution ofX

Mean and Standard Deviation of the Sampling Distribution ofX

Central Limit Theorem

LECTURE NO. 32 239

Sampling Distribution ofpˆ

Sampling Distribution of X 1 X 2

LECTURE NO. 33 249

Point Estimation

Desirable Qualities of a Good Point Estimator

LECTURE NO. 34 256

Methods of Point Estimation

Interval Estimation

LECTURE NO. 35 263

Confidence Interval for 

Confidence Interval for 1- 2

LECTURE NO. 36 268

Large Sample Confidence Intervals for p and p1-p

Determination of Sample Size (with reference to Interval Estimation)

Hypothesis-Testing (An Introduction)

LECTURE NO. 37 274

Hypothesis-Testing (continuation of basic concepts)

Hypothesis-Testing regarding  (based on Z-statistic

LECTURE NO. 38 280

Hypothesis-Testing regarding 1 -  2 (based on Z-statistic)

Hypothesis Testing regarding p (based on Z-statistic)

LECTURE NO. 39 285

Virtual University of Pakistan v

Hypothesis Testing regarding p1-p2 (based on Z-statistic)

The Student’s t-distribution

Confidence Interval for  based on the t-distribution

LECTURE NO. 40 292

Tests and Confidence Intervals based on the t-distribution

t-distribution in case of paired observations

LECTURE NO. 41 298

Hypothesis-Testing regarding Two Population Means in the Case of Paired Observations (t-distribution)

The Chi-square Distribution

Hypothesis Testing and Interval Estimation Regarding a Population Variance (based on Chi-square

Distribution)

LECTURE NO. 42 306

The F-Distribution

Hypothesis Testing and Interval Estimation in order to compare the Variances of Two Normal

Populations (based on F-Distribution)

LECTURE NO. 43 315

Analysis of Variance

Experimental Design

LECTURE NO. 44 323

Randomized Complete Block Design

The Least Significant Difference (LSD) Test

Chi-Square Test of Goodness of Fit

LECTURE NO. 45 331

Chi-Square Test of Independence

The Concept of Degrees of Freedom

P-value

Relationship between Confidence Interval and Tests of Hypothesis

An Overview of the Science of Statistics in Today’s World (including Latest

uncertainty. It should of course be borne in mind that uncertainty does not imply ignorance but it refers to the incompleteness and the instability of data available. In this sense, the word statistics is used in the singular. As it embodies more of less all stages of the general process of learning, sometimes called scientific method , statistics is characterized as a science. Thus the word statistics used in the plural refers to a set of numerical information and in the singular, denotes the science of basing decision on numerical data. It should be noted that statistics as a subject is mathematical in character.

Thirdly , the word statistics are numerical quantities calculated from sample observations; a single quantity that has been so collected is called a statistic. The mean of a sample for instance is a statistic. The word statistics is plural when used in this sense.

CHARACTERISTICS OF THE SCIENCE OF STATISTICS

Statistics is a discipline in its own right. It would therefore be desirable to know the characteristic features of statistics in order to appreciate and understand its general nature. Some of its important characteristics are given below:

 Statistics deals with the behaviour of aggregates or large groups of data. It has nothing to do with what is happening to a particular individual or object of the aggregate.

 Statistics deals with aggregates of observations of the same kind rather than isolated figures.

 Statistics deals with variability that obscures underlying patterns. No two objects in this universe are exactly alike. If they were, there would have been no statistical problem.

 Statistics deals with uncertainties as every process of getting observations whether controlled or uncontrolled, involves deficiencies or chance variation. That is why we have to talk in terms of probability.

 Statistics deals with those characteristics or aspects of things which can be described numerically either by counts or by measurements.

 Statistics deals with those aggregates which are subject to a number of random causes, e.g. the heights of persons are subject to a number of causes such as race, ancestry, age, diet, habits, climate and so forth.

 Statistical laws are valid on the average or in the long run. There is n guarantee that a certain law will hold in all cases. Statistical inference is therefore made in the face of uncertainty.

 Statistical results might be misleading the incorrect if sufficient care in collecting, processing and interpreting the data is not exercised or if the statistical data are handled by a person who is not well versed in the subject mater of statistics.

THE WAY IN WHICH STATISTICS WORKS

As it is such an important area of knowledge, it is definitely useful to have a fairly good idea about the way in which it works, and this is exactly the purpose of this introductory course. The following points indicate some of the main functions of this science:

 Statistics assists in summarizing the larger set of data in a form that is easily understandable.

 Statistics assists in the efficient design of laboratory and field experiments as well as surveys.

 Statistics assists in a sound and effective planning in any field of inquiry.

 Statistics assists in drawing general conclusions and in making predictions of how much of a thing will happen under given conditions.

IMPORTANCE OF STATISTICS IN VARIOUS FIELDS

As stated earlier, Statistics is a discipline that has finds application in the most diverse fields of activity. It is perhaps a subject that should be used by everybody. Statistical techniques being powerful tools for analyzing numerical data are used in almost every branch of learning. In all areas, statistical techniques are being increasingly used, and are developing very rapidly.

 A modern administrator whether in public or private sector leans on statistical data to provide a factual basis for decision.

 A politician uses statistics advantageously to lend support and credence to his arguments while elucidating the problems he handles.  A businessman, an industrial and a research worker all employ statistical methods in their work. Banks, Insurance companies and Government all have their statistics departments.

 A social scientist uses statistical methods in various areas of socio-economic life a nation. It is sometimes said that “a social scientist without an adequate understanding of statistics, is often like the blind man groping in a dark room for a black cat that is not there”.

THE MEANING OF DATA

The word “data” appears in many contexts and frequently is used in ordinary conversation. Although the word carries something of an aura of scientific mystique, its meaning is quite simple and mundane. It is Latin for “those that are given” (the singular form is “datum”). Data may therefore be thought of as the results of observation.

EXAMPLES OF DATA

 Data are collected in many aspects of everyday life.

 Statements given to a police officer or physician or psychologist during an interview are data.

 So are the correct and incorrect answers given by a student on a final examination.

 Almost any athletic event produces data.

 The time required by a runner to complete a marathon,

 The number of errors committed by a baseball team in nine innings of play.

 And, of course, data are obtained in the course of scientific inquiry:

 the positions of artifacts and fossils in an archaeological site,

 The number of interactions between two members of an animal colony during a period of observation,

 The spectral composition of light emitted by a star.

OBSERVATIONS AND VARIABLES

In statistics, an observation often means any sort of numerical recording of information, whether it is a physical measurement such as height or weight; a classification such as heads or tails, or an answer to a question such as yes or no.

VARIABLES

A characteristic that varies with an individual or an object is called a variable. For example, age is a variable as it varies from person to person. A variable can assume a number of values. The given set of all possible values from which the variable takes on a value is called its Domain. If for a given problem, the domain of a variable contains only one value, then the variable is referred to as a constant.

QUANTITATIVE AND QUALITATIVE VARIABLES

Variables may be classified into quantitative and qualitative according to the form of the characteristic of interest. A variable is called a quantitative variable when a characteristic can be expressed numerically such as age, weight, income or number of children. On the other hand, if the characteristic is non-numerical such as education, sex, eye- colour, quality, intelligence, poverty, satisfaction, etc. the variable is referred to as a qualitative variable. A qualitative characteristic is also called an attribute. An individual or an object with such a characteristic can be counted or enumerated after having been assigned to one of the several mutually exclusive classes or categories.

BIASED AND RANDOM ERRORS

An error is said to be biased when the observed value is consistently and constantly higher or lower than the true value. Biased errors arise from the personal limitations of the observer, the imperfection in the instruments used or some other conditions which control the measurements. These errors are not revealed by repeating the measurements. They are cumulative in nature, that is, the greater the number of measurements, the greater would be the magnitude of error. They are thus more troublesome. These errors are also called cumulative or systematic errors. An error, on the other hand, is said to be unbiased when the deviations, i.e. the excesses and defects, from the true value tend to occur equally often. Unbiased errors and revealed when measurements are repeated and they tend to cancel out in the long run. These errors are therefore compensating and are also known as random errors or accidental errors.

LECTURE NO. 2

Steps involved in a Statistical Research-Project  Collection of Data:  Primary Data  Secondary Data  Sampling:  Concept of Sampling  Non-Random Versus Random Sampling  Simple Random Sampling  Other Types of Random Sampling

STEPS INVOLVED IN ANY STATISTICAL RESEARCH

 Topic and significance of the study  Objective of your study  Methodology for data-collection  Source of your data  Sampling methodology  Instrument for collecting data

As far as the objectives of your research are concerned, they should be stated in such a way that you are absolutely clear about the goal of your study --- EXACTLY WHAT IT IS THAT YOU ARE TRYING TO FIND OUT? As far as the methodology for DATA-COLLECTION is concerned, you need to consider:

 Source of your data (the statistical population)  Sampling Methodology  Instrument for collecting data

COLLECTION OF DATA

The most important part of statistical work is perhaps the collection of data. Statistical data are collected either by a COMPLETE enumeration of the whole field, called CENSUS, which in many cases would be too costly and too time consuming as it requires large number of enumerators and supervisory staff, or by a PARTIAL enumeration associated with a SAMPLE which saves much time and money.

PRIMARY AND SECONDARY DATA

Data that have been originally collected (raw data) and have not undergone any sort of statistical treatment, are called PRIMARY data. Data that have undergone any sort of treatment by statistical methods at least ONCE, i.e. the data that have been collected, classified, tabulated or presented in some form for a certain purpose, are called SECONDARY data.

COLLECTION OF PRIMARY DATA

One or more of the following methods are employed to collect primary data:  Direct Personal Investigation  Indirect Investigation  Collection through Questionnaires  Collection through Enumerators  Collection through Local Sources

DIRECT PERSONAL INVESTIGATION

In this method, an investigator collects the information personally from the individuals concerned. Since he interviews the informants himself, the information collected is generally considered quite accurate and complete. This method may prove very costly and time-consuming when the area to be covered is vast. However, it is useful for laboratory experiments or localized inquiries. Errors are likely to enter the results due to personal bias of the investigator.

INDIRECT INVESTIGATION

Sometimes the direct sources do not exist or the informants hesitate to respond for some reason or other. In such a case, third parties or witnesses having information are interviewed. Moreover, due allowance is to be made for the personal bias. This method is useful when the information desired is complex or there is reluctance or indifference on the part of the informants. It can be adopted for extensive inquiries.

For Example:  1)All the possible outcomes from the throw of a die – however long we throw the die and record the results, we could always continue to do so far a still longer period in a theoretical concept – one which has no existence in reality.  2) The No. of ways in which a football team of 11 players can be selected from the 16 possible members named by the Club Manager. We also need to differentiate between the sampled population and the target population. Sampled population is that from which a sample is chosen whereas the population about which information is sought is called the target population thus our population will consist of the total no. of students in all the colleges in the Punjab. Suppose on account of shortage of resources or of time, we are able to conduct such a survey on only 5 colleges scattered throughout the province. In this case, the students of all the colleges will constitute the target pop whereas the students of those 5 colleges from which the sample of students will be selected will constitute the sampled population. The above discussion regarding the population, you must have realized how important it is to have a very well-defined population. The next question is: How will we draw a sample from our population? The answer is that: In order to draw a random sample from a finite population, the first thing that we need is the complete list of all the elements in our population. This list is technically called the FRAME.

SAMPLING FRAME

A sampling frame is a complete list of all the elements in the population. For example:  The complete list of the BCS students of Virtual University of Pakistan on February 15, 2003  Speaking of the sampling frame, it must be kept in mind that, as far as possible, our frame should be free from various types of defects:  does not contain inaccurate elements  is not incomplete  is free from duplication, and  Is not out of date. Next, let’s talk about the sample that we are going to draw from this population. As you all know, a sample is only a part of a statistical population, and hence it can represent the population to only to some extent. Of course, it is intuitively logical that the larger the sample, the more likely it is to represent the population. Obviously, the limiting case is that: when the sample size tends to the population size, the sample will tend to be identical to the population. But, of course, in general, the sample is much smaller than the population. The point is that, in general, statistical sampling seeks to determine how accurate a description of the population the sample and its properties will provide. We may have to compromise on accuracy, but there are certain such advantages of sampling because of which it has an extremely important place in data-based research studies.

ADVANTAGES OF SAMPLING

1. Savings in time and money.  Although cost per unit in a sample is greater than in a complete investigation, the total cost will be less (because the sample will be so much smaller than the statistical population from which it has been drawn).  A sample survey can be completed faster than a full investigation so that variations from sample unit to sample unit over time will largely be eliminated.  Also, the results can be processed and analyzed with increased speed and precision because there are fewer of them. 2. More detailed information may be obtained from each sample unit. 3. Possibility of follow-up: (After detailed checking, queries and omissions can be followed up --- a procedure which might prove impossible in a complete survey). 4. Sampling is the only feasible possibility where tests to destruction are undertaken or where the population is effectively infinite. The next two important concepts that need to be considered are those of sampling and non-sampling errors.

SAMPLING & NON-SAMPLING ERRORS
1. SAMPLING ERROR

The difference between the estimate derived from the sample (i.e. the statistic) and the true population value (i.e. the parameter) is technically called the sampling error. For example,

Sampling error arises due to the fact that a sample cannot exactly represent the pop, even if it is drawn in a correct manner

2. NON-SAMPLING ERROR

Besides sampling errors, there are certain errors which are not attributable to sampling but arise in the process of data collection, even if a complete count is carried out. Main sources of non sampling errors are:  The defect in the sampling frame.  Faulty reporting of facts due to personal preferences.  Negligence or indifference of the investigators  Non-response to mail questionnaires. These (non-sampling) errors can be avoided through  Following up the non-response,  Proper training of the investigators.  Correct manipulation of the collected information,

Let us now consider exactly what is meant by ‘sampling error’: We can say that there are two types of non-response --- partial non-response and total non-response. ‘Partial non-response’ implies that the respondent refuses to answer some of the questions. On the other hand, ‘ total non-response’ implies that the respondent refuses to answer any of the questions. Of course, the problem of late returns and non-response of the kind that I have just mentioned occurs in the case of HUMAN populations. Although refusal of sample units to cooperate is encountered in interview surveys, it is far more of a problem in mail surveys. It is not uncommon to find the response rate to mail questionnaires as low as 15 or 20%.The provision of INFORMATION ABOUT THE PURPOSE OF THE SURVEY helps in stimulating interest, thus increasing the chances of greater response. Particularly if it can be shown that the work will be to the ADVANTAGE of the respondent IN THE LONG RUN. Similarly, the respondent will be encouraged to reply if a pre-paid and addressed ENVELOPE is sent out with the questionnaire. But in spite of these ways of reducing non-response, we are bound to have some amount of non-response. Hence, a decision has to be taken about how many RECALLS should be made. The term ‘recall’ implies that we approach the respondent more than once in order to persuade him to respond to our queries. Another point worth considering is: How long the process of data collection should be continued? Obviously, no such process can be carried out for an indefinite period of time! In fact, the longer the time period over which the survey is conducted, the greater will be the potential VARIATIONS in attitudes and opinions of the respondents. Hence, a well-defined cut-off date generally needs to be established. Let us now look at the various ways in which we can select a sample from our population. We begin by looking at the difference between non-random and RANDOM sampling. First of all, what do we mean by non- random sampling?

NONRANDOM SAMPLING

‘Nonrandom sampling’ implies that kind of sampling in which the population units are drawn into the sample by using one’s personal judgment. This type of sampling is also known as purposive sampling. Within this category, one very important type of sampling is known as Quota Sampling.

QUOTA SAMPLING

In this type of sampling, the selection of the sampling unit from the population is no longer dictated by chance. A sampling frame is not used at all, and the choice of the actual sample units to be interviewed is left to the discretion of the interviewer. However, the interviewer is restricted by quota controls. For example, one particular interviewer may be told to interview ten married women between thirty and forty years of age living in town X, whose husbands are professional workers, and five unmarried professional women of the same age living in the same town. Quota sampling is often used in commercial surveys such as consumer market-research. Also, it is often used in public opinion polls.

ADVANTAGES OF QUOTA SAMPLING

 There is no need to construct a frame.  It is a very quick form of investigation.  Cost reduction.

Sampling error = X 

ONE THOUSAND RANDOM DIGITS

Actually, Random Number Tables are constructed according to certain mathematical principles so that each digit has the same chance of selection. Of course, nowadays randomness may be achieved electronically. Computers have all those programmes by which we can generate random numbers.

EXAMPLE

The following frequency table of distribution gives the ages of a population of 1000 teen-age college students in a particular country. Select a sample of 10 students using the random numbers table. Find the sample mean age and compare with the population mean age.

How will we proceed to select our sample of size 10 from this population of size 1000?

Age (X)

No. of Students (f)

13 6

14 61

15 270

16 491

17 153

18 15

19 4

1000

Student-Population of a College

The first step is to allocate to each student in this population a sampling number. For this purpose, we will begin by constructing a column of cumulative frequencies.

Now that we have the cumulative frequency of each class, we are in a position to allocate the sampling numbers to all the values in a class. As the frequency as well as the cumulative frequency of the first class is 6, we allocate numbers 000 to 005 to the six students who belong to this class.

As the cumulative frequency of the second class is 67 while that of the first class was 6, therefore we allocate sampling numbers 006 to 066 to the 61 students who belong to this class.

AGE

X

No. of

Students

f

cf

Sampling

Numbers

13 6 6 000 – 005

14 61 67

15 270 337

16 491 828

17 153 981

18 15 996

19 4 1000

1000

AGE X

No. of Students f

Cumulative Frequency cf

13 6 6

14 61 67

15 270 337

16 491 828

17 153 981

18 15 996

19 4 1000

1000

AGE

X

No. of

Students

f

cf

Sampling

Numbers

13 6 6 000 – 005

14 61 67 006 – 066

15 270 337

16 491 828

17 153 981

18 15 996

19 4 1000

1000

The age of each student in this class is 14 years; hence, obviously, the age of the 42nd student is also 14 years. This is how we are able to ascertain the ages of all the students who have been selected in our sampling. You will recall that in this example, our aim was to draw a sample from the population of college students, and to compare the sample’s mean age with the population mean age. The population mean age comes out to be 15.785 years.

The population mean age is :

The above formula is a slightly modified form of the basic formula that you have done ever-since school-days i.e. the mean is equal to the sum of all the observations divided by the total number of observations. Next, we compute the sample mean age. Adding the 10 values and dividing by 10, we obtain: Ages of students selected in the sample (in years): 14, 15, 16, 15, 16, 16, 17, 15, 16, 16 Hence the sample mean age is: 15.6, comparing the sample mean age of 15.6 years with the population mean age of 15.785 years, we note that the difference is really quite slight, and hence the sampling error is equal to

0. 185 years

X 15. 6 15. 785

AGE

X

No. of Students

f

fX

years

f

fx

X ^  nX 15  6. years (^10156) 

AGE

X

No. of

Students

f

cf

Sampling Error

And the reason for such a small error is that we have adopted the RANDOM sampling method. The basic advantage of random sampling is that the probability is very high that the sample will be a good representative of the population from which it has been drawn, and any quantity computed from the sample will be a good estimate of the corresponding quantity computed from the population! Actually, a sample is supposed to be a MINIATURE REPLICA of the population. As stated earlier, there are various other types of random sampling.

OTHER TYPES OF RANDOM SAMPLING

· Stratified sampling (if the population is heterogeneous)  Systematic sampling (practically, more convenient than simple random sampling)  Cluster sampling (sometimes the sampling units exist in natural clusters)  Multi-stage sampling All these designs rest upon random or quasi-random sampling. They are various forms of PROBABILITY sampling --- that in which each sampling unit has a known (but not necessarily equal) probability of being selected. Because of this knowledge, there exist methods by which the precision and the reliability of the estimates can be calculated OBJECTIVELY. It should be realized that in practice, several sampling techniques are incorporated into each survey design, and only rarely will simple random sample be used, or a multi-stage design be employed, without stratification. The point to remember is that whatever method be adopted, care should be exercised at every step so as to make the results as reliable as possible.