## What is Sampling?

The terminology “sampling” indicates the selection of a part of a group or an aggregate with a view to obtaining information about the whole. This aggregate or the totality of all members is known as Population although they need not be human beings. The selected part, which is used to ascertain the characteristics of the population, is called Sample.

Table of Contents

While choosing a sample, the population is assumed to be composed of individual units or members, some of which are included in the sample. The total number of members of the population is called Population Size and the number included in the sample is called Sample Size.

Researchers usually cannot make direct observations of every individual in the population they are studying. Instead, they collect data from a subset of individuals – a sample – and use those observations to make inferences about the entire population.

Ideally, the sample corresponds to the larger population on the characteristic(s) of interest. In that case, the researcher’s conclusions from the sample are probably applicable to the entire population.

This type of correspondence between the sample and the larger population is most important when a researcher wants to know what proportion of the population has a certain characteristic –like a particular opinion or a demographic feature. Public opinion polls that try to describe the percentage of the population that plans to vote for a particular candidate, for example, require a sample that is highly representative of the population.

## Need of Sampling

To draw conclusions about populations from samples, we must use inferential statistics which enables us to determine a population’s characteristics by directly observing only a portion (or sample) of the population. We obtain a sample rather than a complete enumeration (a census) of the population for many reasons.

Obviously, it is cheaper to observe a part rather than the whole, but we should prepare ourselves to cope with the dangers of using samples. In this tutorial, we will investigate various kinds of sampling procedures. Some are better than others but all may yield samples that are inaccurate and unreliable. We will learn how to minimize these dangers, but some potential error is the price we must pay for the convenience and savings the samples provide.

## Essentials of Sampling

In order to reach a clear conclusion, the sampling should possess the following essentials:

### It must be representative

The sample selected should possess similar characteristics to the original universe from which it has been drawn.

### Homogeneity

Selected samples from the universe should have similar nature and should not have any difference when compared with the universe.

### Adequate samples

In order to have a more reliable and representative result, a good number of items are to be included in the sample.

### Optimization

All efforts should be made to get maximum results both in terms of cost as well as efficiency. If the size of the sample is larger, there is better efficiency and at the same time the cost is more. A proper size of sample is maintained in order to have optimized results in terms of cost and efficiency.

## Advantages of Sampling

The sampling only chooses a part of the units from the population for the same study. The sampling has a number of advantages as compared to complete enumeration due to a variety of reasons.

Sampling has the following advantages:

**Cost effective****Time-saving**- Testing of A
**ccuracy** **Detailed Research is Possible****Reliability****Exclusive methods in many circumstances****Administrative convenience****More scientific**

### Cost effective

This method is cheaper than the Census Research because only a fraction of the population is studied in this method.

### Time-saving

There is a saving in time not only in conducting the sampling enquiry but also in the decision making process.

### Testing of Accuracy

Testing of accuracy of samples drawn can be made by comparing two or more samples.

### Detailed Research is Possible

Since the data collected under this method is limited but homogeneous, so more time could be spend on decision making.

### Reliability

If samples are taken in proper size and on proper grounds the results of sampling will be almost the same which might have been obtained by Census method.

### Exclusive methods in many circumstances

Where the population is infinite, then the sampling method is the only method of effective research. Also, if the population is perishable or testing units are destructive, then we have to complete our research only through sampling. Example: Estimation of expiry dates of medicines.

### Administrative convenience

The organization and administration of sample survey are easy for the reasons which have been discussed earlier.

### More scientific

Since the methods used to collect data are based on scientific theory and results obtained can be tested, sampling is a more scientific method of collecting data.

## Limitations of Sampling

It is not that sampling is free from demerits or shortcomings. There are certain limitations of this method which are discussed below:

**Biased Conclusion****Experienced Researcher is required****Not suited for Heterogeneous Population****Small Population****Sample Not Representative****Lack of Experts****Conditions of Complete Coverage**

### Biased Conclusion

If the sample has not been properly taken then the data collected and the decision on such data will lead to wrong conclusion. Samples are like medicines. They can be harmful when they are taken carelessly or without knowledge off their effects.

### Experienced Researcher is required

An efficient sampling requires the services of qualified, skilled and experienced personnel. In the absence of these the results of their search will be biased.

### Not suited for Heterogeneous Population

If the populations are mixed or varied, then this method is not suited for research.

### Small Population

Sampling method is not possible when population size is too small. 5. Illusory conclusion: If a sample enquiry is not carefully planned and executed, the conclusions may be inaccurate and misleading.

### Sample Not Representative

To make the sample representative is a difficult task. If a representative sample is taken from the universe, the result is applicable to the whole population. If the sample is not representative of the universe the result may be false and misleading.

### Lack of Experts

As there are lack of experts to plan and conduct a sample survey, its execution and analysis, and its results would be unsatisfactory and not trustworthy.

### Conditions of Complete Coverage

If the information is required for each and every item of the universe, then a complete enumeration survey is better.

**Some Fundamental Definitions**

Some fundamental concepts related to sampling are discussed as follows:

**Universe or Population****Sample****Sampling Unit****Sampling****Parameter****Statistic****Standard Error****Sampling Frame****Sampling Design****Sampling Error****Sample Distribution****Population Distribution**

### Universe or Population

The total number of items in any field of study is called the universe. The population refers to the total units or items about which information is required. The attributes that are the object of the study are called the characteristics and the units possessing them are known as elementary units. The aggregate of such units is the population.

All units in any field of study constitute the universe. All elementary units are the population. Often the two terms are used interchangeably, however, research needs a distinction.

The population or universe can be of two types:

- A
**finite population**consists of fixed number of elements and the elements can be enumerated totally, e.g., the number of students in a state. The symbol N is used to depict the number of elements or items of a finite population. - An
**infinite population**is the one where all the elements cannot be observed, at least theoretically, e.g., the number of stars in the sky. In a sense, a very large finite population is an infinite population.

### Sample

It is a subset of the population. It comprises only some elements of the population. If out of the 350 mechanical engineers employed in an organization, 30 are surveyed regarding their intention to leave the organization in the next six months, these 30 members would constitute the sample.

### Sampling Unit

A sampling unit is a single member of the sample. If a sample of 50 students is taken from a population of 200 MBA students in a business school, then each of the 50 students is a sampling unit. Another example could be that if a sample of 50 patients is taken from a hospital to understand their perception about the services of the hospital, each of the 50 patients is a sampling unit.

### Sampling

It is a process of selecting an adequate number of elements from the population so that the study of the sample will not only help in understanding the characteristics of the population but will also enable us to generalize the results. We will see later that there are two types of sampling designs—probability sampling design and non-probability sampling design.

### Parameter

As per definition,a parameter is an arbitrary constant whose value characterizes a member of a system (as a family of curves); also it is a quantity (as a mean or variance) that describes a statistical population. A parameter is a value, usually unknown (and which therefore has to be estimated), used to represent a certain population characteristic.

For example, the population mean is a parameter that is often used to indicate the average value of a quantity. Within a population, a parameter is a fixed value which does not vary. Each sample drawn from the population has its own value of any statistic that is used to estimate this parameter.

For example, the mean of the data in a sample is used to give information about the overall mean in the population from which that sample was drawn. Parameters are often assigned Greek letters Sigma (s) whereas statistics are assigned Roman letters (s).

A statistical parameter is a parameter that indexes a family of probability distributions. It can be regarded as a numerical characteristic of a population or a model.

### Statistic

Astatistic( singular) is a single measure of some attributes of a sample, for example, its arithmetic mean value. It is calculated by applying a function (statistical algorithm) to the values of the items of the sample, which are known together as a set of data.

More formally, statistical theory defines a statistic as a function of a sample where the function itself is independent of the sample’s distribution; that is, the function can be stated before the realization of the data. The term statistic is used both for the function and for the value of the function on a given sample.

A statistic is distinct from a statistical parameter, which is not computable because often the population is much too large to examine and measure all its items. However, a statistic, when used to estimate a population parameter, is called an estimator. For example, the sample mean is a statistic that estimates the population mean, which is a parameter.

When a statistic (a function) is being used for a specific purpose, it may be referred to by a name indicating its purpose: in descriptive statistics, a descriptive statistic is used to describe the data; in estimation theory, an estimator is used to estimate a parameter of the distribution (population); in statistical hypothesis testing, a test statistic is used to test a hypothesis.

However, a single statistic can be used for multiple purposes – for example, the sample mean can be used to describe a data set, to estimate the population mean, or to test a hypothesis.

### Standard Error

As per definition, the ‘StandardError’ is the standard deviation of the sampling distribution of a statistic. The standard error is a statistical term that measures the accuracy with which a sample represents a population.

In statistics, the sample mean deviates from the actual mean of a population; this deviation is the standard error. Thus the term ‘standard error’ is used to refer to the standard deviation of various sample statistics, such as the mean or median.

The ‘standard error of the mean refers to the standard deviation of the distribution of sample means taken from a population. The smaller the standard error, the more representative the sample will be of the overall population. The standard error is also inversely proportional to the sample size; the larger the sample size, the smaller the standard error because the statistic will approach the actual value.

### Sampling Frame

The elementary units that form the basis of the sampling process are known as sampling units. A list of all such sampling units is referred to as the sampling frame. The sampling frame is a list of items from which the sample is drawn.

For research, a frame of the population is to be constructed which will enable the researcher to draw the sample, e.g., names from the census records or telephone directory, etc., for conducting a study on a sample that is drawn from the frame. A telephone directory is a frame, from which names are drawn to get the sample.

### Sampling Design

Sampling design helps in obtaining a sample from the frame. It is the procedure or technique for obtaining those sampling units from which inferences can be made. The sampling design has to be prepared well in advance before undertaking any research.

Statistic(s) and Parameter(s): A statistic is the characteristic of the sample whereas the parameter is the characteristic of the population. Sampling analysis involves estimating the parameter from the statistic.

### Sampling Error

This refers to any inaccuracy which is spotted in the information collected because only a small portion of the population is included in the study. The sampling errors are also known as error variances. These arise out of sampling and are usually random variations in the sample estimates around the true population values.

### Sample Distribution

For example, say, from a population of 30,000, a random of 300 people is chosen for a given study. The observed data are arranged in a frequency distribution,e.g., fertility rate. This type of distribution is called sample distribution.

### Population Distribution

If the fertility rates of all the 30,000 people of the population are obtained and arranged in a frequency distribution, it is known as population distribution. Since the forms and parameters are not ordinarily known, an estimate of these two characteristics of population is made from the sample distribution. So, if the sample distribution is normal, one can assume that the population distribution is also normal.