# Probability Sampling

Sub Topics
Doing a comprehensive study of large populations is a costly and time consuming process. More often than not, it is not feasible to study the population thoroughly. The characteristics of the population are evaluated or estimated through a representative sample. The sampling methods applied are broadly classified as
1.    Probability sampling
2.    Non Probability sampling

## Definition Probability Sampling

A probability sampling is a method of sampling where the element of randomness is seen in the selection employed.

The chance process attributes a certain probability for every element of the population being included in the sample. In addition to the random selection the sampling process tries to ensure that different units constituting the population are equally represented in the sample. This makes the sample representative of the population.

Using the probability sample statistic as a point estimate, the confidence intervals for the population parameter can be estimated.

## Types of Probability Sampling

Probability sampling methods can be broadly classified into four basic types
1.    Simple random sampling
2.    Stratified random sampling
3.    Systematic random sampling
4.    Cluster random sampling

Simple Random Sampling

If a sample of size ‘n’ is proposed to be selected from a population of size ‘N’, the simple random sampling method ensures that all possible groups of size n in the population have equal chance of getting included in the sampling process. The primitive method of selecting the elements for the sample involves a lot like procedure.  A professional approach for selection makes use of random numbers.  Suppose the population consists of 2000 elements, each member of the population is assigned numbers ranging from 1 to 2000. If the sample size is set as 100, 100 numbers from 1 to 2000 is selected applying random process, using a random number table or random number generating soft wares. The data collected or available for these sample members is then analyzed using statistical techniques. The simple random sampling thus includes the two features of probability sampling,
1.    The randomness applied in the selection process.
2.    The probability that a group of 100 formed from the population to be taken as a sample is same for all such groups.

Stratified random sampling

Stratified random sampling is also called proportional random sampling. In this case the population is divided into homogeneous subgroups and elements from each group are selected using simple random sampling. If the population size is N and if the population is divided into ‘I’ groups each with size N1, N2, N3……….Ni  then N1 + N2 +……….. + Ni = N. If the sample size is fixed as ‘n’ then the number of elements to be included from the ith group = $N_i \times \frac{n}{N}$. The members for the sample from each group using simple random method explained above.

Systematic random sampling

In systematic random sampling the elements are picked from the numbered population members at regular intervals, allowing randomness for the first selection. If N and n are the population and sample sizes, the interval size k = $\frac{N}{n}$. The first unit to be included in the sample is decided by selecting a number between 1 to k using random methods. Then every kth number from this is included in the sample. Suppose the population and sample sizes were 2000 and 100, the k = $\frac{2000}{100}$ = 20. A random number is generated between 1 to 20. If the random number generated is 16, then the numbers are included for the sample in the sequence 16, 36, 56, 76, 96……
The systematic random sapling is the easiest to get the sample. But for the sample to be a true representative of the population, the members of the population should have been arranged in some random manner.

Cluster sampling

Cluster sampling is done mainly to reduce the cost. The clusters are the natural groups mostly geographically formed over the population. Suppose the population consists of two member working families in the entire state of California. If the sample is prepared from the entire population, lot of travelling has to be done for the collection of data. Hence to reduce the cost involved in sampling and to save time, the entire state is viewed as a union of clusters of areas divided by zip codes. We can select 4 or 5 sample areas as representative of the population using random methods. Then the data has to be collected from all units in the clusters selected.If the survey or data collection can be done via Mail, telephone or E-Mail, the cluster sampling need not be preferred for economizing the cost of sampling.

Multi-stage sampling

In actual practice the above stated four methods are not generally used isolated. A combination of two or more of the methods are used which divides the sampling into stages. This method of combining different sampling types for efficient use of available resources is called the multistage sampling.

## Non Probability Sampling

Non Probability sampling, samples are taken from accessible (convenient) population rather from theoretical population. The sample collected in this process can still be representative of the theoretical population. As against probability sampling, in non probability sampling the randomness does not play a role in the inclusion of elements in the Sample.

### Example of Non Probability Sampling

Suppose for the purpose of stat project you need to study the spending behavior of University students. As you have to spend purely from your packet for the project, the sampling method applied should be least expensive. So you prepare a questionnaire and take the survey from the students who visit the neighboring departmental store. You may use the data thus collected for drawing inferences, though only theoretically. You cannot extend the results found statistically to the population as a whole

## Probability vs Non Probability Sampling

 Probability Sampling Non probability sampling The randomness applied is to make the sample a good representative of the population. Effort is not make the sample representative of the population. But still the sample may be representative of the population. The probability of a population represented well by the sample is measurable. It is hard to find how well the sample represents the population The inference arrived using the sample can be statistically generalized to the population. The inferences or findings of the sample cannot be generalized to the population.  The results are valid only for the sample. Free from bias which can arise in sample selection. There is a possible bias in sample selection. The cost involved in sampling cannot be low budgeted. The cost involved are much lower than cost for doing Probability Sampling.

## Probability Sampling Examples

In Banking sector there are different clints who has different types of savings account such as SB account, RDAccount, Current account, trade account and many more accounts. These clints can be put in different plat form and the sampling probabilty can be calculated.