powered by Tutorvista.com

Sales Toll Free No: 1-855-666-7440

Statistically, central tendency refers to either the average or to
the balancing or the most common occurrence concept of a data set. Each
of these characteristic is useful in analyzing the data and drawing
inferences therefrom. Descriptive statistics define measures of
central tendency and provides methods to evaluate them for a data set.
The Inferential statistics makes use of measures of central tendencies
calculated for a sample to draw conclusions about the population
characteristics.

The word central tendency indicates the middle value and is measured using the mean median mode. Each of these measures are calculated differently and all these are measured in different situations depending upon the occurrence of the date. Its the degree of clustering of values of the distribution and is calculated using the above measures.

The word central tendency indicates the middle value and is measured using the mean median mode. Each of these measures are calculated differently and all these are measured in different situations depending upon the occurrence of the date. Its the degree of clustering of values of the distribution and is calculated using the above measures.

Below you could see central tendency examples,

Example 1: Find the mean, median and mode of the set 82, 89, 83. 81, 82, 10

The items of the data when arranged in order (say from least to greatest) are,

10, 81, 82, 82, 83, 89

The mean of the data = 10 + 81 + 82 + 82 + 82 + 89 = $\frac{426}{6}$ = 71

There are two middle terms 82 and 82 and hence the median is $\frac{(82 + 83)}{2}$ = 82.5

The number 82 occurs most and hence the mode is 82.

The number 10 is far different from other numbers in the set. Such an item is called an *outlier* in the set. Suppose we ignore the outlier and calculate the mean, it becomes as, 81 + 82 + 82 + 82 + 89 = $\frac{416}{5}$ = 83.20. It can be seen now that an outlier has a great influence on the mean but has little influence on median or mode. *Hence a more realistic conclusion of a central tendency is the measure of median.*

Suppose the given data is the set of the scores by a student. The median gives a better report about the student than the mean. The score of 10 may be incorrectly awarded or the student might have taken that particular test in a hard situation. But that score drastically reduces his mean. Normally a good judge will go by the median in such cases.

Suppose the numbers are the code numbers of different commodities sold by a shop. The mode 82 tells that the commodity referred by the code 82 is more popular than the rest of the commodities.

Example 2: The Quiz scores of students are given below: Find the mean,median and mode of the data set.

3, 6, 7, 4, 9, 5, 8, 10, 4, 5, 6, 6.

Mean of the data x= $\frac{3+6+7+4+9+5+8+10+4+5+6+6}{12}$ = 6.08

To find the median, the data values are arranged in ascending order

3, 4, 4, 5, 5,

As there are even number of values, the 6

The value 6 occurs most (three times) in the data set. Hence the mode of the data values = 6.

Mean: Mean is the average or arithmetic mean of the data values in the distribution. In a simpler way mean is calculated by dividing the sum of the data values by the number of values in the data set. The symbol μ is used to represent the population mean, while the mean of the sample is indicated by x.

The mean x of the data set x

x = $\frac{x_{1}+x_{2}+x_{3}+.........x_{n}}{n}$

Median: Median is the middle value of an ordered data set. To find the median, the values of a data set are first ordered from the lowest to the highest. If there are an odd number of terms, the median is the $\frac{n+1}{2}$

Mode: The mode is the most frequently occurring value in the data set. A data set can have more than one mode.

The most commonly used measure of a central tendency is the mean. It is also called the average. A mean deviation is defined as the sum of the items of a data divided by the number of items in the data.

**For example**, consider a data of set of numbers 2, 5, 6, 10, 12. The sum of the items is the sum of the numbers 2 + 5 + 6 + 10 + 12 = 35 and number of items is 5. Hence the mean of this data is, $\frac{35}{5}$ = 7.

The median of a data is the middle item of the data when all the items are arranged in an order.

**For example**, consider a data of set of numbers 2, 8, 6, 12, 10.

The items of the data when arranged in order (say from least to greatest) are,

2, 6 , 8, 10, 12

The middle item is 8 and hence 8 is the median of this data.

In case of a data with even number of items, the mean of the middle two terms is the median of the data.

Median is a better measure of central tendency compared to mean if the data is skewed and also it is not influenced by the presence of outliers. We will see this in the illustrated example.

The mode of a data is the item or items which occur the most out of all the items of the data, when all the items are arranged in an order. There may not be any mode in some data or there may be more than one mode in some data.

**For example**, consider a data of set of numbers 2, 8, 6, 12, 10.

The items of the data when arranged in order (say from least to greatest) are,

2, 6 , 8, 10, 12

All the items in the data occur only once. Hence there is no mode or nil mode for this data.

But consider a data of set of numbers 2, 8, 6, 2, 6, 12, 10, 6, 8, 3, 8

The items of the data when arranged in order (say from least to greatest) are,

2, 2, 3, 6, 6, 6, 8, 8, 8, 10, 12

In this data, the number 2 occurs twice, the number 6 occurs thrice and also the number 8 occurs thrice. Hence there are two modes for this data which are 6 and 8.

When the data represent a category, the mode of the data tells us the most favorite item.

The common measures of dispersion which are used in statistical analysis are the range, variance, standard deviation and interquartile range.

Range is the simplest measure of variation and it is the difference between the highest and the lowest values in the data set. Still it is not very useful in describing the data as it does not convey any information about how the data values are related to a measure of central tendency like mean or median.

Variance is the measure of dispersion used along with mean, the measure of central tendency in describing the data. Indeed mean is used in the computation of variance. Standard deviation is the positive square root of the variance which is often used with mean. While the population standard deviation is denoted by σ, the sample standard deviation is represented by 's'.

The formulas for finding the variance and standard deviation are as follows:

Variance = σ

Standard deviation σ = $\sqrt{\sum_{i=1}^{N}\frac{(x_{i}-\overline{x})^{2}}{N}}$Interquartile range (IQR) is the measure of variability used with median to describe the data set. IQR is the difference between the first and third quartiles and gives the range of the central 50% of the data set.

According the central limit theorem that the distribution of sample means approaches normalcy as the number of samples is increased.

Below you could see central limit theorem:

As the sample size increases, the distribution of sample means taken with replacement from a population with mean μ and standard deviation σ will approach a normal distribution. The mean of the distribution of sample means = μ with standard deviation = standard error of the population = $\frac{\sigma }{\sqrt{n}}$.

The distribution of sample means can thus be assumed to be normal for large number of samples, even if the population is not known to be normally distributed. Thus the central limit theorem provides a method to hypothesis testing using test statistic and critical value.