Percentile

Sub Topics
Measures of position are used to locate the relative value of a data in the data set. The common measures of position which are often used in statistical study are Z- score, percentiles, deciles and Quartiles.

Out of the above mentioned measures of positions, Z score tells how many standard deviations a data value is away from the mean. The Z score will be positive if the data value is greater than the mean and negative if it is less than the mean. The z scores calculated are related to the percentiles representing the probability in z-score tables.
Percentiles, deciles and Quartiles divide the data distribution in equal proportions.

Percentile Definition

Percentiles divide a ranked or ordered data set into 100 equal groups.
The  kth percentile mark of a data set is a value such that k% of values fall at or below it, where k is a whole number between 1 and 99 and is called the percentile or percentile rank.

(100-k)% of data values fall above the kth percentile mark.

A diagrammatic explanation of the position of the Kth percentile mark and the distribution of data around it is given below.

We generally come across two types of problems applying percentile.

Percentile Mark is given (The value in the ordered data set) - To find the percentile of the value.

The Percentile  is given - To find the value corresponding to the percentile.

Percentile Formula

The percentile corresponding to a given value ‘X’ in the data set is given by
Percentile = $\frac{Number of values of below X +0.5}{Number of values of data set}\times100$

Examples for Finding the Percentile

Example 1: 10 students took a 30 point test. The scores of the students are given below. Find the percentile rank of the score 20 in the data set.
12, 22, 18, 15, 20, 27, 21, 25, 16, 29.

Solution:
First the data set is to be ordered from the lowest to the highest as follows:
12, 15, 16, 18, 20, 21, 22, 25, 27, 29.
Number of values below the score 20 = 4  and the total number of scores in the data set = 10
Percentile rank for the score 20 = $\frac{Number of values of below 20+0.5}{Number of values of data set}\times100$

=$\frac{4+0.5}{10}$ x 100 %  = 0.45 x 100 % = 45th percentile

This means the student with a score of 20 did better in the test than 45% of the students.

A percentile chart aligning the scores with the percentiles can be made as follows:

 Mark 12 15 16 18 20 21 22 25 27 29 Percentile rank P5 P15 P25 P35 P45 P55 P65 P75 P85 P95

If the calculated answer happens to include decimals, it is rounded to the nearest integer.
Example 2: The following are the ages of 15 students in a statistics class. Find the percentile rank of age 23.
18    21     25   21   28   23   21   19   24   26   21   24   18   27   23

Solution:
The data set has to be arranged from the lowest to the highest.
18  19  21  21  21  21   21  23  23  24  24  25  26  27  28
Number of ages below 23 = 7    and the total number of students = 15.
Percentile rank for the age 23 =$\frac{Number of values of below 23 +0.5}{Number of values of data set}\times100$
$\frac{7+0.5}{15}$ =x 100 % = 50th percentile.

Often we would need to find the value in the kth position of the distribution.  The steps for finding the percentile mark in a data set are given below
1.    Order the data from the lowest to the highest.
2.    Compute the value c =$\frac{nk}{100}$ where n is the total number of items in the data set.
3.    If c is not a whole number round it to the nearest whole number.
4.    Starting from the lowest data value, the value in the cth position is the required Mark.
5.    If c is a whole number take the average of the values in the cth and the (c+1)th positions.

Examples for finding the percentile marks for the given percentile

The following are the recorded scores of 15 students for a quiz with a maximum score of 10. Find the score corresponding to the 25th percentile.
2 3 4 5 5 5 5 6 6 7 8 8 8 9 10
Solution:
The data is given ranked. Now let us compute c where n =15 and k =25
c=$\frac{nk}{100}$=$\frac{15\times20}{100}$= 3.75.  rounding to the nearest integer, c =4
Counting 4 position from the lowest score, the required percentile mark = 5.
This tells that about 25% of the class has scored less than 5 in the quiz.

Percentiles Problems With Frequency Distribution

When the data is given in the form of frequency distribution, the required rank can be found assuming uniform distribution data in a given class and using cumulative percentage frequencies.

Example:
The data given is the scores gained by 150 students in a national talent test.
a)    Find the approximate percentile rank of the score 275
b)    Find the approximate score corresponding to the 60th percentile
 Score Frequency 199.5 – 219.5 12 219.5 – 239.5 42 239.5 – 259.5 58 259.5 – 279.5 28 279.5 – 299.5 10

Now let us redo the table including cumulative frequency and cumulative percentage for frequencies.

 Score Frequency Cumulative frequency Cumulative percentage 199.5 – 219.5 12 12 8% 219.5 – 239.5 42 52 36% 239.5 – 259.5 58 112 74.4% 259.5 – 279.5 28 140 93.3% 279.5 – 299.5 10 150 100%

a) To find the percentile rank for the score 275.

The score 275 falls in the class 259.5– 279.5. The cumulative percentage corresponding the previous class is 74.7%.
Considering the scores are whole numbers, the lowest in this class can be taken as 260 and the highest as 280 thus totaling 20 scores. 275 comes 15th in the order.  The frequency reading 28 for this class tells that there are 28 marks less than the maximum score of 280 in the class.  The number of marks less than the score 275 can be estimated using proportion as follows

=$\frac{15x28}{20}$ = 21

The difference is cumulative percentage to this class and the previous is = 93.3 – 74.7 = 18.6
This increase corresponds to a total increase in frequency of 28.
Hence corresponding to the increase of 21 frequencies (no of marks less than 275) in the class
The increase in cumulative percentage =
=$\frac{18.6x21}{28}$=13.95

Hence the cumulative percentage corresponding to the score 275 = 74.7 + 13.95 = 88.65
Rounding to the nearest integer, the percentile rank for the score 275 = 89%

b)    To find the mark corresponding to the percentile rank 60.
From the table the class containing the required mark is 239.5 – 259.5 as this class contains marks that have the percentile ranks between 36 and 75.
The increase in cumulative percentage for the class = 74.7 – 36 = 38.7
There are 58 items in the class. This means 58 marks will increase the cumulative percentage by 34.7.
The mark in the 60th percentile will increase the cumulative percentage by 24.
Hence the number frequency used in the class to increase the cumulative percentage by 24
=$\frac{24x58}{38.7}$= 36 when rounded to the nearest integer.
The lowest mark in the class level is 240 and the highest is 260 and there are 58 entries spread in this interval.
So the mark increase which cuts off 36 from the lowest =$\frac{36x2}{58}$=12.41 which is 12 when rounded to the nearest whole number.
Hence, the Score corresponding to the 60th percentile rank = 240 + 12 = 252

Percentile Graphical Solution

These problems can also be solved graphically. The cumulative percentage graph for the frequency distribution can be drawn taking the class intervals on the x axis and cumulative percentages on the Y axis. The graph done can be used to find both the Percentile mark and percentile rank when either of them is given. The graph for the above problem is given below and displaying the solutions.  We can see the answers got using graph agree in total with that calculated earlier.