Percentile calculation in different contexts
We see percentiles being calculated in a different way by different textbooks, stats libraries etc., Let’s understand the contexts.
Context 1: (Not a sample)
Imagine you have a set of milestones on the road.
There are three milestones. This is a not a sample. This is a population. You start at 1, goto 2 , reach 3 . How many steps did we cover?
1-> 2 and 2-> 3
For n milestones you have n-1 steps . 100% journey has n-1 steps. In this case to calculate percentile rank of p% => you have to cover ( p/100 ) * n-1 steps
Context 2: (Statistics/Sample)
You are conducting an opinion poll and have taken n samples . Now these are a sample draw. How can we visualize these draws? It is like cutting the space at n points which means you have created n+1 partitions.
Note : How it was n-1 previously and how it is n+1 in this context
For n points you have n+1 steps . 100% journey has n+1 steps. In this case to calculate percentile rank of p% => you have to cover ( p/100 ) * n+1 steps
Context 3 ( Sorted array of numbers)
How many values are less than me? At index 5, there are 5 values less than 18. At index 2, there are 2 values than 8 . At index i, there would be i values lesser.
Percentile = number of values lesser / total values * 100
Percentile of index i = i /n * 100
Percentile rank of p => p* n /100
So notice how we had n-1, n+1 , n in different contexts
Final conclusion: As n -> large , all methods converge to similar values So nothing to worry :-)
You can reach me at