Percentile calculation in different contexts

Aravind Brahmadevara
2 min readAug 20, 2023

We see percentiles being calculated in a different way by different textbooks, stats libraries etc., Let’s understand the contexts.

Context 1: (Not a sample)

Imagine you have a set of milestones on the road.

Src (odishatv.in)

There are three milestones. This is a not a sample. This is a population. You start at 1, goto 2 , reach 3 . How many steps did we cover?

1-> 2 and 2-> 3

For n milestones you have n-1 steps . 100% journey has n-1 steps. In this case to calculate percentile rank of p% => you have to cover ( p/100 ) * n-1 steps

Context 2: (Statistics/Sample)

You are conducting an opinion poll and have taken n samples . Now these are a sample draw. How can we visualize these draws? It is like cutting the space at n points which means you have created n+1 partitions.

Note : How it was n-1 previously and how it is n+1 in this context

For n points you have n+1 steps . 100% journey has n+1 steps. In this case to calculate percentile rank of p% => you have to cover ( p/100 ) * n+1 steps

18.3 — Sample Percentiles | STAT 415 (psu.edu)

Context 3 ( Sorted array of numbers)

C Arrays — GeeksforGeeks

How many values are less than me? At index 5, there are 5 values less than 18. At index 2, there are 2 values than 8 . At index i, there would be i values lesser.

Percentile = number of values lesser / total values * 100

Percentile of index i = i /n * 100

Percentile rank of p => p* n /100

So notice how we had n-1, n+1 , n in different contexts

Final conclusion: As n -> large , all methods converge to similar values So nothing to worry :-)

You can reach me at

(8) Aravind Brahmadevara | LinkedIn

aravind-deva (aravind brahmadevara) (github.com)

--

--