By Repapetilto — Adobe IllustratorPreviously published: Unpublished, CC BY-SA 3.0,

P-Value in less than a minute

Confused by reading from many sources? You are at the right pit stop :-). Many articles focus on explaining what is p-value. I will try to focus on on why p-value

Aravind Brahmadevara
2 min readJul 31, 2022

--

Why p-value:

Assume that we want to prove a theory — Boiling water produces steam

How do we prove? — We repeat the experiment again and again

Reality in Statistics —We generally observe some data (which is not from a controlled experiment). Otherwise, we record some experimental data (which is controlled for keeping all other factors same) Example: If weather is 15 degrees C, then would it rain? Can we artificially create 15 degrees C in a lab-controlled experiment? or we are better off using some observed data outside of the experiment?

Questions:

Are these data accurate without biases? — No Guarantee

Do these data represent the whole population? — No. It is a sample (like a poll)

Judgement error — Since we can’t completely rely on the data, there is every chance that we make wrong conclusions.

Repeatability — Can we conclude anything without repeating the experiments? If we want to conclude from the data without repetition, then there is possibility of making wrong conclusions.

Final Intuition

Assume I got a sample of data, and I am going to make the wrong conclusion(favoring alternate hypothesis)

I made a wrong conclusion without repeating the experiment. So, what is the probability of making the same wrong conclusion if I HAD repeated the experiment? When do I make the same mistake? Answer: If I get data worse than this or as worse as this (which is nothing but data points as extreme as this) then I would make the same error

p-value: probability of making wrong conclusion if I repeat the experiment hypothetically.

Significance level: cut off for making wrong conclusions.

If p-value is more than the cut off, then there is not much evidence to favor the alternate theory.

Note: There are alternate explanations such as ‘Assuming Null Hypothesis is true, probability of observing this data point or worse(chance/luck) when the experiment is conducted’. This is mathematically correct but intuitively not accurate. We assume that the data point we observed is not by luck/chance. It is only about how confident we are about making the correct conclusion using the significance/cut off value.

Follow me: https://www.linkedin.com/in/aravind-deva/

--

--