Bayes Theorem Simplified
My Goal: Read once and Remember for a lifetime. My focus is on stressing the right foundational principles.
Sample space: All possible results(outcomes) of a random experiment
Event: A group of outcomes (a set)
Exclusive events: Non overlapping events
Exclusive events and Exhaustive events: Non overlapping and covers the entire sample space. Mathematically, this is called partition of the sample space.
Simple Binary Case:
One Event: A(hypothesis)
Another Event: B(Evidence) — could be from different Random variable
Another simplicity — Assume Event B is also from the same sample space (same random variable).
Question to yourself: Do A and B cover the entire sample set? No
Do A and A’ (A’s complement) cover the entire sample space ? Yes
So, we are dealing with A and A’ here. B can occur in A or A’. The name binary comes from A and A’ not because of A & B)
Now comes the Bayes’ theorem. We have an evidence(B) , what is the probability that A happened?
Analytical understanding (SINGLE EVENT)
We know B falls in A or A’ . So, let’s say if ONE event A happened => P(A) , then within A what is the probability that the same event was also a B?
P(A)* P(B|A)
Remember we are still in the scope of one event. So, the conditional probability kicks in
If there were two sequential events, it would’ve been something like P(A) * P(B) where A is like drawing a King from pack of cards and B is another event like rolling a 4 on 6-sided dice. (A and B independent. If there is dependency again conditional probability kicks in for sequential events)
Single Event Continued:
Either A happens or A’ happens. Both are exclusive. If A happens, what the probability of B happening (event also belongs to B)
Let’s say A’ happened => P(A’) and within A’, probability of B happening is P(A’) * P(B|A’)
So Total Probability of B in two exclusive events (A or A’)
P(B) =P(A)* P(B|A) + P(A’) * P(B|A’)
Possibilities for B = P(B) * number of trials . Number of trials can be assumed to be 1 in subtle sense. But subtlety can be skipped for now.
Now, let's come to the actual event. We know that B happened(evidence). Total possibilities for B are P(B) * num of trials.
Intuition: Assume event A has occurred, then what is the possibilities of B within A? P(A)* P(B|A) * number of trials
P(A|B ) = P(A)* P(B|A) * number of trials / Total possibilities for B
Number of trails cancel both in Numerator and denominator.
P(A)* P(B|A) / ( P(A)* P(B|A) + P(A’) * P(B|A’))
Intuition: Assume A’ event occurred, then what is the possibilities of B within A’?P(A’) * P(B|A’) * number of trials
P(A’|B ) = P(A’) * P(B|A’) * number of trials/ Total possibilities for B
Number of trails cancel both in Numerator and denominator.
P(A’)* P(B|A’) / ( P(A)* P(B|A) + P(A’) * P(B|A’))
This is the core Bayes’ theorem.
Multi event case: This is not explained in detailed in public articles.
Let’s say there are three events which X,Y,Z
X+Y+Z cover the entire sample space
P( X | B) = P(X)* P(B|X) / (P(X)* P(B|) + P(Y) * P(B|Y) + P(Z) * P(B|Z) )
Multi Event case + but B comes another random variable (Y) (Just like our dice example)
Here B does not fall in the same sample space at all. It is from a different random variable.(Y)
Same equation holds. In this case, imagine a table with X as row index and Y as column index . Here X and Y can be independent or dependent
Table::::: Y=10 Y=11 Y=12
X =1
X=2
X=3
Continuous Case:
P.D.F(X |B=b) = P.D.F (X) * P (B=b | X=x) / P(B=b)
P.D.F(X|B) = P.D.F(X) * P.D.F(B|X) / P.D.F(B)
You can reach me at