Bayes Theorem Simplified

Aravind Brahmadevara
3 min readAug 19, 2023

My Goal: Read once and Remember for a lifetime. My focus is on stressing the right foundational principles.

Sample space: All possible results(outcomes) of a random experiment

Event: A group of outcomes (a set)

Exclusive events: Non overlapping events

Exclusive events and Exhaustive events: Non overlapping and covers the entire sample space. Mathematically, this is called partition of the sample space.

Simple Binary Case:

One Event: A(hypothesis)

Another Event: B(Evidence) — could be from different Random variable

Another simplicity — Assume Event B is also from the same sample space (same random variable).

Question to yourself: Do A and B cover the entire sample set? No

Do A and A’ (A’s complement) cover the entire sample space ? Yes

So, we are dealing with A and A’ here. B can occur in A or A’. The name binary comes from A and A’ not because of A & B)

Now comes the Bayes’ theorem. We have an evidence(B) , what is the probability that A happened?

Analytical understanding (SINGLE EVENT)

We know B falls in A or A’ . So, let’s say if ONE event A happened => P(A) , then within A what is the probability that the same event was also a B?

P(A)* P(B|A)

Remember we are still in the scope of one event. So, the conditional probability kicks in

If there were two sequential events, it would’ve been something like P(A) * P(B) where A is like drawing a King from pack of cards and B is another event like rolling a 4 on 6-sided dice. (A and B independent. If there is dependency again conditional probability kicks in for sequential events)

Single Event Continued:

Either A happens or A’ happens. Both are exclusive. If A happens, what the probability of B happening (event also belongs to B)

Let’s say A’ happened => P(A’) and within A’, probability of B happening is P(A’) * P(B|A’)

So Total Probability of B in two exclusive events (A or A’)

P(B) =P(A)* P(B|A) + P(A’) * P(B|A’)

Possibilities for B = P(B) * number of trials . Number of trials can be assumed to be 1 in subtle sense. But subtlety can be skipped for now.

Now, let's come to the actual event. We know that B happened(evidence). Total possibilities for B are P(B) * num of trials.

Intuition: Assume event A has occurred, then what is the possibilities of B within A? P(A)* P(B|A) * number of trials

P(A|B ) = P(A)* P(B|A) * number of trials / Total possibilities for B

Number of trails cancel both in Numerator and denominator.

P(A)* P(B|A) / ( P(A)* P(B|A) + P(A’) * P(B|A’))

Intuition: Assume A’ event occurred, then what is the possibilities of B within A’?P(A’) * P(B|A’) * number of trials

P(A’|B ) = P(A’) * P(B|A’) * number of trials/ Total possibilities for B

Number of trails cancel both in Numerator and denominator.

P(A’)* P(B|A’) / ( P(A)* P(B|A) + P(A’) * P(B|A’))

This is the core Bayes’ theorem.

Multi event case: This is not explained in detailed in public articles.

Let’s say there are three events which X,Y,Z

X+Y+Z cover the entire sample space

P( X | B) = P(X)* P(B|X) / (P(X)* P(B|) + P(Y) * P(B|Y) + P(Z) * P(B|Z) )

Multi Event case + but B comes another random variable (Y) (Just like our dice example)

Here B does not fall in the same sample space at all. It is from a different random variable.(Y)

Same equation holds. In this case, imagine a table with X as row index and Y as column index . Here X and Y can be independent or dependent

Table::::: Y=10 Y=11 Y=12

X =1

X=2

X=3

Continuous Case:

P.D.F(X |B=b) = P.D.F (X) * P (B=b | X=x) / P(B=b)

P.D.F(X|B) = P.D.F(X) * P.D.F(B|X) / P.D.F(B)

You can reach me at

(8) Aravind Brahmadevara | LinkedIn

aravind-deva (aravind brahmadevara) (github.com)

--

--