I Taught My Mom Bayes’ Theorem In 10 Minutes. She Taught Me Patience.

Bayesian Inference Explained Like You’re Guessing Who Ate Your Leftovers

Let me tell you about the worst Wednesday of my life.

I’m sitting at my desk. My left sock is wet for reasons I still don’t understand. My dog is losing his mind at a squirrel outside. And I’m staring at a Wikipedia page about “Bayesian inference” that reads like it was written by someone who actively hates the concept of fun.

Funny anime-style cover illustration for Bayesian inference article showing person at messy desk with wet sock, barking dog, glowing laptop showing Bayes theorem, and giant vending machine in background, colorful and chaotic cartoon data science humor.
You’ve been doing Bayesian inference your whole life. Someone owes you a degree.

Here’s the sentence that broke me:

“Bayesian inference is a method of statistical inference in which Bayes’ theorem is used to update the probability for a hypothesis as more evidence or information becomes available.”

Cool. Very helpful. I’ll just file that next to my will to live.

But here’s the thing. That sentence? It’s actually describing something you did this morning. And yesterday. And every single day of your chaotic little life.

Let me prove it.

You wake up. You look outside. The sky is grey. You think: “Probably gonna rain.” That’s your prior. It’s your starting guess based on vibes and experience.

Then you check your phone. The weather app says 90% chance of rain. That new information? That’s your likelihood. It’s the evidence.

Now your brain goes: “OK, grey sky PLUS the weather app says rain? Yeah, I’m grabbing the umbrella.” That updated belief? That’s your posterior.

Congrats. You just did Bayesian inference. No PhD required. No Greek letters. Just a grey sky and a phone.

Cartoon teenager doing Bayesian inference by checking weather forecast and looking at grey sky before grabbing umbrella, funny statistics illustration
You’ve been a statistician this whole time. Someone owes you back pay.

The entire field of Bayesian statistics is built on this one dumb, beautiful loop:

Start with a guess. See some evidence. Update your guess.

That’s the whole game. The rest is just fancy notation to make professors feel important. (Love you, professors. Please don’t email me.)

Let’s go deeper, but like, in a fun way. Not in a “I’m going to explain set theory for 40 minutes” way.

Imagine you’re at a party. Your friend Jake has not shown up yet. You have some prior beliefs about Jake:

  • Jake is late to literally everything. (High prior for “Jake is late.”)
  • Jake said he was coming. (Mild prior for “Jake will show up eventually.”)
  • Jake’s car is held together by duct tape and prayer. (Moderate prior for “Jake’s car broke down.”)

Now someone texts you: “Jake’s car is on the side of the highway.”

BOOM. That’s your likelihood. New evidence just dropped.

Your brain instantly updates. The probability of “Jake’s car broke down” shoots through the roof. The probability of “Jake is just fashionably late” drops. Your posterior belief is now: “Jake is definitely stranded. Someone should call him.”

You didn’t do any math. But your brain just ran Bayes’ theorem in the background like a silent app you never downloaded.

Funny cartoon of friends at party updating their beliefs about late friend using Bayesian reasoning, probability thought bubbles, vibrant vector art
Jake’s posterior probability of showing up just hit rock bottom. Pour one out.

This Is a Vending Machine

OK. Time for the official “smart person” definition. Brace yourself.

“In Bayesian statistics, the posterior distribution is proportional to the product of the prior distribution and the likelihood function, as described by Bayes’ theorem: P(θ|D) = P(D|θ) × P(θ) / P(D).”

I’m sorry. WHAT.

That definition is technically correct. It’s also completely useless if you don’t already know what it means. It’s like explaining a joke by reading the patent filing for humor.

Here’s why that definition is garbage for beginners: It assumes you know what θ means. It assumes you know what “proportional” means in context. It assumes you know what a “distribution” is. It assumes you ENJOY reading things that look like alien tax forms.

You don’t need any of that right now.

Here’s my version. Bayesian inference is a vending machine.

You walk up to the vending machine with some coins in your pocket. Those coins are your prior. It’s what you bring to the situation. Your existing belief. Maybe you think the vending machine mostly gives out chips because that’s what you got last time.

Now you press a button. The machine makes a weird noise. A light flickers. That whole process is the likelihood. It’s the machine reacting to reality. It’s the evidence grinding through the gears.

Then something falls out. That snack is your posterior. It’s your updated belief. Maybe you pressed the chips button but a candy bar fell out. Now you update: “Oh, this machine is a little unpredictable.”

Prior (what you walked in with) × Likelihood (what the machine showed you) = Posterior (what you walk away believing).

That’s the entire formula. That’s Bayes’ theorem. It’s three things multiplied together. The Greek letters are just there to scare off people who would otherwise realize how simple this is.

Colorful cartoon vending machine explaining Bayesian inference with prior as coin input, likelihood as machine process, and posterior as output snack, funny statistics diagram
The most important machine in statistics looks like it belongs in a bowling alley.

Let me kill one more myth while I’m here.

People think “prior” means “bias” and that’s bad. Nope. A prior is just your starting point. Everyone has one. Pretending you DON’T have a prior doesn’t make you objective. It makes you a liar. Bayesian inference is honest about the fact that you walked into the room with opinions. Then it forces you to update those opinions with data. That’s not bias. That’s growth.

Funny cartoon comparing biased denial versus honest Bayesian prior updating, two characters debating statistics with humor
The first step is admitting you walked in with opinions. The second step is math.

Let’s Do Some Code

Alright. Enough philosophy. Let’s build this thing step by step. I’m going to walk you through Bayesian inference using a real example, real (tiny) code, and absolutely zero suffering.

The Scenario: You run a small pizza shop. (Stay with me.) You want to know what percentage of your customers are going to order pineapple pizza. You have some beliefs, and you have some data. Let’s be Bayesian about it.

My cat just stepped on my keyboard and typed “rrrrrrr” into my code editor. I’m keeping it as a comment. He’s part of the team now.

Step 1: Set Your Prior (“What Do You Believe Before Seeing Data?”)

Before you look at a single order, what do you think? Maybe from your experience, about 30% of people order pineapple. Maybe you just have a gut feeling.

That gut feeling IS your prior. In math world, we’d say you have a Beta distribution centered around 0.3. But honestly? Just think of it as: “I believe roughly 30% of people are pineapple criminals.”

Why this matters for your sanity: The prior isn’t a fact. It’s a starting point. It’s OK to be wrong. That’s the whole point. The data will fix you.

Cartoon pizza shop owner writing prior probability belief on chalkboard, pineapple pizza sitting on counter, funny Bayesian statistics step one illustration
Step 1: Admit what you believe. Even if it’s controversial.

Step 2: Collect Evidence (“Look at the Actual Data”)

Now you open the register. Out of your last 20 customers, 8 ordered pineapple. That’s 40%.

Huh. Higher than you thought.

This data is your likelihood. It tells you how probable this specific data is, given different possible values of the true pineapple rate. The data doesn’t care about your feelings. It just shows up and tells the truth.

Why this matters for your sanity: You can’t argue with orders. The data is the data. Your job now is to let it shift your belief. Not replace it. Shift it.

Cartoon pizza shop owner reading receipt showing 8 out of 20 pineapple pizza orders as Bayesian likelihood evidence, funny data collection illustration
The data doesn’t care about your priors. It just shows up with receipts. Literally

Step 3: Update Your Belief (“Let the Math Cook”)

Here’s where the vending machine does its thing.

Your prior said 30%. The data said 40%. The posterior is going to land somewhere in between. Not exactly at 30%. Not exactly at 40%. Somewhere that respects BOTH your experience AND the evidence.

That’s the beauty. Bayesian inference doesn’t throw away your prior knowledge. It blends it with new evidence. If you have a LOT of prior experience, your belief won’t move much. If you have very little prior experience but a TON of data, the data wins.

It’s like a tug of war between your gut and reality. And the posterior is wherever the rope ends up.

Why this matters for your sanity: You don’t have to choose between “trust the data” and “trust your experience.” Bayesian inference says: trust both. But proportionally.

Funny tug of war cartoon between prior belief and data likelihood showing posterior distribution as the middle result, colorful Bayesian inference explanation
The most civilized fight in all of mathematics.

Step 4: Write the Code

Here it is. The whole thing. In Python. Breathe.

import numpy as np
from scipy import stats
# My prior: I THINK about 30% of people like pineapple pizza
# Using a Beta distribution because it's bounded between 0 and 1
prior_a, prior_b = 6, 14 # shape params that center around 0.3
# The data: 8 pineapple orders out of 20 customers
pineapple_yes = 8 # the brave ones
pineapple_no = 12 # the cowards (kidding)
# THE UPDATE: just add the data counts to the prior params
# This is the entire magic trick. Seriously. That's it.
posterior_a = prior_a + pineapple_yes # 6 + 8 = 14
posterior_b = prior_b + pineapple_no # 14 + 12 = 26
# What does our updated belief look like?
posterior = stats.beta(posterior_a, posterior_b)
print(f"Updated belief: ~{posterior.mean():.1%} of people want pineapple")
# Output: Updated belief: ~35.0% of people want pineapple
# rrrrrrr (cat contribution, keeping it)

That’s it. That is the entire Bayesian update. You added the data to the prior. The posterior appeared. No calculus. No tears. Just addition.

Why this matters for your sanity: The Beta-Binomial model is the “Hello World” of Bayesian inference. If you understand THIS, you understand the core mechanic. Everything else is just fancier vending machines.

Cartoon laptop showing simple Bayesian inference Python code with cat sleeping on keyboard and sticky note saying it is just addition, funny programming illustration
15 lines. No calculus. One cat. This is peak data science.

Step 5: Repeat Forever (“Today’s Posterior Is Tomorrow’s Prior”)

Here’s the best part. Done?

NOPE.

Tomorrow you get 20 more customers. You take today’s posterior and use it as tomorrow’s prior. Then you update again. And again. And again.

Every single day, your belief gets sharper. More data, less chaos. Your vending machine gets better at predicting what snack is going to fall out.

This is the Bayesian cycle. It never stops. And that’s the whole point. You’re never “done” learning. You’re always updating.

Why this matters for your sanity: Bayesian inference isn’t a one-time calculation. It’s a lifestyle. (I’m sorry for how that sounded, but it’s true.)

Circular flowchart showing Bayesian inference cycle of prior to evidence to posterior repeating, colorful cartoon loop diagram with calendar
The Bayesian cycle: guess, learn, update, repeat. Also known as ‘being a functioning human

The Documentation Secret

Here’s what the textbooks won’t tell you because it would ruin the mystique:

Bayesian inference with conjugate priors (like the Beta-Binomial we just used) is literally just addition.

That’s it. That’s the secret. The “scary” part of Bayesian stats (choosing priors, computing posteriors) has a massive shortcut for a huge number of real-world problems. You pick a prior from the same family as your likelihood, and the update becomes plug-and-chug arithmetic.

The official documentation buries this under 50 pages of theory because if you knew how easy it was, you’d finish the chapter in 10 minutes and they couldn’t sell you a 400-page textbook.

I found this out at 1 AM on a Tuesday while my roommate was microwaving something that smelled like regret. Changed my entire relationship with statistics.

The Golden Rule:

If you can guess, and you can count, you can do Bayesian inference. Everything else is just decoration.

Funny cartoon of thick statistics textbook being cracked open to reveal tiny note saying Bayesian inference is just addition, student shocked, academic humor illustration
400 pages to hide three words. The academic-industrial complex strikes again.

Scars-Over-Theory Tips (Stuff I Learned the Hard Way)

  • Your prior doesn’t need to be perfect. It needs to exist. A bad prior with good data still converges to the truth. A “no prior” is just a hidden prior you’re pretending isn’t there.
  • Start with the Beta-Binomial. It’s the gateway drug of Bayesian inference. Get comfortable here before you wander into Markov Chain Monte Carlo territory. Trust me.
  • If your posterior looks identical to your prior, you don’t have enough data. Go collect more. This is not a math problem. This is a “go outside and talk to customers” problem.
  • Bayes’ theorem is not just for nerds. Doctors use it to diagnose diseases. Spam filters use it to catch junk mail. Your brain uses it to decide if that noise downstairs is a burglar or your cat knocking over a glass. Again.
  • Read the Wikipedia page for “conjugate prior.” It’s a cheat sheet for which priors make the math easy. Bookmark it. Love it. It will save you hours.

Here’s what nobody tells you about Bayesian inference.

It’s freeing.

Frequentist statistics asks you to pretend you have no opinions. To be a blank slate. To only speak when the p-value gives you permission.

Bayesian statistics says: “Bring your whole self. Bring your experience. Bring your gut. Bring your messy, imperfect, human intuition. And then let the data help you grow.”

That’s not just a math framework. That’s a philosophy. It says it’s OK to be wrong. It says learning is a process, not a destination. It says your past experience has value, and new evidence has value, and the truth is usually somewhere in between.

I don’t know about you, but that feels a lot more like real life than any p-value ever did.

Inspirational cartoon of person standing on data mountain with arms spread representing freedom of Bayesian thinking, discarded frequentist textbook below, colorful probability clouds in sky
You were never meant to be a blank slate. You were meant to update.

Bayesian inference is not a math problem. It’s the math of being a person who learns.

If this made Bayesian inference even 1% less scary for you, hit that clap button like it’s a prior you’re updating. Follow me for more “I can’t believe that’s all it was” moments in data science.

And seriously, go check out the conjugate prior Wikipedia page. You’ll thank me later.


I Taught My Mom Bayes’ Theorem In 10 Minutes. She Taught Me Patience. was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Liked Liked