I think about random variables a lot.

Well, that’s not entirely true.

I think about random variables every time I find myself in this really specific social situation: when someone asks me how tall I am.

You see, the thing is, I honestly think I’m about 5’8.5”. I usually say “like, five-eight-and-a-half?” When I’ve gone to get a physical in the past, the staff will measure my height and either say “about five-eight” or “about five-nine” depending on my hair that day or how they’re feeling or whatever else. But when you’re a (slightly) shorter-than-average-height guy, you can’t really claim any half-inch bullshit, ‘cause it sorta seems like you’re trying to pass yourself as taller than you are. Which I’m not really (at least not consciously), I just think I’m between five-eight or five-nine and how you call it sorta depends on the day.

And that makes me think about random variables.

## Discrete vs. continuous

So we’re gonna need a little bit of probability theory here. (Bear with me, I’m going somewhere with this.)

For our purposes, you can think of a random variable as something that tells you the probability that something will happen.

There’s two kinds of random variable, based on two distinct kinds of probabilities you might want to estimate.

One is discrete. Discrete random variables give you probabilities for things where there’s only so many outcomes Technically, “a finite or countably infinite number of outcomes”, but if you know the difference then you prolly don’t need to read this section. that are possible. The classic example is the result of a coin toss: it’ll either be heads or tails with a 50% chance for either. You could say x was your random variable for the chance of the coin landing “heads”, and p(x) would be the probability of x, eventually getting you to

$$p(x) = 0.5$$ The standard practice is to use decimals instead of percentages in equations, because reasons. 0.5 = 50%, 0.75 = 75%, and so on, with 0 = 0% and 1 = 100%

Even if you wanna be that kid and ask about a coin landing perfectly on its side, you still get only three possible outcomes with something like a 49.9995%, 49.9995%, 0.001% split between each. So then p(x) would be .499995, unless you decided x was the probability of the side landing, in which case you’d have

$$p(x) = 0.00001$$

And so on.

The other kind of random variable is continuous. Continuous random variables express the probability of something happening when there’s an infinite number of possible outcomes/values. While discrete random variables are most often used to talk about things in game theory and other artificial constructs, continuous random variables are used to talk about thing that occur naturally, where there’s little-to-no inherent limitations on what could happen.

A good example would be something like temperature. We could ask “what’s the probability that it’ll be 62°F today?“, but what we probably mean by that is “what’s the probability that it’ll be around 62°F today?” After all, if it was 61.185°F, and someone asked you the temperature, you’d probably just tell them 62.

Any time where something could be in a whole range of values and we naturally sort of round those values up or down (like we do for temperature) tends to be a continuous random variable.

Like if, say, you were a physician measuring the height of someone who was 68.5 inches tall—you’re about five-eight / about five-nine.

## What’s hiding in the math

So there’s the two kinds of random variables, and (as you might suspect) they end up having different math behind them.

Generally, discrete random variables end up being easier to work with because you can always just manually list out the possibilities and multiply the odds together. The chance that you’ll flip a coin three times in a row and get “heads” each time, for example, would be something like:

\begin{align} p(three\,heads\,in\,a\,row) \\ & = p(heads) * p(heads) * p(heads) \\ & = 0.5 * 0.5 * 0.5 \\ & = 0.125 \\ & = 12.5\% \end{align}

…Which both intuitively seems about right and is easy to read in an equation.

You can’t do that for continuous random variables.

For continuous random variables, you need calculus, because the probabilities are defined in terms of integrals. To calculate the probability that the temperature will be between 60 and 70°F today, you’d have to do something like this:

$$\int_{60}^{70} f(x) dx$$

…Where f(x) was a function that assigned probabilities to an area under a curve.

And it only gets more complicated from there. This is the equation behind the normal random variable , aka the most common type of continuous random variable: $$f(x \mid \mu, \sigma^2) = \frac{1}{\sqrt{2\pi\sigma^2} } e^{ -\frac{(x-\mu)^2}{2\sigma^2} }$$ This is how I remember feeling the first time I had to work with that equation:

But the interesting thing that’s buried in there—something that’s actually sort of glossed over a lot—is that the probability that a continuous random variable will be any exact, specific value is always 0.

So there might be a 5% chance that it’ll be around 62°F today, but there is an absolutely 0% chance that it will be exactly 62.0000000°F. In fact, whatever exact, specific temperature it is right now had a 0% chance of happening.

We wake up, every day, in a world filled with things that are statistically impossible.

We ourselves are made up of things (like a 68.5” height) that are statistically impossible.

## Beat the odds, beat the Feds, wouldn’t be wise to bet against the kid

As a general thing, I don’t really like when people talk about divisions in how we make sense of the world.

I might have hinted at this before when I ragged on Atwood’s reductionist take on a scene from Her, but it really gets me when people talk about the “rational” thinking of math or the sciences vs. those heathens who read books or—even worse!—go to church.

There is no purely “rational” basis in mathematics; arguably one of the most rational things about it is that it’s provably incomplete, Thank you, Kurt Gödel! and so we rationally know that there are things we just have to accept as true.

The probability that a continuous random variable takes an exact value cannot be mathematically defined as anything other than 0, for all kinds of reasons. It breaks the definition of the integral, it would keep the integral from summing to 1, it breaks the same kind of logic where 0.9[9] = 1, etc. The probability that any specific value emerges has to be 0, even as the probability that it ends up in a given range can be greater than 0.

At some point, all the zero probabilities in a range merge into something (literally) greater than themselves, and become a non-zero value.

Continuous random variables are a small corner in mathematics where something emerges from nothing, and we can’t do anything other than accept the something as axiomatically true.

This is something that happens constantly in life, and there will are times when it’s useful to find either the emergent something or the base layer of nothing, but it’s important to not get too caught up in assigning too much weight either way.

Stay flexible.

Focus on finding what’s useful.