A floating point number is a real number that uses decimals to represent a fraction. The term floating point refers to the decimal point.
When reading this, my math-teacher-brain, said, “Okay, so…a decimal. Cool.” But a few months later, I ran into a StackOverflow question that questioned my simple categorization of floating point numbers. The heart of the question was this image:
The original StackOverflow question had a an excellent response that explained a lot of the logistical reasons for these odd outputs. I recommend reading the top answer there in its original form, but to paraphrase: computers have to store numbers, and they have to strike a balance between precision and space. If I am 1/3″ taller than 5’9″ and I want to save my height as a variable, it’s unreasonable (and impossible) for the computer to save that irrational 0.3 decimal….because it doesn’t really matter and would take up A TON (well…infinite) space. Enter floating point to strike a compromise.
The next bit of this explanation comes thanks to The Floating Point Guide. A floating point number is comprised of two parts: the significand stores the number’s digits and can be positive or negative. The exponent says where the decimal point is placed in relation to the significand. If you think back to learning about scientific notation in high school chemistry, you’re on the right track. Here’s a handy diagram, again from The Floating Point Guide:
There’s also a whole standard called IEEE 754 that dictates exactly how all of this goes down, which you are free to dive into. After reading all about scientific notation, I still wanted to know EXACTLY WHY I wasn’t getting 0.3 when I typed in 0.1 + 0.2. WHY?
Well….floating point numbers are just a variation of scientific notation. Essentially, they use base 2 instead of base 10, and they’re stored as 32 bits. The first bit is for the sign, the next 8 bits are for the exponent, and the remaining 23 bits (called the mantissa…nerdy baby name anyone?) is for the significant digits of the number. Here’s a visual:
There’s an equation to calculate the value of the number given the sign bit, the mantissa, and the exponent, but I’ll spare you.
Getting bored? Cool. Let’s talk Mario! Seriously.
Super Mario 64 has a crazy glitch that is CAUSED BY FLOATING POINT NUMBERS BEING CRAZY. Up above, we described the floating point system as a compromise: I want to be able to store a ton of numbers but not use up a ton of space. One of the weird implications, though, is that the numbers you can store is not evenly spaced across a number line. As it turns out, I can represent a ton of numbers that are close to 0, but fewer and fewer as we move out towards infinity.
Each power of 2 has an equal number of possible float values, which means that as you increase the powers of 2, you also drastically (exponentially!) increase the distance between float values. Cool. For small-ish numbers, this really doesn’t matter much. If Mario’s coordinates are represented as X and Y where both are floating point numbers, we can just round to the nearest one. If Mario is very close to (0, 0), the rounding is going to be teeny tiny and completely unnoticeable. As he gets farther away, the game might look a bit jumpy as his X coordinate stays the same…then suddenly rounds up to the next floating point. At some point, he can’t move anymore! The distance between floating point numbers is so great that poor Mario is stuck.
For a truly awesome explanation of floating point math and examples of what this looks like in gameplay, check out this video by UncommentatedPannen.