I remember my progression through science classes during my primary education all feeling pretty good—until, late in high school, when I got to relativity and subatomic theory.
“I know,” my physics teacher informed the class one day, “that we’ve been talking about protons and neutrons and electrons as though they’re discrete things, like miniature billiard balls, but we’re about to learn that that’s not quite how things are.” She then proceeded to explain that, to take electrons as just one example, they’re actually never observed between the different energy levels we’d learned about. Some time prior, we’d learned all about how electrons orbit their nucleus at different distances, and each distance corresponded to an energy level. And that moving between the energy levels was something that electrons did. But, in the world of our experience, there’s always a “between.” Not so, with electrons. Electrons move between two positions without ever appearing anywhere in between them. They just pop out of existence at energy level one and into existence at energy level two simultaneously.
“But wait,” our teacher informed us. “It gets even better.” Electrons can’t even said to be spheres. We can’t even really pin them down exactly at any particular point above the “surface” of the atom’s nucleus. We know that there’s something that exists somewhere at a particular distance from the center of the atom, but it’s impossible to say exactly where, so it’s more accurate to say that there’s a kind of hazy “electron shell” that’s simultaneously “everywhere” around the atom at a particular distance from it.
Mind blown.
The subatomic world operates in ways that are complete foreign to our experience at our comparatively macroscopic level of existence. This realization was further reinforced by everything I learned about relativity and my brief introduction to quantum theory. It’s exactly this mood of uncertainty and unpredictable behavior that I tried to capture in my novel Schrödinger’s City.
For nearly a year now, I’ve been engaged in the pursuit of degree in computer science (my reasons for this could fill an entire blog post, so I’ll set that aside for now). Given that I already have fourteen years of professional experience in software, and much more non-professional experience besides, I specifically chose a program that would allow me to accelerate through the material for which I could already demonstrate competency. I spent much of the first six months starting courses and passing them in a matter of a few days to few weeks. However, the program hasn’t been without novel material.
One such course was Computer Architecture, which is essentially about how computer hardware works at a very low level, and about Assembly language, which is the programming language of instructions that are actually executable by a processor. When most software engineers talk about “coding,” they’re likely not talking about Assembly. Most programming these days is done with “higher-level” languages (think C++, Java, JavaScript, Rust, Python, etc.) These languages get automatically converted into Assembly by tools called compilers or interpreters, and that Assembly code is what gets executed by the processor.
When I started learning the details of how those instructions get executed, I was in for the kind of surprises that sent me back to my high school days of trying to wrap my head around the fact that electrons could transport themselves through empty space and simultaneously be at all points that were a particular distance from an atom’s nucleus.
The first surprise was negative numbers. Computers are good at math, right? They’re certainly super fast at it. I definitely wasn’t expecting any surprises here. So, the computer is all 1′s and 0′s underneath, entities called “bits.” I figured, that there was just a bit for which 1 meant “positive” and 0 meant “negative,” and that would be that.
Well, that scheme was tried, my textbook explained, and it was promptly abandoned. The biggest problem had to do with memory overflow, which is when a computation causes the computer to need to read or write more bits than it has access to in the moment (and even though memory sizes are really, really large at the RAM level these days, they remain extremely restricted at the level of the processor). The second problem had to due with representing the number zero itself, because having a bit for positive or negative meant that you’d end up with “negative zero” and “positive zero” floating around your system, which was apparently a source of bugs and a giant headache.
The solution? Well, it turns out that the best way to make the computer do simple addition and subtraction correctly is to “invert” the numbering system for negatives over the maximum number allowed by the number of bits available. So, instead of a “sign bit,” you end up with a bit that means (for a 64-bit number) something like, “subtract 9,223,372,036,854,775,808 from this number to get the actual intended number.” This monstrosity somehow works better than a bit that means, “make this number negative.”
But this is small potatoes compared to how the processor handles instructions more generally.
I think that by now, most people, even those not familiar with code, are aware that, at a very basic level, a program is a sequence of instructions, and a CPU processes those instructions one by one until it’s done. This is essentially true, but the devil truly is in the details.
You see, executing an instruction, my textbook informed me, isn’t the “one and done” operation I had blithely assumed that it was. It’s actually a multi-stage process. The processor has to:
A processor could, theoretically, push an entire instruction through this whole pipeline before executing the next instruction. However, that would be super slow. It would be like putting a load of laundry in the washer, then the dryer, and then folding it, all before starting your second load in the washer. If you put your second load of laundry into the washer immediately after the first load goes into the dryer, the work of laundry washing gets done a lot faster. Obviously.
And so, this is what our processors do all the time. The processor just keeps fetching instructions and pushing them through, even before the previous instruction is complete. This works wonderfully for all of math, all of string manipulation, all of input and output, nearly every aspect of programming you can think of… except for branches.
This is an example of a simple branch:
if (blarg > frangle) {
print blarg;
} else {
print frangle;
}
This code has two variables (presumably numbers), one called “blarg” and another called “frangle.” If the number in blarg is greater than the number in frangle, it prints the blarg number. Otherwise, it prints the frangle number. This code branches. It makes a choice about which of two paths to take based on some conditional logic.
Now, let’s imagine this code going through the processor pipeline I described above. The first instruction comes in. For the sake of simplicity, let’s imagine this is something like if (blarg > frangle)
. The processor has fetched that and moved it on to the next step, where it’s going to read the memory it needs (presumably the locations holding blarg and frangle). At the same time, it’s going to fetch the next instruction.
…
What’s the next instruction?!
If blarg is greater than frangle, then the next instruction is print blarg;
. But if not, then the next instruction is print frangle;
. The processor can’t select the next instruction, because it won’t know what the next instruction will be until the if
instruction makes it all the way to the end of the pipeline.
So, what’s the solution? Do our processors simply stall out and waste 75% of potentially productive time waiting for instructions to resolve every time we programmers write an if
statement?
No. The reality is far weirder than anything I’ve dreamed up for my science fiction novels.
When the processor encounters a branch statement in the “fetch instruction” phase, then it simply guesses about which branch will be taken, and queues up that instruction next. To put that concretely, it means that, in the example above, the processor “guesses” that blarg will in fact be greater and queues up the print blarg;
instruction for processing.
Now, of course, this begs the question, what happens when the processor guesses wrongly? This is my favorite part.
If the processor gets the if (blarg > frangle)
statement to the end of the pipeline, after queuing up print blarg;
and it turns out that frangle was actually the larger value, well, all the subsequent processes are only partially complete, so it can simply flush all of its in-progress instructions away, reset it itself to just after if (blarg > frangle)
in the instruction queue, and, this time, process the correct instruction.
Modern processors, my textbook assures me, have gotten so good at branch prediction (e.g. using statistics to determine if a given branch statement will take path A or path B), that they can guess the correct branch 85% of the time, minimizing the need to throw away work in the way I just described.
Still, even if with that, imagine the speed at which processors operate, and how many calculations they do, particularly for applications like real-time 3D games. The number of operations that must get calculated, discarded, and reset every second in a modern computer must be staggering.
I’m certainly glad to be learning these things. I love learning in general, and software specifically. It’s cool to get glimpse of how this stuff works under the hood. I used to think that computation was this concrete, definite thing. But, much like the physics of the subatomic, it’s a weirder world down there than I realized. My most important insight from my degree has nothing to do with code in particular, but rather, I’d say that I’ve developed a renewed respect for the power of programmatic computation. One of the best things science fiction literature can do is invoke awe and wonder. The same can be said for education.