While I was playing Control the other night, the Old Gods of Asgard told me to Take Control.
It’s one of the coolest moments in video games.
When the publisher announced the title it caught my attention, because control has been on my mind for years. I control computers for a living, but I often feel that computers really control me. What is control, anyway?
To most programmers, control
means control structures in a programming language. But it’s funny that only part of a language is called control
when an article in Communications of the ACM asserted a few years ago that
The purpose of programs is to control machines, not to provide a means of algorithmic self-expression for programmers.
According to these authors, we should refer to the whole of every language as control structures (and we should prohibit expressions). Why not? What part of a program isn’t for control?
The usual answer is data,
the inert and passive things in the program that the control
parts move around and transform. Control and data; algorithms and data structures; they form a pleasant duality to think about.
Yet the first part of that quotation is correct. Every line of our programs—control, data, or otherwise—exists ultimately for a computer to read and act upon. An expression defining a hash table is really as much of a control structure as it is a data structure. These separate categories are in the end just mnemonics.
From that perspective it’s clear that the second part of the quotation is dead wrong. These mnemonics exist only to encode a programmer’s message about what the computer should do. Programs control machines by virtue of conveying programmers’ intent to them. Programmers can get by without self-expression only when they do the computation themselves—that is, when they are not programmers at all.
This is not just metaphysics for its own sake. It shines light on why programmers so often don’t feel in control, even though by all indications they should have total control. My words have power to control computers inasmuch as they have power to convey my message to them. When I don’t feel in control, sometimes I am precisely articulating the wrong message by accident. However, more often I am trying to put the right message through the wrong transmitter: I am trying to encode it as the wrong control structures.
How, really, do my programs control the computer? In essence they work the same way as they did back in 1997, when I learned programming in BASIC. The first thing I learned was that the computer interprets each statement and produces some corresponding effect. So, my choice of statement was an element of control. The line
10 PRINT "I AM THE RADIO SHACK TRS-80"
does something different than the line
10 PRINT "I AM THE COMPAQ PRESARIO 1630"
And then, when it is done producing the effect, the computer drops down a line and does the same with the next statement. So, the order of my statements was another element of control. The program
10 X = 43
20 PRINT "I AM " + X + " YEARS OLD"
does something different than
10 X = 23
20 PRINT "I AM " + X + " YEARS OLD"
So far so good. I could write lots of programs this way, so already the machine was under my control. It’s not a one-button toy, but rather something I could configure to do what I want. With automatic machines, control is the same as configuration, and programs are configuration files.
There are many levels of configuration. Some statements change the control flow1. These statements change the rule for determining which statement comes next. So in the program
10 IF X < 26 THEN
20 PRINT "INNOVATOR"
30 ELSE
40 PRINT "CAROUSEL"
the computer chooses which line comes next, 20 or 40, based on the value of a variable. Earlier the computer followed a strict rule that the next statement is always the one with the next largest number after the current one. But these control statements actually change the rule! Moreover, the new rule is a function of the input. What power!
Some control statements even let us make our own rules. In the program
10 PRINT "MACHINE"
20 GOTO 10
line 20 instructs the computer to start over from line 10. GOTO
lets the programmer define an arbitrary special case for the next-statement rule. More power! Unlike the IF
statement, the effect of GOTO
is never a function of the input. But when combined, IF
and GOTO
together can compute all kinds of control rules from the input at run time.
Even back then I could see that this let me make a truly interactive program that could be explored. The program could now be a map of possibilities, and with GOTO
I could connect them with routes for the computer to navigate. I realized that this allowed me to turn a single thread of instructions into the ornate tapestries with which I had fallen in love on my game consoles.
Imagine my irritation, then, at the persistent misconception that any use of goto
is bad programming. Programmers hold this taboo widely and deeply enough to suggest that goto
s are a scourge threatening all of computing. This is just not the case these days, and it was probably not the case in 1997. From what I gather, this was only true before many of today’s critics were born, in an era when the current doctrine of function calls was scorned as wasteful. The goto
scare that arose from Dijkstra’s letter in Communications lives on like a ghost with an inexhaustible grudge.
In reality, while many programs are better without goto
statements, the essential concept of goto
remains indispensible. If
, goto
, for
loops, and every other control structure in every language implemented for modern machines all rely on more powerful machine instructions that can set the next machine instruction to be any location, even one computed from the input. These are usually called jump
or branch
instructions. A jump might as well be called goto,
because they both mean go somewhere else.
The machine can read its input as an address, and then it can just YOLO over to that address and start executing instructions.
So what should we do if our honest message for the computer is that we want this to happen? Why is that degree of control usually unavailable to programmers above the level of the machine language? In short, it’s because programs for computing payroll don’t need it and suffered from its misuse…more than half a century ago. In response, programming languages adopted the philosophy of structured programming and provided more restricted access to jumps in the form of the familiar control structures. For example, calling a function is just jumping to it and then jumping back:
.... | # Set up arguments
1000 | jal QUICKSORT # Call QUICKSORT
.... | ....
.... | ....
2000 | QUICKSORT: # Code for quicksort
.... | .....
.... | .....
20ff | jr $ra # Return back
Since then, entire industries of computer applications have risen and fallen. Core memory is unknown to engineers who have spent their entire lives using semiconductors. The ARPANET took over the entire world. And we are still indoctrinating programmers as structured programming soldiers. We tell them that their jump
message is just the wrong message, whether or not their program computes payroll. Well, sometimes it’s the right message now.
Midway through those intervening years, around the time I started programming, a video game led a phenomenon of pop culture that endures to this day: Pokémon. This was not a payroll program. This was a program that could be explored interactively, and moreover one whose entire purpose was this kind of exploration. And what were the control structures that defined the paths of navigation? Were they structured programs, pure and true?
No! This game was scripted with unstructured jumps in assembler language! Look at this part:
CeladonMartRoofScript_GiveDrinkToGirl:
...
...
.gaveSodaPop
CheckEvent EVENT_GOT_TM48
jr nz, .alreadyGaveDrink
ld hl, CeladonMartRoofText_48504
call PrintText
call RemoveItemByIDBank12
lb bc, TM_48, 1
call GiveItem
jr nc, .bagFull
ld hl, CeladonMartRoofText_4850a
call PrintText
SetEvent EVENT_GOT_TM48
ret
The jr
opcodes denote relative jumps. They jump out of the normal call stack when certain conditions hold. Just as significant are the call
opcodes, indicating calls to subroutines like PrintText
and GiveItem
that last several seconds while animated text is printed. This game isn’t even structured as a main loop that updates and draws all of the game at once! For all I know, there is no expression of a main loop at all, and the repetetive parts of the game just emerge from temporary cycles of jumps.
Nonetheless I can understand this routine decently well given my experience playing the game. The programmer’s message is visible, and I can imagine easily modifying this and putting my message in with the original one.
And that’s what I call REAL Ultimate Power!
Received opinion suggests that a transfer to a different block should be a function call, not a jump. As we saw, a function call intentionally restricts the programmer’s control by forcing jumps to come in pairs: one for the call, and one for the return. For many purposes this is very useful, and I have happily written many functions and function calls. But how do you weave a tapestry when both ends of your thread are fixed? How can you use a function call to go to
the next activity of the game when your programming language says that the process absolutely must return
to what it was doing before?
You can’t. If function calls are the only way to manipulate your programming language’s internal control information, that information simply doesn’t represent the overall state of your game. Instead, you have to represent the state of your game with regular data. You ultimately simulate jumps in terms of other control structures that, like if
, are informed by these data.
The state of most games requires an intricate representation. In a good game, each state can lead to many other states that are surprisingly, intriguingly, playfully different. Any state space you construct to model that will be accordingly complex. In my experience the state representation is usually a wad of ad-hoc local and global variables filling in the gaps that remain in half-useful configurations for systems attempting to define common behaviors like cinematics, menus, missions, and the like. To understand each change of control, you have to understand how this or that jerky little state variable is interpreted potentially anywhere in the program. I think we have all seen today’s best practices yield horrors that make CeladonMartRoofScript_GiveDrinkToGirl
look like hello world.
It would be better to unify the representation and interpretation of the game state by using them to implement a higher-level language. It can be embedded right in the language you would use otherwise. The language could explicitly provide the jump commands that the programmers are really trying to communicate in the first place. Give them a straightforward way to express this instead of asking them to encode it in spaghetti.
This is the motivation behind tools for artists, dashboards for managers, config files for system administrators, CRUD interfaces for salespeople, and countless other means of self-expression for anyone who needs to control the behavior of a computer system. Each of these things constitutes a language whose semantics packages up control in such a way that each user can express their message about it relatively directly.
What makes programmers different from those other users is not that they have generally greater capacity to process information. They benefit just as much from a language whose semantics closely fits what they have to say. The difference is that they usually have a wider variety of messages to convey, and they are expected to have such expertise that they can express these messages in basically any code necessary, whether it is a straightforward code or an obscenely complex code.
The basic control structures underlying these codes have, by and large, changed little in decades. Yet the messages we are transmitting with them have become drastically more complicated and diversified. As the message diverges from the expressions, its transmission becomes more susceptible to loss and we are liable to lose control. We can recover it if we can adapt our programming systems to provide more direct expressions for our higher-level messages. Today most of us lack that level of control, but I can confirm that it is possible.
The metaphor here is that some statement is in control
while the computer is executing it, and each statement hands off control of the machine to the next one; we imagine the control flowing through these handoffs like a wave in a line of dominoes.↩︎