control

While I was playing Control the other night, the Old Gods of Asgard told me to Take Control. It’s one of the coolest moments in video games.

When the publisher announced the title it caught my attention, because control has been on my mind for years. I control computers for a living, but I often feel that computers really control me. What is control, anyway?

According to these authors, we should refer to the whole of every language as control structures (and we should prohibit expressions). Why not? What part of a program isn’t for control?

The usual answer is data, the inert and passive things in the program that the control parts move around and transform. Control and data; algorithms and data structures; they form a pleasant duality to think about.

Yet the first part of that quotation is correct. Every line of our programs—control, data, or otherwise—exists ultimately for a computer to read and act upon. An expression defining a hash table is really as much of a control structure as it is a data structure. These separate categories are in the end just mnemonics.

From that perspective it’s clear that the second part of the quotation is dead wrong. These mnemonics exist only to encode a programmer’s message about what the computer should do. Programs control machines by virtue of conveying programmers’ intent to them. Programmers can get by without self-expression only when they do the computation themselves—that is, when they are not programmers at all.

This is not just metaphysics for its own sake. It shines light on why programmers so often don’t feel in control, even though by all indications they should have total control. My words have power to control computers inasmuch as they have power to convey my message to them. When I don’t feel in control, sometimes I am precisely articulating the wrong message by accident. However, more often I am trying to put the right message through the wrong transmitter: I am trying to encode it as the wrong control structures.

Quick basics

How, really, do my programs control the computer? In essence they work the same way as they did back in 1997, when I learned programming in BASIC. The first thing I learned was that the computer interprets each statement and produces some corresponding effect. So, my choice of statement was an element of control. The line

And then, when it is done producing the effect, the computer drops down a line and does the same with the next statement. So, the order of my statements was another element of control. The program

So far so good. I could write lots of programs this way, so already the machine was under my control. It’s not a one-button toy, but rather something I could configure to do what I want. With automatic machines, control is the same as configuration, and programs are configuration files.

Control flow

There are many levels of configuration. Some statements change the control flow¹. These statements change the rule for determining which statement comes next. So in the program

the computer chooses which line comes next, 20 or 40, based on the value of a variable. Earlier the computer followed a strict rule that the next statement is always the one with the next largest number after the current one. But these control statements actually change the rule! Moreover, the new rule is a function of the input. What power!

line 20 instructs the computer to start over from line 10. GOTO lets the programmer define an arbitrary special case for the next-statement rule. More power! Unlike the IF statement, the effect of GOTO is never a function of the input. But when combined, IF and GOTO together can compute all kinds of control rules from the input at run time.

Even back then I could see that this let me make a truly interactive program that could be explored. The program could now be a map of possibilities, and with GOTO I could connect them with routes for the computer to navigate. I realized that this allowed me to turn a single thread of instructions into the ornate tapestries with which I had fallen in love on my game consoles.

Jump instruction considered essential

Imagine my irritation, then, at the persistent misconception that any use of goto is bad programming. Programmers hold this taboo widely and deeply enough to suggest that gotos are a scourge threatening all of computing. This is just not the case these days, and it was probably not the case in 1997. From what I gather, this was only true before many of today’s critics were born, in an era when the current doctrine of function calls was scorned as wasteful. The goto scare that arose from Dijkstra’s letter in Communications lives on like a ghost with an inexhaustible grudge.

In reality, while many programs are better without goto statements, the essential concept of goto remains indispensible. If, goto, for loops, and every other control structure in every language implemented for modern machines all rely on more powerful machine instructions that can set the next machine instruction to be any location, even one computed from the input. These are usually called jump or branch instructions. A jump might as well be called goto, because they both mean go somewhere else. The machine can read its input as an address, and then it can just YOLO over to that address and start executing instructions.

So what should we do if our honest message for the computer is that we want this to happen? Why is that degree of control usually unavailable to programmers above the level of the machine language? In short, it’s because programs for computing payroll don’t need it and suffered from its misuse…more than half a century ago. In response, programming languages adopted the philosophy of structured programming and provided more restricted access to jumps in the form of the familiar control structures. For example, calling a function is just jumping to it and then jumping back:

Since then, entire industries of computer applications have risen and fallen. Core memory is unknown to engineers who have spent their entire lives using semiconductors. The ARPANET took over the entire world. And we are still indoctrinating programmers as structured programming soldiers. We tell them that their jump message is just the wrong message, whether or not their program computes payroll. Well, sometimes it’s the right message now.

Game programming

Midway through those intervening years, around the time I started programming, a video game led a phenomenon of pop culture that endures to this day: Pokémon. This was not a payroll program. This was a program that could be explored interactively, and moreover one whose entire purpose was this kind of exploration. And what were the control structures that defined the paths of navigation? Were they structured programs, pure and true?

The jr opcodes denote relative jumps. They jump out of the normal call stack when certain conditions hold. Just as significant are the call opcodes, indicating calls to subroutines like PrintText and GiveItem that last several seconds while animated text is printed. This game isn’t even structured as a main loop that updates and draws all of the game at once! For all I know, there is no expression of a main loop at all, and the repetetive parts of the game just emerge from temporary cycles of jumps.

Nonetheless I can understand this routine decently well given my experience playing the game. The programmer’s message is visible, and I can imagine easily modifying this and putting my message in with the original one.

Received opinion suggests that a transfer to a different block should be a function call, not a jump. As we saw, a function call intentionally restricts the programmer’s control by forcing jumps to come in pairs: one for the call, and one for the return. For many purposes this is very useful, and I have happily written many functions and function calls. But how do you weave a tapestry when both ends of your thread are fixed? How can you use a function call to go to the next activity of the game when your programming language says that the process absolutely must return to what it was doing before?

You can’t. If function calls are the only way to manipulate your programming language’s internal control information, that information simply doesn’t represent the overall state of your game. Instead, you have to represent the state of your game with regular data. You ultimately simulate jumps in terms of other control structures that, like if, are informed by these data.

The state of most games requires an intricate representation. In a good game, each state can lead to many other states that are surprisingly, intriguingly, playfully different. Any state space you construct to model that will be accordingly complex. In my experience the state representation is usually a wad of ad-hoc local and global variables filling in the gaps that remain in half-useful configurations for systems attempting to define common behaviors like cinematics, menus, missions, and the like. To understand each change of control, you have to understand how this or that jerky little state variable is interpreted potentially anywhere in the program. I think we have all seen today’s best practices yield horrors that make CeladonMartRoofScript_GiveDrinkToGirl look like hello world.

Control language

It would be better to unify the representation and interpretation of the game state by using them to implement a higher-level language. It can be embedded right in the language you would use otherwise. The language could explicitly provide the jump commands that the programmers are really trying to communicate in the first place. Give them a straightforward way to express this instead of asking them to encode it in spaghetti.

This is the motivation behind tools for artists, dashboards for managers, config files for system administrators, CRUD interfaces for salespeople, and countless other means of self-expression for anyone who needs to control the behavior of a computer system. Each of these things constitutes a language whose semantics packages up control in such a way that each user can express their message about it relatively directly.

What makes programmers different from those other users is not that they have generally greater capacity to process information. They benefit just as much from a language whose semantics closely fits what they have to say. The difference is that they usually have a wider variety of messages to convey, and they are expected to have such expertise that they can express these messages in basically any code necessary, whether it is a straightforward code or an obscenely complex code.

The basic control structures underlying these codes have, by and large, changed little in decades. Yet the messages we are transmitting with them have become drastically more complicated and diversified. As the message diverges from the expressions, its transmission becomes more susceptible to loss and we are liable to lose control. We can recover it if we can adapt our programming systems to provide more direct expressions for our higher-level messages. Today most of us lack that level of control, but I can confirm that it is possible.

The metaphor here is that some statement is in control while the computer is executing it, and each statement hands off control of the machine to the next one; we imagine the control flowing through these handoffs like a wave in a line of dominoes.↩︎