HomeThoughts and MusingsThomas M. Tuerke on Technology • On Goto, Continue, and Break
 

On Goto, Continue, and Break


Table of Contents [show/hide]
 
On Goto, Continue, and Break

goto Through the Ages

Long ago, in the Pleistocene (or what passes for it in computer history) computer languages had very simple flow control: conditions, and goto‌s. That's because under the hood, that's effectively all there is. Regardless of which processor you (or your compiler) are writing code for, there's a test-and-branch, and unconditional branch.

If you've ever developed in assembly language, you know you could jump just about anywhere in your code. Limitations were typically architectural versus structural. That is to say, some architectures may have limited your conditional branches to a small subset of bytes before or after the current instruction, but wouldn't prevent you from jumping, say, outside of your "current function" ... a notion only introduced later with the advent of higher-level languagesArguably, the BASIC of yore was not such a beast. It had line numbers, goto, and gosub. Exactly where a "function" (or rather, subroutine) began and ended was really rather fluid, and it was quite easy to goto a line beyond your return statement, into an entirely different chunk of code.... Unconditional jumps were dually unconditional: the code would always jump, and the code could jump anywhere: gimme an address, and I'll make that be the Next Sequential Instruction. Done. Instant worm-hole to anywhere in memory.

Just as strange beasts roamed the Pliestocene, so too strange and wonderful programs—riddled with all manner of jumps-to-anywhere—existed in the early computing era. Dissecting such a beast to understand how it worked was... difficult.

Modern Flow Control

As programming evolved, certain patterns of logic flow developed, and programming languages adopted them: we now have pre- and post-condition loops... which early BASIC never had... (manifest as while and do-while) and various types of more highly structured enumerative loops... which early BASIC did have... (a la for). But because old habits die hard—and there was just enough reluctance to pry that ultimate power from the hands of developers—the notion of unconditional jumps anywhere, in the form of the goto keyword, lived on. With it came all the power-for-good-or-ill.

And yes, gradually the goto came with more and more restrictions, effectively confining it into smaller and smaller paddocks to jump to... (but never quite small enough for some.) The goto statement was pretty universally regarded as bad, and language designers frittered at the edges of questionable use to bring it more under control.

Thing is, as we began to understand these control flow structures, we determined that nearly all these varieties of goto were easily lumped into specific types: jumping to the "beginning" of the loop, or vacating it entirely. These took the forms of continue and break, respectively. In theory, the goto statement could be completely eliminated.

Yada-yada. Thing is, the goto never did die off. It exists in modern languages like Java (albeit inert, as a frozen specimen that could be revived some time in the future) and C# certainly hasn't been able to rid itself of goto's influence. But its demise is long overdue. Honestly, I wouldn't particularly mourn its extinction as a higher-level-language construct; in the past nigh-on four decades of coding in various curly-brace languages, I have had almost no cause to use it... it is possible to write clear goto-free code! In fact, I see relying on goto as a crutch, and writing poorly-structured sans-goto code as the mark of a novice.

goto Is Evil (?)

The ill effects of goto are widely known, even as adherents tout its great utility.

As should be clear by now, I come down on the side that goto weighs in as a liability: I have seen far more instances of its ab‌use than its proper use.

Even from that fundamental concern of readability, when you see

 goto some_label;

it is harder to reason about where you are going, precisely because that label can be (more or less) anywhere.

So strongly held is this position that major safety-based compliance regimens such as MISRA, HIS Metrics, and JSF outright ban its use or else strongly advise against it.

continue is Evil (?)

There is a camp that also calls continue evil. The rationale here is "call a goto a goto".

And I disagree with that. Following that rationale to its extreme, one could reason that one should code this:

 loop_top: if(done)
             goto loop_end;
           do_something();
           goto loop_top

 loop_end: // after the loop

rather than

           while ( ! done) {
             do_something();
           }

           // after the loop

because, hey: call a goto a goto, and a while loop has two of them: one to exit the loop when the condition no longer holds, and another to head back up to the top of the loop.

So "call a goto a goto" holds no sway with me: structured code exists to obviate the goto, replacing it with concise, controlled mechanisms that make it easier for a reader to reason about the logic.

In this way, continue and break are not mere goto statements, but like while, do-while, and for, have a very precise meaning that are very easy to reason about. The compiler enforces exactly where continue and break take the flow of control. Not so with goto, which requires a leap of faith to reason about the destination.

Nevertheless, JSF also bans continue and some uses of break. Some of this comes about because the contextually ambiguous (and not entirely parallel) behavior of each: continue applies only to loops, while break also applies to switch statements, so it is possible to have something like this:

 while( ! done) {

   // code

   switch(x) {
     case X:
       continue;  // applies to loop, goes to next iteration
     case Y:
       break;     // not a parallel to continue: will not exit loop
     case Z:
       do_something();
   }

   // more code

 }

where—because of the switch statement—the continue and break do not both apply to the same control structure.


Sections: 1
The dangers of goto
- Thomas M. Tuerke

So, if you poke your head in on the CERT-C guidelines, you see the recommendation to use a goto chain. They cite a non-compliant example which forgets to close a file resource. That's correct: the non-compliant example has a flaw.

But they suggest that the right thing to do is rely on goto to overcome that. And they suggest something like the following.

errno_t do_something(void) {
  FILE *fin1, *fin2;
  object_t *obj;
  errno_t ret_val = NOERR; /* Initially assume a successful return value */
 
  fin1 = fopen("some_file", "r");
  if (fin1 == NULL) {
    ret_val = errno;
    goto FAIL_FIN1;
  }
 
  fin2 = fopen("some_other_file", "r");
  if (fin2 == NULL) {
    ret_val = errno;
    goto FAIL_FIN2;
  }
 
  obj = malloc(sizeof(object_t));
  if (obj == NULL) {
    ret_val = errno;
    goto FAIL_OBJ;
  }
 
  /* ... More code ... */
 
SUCCESS:     /* Clean up everything */
  free(obj);
 _ 
FAIL_FIN2:
  fclose(fin1);
 
FAIL_OBJ:   /* Otherwise, close only the resources we opened */
  fclose(fin2);

FAIL_FIN1:
  return ret_val;
}

I say something like, because I've slightly altered it (in a way that is easy to miss, is not something a compiler will detect, and more importantly, it's a way that's easy to do by accident.) Moreover, you're only likely to try to find the problem now that I've called it out: if you haven't found it before reading this paragraph, gotcha.

The point is, the example above looks correct. It compiles. It even runs fine for the normal case and a few error cases. But it has a flaw.

So is that goto-chain the best that we can do? I think not.

Now, I'm not fond of deeply-indented code. But I'd rather have that indentation—and the support of the compiler—than a bunch of goto statements and labels (plus a heavy dose of faith in the integrity of hand-stitched logic flow.) All I have done is used the curly braces of the else clause to do the same thing as a goto, without the risk of mis-expression, like having the goto target the wrong label, or having the labels be out of order.

errno_t do_something(void) {
  FILE *fin1, *fin2;
  object_t *obj;
  errno_t ret_val = NOERR; /* Initially assume a successful return value */
 
  fin1 = fopen("some_file", "r");
  if (fin1 == NULL) {
    ret_val = errno;
  }
  else {                                        // fin1 is open
    fin2 = fopen("some_other_file", "r");
    if (fin2 == NULL) {
      ret_val = errno;
    }
    else {                                    // fin2 is open
      obj = malloc(sizeof(object_t));
      if (obj == NULL) {
        ret_val = errno;
      }
      else {                                // obj allocated 
      
        /* ... More code ... */
        
        free(obj);                          // obj deallocated
      }
      fclose(fin2);                           // fin2 is closed
    }
    fclose(fin1);                               // fin1 is closed
  }

  return ret_val;
}

Now granted, this has an upper bound: the amount of indentation. In fact, the CERT-C site goes on to illustrate a rather extreme case, with a 17-goto instance drawn from the Linux kernel. All well and good. But it's brittle. And an extreme outlier (although there are an extraordinary number of megamoth functions in Linux.) For such extreme cases, exceptions and extreme caution are the watchword. But for more pedestrian (and much more common) instances it makes sense to utilize the compiler to its fullest extent. I'm not a willing vassal of the hobgoblins of foolish consistency.

In a way, a limit like indentation is also a good thing. Safe coding also strongly discourages complex functions, with metrics such as Cyclomatic Complexity (roughly, the number of branches in your code) frequently being restricted to a very low level. Deep indentation is frequently a reminder that you're getting close to this ceiling, and should apply some effort to re-expressing your solution, lest it turn into a tangled, unmanageable megamoth.




Share: