Previous: Two-pass Code, Up: Two-pass Design


21.4.2 Why Two Passes

The need for two passes was not immediately evident during the design and implementation of the code in the FFE that was to produce GBEL. Only after a few kludges, to handle things like incorrectly-guessed ASSIGN label nature, had been implemented, did enough evidence pile up to make it clear that std.c had to be introduced to intercept, save, then revisit as part of a second pass, the digested contents of a program unit.

Other such missteps have occurred during the evolution of the FFE, because of the different goals of the FFE and the GBE.

Because the GBE's original, and still primary, goal was to directly support the GNU C language, the GBEL, and the GBE itself, requires more complexity on the part of most front ends than it requires of gcc's.

For example, the GBEL offers an interface that permits the gcc front end to implement most, or all, of the language features it supports, without the front end having to make use of non-user-defined variables. (It's almost certainly the case that all of K&R C, and probably ANSI C as well, is handled by the gcc front end without declaring such variables.)

The FFE, on the other hand, must resort to a variety of “tricks” to achieve its goals.

Consider the following C code:

     int
     foo (int a, int b)
     {
       int c = 0;
     
       if ((c = bar (c)) == 0)
         goto done;
     
       quux (c << 1);
     
     done:
       return c;
     }

Note what kinds of objects are declared, or defined, before their use, and before any actual code generation involving them would normally take place:

Whereas, the following items can, and do, suddenly appear “out of the blue” in C:

Not surprisingly, the GBE faithfully permits the latter set of items to be “discovered” partway through GBEL “programs”, just as they are permitted to in C.

Yet, the GBE has tended, at least in the past, to be reticent to fully support similar “late” discovery of items in the former set.

This makes Fortran a poor fit for the “safe” subset of GBEL. Consider:

           FUNCTION X (A, ARRAY, ID1)
           CHARACTER*(*) A
           DOUBLE PRECISION X, Y, Z, TMP, EE, PI
           REAL ARRAY(ID1*ID2)
           COMMON ID2
           EXTERNAL FRED
     
           ASSIGN 100 TO J
           CALL FOO (I)
           IF (I .EQ. 0) PRINT *, A(0)
           GOTO 200
     
           ENTRY Y (Z)
           ASSIGN 101 TO J
     200   PRINT *, A(1)
           READ *, TMP
           GOTO J
     100   X = TMP * EE
           RETURN
     101   Y = TMP * PI
           CALL FRED
           DATA EE, PI /2.71D0, 3.14D0/
           END

Here are some observations about the above code, which, while somewhat contrived, conforms to the FORTRAN 77 and Fortran 90 standards:

Very few of these “discoveries” can be accommodated by the GBE as it has evolved over the years. The GBEL doesn't support several of them, and those it might appear to support don't always work properly, especially in combination with other GBEL and GBE features, as implemented in the GBE.

(Had the GBE and its GBEL originally evolved to support g77, the shoe would be on the other foot, so to speak—most, if not all, of the above would be directly supported by the GBEL, and a few C constructs would probably not, as they are in reality, be supported. Both this mythical, and today's real, GBE caters to its GBEL by, sometimes, scrambling around, cleaning up after itself—after discovering that assumptions it made earlier during code generation are incorrect. That's not a great design, since it indicates significant code paths that might be rarely tested but used in some key production environments.)

So, the FFE handles these discrepancies—between the order in which it discovers facts about the code it is compiling, and the order in which the GBEL and GBE support such discoveries—by performing what amounts to two passes over each program unit.

(A few ambiguities can remain at that point, such as whether, given `EXTERNAL BAZ' and no other reference to `BAZ' in the program unit, it is a subroutine, a function, or a block-data—which, in C-speak, governs its declared return type. Fortunately, these distinctions are easily finessed for the procedure, library, and object-file interfaces supported by g77.)