Language elements

by Michael Metcalf / CERN CN-ASD

The basic components of the Fortran language are its character set. The members are:

From these components, we build the tokens that have a syntactic meaning to the compiler. There are six classes of token:
Label:       123                  Constant: 123.456789_long
 
Keyword:     ALLOCATABLE          Operator: .add.
 
Name:        solve_equation (up to 31 characters, including _)
 
Separator:  /   (   )   (/   /)   ,   =   =>   :   ::   ;   %

From the tokens, we can build statements. These can be coded using the new free source form which does not require positioning in a rigid column structure:

FUNCTION string_concat(s1, s2)                ! This is a comment
   TYPE (string), INTENT(IN) :: s1, s2
   TYPE (string) string_concat
   string_concat%string_data = s1%string_data(1:s1%length) // &
      s2%string_data(1:s2%length)             ! This is a continuation
   string_concat%length = s1%length + s2%length
END FUNCTION string_concat
Note the trailing comments and the trailing continuation mark. There may be 39 continuation lines, and 132 characters per line. Blanks are significant. Where a token or character constant is split across two lines:
               ...        start_of&
        &_name
               ...   'a very long &
        &string'
a leading & on the continued line is also required.

Automatic conversion of source form for existing programs can be carried out by convert.f90. Its options are:

Fortran has five intrinsic data types. For each there is a corresponding form of literal constant. For the three numeric intrinsic types they are:

INTEGER

        1   0   -999   32767   +10
for the default kind; but we may also define, for instance for a desired range of -10**4 to +10**4, a named constant, say two_bytes:
        INTEGER, PARAMETER :: two_bytes = SELECTED_INT_KIND(4)
that allows us to define constants of the form
        -1234_two_bytes   +1_two_bytes
Here, two_bytes is the kind type parameter; it can also be a default integer literal constant, like
        -1234_2
but use of an explicit literal constant would be non-portable.

The KIND function supplies the value of a kind type parameter:

        KIND(1)            KIND(1_two_bytes)
and the RANGE function supplies the actual decimal range (so the user must make the actual mapping to bytes):
        RANGE(1_two_bytes)

Also, in DATA statements, binary, octal and hexcadecimal constants may be used:

        B'01010101'   O'01234567'   Z'10fa'

REAL

There are at least two real kinds - the default, and one with greater precision (this replaces DOUBLE PRECISION). We might specify
        INTEGER, PARAMETER :: long = SELECTED_REAL_KIND(9, 99)
for at least 9 decimal digits of precision and a range of 10*(-99) to 10**99, allowing
        1.7_long
Also, we have the intrinsic functions
        KIND(1.7_long)   PRECISION(1.7_long)   RANGE(1.7_long)
that give in turn the kind type value, the actual precision (here at least 9), and the actual range (here at least 99).

COMPLEX

This data type is built of two integer or real components:
        (1, 3.7_long)
The forms of literal constants for the two non-numeric data types are:

CHARACTER

        'A string'   "Another"   'A "quote"'   ''
(the last being a null string). Other kinds are allowed, especially for support of non-European languages:
        2_'   '
and again the kind value is given by the KIND function:
        KIND('ASCII')

LOGICAL

Here, there may also be different kinds (to allow for packing into bits):
        .FALSE.   .true._one_bit
and the KIND function operates as expected:
        KIND(.TRUE.)

The numeric types are based on model numbers with associated inquiry functions (whose values are independent of the values of their arguments):

     DIGITS(X)               Number of significant digits
     EPSILON(X)              Almost negligible compared to one (real)
     HUGE(X)                 Largest number
     MAXEXPONENT(X)          Maximum model exponent (real)
     MINEXPONENT(X)          Minimum model exponent (real)
     PRECISION(X)            Decimal precision (real, complex)
     RADIX(X)                Base of the model
     RANGE(X)                Decimal exponent range
     TINY(X)                 Smallest postive number (real)
These functions are important for portable numerical software.

We can specify scalar variables corresponding to the five intrinsic types:

        INTEGER(KIND=2) i
        REAL(KIND=long) a
        COMPLEX         current
        LOGICAL         Pravda
        CHARACTER(LEN=20) word
        CHARACTER(LEN=2, KIND=Kanji) kanji_word
where the optional KIND parameter specifies a non-default kind, and the LEN= specifier replaces the *len form. The explicit KIND and LEN specifiers are optional:
        CHARACTER(2, Kanji) kanji_word
works just as well.

For derived-data types we must first define the form of the type:

        TYPE person
           CHARACTER(10) name
           REAL          age
        END TYPE person
and then create structures of that type:
        TYPE(person) you, me
To select components of a derived type, we use the % qualifier:
        you%age
and the form of a literal constant of a derived type is shown by:
        you = person('Smith', 23.5)
which is known as a structure constructor. Definitions may refer to a previously defined type:
        TYPE point
           REAL x, y
        END TYPE point
        TYPE triangle
           TYPE(point) a, b, c
        END TYPE triangle
and for a variable of type triangle, as in
        TYPE(triangle) t
we have components of type point:
        t%a   t%b   t%c
which, in turn, have ultimate components of type real:
        t%a%x   t%a%y   t%b%x   etc.
We note that the % qualifier was chosen rather than . because of ambiguity difficulties.

Arrays are considered to be variables in their own right. Given

        REAL a(10)
        INTEGER, DIMENSION(0:100, -50:50) :: map
(the latter an example of the syntax that allows grouping of attributes to the left of :: and of variables sharing the attributes to the right), we have two arrays whose elements are in array element order (column major), but not necessarily in contiguous storage. Elements are, for example,
        a(1)               a(i*j)
and are scalars. The subscripts may be any scalar integer expression. Sections are
        a(i:j)               ! rank one
        map(i:j, k:l:m)      ! rank two
        a(map(i, k:l))       ! vector subscript
        a(3:2)               ! zero length
Whole arrays and array sections are array-valued objects. Array-valued constants (constructors) are available:
        (/ 1, 2, 3, 4, 5 /)
        (/ (i, i = 1, 9, 2) /)
        (/ ( (/ 1, 2, 3 /), i = 1, 10) /)
        (/ (0, i = 1, 100) /)
        (/ (0.1*i, i = 1, 10) /)
making use of the implied-DO loop notation familiar from I/O lists. A derived data type may, of course, contain array components:
        TYPE triplet
           REAL, DIMENSION(3) :: vertex
        END TYPE triplet
        TYPE(triplet), DIMENSION(4) :: t
so that
       t(2)           is a scalar (a structure)
       t(2)%vertex    is an array component of a scalar

There are some other interesting character extensions. Just as a substring as in

        CHARACTER(80), DIMENSION(60) :: page
        ... = page(j)(i:i)         ! substring
was already possible, so now are the substrings
        '0123456789'(i:i)
        you%name(1:2)
Also, zero-length strings are allowed:
        page(j)(i:i-1)       ! zero-length string
Finally, there are some new intrinsic character functions:
      ACHAR                 IACHAR  (for ASCII set)
      ADJUSTL               ADJUSTR
      LEN_TRIM              INDEX(s1, s2, BACK=.TRUE.)
      REPEAT                SCAN  (for one of a set)
      TRIM                  VERIFY(for all of a set)


M.G. (October 19th 1995)