Chapter 2. Lexical Structure

The character set used in source files is defined by the Sather implementation, but it must include at least the characters which appear in the syntactic constructs in this specification. Sather implementations may be based on ASCII, but this is not required. The case of characters in source files is significant. All syntactic constructs except identifiers and certain literals may be separated by an arbitrary number of whitespace characters and comments. The seven whitespace characters are space, tab, newline, vertical tab, backspace, carriage return, and form feed. Sather comments consist of two dashes '--' outside of a string (See String literal expressions) or character literal (See Character literal expressions) and all following text until a newline.

Sather identifiers are used to name class features, method arguments, and local variables. Most consist of letters, decimal digits, and the underscore character, and begin with a letter. Iterator names additionally end with the '!' character. Abstract type names and class names are similar, but the letters must be uppercase and abstract type names begin with '$'. There are no restrictions on the lengths of Sather identifiers or class names. Identifiers, class names, and keywords must be followed by a character other than a letter, decimal digit, or underscore. This may force the use of white-space after an identifier.

identifier ==>
        letter { letter | decimal_digit | _ }
uppercase_identifier ==>
        uppercase_letter { uppercase_letter | decimal_digit | _ }
abstract_class_name ==>
        $ uppercase_identifier
iter_name ==>
        [ identifier ] !
letter ==>
        lowercase_letter | uppercase_letter
lowercase_letter ==>
        a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z
uppercase_letter ==>
        A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
decimal_digit ==>
        0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Sather keywords are used to identify the fundamental syntactic constructs and may not be used as identifiers. Some keywords are reserved for language extensions (See Sather 1.1 Extensions). The keywords are:

keyword ==>
        abstract | and | any | assert | attr | bind | break! | builtin | case | class | clusters | clusters! | cohort | const | else | elsif | end | exception | external | false | far | fork | guard | if | immutable | inout | include | initial | is | ITER | lock | loop | near | new | once | or | out | par | parloop | post | pre | private | protect | quit | raise | readonly | result | return | ROUT | SAME | self | shared | sync | then | true | typecase | unlock | until! | void | when | while! | with | yield

The syntax also makes use of the following special symbols:

special_symbol ==>
        ( | ) | [ | ] | { | } | , | . | ; | : | $ | _ | + | - | * | / | = | < | > | # | ^ | % | ~ | | | ! | /= | <= | >= | := | :: | -> | @ | :-