The lexical syntax determines how a character sequence is split into a sequence of lexemes, omitting non–significant portions such as comments and whitespace. The character sequence is assumed to be text according to the Unicode standard. Some of the lexemes, such as identifiers, representations of number objects, strings etc., of the lexical syntax are syntactic data in the datum syntax, and thus represent objects. Besides the formal account of the syntax, this section also describes what datum values are represented by these syntactic data.
The lexical syntax, in the description of comments, contains a forward
datum, which is described as part of the datum
syntax. Being comments, however, these
datums do not play a
significant role in the syntax.
Case is significant except in representations of booleans, number
objects, and in hexadecimal numbers specifying Unicode scalar values.
#X1a are equivalent. The identifier
Foo is, however, distinct from the identifier
Interlexeme-space may occur on either side of any lexeme, but not
within a lexeme.
booleans, must be terminated by a
delimiter or by the end
of the input.
Line endings are significant in Scheme in single–line comments
and within string literals.
In Scheme source code, any of the line endings in
marks the end of a line. Moreover, the two–character line endings
next-line each count as a single line ending.
In a string literal, a
line-ending not preceded by a
stands for a linefeed character, which is the standard line–ending
character of Scheme.
| any character whose category is Zs, Zl, or Zp
; all subsequent characters up to a
::= character sequence not containing
Whitespace characters are spaces, linefeeds, carriage returns, character tabulations, form feeds, line tabulations, and any other character whose category is Zs, Zl, or Zp. Whitespace is used for improved readability and as necessary to separate lexemes from each other. Whitespace may occur between any two lexemes, but not within a lexeme. Whitespace may also occur inside a string, where it is significant.
The lexical syntax includes several comment forms. In all cases, comments are invisible to Scheme, except that they act as delimiters, so, for example, a comment cannot appear in the middle of an identifier or representation of a number object.
A semicolon (
;) indicates the start of a line comment. The
comment continues to the end of the line on which the semicolon appears.
Another way to indicate a comment is to prefix a
#;, possibly with
interlexeme-space before the
datum. The comment consists
of the comment prefix
#; and the
datum together. This
notation is useful for “commenting out” sections of code.
Block comments may be indicated with properly nested
#| The FACT procedure computes the factorial of a non-negative integer. |# (define fact (lambda (n) ;; base case (if (= n 0) #;(= n 1) 1 ; identity of * (* n (fact (- n 1))))))
c | ... |
C | ... |
| any character whose Unicode scalar value is greater than
127, and whose category is Lu, Ll, Lt, Lm, Lo, Mn,
Nl, No, Pd, Pc, Po, Sc, Sm, Sk, So, or Co
| any character whose category is Nd, Mc, or Me
::= any character except
::= any character except
Most identifiers allowed by other programming languages are also
acceptable to Scheme. In general, a sequence of letters, digits, and
“extended alphabetic characters” is an identifier when it begins with
a character that cannot begin a representation of a number object. In
... are identifiers, as is a
sequence of letters, digits, and extended alphabetic characters that
begins with the two–character sequence
->. Here are some
examples of identifiers:
lambda q soup list->vector + V17a <= a34kTMNs ->- the-word-recursion-has-many-meanings
Extended alphabetic characters may be used within identifiers as if they were letters. The following are extended alphabetic characters:
! $ % & * + - . / < = > ? @ ^ _ ~
Moreover, all characters whose Unicode scalar values are greater than
127 and whose Unicode category is Lu, Ll, Lt, Lm, Lo, Mn, Mc, Me, Nd,
Nl, No, Pd, Pc, Po, Sc, Sm, Sk, So, or Co can be used within
identifiers. In addition, any character can be used within an
identifier when specified using an
escape-sequence. For example,
H\x65;llo is the same as the identifier
Kawa supports two additional non-R6RS ways of making
identifiers using special characters, both taken from Common Lisp:
Any character (except
x) following a backslash is treated
as if it were a
as is any character between a pair of vertical bars.
Identifiers have two uses within Scheme programs:
In contrast with older versions of Scheme, the syntax distinguishes between upper and lower case in identifiers and in characters specified via their names, but not in numbers, nor in inline hex escapes used in the syntax of identifiers, characters, or strings. The following directives give explicit control over case folding.
These directives may appear anywhere comments are permitted and are
treated as comments, except that they affect the reading of subsequent
#!fold-case directive causes the
procedure to case-fold (as if by
identifier and character name subsequently read from the same
#!no-fold-case directive causes the
procedure to return to the default, non-folding behavior.
Note that colon
: is treated specially for
colon notation in Kawa Scheme,
though it is a
special-initial in standard Scheme (R6RS).