Next: , Up: The Lexical Analyzer Function yylex   [Contents][Index]


4.3.1 Calling Convention for yylex

The value that yylex returns must be the positive numeric code for the kind of token it has just found; a zero or negative value signifies end-of-input.

When a token kind is referred to in the grammar rules by a name, that name in the parser implementation file becomes an enumerator of the enum yytoken_kind_t whose definition is the proper numeric code for that token kind. So yylex should use the name to indicate that type. See Symbols, Terminal and Nonterminal.

When a token is referred to in the grammar rules by a character literal, the numeric code for that character is also the code for the token kind. So yylex can simply return that character code, possibly converted to unsigned char to avoid sign-extension. The null character must not be used this way, because its code is zero and that signifies end-of-input.

Here is an example showing these things:

int
yylex (void)
{
  …
  if (c == EOF)    /* Detect end-of-input. */
    return YYEOF;
  …
  else if (c == '+' || c == '-')
    return c;      /* Assume token kind for '+' is '+'. */
  …
  else
    return INT;    /* Return the kind of the token. */
  …
}

This interface has been designed so that the output from the lex utility can be used without change as the definition of yylex.