Next: , Previous: , Up: Bison Declarations   [Contents][Index]


3.7.2 Token Kind Names

The basic way to declare a token kind name (terminal symbol) is as follows:

%token name

Bison will convert this into a definition in the parser, so that the function yylex (if it is in this file) can use the name name to stand for this token kind’s code.

Alternatively, you can use %left, %right, %precedence, or %nonassoc instead of %token, if you wish to specify associativity and precedence. See Operator Precedence. However, for clarity, we recommend to use these directives only to declare associativity and precedence, and not to add string aliases, semantic types, etc.

You can explicitly specify the numeric code for a token kind by appending a nonnegative decimal or hexadecimal integer value in the field immediately following the token name:

%token NUM 300
%token XNUM 0x12d // a GNU extension

It is generally best, however, to let Bison choose the numeric codes for all token kinds. Bison will automatically select codes that don’t conflict with each other or with normal characters.

In the event that the stack type is a union, you must augment the %token or other token declaration to include the data type alternative delimited by angle-brackets (see More Than One Value Type).

For example:

%union {              /* define stack type */
  double val;
  symrec *tptr;
}
%token <val> NUM      /* define token NUM and its type */

You can associate a literal string token with a token kind name by writing the literal string at the end of a %token declaration which declares the name. For example:

%token ARROW "=>"

For example, a grammar for the C language might specify these names with equivalent literal string tokens:

%token  <operator>  OR      "||"
%token  <operator>  LE 134  "<="
%left  OR  "<="

Once you equate the literal string and the token kind name, you can use them interchangeably in further declarations or the grammar rules. The yylex function can use the token name or the literal string to obtain the token kind code (see Calling Convention for yylex).

String aliases allow for better error messages using the literal strings instead of the token names, such as ‘syntax error, unexpected ||, expecting number or (’ rather than ‘syntax error, unexpected OR, expecting NUM or LPAREN’.

String aliases may also be marked for internationalization (see Token Internationalization):

%token
    OR     "||"
    LPAREN "("
    RPAREN ")"
    '\n'   _("end of line")
  <double>
    NUM    _("number")

would produce in French ‘erreur de syntaxe, || inattendu, attendait nombre ou (’ rather than ‘erreur de syntaxe, || inattendu, attendait number ou (’.


Next: Operator Precedence, Previous: Require a Version of Bison, Up: Bison Declarations   [Contents][Index]