Previous: , Up: C++ Scanner Interface   [Contents][Index]


10.1.7.2 Complete Symbols

With both %define api.value.type variant and %define api.token.constructor, the parser defines the type symbol_type, and expects yylex to have the following prototype.

Function: parser::symbol_type yylex ()
Function: parser::symbol_type yylex (type1 arg1, …)

Return a complete symbol, aggregating its type (i.e., the traditional value returned by yylex), its semantic value, and possibly its location. Invocations of ‘%lex-param {type1 arg1}’ yield additional arguments.

Type of parser: symbol_type

A “complete symbol”, that binds together its kind, value and (when applicable) location.

Method on symbol_type: symbol_kind_type kind () const

The kind of this symbol.

Method on symbol_type: const char * name () const

The name of the kind of this symbol.

Returns a std::string when parse.error is verbose.


For each token kind, Bison generates named constructors as follows.

Constructor on parser::symbol_type: symbol_type (int token, const value_type& value, const location_type& location)
Constructor on parser::symbol_type: symbol_type (int token, const location_type& location)
Constructor on parser::symbol_type: symbol_type (int token, const value_type& value)
Constructor on parser::symbol_type: symbol_type (int token)

Build a complete terminal symbol for the token kind token (including the api.token.prefix), whose semantic value, if it has one, is value of adequate value_type. Pass the location iff location tracking is enabled.

Consistency between token and value_type is checked via an assert.

For instance, given the following declarations:

%define api.token.prefix {TOK_}
%token <std::string> IDENTIFIER;
%token <int> INTEGER;
%token ':';

you may use these constructors:

symbol_type (int token, const std::string&, const location_type&);
symbol_type (int token, const int&, const location_type&);
symbol_type (int token, const location_type&);

Correct matching between token kinds and value types is checked via assert; for instance, ‘symbol_type (ID, 42)’ would abort. Named constructors are preferable (see below), as they offer better type safety (for instance ‘make_ID (42)’ would not even compile), but symbol_type constructors may help when token kinds are discovered at run-time, e.g.,

[a-z]+   {
           if (auto i = lookup_keyword (yytext))
             return yy::parser::symbol_type (i, loc);
           else
             return yy::parser::make_ID (yytext, loc);
         }

Note that it is possible to generate and compile type incorrect code (e.g. ‘symbol_type (':', yytext, loc)’). It will fail at run time, provided the assertions are enabled (i.e., -DNDEBUG was not passed to the compiler). Bison supports an alternative that guarantees that type incorrect code will not even compile. Indeed, it generates named constructors as follows.

Method on parser: symbol_type make_token (const value_type& value, const location_type& location)
Method on parser: symbol_type make_token (const location_type& location)
Method on parser: symbol_type make_token (const value_type& value)
Method on parser: symbol_type make_token ()

Build a complete terminal symbol for the token kind token (not including the api.token.prefix), whose semantic value, if it has one, is value of adequate value_type. Pass the location iff location tracking is enabled.

For instance, given the following declarations:

%define api.token.prefix {TOK_}
%token <std::string> IDENTIFIER;
%token <int> INTEGER;
%token COLON;
%token EOF 0;

Bison generates:

symbol_type make_IDENTIFIER (const std::string&, const location_type&);
symbol_type make_INTEGER (const int&, const location_type&);
symbol_type make_COLON (const location_type&);
symbol_type make_EOF (const location_type&);

which should be used in a scanner as follows.

[a-z]+   return yy::parser::make_IDENTIFIER (yytext, loc);
[0-9]+   return yy::parser::make_INTEGER (text_to_int (yytext), loc);
":"      return yy::parser::make_COLON (loc);
<<EOF>>  return yy::parser::make_EOF (loc);

Tokens that do not have an identifier are not accessible: you cannot simply use characters such as ':', they must be declared with %token, including the end-of-file token.


Previous: Split Symbols, Up: C++ Scanner Interface   [Contents][Index]