3.4 Error recovery

The error recovery mechanism of the Wisent’s parser conforms to the one Bison uses. See (bison)Error Recovery, in the Bison manual for details.

To recover from a syntax error you must write rules to recognize the special token error. This is a terminal symbol that is automatically defined and reserved for error handling.

When the parser encounters a syntax error, it pops the state stack until it finds a state that allows shifting the error token. After it has been shifted, if the old look-ahead token is not acceptable to be shifted next, the parser reads tokens and discards them until it finds a token which is acceptable.

Strategies for error recovery depend on the choice of error rules in the grammar. A simple and useful strategy is simply to skip the rest of the current statement if an error is detected:

(statement (( error ?; )) ;; on error, skip until ';' is read

It is also useful to recover to the matching close-delimiter of an opening-delimiter that has already been parsed:

(primary (( ?{ expr  ?} ))
         (( ?{ error ?} ))

Note that error recovery rules may have actions, just as any other rules can. Here are some predefined hooks, variables, functions or macros, useful in such actions:

Variable: wisent-nerrs

The number of parse errors encountered so far.

Variable: wisent-recovering

non-nil means that the parser is recovering. This variable only has meaning in the scope of wisent-parse.

Function: wisent-error msg

Call the user supplied error reporting function with message msg (see The error reporting function).

For an example of use, See wisent-skip-token.

Function: wisent-errok

Resume generating error messages immediately for subsequent syntax errors.

The parser suppress error message for syntax errors that happens shortly after the first, until three consecutive input tokens have been successfully shifted.

Calling wisent-errok in an action, make error messages resume immediately. No error messages will be suppressed if you call it in an error rule’s action.

For an example of use, See wisent-skip-token.

Function: wisent-clearin

Discard the current lookahead token. This will cause a new lexical token to be read.

In an error rule’s action the previous lookahead token is reanalyzed immediately. wisent-clearin may be called to clear this token.

For example, suppose that on a parse error, an error handling routine is called that advances the input stream to some point where parsing should once again commence. The next symbol returned by the lexical scanner is probably correct. The previous lookahead token ought to be discarded with wisent-clearin.

For an example of use, See wisent-skip-token.

Function: wisent-abort

Abort parsing and save the lookahead token.

Function: wisent-set-region start end

Change the region of text matched by the current nonterminal. start and end are respectively the beginning and end positions of the region occupied by the group of components associated to this nonterminal. If start or end values are not a valid positions the region is set to nil.

For an example of use, See wisent-skip-token.

Variable: wisent-discarding-token-functions

List of functions to be called when discarding a lexical token. These functions receive the lexical token discarded. When the parser encounters unexpected tokens, it can discards them, based on what directed by error recovery rules. Either when the parser reads tokens until one is found that can be shifted, or when an semantic action calls the function wisent-skip-token or wisent-skip-block. For language specific hooks, make sure you define this as a local hook.

For example, in Semantic, this hook is set to the function wisent-collect-unmatched-syntax to collect unmatched lexical tokens (see Useful functions).

Function: wisent-skip-token

Skip the lookahead token in order to resume parsing. Return nil. Must be used in error recovery semantic actions.

It typically looks like this:

(wisent-message "%s: skip %s" $action
                (wisent-token-to-string wisent-input))
 'wisent-discarding-token-functions wisent-input)
Function: wisent-skip-block

Safely skip a block in order to resume parsing. Return nil. Must be used in error recovery semantic actions.

A block is data between an open-delimiter (syntax class () and a matching close-delimiter (syntax class )):

(a parenthesized block)
[a block between brackets]
{a block between braces}

The following example uses wisent-skip-block to safely skip a block delimited by ‘LBRACE’ ({) and ‘RBRACE’ (}) tokens, when a syntax error occurs in ‘other-components’:

(block ((LBRACE other-components RBRACE))
       ((LBRACE RBRACE))
       ((LBRACE error)