3.1 What the parser must receive

It is important to understand that the parser does not parse characters, but lexical tokens, and does not know anything about characters in text streams!

Reading input data to produce lexical tokens is performed by a lexer (also called a scanner) in a lexical analysis step, before the syntax analysis step performed by the parser. The parser automatically calls the lexer when it needs the next token to parse.

A Wisent’s lexer is an Emacs Lisp function with no argument. It must return a valid lexical token of the form:

(token-class value [start . end])

token-class

Is a category of lexical token identifying a terminal as specified in the grammar (see Wisent Grammar). It can be a symbol or a character literal.

value

Is the value of the lexical token. It can be of any valid Emacs Lisp data type.

start
end

Are the optional beginning and ending positions of value in the input stream.

When there are no more tokens to read the lexer must return the token (list wisent-eoi-term) to each request.

Variable: wisent-eoi-term

Predefined constant, End-Of-Input terminal symbol.

wisent-lex is an example of a lexer that reads lexical tokens produced by a Semantic lexer, and translates them into lexical tokens suitable to the Wisent parser. See also The Wisent Lex lexer.

To call the lexer in a semantic action use the function wisent-lexer. See also Variables and macros useful in grammar actions..