Next: , Up: Wisent Parsing

3.1 What the parser must receive

It is important to understand that the parser does not parse characters, but lexical tokens, and does not know anything about characters in text streams!

Reading input data to produce lexical tokens is performed by a lexer (also called a scanner) in a lexical analysis step, before the syntax analysis step performed by the parser. The parser automatically calls the lexer when it needs the next token to parse.

A Wisent's lexer is an Emacs Lisp function with no argument. It must return a valid lexical token of the form:

(token-class value [start . end])

token-class
Is a category of lexical token identifying a terminal as specified in the grammar (see Wisent Grammar). It can be a symbol or a character literal.
value
Is the value of the lexical token. It can be of any valid Emacs Lisp data type.
start
end
Are the optional beginning and ending positions of value in the input stream.

When there are no more tokens to read the lexer must return the token (list wisent-eoi-term) to each request.

— Variable: wisent-eoi-term

Predefined constant, End-Of-Input terminal symbol.

wisent-lex is an example of a lexer that reads lexical tokens produced by a Semantic lexer, and translates them into lexical tokens suitable to the Wisent parser. See also Wisent Lex.

To call the lexer in a semantic action use the function wisent-lexer. See also Actions goodies.