GNU Source-highlight Library 3.0

Table of Contents

Next: , Previous: (dir), Up: (dir)

GNU Source-highlight Library

GNU Source-highlight, given a source file, produces a document with syntax highlighting.

This is Edition 3.0 of the Source-highlight Library manual.

This file documents GNU Source-highlight Library version 3.0.

This manual is for GNU Source-highlight Library (version 3.0, 10 May 2009), which given a source file, produces a document with syntax highlighting.

Copyright © 2005-2008 Lorenzo Bettini,

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation; with no Invariant Sections, with the Front-Cover Texts being “A GNU Manual,” and with the Back-Cover Texts as in (a) below. A copy of the license is included in the section entitled “GNU Free Documentation License.”

(a) The FSF's Back-Cover Text is: “You have freedom to copy and modify this GNU Manual, like GNU software. Copies published by the Free Software Foundation raise funds for GNU development.”

Next: , Previous: Top, Up: Top

1 Introduction

GNU Source-highlight, given a source file, produces a document with syntax highlighting. see Introduction for a wider introduction about GNU Source-highlight.

This file documents the Library provided by GNU Source-highlight, thus its audience is programmers only, who want to use source-highlight features inside their programs, not the users of Source-highlight. This library is part of GNU Source-highlight since version 3.0.

However, the main principles of GNU Source-highlight will be given for granted, together with all the notions for writing language definition files, output definition files, and so on. Again, we refer to the documentation of GNU Source-highlight for all these features.

Next: , Previous: Introduction, Up: Top

2 Installation

GNU Source-highlight library is part of GNU Source-highlight, thus it will be installed together with Source-highlight itself; we refer to see Installation for further instructions on installing GNU Source-highlight. Here we detail only the parts concerning the library.

If you want to build and install the API documentation of Source-highlight library, you need to run configure with the option --with-doxygen, but you need the program Doxygen,, to build the documentation. The documentation will be installed in the following directory:

Library API documentation
library examples

Next: , Previous: Installation, Up: Top

3 Use of GNU Source-highlight Library

You can use GNU Source-highlight library in your programs, by including its headers and linking to the file libsource-highlight.ext1.

All the classes of the library are part of the namespace srchilite, and all the header files are in the subdirectory srchilite.

Previous: Use of GNU Source-highlight Library, Up: Use of GNU Source-highlight Library

3.1 Using Automake and Autotools

The easiest way to use GNU Source-highlight library in your program is to rely on autotools, i.e., Automake, Autoconf, etc. In particular, the library is installed with a pkg-config2 configuration file (metadata file), source-highlight.pc.

pkg-config is a tool for helping compiling applications and libraries. It helps you insert the correct compiler options on the command line so an application can use Source-highlight library simply by running

     gcc -o test test.c `pkg-config --libs --cflags source-highlight`

rather than hard-coding values on where to find the library. Moreover, this will provide also with the correct compiler flags and libraries used by Source-highlight library itself, e.g., Boost Regex library.

Note that pkg-config searches for .pc files in its standard directories. If you installed the library in a non standard directory, you'll need to set the PKG_CONFIG_PATH environment variable accordingly. For instance, if I install the library into /usr/local/lib, the .pc file will be installed into /usr/local/lib/pkgconfig, and then I'll need to call pkg-config as follows:

     PKG_CONFIG_PATH=/usr/local/lib/pkgconfig \
             pkg-config --libs --cflags source-highlight

In your you can use the autoconf macro provided by pkg-config; here is an example:

     # Checks for libraries.
     PKG_CHECK_MODULES(SRCHILITE, [source-highlight >= 3.0])

Then, you can use the variables SRCHILITE_CFLAGS and SRCHILITE_LIBS in your makefiles accordingly. For instance,


Next: , Previous: Use of GNU Source-highlight Library, Up: Top

4 Main Classes

Here we present the main classes of the Source-highlight library, together with some example of use. For the documentation of all the classes (and methods of the classes) we refer to the generated API documentation (see See Installation).

You will note that often, methods and constructors of the classes of the libraries do not take a pointer or a reference to a class, say MyClass, but an object of type MyClassPtr; these are shared pointers, in particular the ones provided by the Boost libraries (they are typedefs using, e.g., boost::shared_ptr<MyClass>). This will avoid dangerous dangling pointers and possible memory leaks in the library.

If on the contrary, a method or a constructor in a class of the library takes a standard pointer, say MyClass *, then that class will NEVER delete such pointer. It is up to the actual owner the object of MyClass * to delete the object when it is not needed anymore.

The classes of the libraries can raise exceptions if errors are encountered (e.g., an input file cannot be opened, or a language definition file cannot be parsed); the exception classes can be found in the API documentation, and all exception classes inherit from std::exception class.

Next: , Previous: Main Classes, Up: Main Classes

4.1 SourceHighlight class

The SourceHighlight class is the class of the library that basically implements all the functionalities used by the program source-highlight itself; thus it highlights an input file generating an output file. It can be configured with many options, and basically it has a get/set methods for all the command line options of source-highlight (we refer also to see Invoking source-highlight).

For instance, the following example (source-highlight-console-main.cpp) highlights an input file to the console (the colors are obtained through ANSI color escape sequences (so you need a console program that supports this):

     #include <iostream>
     #include "srchilite/sourcehighlight.h"
     #include "srchilite/langmap.h"
     using namespace std;
     #ifndef DATADIR
     #define DATADIR ""
     int main(int argc, char *argv[]) {
         // we highlight to the console, through ANSI escape sequences
         srchilite::SourceHighlight sourceHighlight("esc.outlang");
         // make sure we find the .lang and .outlang files
         // by default we highlight C++ code
         string inputLang = "cpp.lang";
         if (argc > 1) {
             // we have a file name so we detect the input source language
             srchilite::LangMap langMap(DATADIR, "");
             string lang = langMap.getMappedFileNameFromFileName(argv[1]);
             if (lang != "") {
                 inputLang = lang;
             } // otherwise we default to C++
             // output file name is empty => cout
             sourceHighlight.highlight(argv[1], "", inputLang);
         } else {
             // input file name is empty => cin
             sourceHighlight.highlight("", "", inputLang);
         return 0;

Note that if a file name is passed at the command line, the program tries to detect the source language by using a LangMap class object, specifying the map file, which is the one mapping file extensions to language definition files (e.g., if the file name has extension .java it will use the corresponding java.lang). Otherwise we assume that we want to highlight a C++ file.

All the highlighting is performed by the highlight method; since we don't specify an output file name it will output the highlighted result directly to the console. In case we don't have an input filename either, highlight method will read from the standard input. Since the highlighting takes place one line per time, you can test the program this way: you'll enter a line on the console and when you press enter, the program will echo the same line highlighted.

The DATADIR is not even mandatory, provided you installed Source-highlight correctly, or that you set it up, using source-highlight-settings program.

Next: , Previous: SourceHighlight class, Up: Main Classes

4.2 Customizing Formatting

The formatting of Source-highlight library, i.e., how to actually perform the highlighting, or what to do when we need to highlight something, can be completely customized; the library detects (using regular expressions based on language definition files) that something must be highlighted as, say, a keyword, and you can then do whatever you want with this information. The default formatting strategy is to output an highlighted text using a specific formatting format, but you're free to do whatever you like, if you want.

This formatting abstraction is done through Formatter class, which basically declares only the abstract method format method which takes as parameters the string to format, and further (possibly empty) additional parameters, implemented by FormatterParams class. Note that the format method does not get as an argument how the passed string must be formatted (e.g., as a keyword, as a type, etc.); this information must be stored in the formatter from the start. Indeed, the mapping between a language element and a formatter is performed by FormatterManager class. An object of this class must be created by specifying a default formatter object, that will be used when the formatter manager will be queried for a formatter for a specific language element that it is not able to handle (in this it will fall back by returning the default formatter).

For instance, this is a customized formatter (infoformatter.h) which, when requested to format a string, it simply writes this information specifying which kind of language element it is, and the position in the line (the start field in FormatterParams class). Note that the language element is stored in a field of the class, and it is set at object creation time. We avoid to write anything if we are requested to format something as "normal", or if the string to format is empty.

     class InfoFormatter: public srchilite::Formatter {
         /// the language element represented by this formatter
         std::string elem;
         InfoFormatter(const std::string &elem_ = "normal") :
             elem(elem_) {
         virtual void format(const std::string &s,
                 const srchilite::FormatterParams *params = 0) {
             // do not print anything if normal or string to format is empty
             if (elem != "normal" || !s.size()) {
                 std::cout << elem << ": " << s;
                 if (params)
                     std::cout << ", start: " << params->start;
                 std::cout << std::endl;
     /// shared pointer for InfoFormatter
     typedef boost::shared_ptr<InfoFormatter> InfoFormatterPtr;

For convenience we also declare a typedef for the shared pointer (since the formatter manager takes only shared pointers to formatters).

In order to customize the formatting, there are some more steps to do, and in particular, you cannot use SourceHighlight class anymore but you need to use more classes.

First of all, you need LangDefManager class which takes care of building the regular expressions starting from a language definition file; in order to do this it uses a HighlightRuleFactory class object; for the moment, only the implementation based on boost regular expression exists, so you can simply pass an object of RegexRuleFactory class. Once you have an object of LangDefManager class, you can use the getHighlightState method to build the automaton to perform the highlight (in particular the initial state of such automaton, of HighlightState class), and you should pass this to an object that can use the automaton to perform the highlighting. To do this, you can use SourceHighlighter class whose objects can be used to highlight a line of text, using highlightParagraph method.

You can then create a FormatterManager class object and populate it with your formatters and set it to the SourceHighlighter class object. The following example (infoformatter-main.cpp) shows how to perform these steps; note that we can share the same formatter for different language elements:

     #include <iostream>
     #include "srchilite/langdefmanager.h"
     #include "srchilite/regexrulefactory.h"
     #include "srchilite/sourcehighlighter.h"
     #include "srchilite/formattermanager.h"
     #include "infoformatter.h"
     using namespace std;
     #ifndef DATADIR
     #define DATADIR ""
     int main() {
         srchilite::RegexRuleFactory ruleFactory;
         srchilite::LangDefManager langDefManager(&ruleFactory);
         // we highlight C++ code for simplicity
         srchilite::SourceHighlighter highlighter(langDefManager.getHighlightState(
                 DATADIR, "cpp.lang"));
         srchilite::FormatterManager formatterManager(InfoFormatterPtr(
                 new InfoFormatter));
         InfoFormatterPtr keywordFormatter(new InfoFormatter("keyword"));
         formatterManager.addFormatter("keyword", keywordFormatter);
         formatterManager.addFormatter("string", InfoFormatterPtr(new InfoFormatter(
         // for "type" we use the same formatter as for "keyword"
         formatterManager.addFormatter("type", keywordFormatter);
         formatterManager.addFormatter("comment", InfoFormatterPtr(
                 new InfoFormatter("comment")));
         formatterManager.addFormatter("symbol", InfoFormatterPtr(new InfoFormatter(
         formatterManager.addFormatter("number", InfoFormatterPtr(new InfoFormatter(
         formatterManager.addFormatter("preproc", InfoFormatterPtr(
                 new InfoFormatter("preproc")));
         // make sure it uses additional information
         srchilite::FormatterParams params;
         string line;
         // we now highlight a line a time
         while (getline(cin, line)) {
             // reset position counter within a line
             params.start = 0;
         return 0;

Note that, since we highlight a line a time, we must reset the start field each time we start to examine a new line.

For simplicity this example highlights only C++ code and reads directly from the standard input and writes to the standard output. This is a run of the example reading from the standard input (so each time you insert a line you get the output of your formatters):

     // this is a comment
     comment: //, start: 0
     comment:  this is a comment, start: 2
     #include <foobar.h>
     preproc: #include, start: 0
     string: <foobar.h>, start: 9
     int abc = 100 + 5;
     keyword: int, start: 0
     symbol: =, start: 8
     number: 100, start: 10
     symbol: +, start: 14
     number: 5, start: 16
     symbol: ;, start: 17

Previous: Customizing Formatting, Up: Main Classes

4.3 Events and Listeners

During the highlighting (and regular expression matching) the library generates events that can be “listened” by using a customized event listener. An event is represented by an object of HighlightEvent class, which stores the HighlightToken class object and the type (an HighlightEventType enum) of the event.

A customized listener can be implemented by deriving from HighlightEventListener class and by defining the virtual method notify method, which, of course, takes an HighlightEvent class object as parameter.

For instance, source-highlight implements the debugging functionalities by using a customized listener, DebugListener class, whose method implementation we report here as an example:

     void DebugListener::notify(const HighlightEvent &event) {
         switch (event.type) {
         case HighlightEvent::FORMAT:
             // print information about the rule
             if (event.token.rule) {
                 os << event.token.rule->getAdditionalInfo() << endl;
                 os << "expression: \"" << event.token.rule->toString() << "\""
                         << endl;
             // now format the matched strings
             for (MatchedElements::const_iterator it = event.token.matched.begin(); it
                     != event.token.matched.end(); ++it) {
                 os << "formatting \"" << it->second << "\" as " << it->first
                         << endl;
         case HighlightEvent::FORMATDEFAULT:
             os << "formatting \"" << event.token.matched.front().second
                     << "\" as default" << endl;
         case HighlightEvent::ENTERSTATE:
             os << "entering state: " << event.token.rule->getNextState()->getId()
                     << endl;
         case HighlightEvent::EXITSTATE:
             int level = event.token.rule->getExitLevel();
             os << "exiting state, level: ";
             if (level < 0)
                 os << "all";
                 os << level;
             os << endl;

Next: , Previous: Main Classes, Up: Top

5 Reporting Bugs

If you find a bug in source-highlight, please send electronic mail to

bug-source-highlight at gnu dot org

Include the version number, which you can find by running ‘source-highlight --version. Also include in your message the output that the program produced and the output you expected.

If you have other questions, comments or suggestions about source-highlight, contact the author via electronic mail (find the address at The author will try to help you out, although he may not have time to fix your problems.

Next: , Previous: Problems, Up: Top

6 Mailing Lists

The following mailing lists are available:

help-source-highlight at gnu dot org

for generic discussions about the program and for asking for help about it (open mailing list),

info-source-highlight at gnu dot org

for receiving information about new releases and features (read-only mailing list),

If you want to subscribe to a mailing list just go to the URL and follow the instructions, or send me an e-mail and I'll subscribe you.

I'll describe new features in new releases also in my blog, at this URL:

Previous: Mailing Lists, Up: Top

Concept Index

Short Contents


[1] The extension of course depends on the library being shared or static, e.g., .so, .la, .a, and on the system