Next: , Up: Extraction options

3.6.1 Mapping file names to source languages

The file, installed by default in $(prefix)/share/, contains rules for mapping file names to source languages. Each rule comprises three parts: a shell glob pattern, a language name, and language-specific scanner options.

The special pattern ‘**’ denotes the default source language. This is the language that's assigned to file names that don't match any other pattern.

The special pattern ‘***’ should be followed by a file name. The named file should contain more language-map rules and is included at this point.

The order in which rules are presented in a language-map file is significant. This order influences the order in which files are displayed as the result of queries. For example, the distributed language-map file places all rules for C .h files ahead of .c files, so that in general, declarations will precede definitions in query output. The same thing is done for C++ and its many different source file name extensions.

Here is a pared-down version of the file distributed with the ID utilities:

     # Default language
     **			IGNORE	# Although this is listed first,
     				# the default language pattern is
     				# logically matched last.
     # Backup files
     *~			IGNORE
     *.bak			IGNORE
     *.bk[0-9]		IGNORE
     # SCCS files
     [sp].*			IGNORE
     # list header files before code files
     *.h			C
     *			C
     *.H			C++
     *.hh			C++
     *.hpp			C++
     *.hxx			C++
     # list C `meta' files next
     *.l			C
     *.lex			C
     *.y			C
     *.yacc			C
     # list C code files after header files
     *.c			C
     *.C			C++
     *.cc			C++
     *.cpp			C++
     *.cxx			C++
     # list assembly language after C
     *.[sS]			asm --comment=;
     *.asm			asm --comment=;
     # [nt]roff
     *.[0-9]			roff
     *.ms			roff
     *.me			roff
     *.mm			roff
     # TeX and friends
     *.tex			TeX
     *.ltx			TeX
     *.texi			texinfo
     *.texinfo		texinfo