Introduction - ID database utilities

Next: Quick start, Previous: Top, Up: Top

1 Introduction

An ID database is a binary file containing a list of file names, a list of tokens, and a sparse matrix indicating which tokens appear in which files.

With this database and some tools to query it (described in this manual), many text-searching tasks become simpler and faster. For example, you can list all files that reference a particular #include file throughout a huge source hierarchy, search for all the memos containing references to a project, or automatically invoke an editor on all files containing references to some function or variable. Anyone with a large software project to maintain, or a large set of text files to organize, can benefit from the ID utilities.

Although the name `ID' is short for `identifier', the ID utilities handle more than just identifiers; they also treat other kinds of tokens, most notably numeric constants, and the contents of certain character strings. Thus, this manual will use the word token as a term that is inclusive of identifiers, numbers and strings.

There are several programs in the ID utilities family:

mkid: scans files for tokens and builds the ID database file.
lid: queries the ID database for tokens, then reports matching file names or matching lines.
fid: lists all tokens recorded in the database for given files, or tokens common to two files.
fnid: matches the file names in the database, rather than the tokens.
xtokid: extracts raw tokens—helps with testing of new mkid scanners.

In addition, the ID utilities have historically provided several query programs which are specializations of lid:

gid: (alias for ‘lid -R grep’) lists all lines containing the requested pattern.
eid: (alias for ‘lid -R edit’) invokes an editor on all files containing the requested pattern, and if possible, initiates a text search for that pattern.
aid: (alias for ‘lid -ils’) treats the requested pattern as a case-insensitive literal substring.

Please report bugs to ‘bug-idutils@gnu.org’. Remember to include the version number, machine architecture, input files, and any other information needed to reproduce the bug: your input, what you expected, what you got, and why it is wrong. Diffs are welcome, but please include a description of the problem as well, since this is sometimes difficult to infer. See Bugs.