Introduction - The GNU libextractor Reference Manual

Next: Preparation, Previous: Top, Up: Top

1 Introduction

GNU libextractor is GNU's library for extracting meta data from files. Meta data includes format information (such as mime type, image dimensions, color depth, recording frequency), content descriptions (such as document title or document description) and copyright information (such as license, author and contributors). Meta data extraction is an inherently uncertain business — a parse error can be a corrupt file, an incompatibility in the file format version, an entirely different file format or a bug in the parser. As a result of this uncertainty, GNU libextractor deliberately avoids to ever report any errors. Unexpected file contents simply result in less or possibly no meta data being extracted.

GNU libextractor uses plugins to handle various file formats. Technically a plugin can support multiple file formats; however, most plugins only support one particular format. By default, GNU libextractor will use all plugins that are available and found in the plugin installation directory. Applications can request the use of only specific plugins or the exclusion of certain plugins.

GNU libextractor is distributed with the extract command¹ which is a command-line tool for extracting meta data. extract is given a list of filenames and prints the resulting meta data to the console. The extract source code also serves as an advanced example for how to use GNU libextractor.

This manual focuses on providing documentation for writing software with GNU libextractor. The only relevant parts for end-users are the chapter on compiling and installing GNU libextractor (See Preparation.). Also, the chapter on existing plugins maybe of interest (See Existing Plugins.). Additional documentation for end-users can be find in the man page on extract (using man extract).

GNU libextractor is licensed under the GNU General Public License, specifically, since version 0.7, GNU libextractor is licensed under GPLv3 or any later version.

Footnotes

[1] Some distributions ship extract in a seperate package.