Next: , Previous: Preparation, Up: Top


3 Generalities

3.1 Introduction to the “extract” command

The extract command takes a list of file names as arguments, extracts meta data from each of those files and prints the result to the console. By default, extract will use all available plugins and print all (non-binary) meta data that is found.

The set of plugins used by extract can be controlled using the “-l” and “-n” options. Use “-n” to not load all of the default plugins. Use “-l NAME” to specifically load a certain plugin. For example, specify “-n -l mime” to only use the MIME plugin.

Using the “-p” option the output of extract can be limited to only certain keyword types. Similarly, using the “-x” option, certain keyword types can be excluded. A list of all known keyword types can be obtained using the “-L” option.

The output format of extract can be influenced with the “-V” (more verbose, lists filenames), “-g” (grep-friendly, all meta data on a single line per file) and “-b” (bibTeX style) options.

3.2 Common usage examples for “extract”

     $ extract test/test.jpg
     comment - (C) 2001 by Christian Grothoff, using gimp 1.2 1
     mimetype - image/jpeg
     
     $ extract -V -x comment test/test.jpg
     Keywords for file test/test.jpg:
     mimetype - image/jpeg
     
     $ extract -p comment test/test.jpg
     comment - (C) 2001 by Christian Grothoff, using gimp 1.2 1
     
     $ extract -nV -l png.so -p comment test/test.jpg test/test.png
     Keywords for file test/test.jpg:
     Keywords for file test/test.png:
     comment - Testing keyword extraction

3.3 Introduction to the libextractor library

Each public symbol exported by GNU libextractor has the prefix EXTRACTOR_. All-caps names are used for constants. For the impatient, the minimal C code for using GNU libextractor (on the executing binary itself) looks like this:

#include <extractor.h>

int 
main (int argc, char ** argv) 
{
  struct EXTRACTOR_PluginList *plugins
    = EXTRACTOR_plugin_add_defaults (EXTRACTOR_OPTION_DEFAULT_POLICY);
  EXTRACTOR_extract (plugins, argv[1],
                     NULL, 0, 
                     &EXTRACTOR_meta_data_print, stdout);
  EXTRACTOR_plugin_remove_all (plugins);
  return 0;
}

The minimal API illustrated by this example is actually sufficient for many applications. The full external C API of GNU libextractor is described in chapter See Extracting meta data. Bindings for other languages are described in chapter See Language bindings. The API for writing new plugins is described in chapter See Writing new Plugins.