Up: Document Groups
Adding new document types to be recognized by
difficult. You just have to whip up a definition of what the document
looks like, write a predicate function to recognize that document type,
and then hook into
First, here's an example document type definition:
(mmdf (article-begin . "^\^A\^A\^A\^A\n") (body-end . "^\^A\^A\^A\^A\n"))
The definition is simply a unique name followed by a series of regexp pseudo-variable settings. Below are the possible variables—don't be daunted by the number of variables; most document types can be defined with very few settings:
nndocwill skip past all text until it finds something that match this regexp. All text before this will be totally ignored.
article-begin-functioninstead of this.
head-begin-functioninstead of this.
body-begin-functioninstead of this.
body-end-functioninstead of this.
So, using these variables
nndoc is able to dissect a document
file into a series of articles, each with a head and a body. However, a
few more variables are needed since not all document types are all that
news-like—variables needed to transform the head or the body into
something that's palatable for Gnus:
Let's look at the most complicated example I can come up with—standard digests:
(standard-digest (first-article . ,(concat "^" (make-string 70 ?-) "\n\n+")) (article-begin . ,(concat "\n\n" (make-string 30 ?-) "\n\n+")) (prepare-body-function . nndoc-unquote-dashes) (body-end-function . nndoc-digest-body-end) (head-end . "^ ?$") (body-begin . "^ ?\n") (file-end . "^End of .*digest.*[0-9].*\n\\*\\*\\|^End of.*Digest *$") (subtype digest guess))
We see that all text before a 70-width line of dashes is ignored; all
text after a line that starts with that ‘^End of’ is also ignored;
each article begins with a 30-width line of dashes; the line separating
the head from the body may contain a single space; and that the body is
nndoc-unquote-dashes before being delivered.
To hook your own document definition into
nndoc, use the
nndoc-add-type function. It takes two parameters—the first
is the definition itself and the second (optional) parameter says
where in the document type definition alist to put this definition.
The alist is traversed sequentially, and
-type-p is called for a given type type.
nndoc-mmdf-type-p is called to see whether a document is of
mmdf type, and so on. These type predicates should return
nil if the document is not of the correct type;
t if it
is of the correct type; and a number if the document might be of the
correct type. A high number means high probability; a low number
means low probability with ‘0’ being the lowest valid number.