Kawa: XML literals

XML literals

You can write XML literals directly in Scheme code, following a #. Notice that the outermost element needs to be prefixed by #, but nested elements do not (and must not).

#<p>The result is <b>final</b>!</p>

Actually, these are not really literals since they can contain enclosed expressions:

#<em>The result is &{result}.</em>

The value of result is substituted into the output, in a similar way to quasi-quotation. (If you try to quote one of these “XML literals”, what you get is unspecified and is subject to change.)

An xml-literal is usually an element constructor, but there some rarely used forms (processing-instructions, comments, and CDATA section) we’ll cover later.

xml-literal ::= #xml-constructor
xml-constructor ::= xml-element-constructor
  | xml-PI-constructor
  | xml-comment-constructor
  | xml-CDATA-constructor

Element constructors

xml-element-constructor ::=
    <QName xml-attribute*>xml-element-datum...</QName >
  | <xml-name-form xml-attribute*>xml-element-datum...</>
  | <xml-name-form xml-attribute*/>
xml-name-form ::= QName
  | xml-enclosed-expression
xml-enclosed-expression ::=
    {expression}
  | (expression...)

The first xml-element-constructor variant uses a literal QName, and looks like standard non-empty XML element, where the starting QName and the ending QName must match exactly:

#<a href="next.html">Next</a>

As a convenience, you can leave out the ending tag(s):

This is a paragraph in <emphasis>DocBook</> syntax.</>

You can use an expression to compute the element tag at runtime - in that case you must leave out the ending tag:

#<p>This is <(if be-bold 'strong 'em)>important</>!</p>

You can use arbitrary expression inside curly braces, as long as it evaluates to a symbol. You can leave out the curly braces if the expression is a simple parenthesised compound expression. The previous example is equivalent to:

#<p>This is <{(if be-bold 'strong 'em)}>important</>!</p>

The third xml-element-constructor variant above is an XML “empty element”; it is equivalent to the second variant when there are no xml-element-datum items.

(Note that every well-formed XML element, as defined in the XML specifications, is a valid xml-element-constructor, but not vice versa.)

Elements contents (children)

The “contents” (children) of an element are a sequence of character (text) data, and nested nodes. The characters &, <, and > are special, and need to be escaped.

xml-element-datum ::=
    any character except &, or <.
  | xml-constructor
  | xml-escaped
xml-escaped ::=
    &xml-enclosed-expression
  | &xml-entity-name;
  | xml-character-reference
xml-character-reference ::=
    &#digit+;
  | &#xhex-digit+;

Here is an example shows both hex and decimal character references:

#<p>A&#66;C&#x44;E</p>  ⇒  <p>ABCDE</p>

xml-entity-name ::= identifier

Currently, the only supported values for xml-entity-name are the builtin XML names lt, gt, amp, quot, and apos, which stand for the characters <, >, &, ", and ', respectively. The following two expressions are equivalent:

#<p>&lt; &gt; &amp; &quot; &apos;</p>
#<p>&{"< > & \" '"}</p>

Attributes

xml-attribute ::=
    xml-name-form=xml-attribute-value
xml-attribute-value ::=
    "quot-attribute-datum*"
  | ’apos-attribute-datum*’
quot-attribute-datum ::=
    any character except ", &, or <.
  | xml-escaped
apos-attribute-datum ::=
    any character except ', &, or <.
  | xml-escaped

If the xml-name-form is either xmlns or a compound named with the prefix xmlns, then technically we have a namespace declaration, rather than an attribute.

QNames and namespaces

The names of elements and attributes are qualified names (QNames), which are represented using compound symbols (see Namespaces). The lexical syntax for a QName is either a simple identifier, or a (prefix,local-name) pair:

QName ::= xml-local-part
| xml-prefix:xml-local-part
xml-local-part ::= identifier
xml-prefix ::= identifier

An xml-prefix is an alias for a namespace-uri, and the mapping between them is defined by a namespace-declaration. You can either use a define-namespace form, or you can use a namespace declaration attribute:

xml-namespace-declaration-attribute ::=
xmlns:xml-prefix=xml-attribute-value
| xmlns=xml-attribute-value

The former declares xml-prefix as a namespace alias for the namespace-uri specified by xml-attribute-value (which must be a compile-time constant). The second declares that xml-attribute-value is the default namespace for simple (unprefixed) element tags. (A default namespace declaration is ignored for attribute names.)

(let ((qn (element-name #<gnu:b xmlns:gnu="http://gnu.org/"/>)))
  (list (symbol-local-name qn)
        (symbol-prefix qn)
        (symbol-namespace-uri qn)))
⇒ ("b" "gnu" "http://gnu.org/")

Other XML types

Processing instructions

An xml-PI-constructor can be used to create an XML processing instruction, which can be used to pass instructions or annotations to an XML processor (or tool). (Alternatively, you can use the processing-instruction type constructor.)

xml-PI-constructor ::= <?xml-PI-target xml-PI-content?>
xml-PI-target ::= NCname (i.e. a simple (non-compound) identifier)
xml-PI-content ::= any characters, not containing ?>.

For example, the DocBook XSLT stylesheets can use the dbhtml instructions to specify that a specific chapter should be written to a named HTML file:

#<chapter><?dbhtml filename="intro.html" ?>
<title>Introduction</title>
...
</chapter>

XML comments

You can cause XML comments to be emitted in the XML output document. Such comments can be useful for humans reading the XML document, but are usually ignored by programs. (Alternatively, you can use the comment type constructor.)

xml-comment-constructor ::= <!–xml-comment-content–>
xml-comment-content ::= any characters, not containing --.

CDATA sections

A CDATA section can be used to avoid excessive use of xml-entity-ref such as & in element content.

xml-CDATA-constructor ::= <![CDATA[xml-CDATA-content]]>
xml-CDATA-content ::= any characters, not containing ]]>.

The following are equivalent:

#<p>Specal characters <![CDATA[< > & ' "]]> here.</p>
#<p>Specal characters &lt; &gt; &amp; &quot; &apos; here.</p>

Kawa remembers that you used a CDATA section in the xml-element-constructor and will write it out using a CDATA constructor.