Next: , Previous: XML Support, Up: XML Support

14.12.1 XML Input

The primary entry point for the XML parser is read-xml, which reads characters from a port and returns an XML document record. The character coding of the input is determined by reading some of the input stream and looking for a byte order mark and/or an encoding in the XML declaration. We support all ISO 8859 codings, as well as UTF-8, UTF-16, and UTF-32.

When an XHTML document is read, the parser provides entity definitions for all of the named XHTML characters; for example, it defines ` ' and `©'. In order for a document to be recognized as XHTML, it must contain an XHTML DTD, such as this:

     <!DOCTYPE html
               PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"

At present the parser recognizes XHTML Strict 1.0 and XHTML 1.1 documents.

— procedure: read-xml port [pi-handlers]

Read an XML document from port and return the corresponding XML document record.

Pi-handlers, if specified, must be an association list. Each element of pi-handlers must be a list of two elements: a symbol and a procedure. When the parser encounters processing instructions with a name that appears in pi-handlers, the procedure is called with one argument, which is the text of the processing instructions. The procedure must return a list of XML structure records that are legal for the context of the processing instructions.

— procedure: read-xml-file pathname [pi-handlers]

This convenience procedure simplifies reading XML from a file. It is roughly equivalent to

          (define (read-xml-file pathname #!optional pi-handlers)
            (call-with-input-file pathname
              (lambda (port)
                (read-xml port pi-handlers))))
— procedure: string->xml string [start [end [pi-handlers]]]

This convenience procedure simplifies reading XML from a string. The string argument may be a string or a wide string. It is roughly equivalent to

          (define (string->xml string #!optional start end pi-handlers)
            (read-xml (open-input-string string start end)