Types and the Web (Guile Reference Manual)

Next: Universal Resource Identifiers, Up: HTTP, the Web, and All That [Contents][Index]

7.3.1 Types and the Web

It is a truth universally acknowledged, that a program with good use of data types, will be free from many common bugs. Unfortunately, the common practice in web programming seems to ignore this maxim. This subsection makes the case for expressive data types in web programming.

By “expressive data types”, we mean that the data types say something about how a program solves a problem. For example, if we choose to represent dates using SRFI 19 date records (see SRFI-19 - Time/Date Library), this indicates that there is a part of the program that will always have valid dates. Error handling for a number of basic cases, like invalid dates, occurs on the boundary in which we produce a SRFI 19 date record from other types, like strings.

With regards to the web, data types are helpful in the two broad phases of HTTP messages: parsing and generation.

Consider a server, which has to parse a request, and produce a response. Guile will parse the request into an HTTP request object (see HTTP Requests), with each header parsed into an appropriate Scheme data type. This transition from an incoming stream of characters to typed data is a state change in a program—the strings might parse, or they might not, and something has to happen if they do not. (Guile throws an error in this case.) But after you have the parsed request, “client” code (code built on top of the Guile web framework) will not have to check for syntactic validity. The types already make this information manifest.

This state change on the parsing boundary makes programs more robust, as they themselves are freed from the need to do a number of common error checks, and they can use normal Scheme procedures to handle a request instead of ad-hoc string parsers.

The need for types on the response generation side (in a server) is more subtle, though not less important. Consider the example of a POST handler, which prints out the text that a user submits from a form. Such a handler might include a procedure like this:

;; First, a helper procedure
(define (para . contents)
  (string-append "<p>" (string-concatenate contents) "</p>"))

;; Now the meat of our simple web application
(define (you-said text)
  (para "You said: " text))

(display (you-said "Hi!"))
-| <p>You said: Hi!</p>

This is a perfectly valid implementation, provided that the incoming text does not contain the special HTML characters ‘<’, ‘>’, or ‘&’. But this provision of a restricted character set is not reflected anywhere in the program itself: we must assume that the programmer understands this, and performs the check elsewhere.

Unfortunately, the short history of the practice of programming does not bear out this assumption. A cross-site scripting (XSS) vulnerability is just such a common error in which unfiltered user input is allowed into the output. A user could submit a crafted comment to your web site which results in visitors running malicious Javascript, within the security context of your domain:

(display (you-said "<script src=\"http://bad.com/nasty.js\" />"))
-| <p>You said: <script src="http://bad.com/nasty.js" /></p>

The fundamental problem here is that both user data and the program template are represented using strings. This identity means that types can’t help the programmer to make a distinction between these two, so they get confused.

There are a number of possible solutions, but perhaps the best is to treat HTML not as strings, but as native s-expressions: as SXML. The basic idea is that HTML is either text, represented by a string, or an element, represented as a tagged list. So ‘foo’ becomes ‘"foo"’, and ‘<b>foo</b>’ becomes ‘(b "foo")’. Attributes, if present, go in a tagged list headed by ‘@’, like ‘(img (@ (src "http://example.com/foo.png")))’. See SXML, for more information.

The good thing about SXML is that HTML elements cannot be confused with text. Let’s make a new definition of para:

(define (para . contents)
  `(p ,@contents))

(use-modules (sxml simple))
(sxml->xml (you-said "Hi!"))
-| <p>You said: Hi!</p>

(sxml->xml (you-said "<i>Rats, foiled again!</i>"))
-| <p>You said: &lt;i&gt;Rats, foiled again!&lt;/i&gt;</p>

So we see in the second example that HTML elements cannot be unwittingly introduced into the output. However it is now perfectly acceptable to pass SXML to you-said; in fact, that is the big advantage of SXML over everything-as-a-string.

(sxml->xml (you-said (you-said "<Hi!>")))
-| <p>You said: <p>You said: &lt;Hi!&gt;</p></p>

The SXML types allow procedures to compose. The types make manifest which parts are HTML elements, and which are text. So you needn’t worry about escaping user input; the type transition back to a string handles that for you. XSS vulnerabilities are a thing of the past.

Well. That’s all very nice and opinionated and such, but how do I use the thing? Read on!