Warning: This is the manual of the legacy Guile 2.0 series. You may want to read the manual of the current stable series instead.

Next: , Previous: , Up: Web   [Contents][Index]


7.3.2 Universal Resource Identifiers

Guile provides a standard data type for Universal Resource Identifiers (URIs), as defined in RFC 3986.

The generic URI syntax is as follows:

URI := scheme ":" ["//" [userinfo "@"] host [":" port]] path \
       [ "?" query ] [ "#" fragment ]

For example, in the URI, ‘http://www.gnu.org/help/’, the scheme is http, the host is www.gnu.org, the path is /help/, and there is no userinfo, port, query, or fragment. All URIs have a scheme and a path (though the path might be empty). Some URIs have a host, and some of those have ports and userinfo. Any URI might have a query part or a fragment.

Userinfo is something of an abstraction, as some legacy URI schemes allowed userinfo of the form username:passwd. But since passwords do not belong in URIs, the RFC does not want to condone this practice, so it calls anything before the @ sign userinfo.

Properly speaking, a fragment is not part of a URI. For example, when a web browser follows a link to ‘http://example.com/#foo’, it sends a request for ‘http://example.com/’, then looks in the resulting page for the fragment identified foo reference. A fragment identifies a part of a resource, not the resource itself. But it is useful to have a fragment field in the URI record itself, so we hope you will forgive the inconsistency.

(use-modules (web uri))

The following procedures can be found in the (web uri) module. Load it into your Guile, using a form like the above, to have access to them.

Scheme Procedure: build-uri scheme [#:userinfo=#f] [#:host=#f] [#:port=#f] [#:path=""] [#:query=#f] [#:fragment=#f] [#:validate?=#t]

Construct a URI object. scheme should be a symbol, port either a positive, exact integer or #f, and the rest of the fields are either strings or #f. If validate? is true, also run some consistency checks to make sure that the constructed URI is valid.

Scheme Procedure: uri? obj
Scheme Procedure: uri-scheme uri
Scheme Procedure: uri-userinfo uri
Scheme Procedure: uri-host uri
Scheme Procedure: uri-port uri
Scheme Procedure: uri-path uri
Scheme Procedure: uri-query uri
Scheme Procedure: uri-fragment uri

A predicate and field accessors for the URI record type. The URI scheme will be a symbol, the port either a positive, exact integer or #f, and the rest either strings or #f if not present.

Scheme Procedure: string->uri string

Parse string into a URI object. Return #f if the string could not be parsed.

Scheme Procedure: uri->string uri

Serialize uri to a string. If the URI has a port that is the default port for its scheme, the port is not included in the serialization.

Scheme Procedure: declare-default-port! scheme port

Declare a default port for the given URI scheme.

Scheme Procedure: uri-decode str [#:encoding="utf-8"] [#:decode-plus-to-space? #t]

Percent-decode the given str, according to encoding, which should be the name of a character encoding.

Note that this function should not generally be applied to a full URI string. For paths, use split-and-decode-uri-path instead. For query strings, split the query on & and = boundaries, and decode the components separately.

Note also that percent-encoded strings encode bytes, not characters. There is no guarantee that a given byte sequence is a valid string encoding. Therefore this routine may signal an error if the decoded bytes are not valid for the given encoding. Pass #f for encoding if you want decoded bytes as a bytevector directly. See set-port-encoding!, for more information on character encodings.

If decode-plus-to-space? is true, which is the default, also replace instances of the plus character ‘+’ with a space character. This is needed when parsing application/x-www-form-urlencoded data.

Returns a string of the decoded characters, or a bytevector if encoding was #f.

Fixme: clarify return type. indicate default values. type of unescaped-chars.

Scheme Procedure: uri-encode str [#:encoding="utf-8"] [#:unescaped-chars]

Percent-encode any character not in the character set, unescaped-chars.

The default character set includes alphanumerics from ASCII, as well as the special characters ‘-’, ‘.’, ‘_’, and ‘~’. Any other character will be percent-encoded, by writing out the character to a bytevector within the given encoding, then encoding each byte as %HH, where HH is the hexadecimal representation of the byte.

Scheme Procedure: split-and-decode-uri-path path

Split path into its components, and decode each component, removing empty components.

For example, "/foo/bar%20baz/" decodes to the two-element list, ("foo" "bar baz").

Scheme Procedure: encode-and-join-uri-path parts

URI-encode each element of parts, which should be a list of strings, and join the parts together with / as a delimiter.

For example, the list ("scrambled eggs" "biscuits&gravy") encodes as "scrambled%20eggs/biscuits%26gravy".


Next: , Previous: , Up: Web   [Contents][Index]