Guile provides a standard data type for Universal Resource Identifiers (URIs), as defined in RFC 3986.
The generic URI syntax is as follows:
URI := scheme ":" ["//" [userinfo "@"] host [":" port]] path \ [ "?" query ] [ "#" fragment ]
For example, in the URI, ‘
http, the host is
www.gnu.org, the path is
/help/, and there is no userinfo, port, query, or fragment. All
URIs have a scheme and a path (though the path might be empty). Some
URIs have a host, and some of those have ports and userinfo. Any URI
might have a query part or a fragment.
Userinfo is something of an abstraction, as some legacy URI schemes
allowed userinfo of the form
since passwords do not belong in URIs, the RFC does not want to condone
this practice, so it calls anything before the
Properly speaking, a fragment is not part of a URI. For example, when a
web browser follows a link to ‘
sends a request for ‘
http://example.com/’, then looks in the
resulting page for the fragment identified
foo reference. A
fragment identifies a part of a resource, not the resource itself. But
it is useful to have a fragment field in the URI record itself, so we
hope you will forgive the inconsistency.
(use-modules (web uri))
The following procedures can be found in the
module. Load it into your Guile, using a form like the above, to have
access to them.
Construct a URI object. scheme should be a symbol, port
either a positive, exact integer or
#f, and the rest of the
fields are either strings or
#f. If validate? is true,
also run some consistency checks to make sure that the constructed URI
A predicate and field accessors for the URI record type. The URI scheme
will be a symbol, the port either a positive, exact integer or
and the rest either strings or
#f if not present.
Parse string into a URI object. Return
#f if the string
could not be parsed.
Serialize uri to a string. If the URI has a port that is the default port for its scheme, the port is not included in the serialization.
Declare a default port for the given URI scheme.
"utf-8"] [#:decode-plus-to-space? #t]
Percent-decode the given str, according to encoding, which should be the name of a character encoding.
Note that this function should not generally be applied to a full URI
string. For paths, use
split-and-decode-uri-path instead. For
query strings, split the query on
= boundaries, and
decode the components separately.
Note also that percent-encoded strings encode bytes, not
characters. There is no guarantee that a given byte sequence is a valid
string encoding. Therefore this routine may signal an error if the
decoded bytes are not valid for the given encoding. Pass
encoding if you want decoded bytes as a bytevector directly.
set-port-encoding!, for more information on
If decode-plus-to-space? is true, which is the default, also
replace instances of the plus character ‘+’ with a space character.
This is needed when parsing
Returns a string of the decoded characters, or a bytevector if
Fixme: clarify return type. indicate default values. type of unescaped-chars.
Percent-encode any character not in the character set, unescaped-chars.
The default character set includes alphanumerics from ASCII, as well as
the special characters ‘-’, ‘.’, ‘_’, and ‘~’. Any
other character will be percent-encoded, by writing out the character to
a bytevector within the given encoding, then encoding each byte as
%HH, where HH is the hexadecimal representation of
Split path into its components, and decode each component, removing empty components.
"/foo/bar%20baz/" decodes to the two-element list,
("foo" "bar baz").
URI-encode each element of parts, which should be a list of
strings, and join the parts together with
/ as a delimiter.
For example, the list
("scrambled eggs" "biscuits&gravy") encodes