Guile provides a standard data type for Universal Resource Identifiers (URIs), as defined in RFC 3986.
The generic URI syntax is as follows:
URI := scheme ":" ["//" [userinfo "@"] host [":" port]] path \ [ "?" query ] [ "#" fragment ]
For example, in the URI, ‘
http, the host is
www.gnu.org, the path is
/help/, and there is no userinfo, port, query, or fragment. All
URIs have a scheme and a path (though the path might be empty). Some
URIs have a host, and some of those have ports and userinfo. Any URI
might have a query part or a fragment.
There is also a “URI-reference” data type, which is the same as a URI
but where the scheme is optional. In this case, the scheme is taken to
be relative to some other related URI. A common use of URI references
is when you want to be vague regarding the choice of HTTP or HTTPS –
serving a web page referring to
/foo.css will use HTTPS if loaded
over HTTPS, or HTTP otherwise.
Userinfo is something of an abstraction, as some legacy URI schemes
allowed userinfo of the form
since passwords do not belong in URIs, the RFC does not want to condone
this practice, so it calls anything before the
Properly speaking, a fragment is not part of a URI. For example, when a
web browser follows a link to ‘
sends a request for ‘
http://example.com/’, then looks in the
resulting page for the fragment identified
foo reference. A
fragment identifies a part of a resource, not the resource itself. But
it is useful to have a fragment field in the URI record itself, so we
hope you will forgive the inconsistency.
(use-modules (web uri))
The following procedures can be found in the
module. Load it into your Guile, using a form like the above, to have
access to them.
Construct a URI object. scheme should be a symbol, port
either a positive, exact integer or
#f, and the rest of the
fields are either strings or
#f. If validate? is true,
also run some consistency checks to make sure that the constructed URI
build-uri, but with an optional scheme.
In Guile, both URI and URI reference data types are represented in the same way, as URI objects.
A predicate and field accessors for the URI record type. The URI scheme
will be a symbol, or
#f if the object is a URI reference but not
a URI. The port will be either a positive, exact integer or
and the rest of the fields will be either strings or
#f if not
Parse string into a URI object. Return
#f if the string
could not be parsed.
Parse string into a URI object, while not requiring a scheme.
#f if the string could not be parsed.
Serialize uri to a string. If the URI has a port that is the default port for its scheme, the port is not included in the serialization.
Declare a default port for the given URI scheme.
"utf-8"] [#:decode-plus-to-space? #t]
Percent-decode the given str, according to encoding, which should be the name of a character encoding.
Note that this function should not generally be applied to a full URI
string. For paths, use
split-and-decode-uri-path instead. For
query strings, split the query on
= boundaries, and
decode the components separately.
Note also that percent-encoded strings encode bytes, not
characters. There is no guarantee that a given byte sequence is a valid
string encoding. Therefore this routine may signal an error if the
decoded bytes are not valid for the given encoding. Pass
encoding if you want decoded bytes as a bytevector directly.
set-port-encoding!, for more information on
If decode-plus-to-space? is true, which is the default, also
replace instances of the plus character ‘+’ with a space character.
This is needed when parsing
Returns a string of the decoded characters, or a bytevector if
Percent-encode any character not in the character set, unescaped-chars.
The default character set includes alphanumerics from ASCII, as well as
the special characters ‘-’, ‘.’, ‘_’, and ‘~’. Any
other character will be percent-encoded, by writing out the character to
a bytevector within the given encoding, then encoding each byte as
%HH, where HH is the hexadecimal representation of
Split path into its components, and decode each component, removing empty components.
"/foo/bar%20baz/" decodes to the two-element list,
("foo" "bar baz").
URI-encode each element of parts, which should be a list of
strings, and join the parts together with
/ as a delimiter.
For example, the list
("scrambled eggs" "biscuits&gravy") encodes