Guile provides a standard data type for Universal Resource Identifiers (URIs), as defined in RFC 3986.
The generic URI syntax is as follows:
URI := scheme ":" ["//" [userinfo "@"] host [":" port]] path \
[ "?" query ] [ "#" fragment ]
So, all URIs have a scheme and a path. Some URIs have a host, and some of those have ports and userinfo. Any URI might have a query part or a fragment.
Userinfo is something of an abstraction, as some legacy URI schemes
allowed userinfo of the form username:passwd.
Passwords don't belong in URIs, so the RFC does not want to condone
this, but neither can it say that what is before the @ sign is
just a username, so the RFC punts on the issue and calls it
userinfo.
Also, strictly speaking, a URI with a fragment is a URI reference. A fragment is typically not serialized when sending a URI over the wire; that is, it is not part of the identifier of a resource. It only identifies a part of a given resource. But it's useful to have a field for it in the URI record itself, so we hope you will forgive the inconsistency.
(use-modules (web uri))
The following procedures can be found in the (web uri)
module. Load it into your Guile, using a form like the above, to have
access to them.
Construct a URI object. If validate? is true, also run some consistency checks to make sure that the constructed URI is valid.
A predicate and field accessors for the URI record type.
Declare a default port for the given URI scheme.
Default ports are for printing URI objects: a default port is not printed.
Parse string into a URI object. Returns
#fif the string could not be parsed.
Percent-decode the given str, according to charset.
Note that this function should not generally be applied to a full URI string. For paths, use split-and-decode-uri-path instead. For query strings, split the query on
&and=boundaries, and decode the components separately.Note that percent-encoded strings encode bytes, not characters. There is no guarantee that a given byte sequence is a valid string encoding. Therefore this routine may signal an error if the decoded bytes are not valid for the given encoding. Pass
#ffor charset if you want decoded bytes as a bytevector directly.
Percent-encode any character not in unescaped-chars.
Percent-encoding first writes out the given character to a bytevector within the given charset, then encodes each byte as
%HH, where HH is the hexadecimal representation of the byte.