Kawa: Ports

Ports

Ports represent input and output devices. An input port is a Scheme object that can deliver data upon command, while an output port is a Scheme object that can accept data.

Different port types operate on different data:

A textual port supports reading or writing of individual characters from or to a backing store containing characters using read-char and write-char below, and it supports operations defined in terms of characters, such as read and write.
A binary port supports reading or writing of individual bytes from or to a backing store containing bytes using read-u8 and write-u8 below, as well as operations defined in terms of bytes (integers in the range 0 to 255).

All Kawa binary ports created by procedures documented here are also textual ports. Thus you can either read/write bytes as described above, or read/write characters whose scalar value is in the range 0 to 255 (i.e. the Latin-1 character set), using read-char and write-char.

A native binary port is a java.io.InputStream or java.io.OutputStream instance. These are not textual ports. You can use methods read-u8 and write-u8, but not read-char and write-char on native binary ports. (The functions input-port?, output-port?, binary-port?, and port? all currently return false on native binary ports, but that may change.)

Procedure: call-with-port port proc

The call-with-port procedure calls proc with port as an argument. If proc returns, then the port is closed automatically and the values yielded by the proc are returned.

If proc does not return, then the port must not be closed automatically unless it is possible to prove that the port will never again be used for a read or write operation.

As a Kawa extension, port may be any object that implements java.io.Closeable. It is an error if proc does not accept one argument.

Procedure: call-with-input-file path proc

Procedure: call-with-output-file path proc

These procedures obtain a textual port obtained by opening the named file for input or output as if by open-input-file or open-output-file. The port and proc are then passed to a procedure equivalent to call-with-port.

It is an error if proc does not accept one argument.

Procedure: input-port? obj

Procedure: output-port? obj

Procedure: textual-port? obj

Procedure: binary-port? obj

Procedure: port? obj

These procedures return #t if obj is an input port, output port, textual port, binary port, or any kind of port, respectively. Otherwise they return #f.

These procedures currently return #f on a native Java streams (java.io.InputStream or java.io.OutputStream), a native reader (a java.io.Reader that is not an gnu.mapping.Inport), or a native writer (a java.io.Writer that is not an gnu.mapping.Outport). This may change if conversions between native ports and Scheme ports becomes more seamless.

Procedure: input-port-open? port

Procedure: output-port-open? port

Returns #t if port is still open and capable of performing input or output, respectively, and #f otherwise. (Not supported for native binary ports - i.e. java.io.InputStteam or java.io.OutputStream.)

Procedure: current-input-port

Procedure: current-output-port

Procedure: current-error-port

Returns the current default input port, output port, or error port (an output port), respectively. (The error port is the port to which errors and warnings should be sent - the standard error in Unix and C terminology.) These procedures are parameter objects, which can be overridden with parameterize.

The initial bindings for (current-output-port) and (current-error-port) are hybrid textual/binary ports that wrap the values of the corresponding java.lang.System fields out, and err. The latter, in turn are bound to the standard output and error streams of the JVM process. This means you can write binary data to standard output using write-bytevector and write-u8.

The initial value (current-input-port) similarly is a textual port that wraps the java.lang.System field in, which is bound to the standard input stream of the JVM process. It is a hybrid textual/binary port only if there is no console (as determined by (java.lang.System:console) returning #!null) - i.e. if standard input is not a tty.

Here is an example that copies standard input to standard output:
(let* ((in (current-input-port))
       (out (current-output-port))
       (blen ::int 2048)
       (buf (make-bytevector blen)))
  (let loop ()
    (define n (read-bytevector! buf in))
    (cond ((not (eof-object? n))
           (write-bytevector buf out 0 n)
           (loop)))))

Procedure: with-input-from-file path thunk

Procedure: with-output-to-file path thunk

The file is opened for input or output as if by open-input-file or open-output-file, and the new port is made to be the value returned by current-input-port or current-output-port (as used by (read), (write obj), and so forth). The thunk is then called with no arguments. When the thunk returns, the port is closed and the previous default is restored. It is an error if thunk does not accept zero arguments. Both procedures return the values yielded by thunk. If an escape procedure is used to escape from the continuation of these procedures, they behave exactly as if the current input or output port had been bound dynamically with parameterize.

Procedure: open-input-file path

Procedure: open-binary-input-file path

Takes a path naming an existing file and returns a textual input port or binary input port that is capable of delivering data from the file.

The procedure open-input-file checks the fluid variable port-char-encoding to determine how bytes are decoded into characters. The procedure open-binary-input-file is equivalent to calling open-input-file with port-char-encoding set to #f.

Procedure: open-output-file path

Procedure: open-binary-output-file path

Takes a path naming an output file to be created and returns respectively a textual output port or binary output port that is capable of writing data to a new file by that name. If a file with the given name already exists, the effect is unspecified.

The procedure open-output-file checks the fluid variable port-char-encoding to determine how characters are encoded as bytes. The procedure open-binary-output-file is equivalent to calling open-output-file with port-char-encoding set to #f.

Procedure: close-port port

Procedure: close-input-port port

Procedure: close-output-port port

Closes the resource associated with port, rendering the port incapable of delivering or accepting data. It is an error to apply the last two procedures to a port which is not an input or output port, respectively. (Specifically, close-input-port requires a java.io.Reader, while close-output-port requires a java.io.Writer. In contrast close-port accepts any object whose class implements java.io.Closeable.)

These routines have no effect if the port has already been closed.

String and bytevector ports

Procedure: open-input-string string

Takes a string and returns a text input port that delivers characters from the string. The port can be closed by close-input-port, though its storage will be reclaimed by the garbage collector if it becomes inaccessible.
(define p
  (open-input-string "(a . (b c . ())) 34"))

(input-port? p)                 ⇒  #t
(read p)                        ⇒  (a b c)
(read p)                        ⇒  34
(eof-object? (peek-char p))     ⇒  #t

Procedure: open-output-string

Returns an textual output port that will accumulate characters for retrieval by get-output-string. The port can be closed by the procedure close-output-port, though its storage will be reclaimed by the garbage collector if it becomes inaccessible.
(let ((q (open-output-string))
  (x '(a b c)))
    (write (car x) q)
    (write (cdr x) q)
    (get-output-string q))        ⇒  "a(b c)"

Procedure: get-output-string output-port

Given an output port created by open-output-string, returns a string consisting of the characters that have been output to the port so far in the order they were output. If the result string is modified, the effect is unspecified.
(parameterize
    ((current-output-port (open-output-string)))
    (display "piece")
    (display " by piece ")
    (display "by piece.")
    (newline)
    (get-output-string (current-output-port)))
        ⇒ "piece by piece by piece.\n"

Procedure: call-with-input-string string proc

Create an input port that gets its data from string, call proc with that port as its one argument, and return the result from the call of proc

Procedure: call-with-output-string proc

Create an output port that writes its data to a string, and call proc with that port as its one argument. Return a string consisting of the data written to the port.

Procedure: open-input-bytevector bytevector

Takes a bytevector and returns a binary input port that delivers bytes from the bytevector.

Procedure: open-output-bytevector

Returns a binary output port that will accumulate bytes for retrieval by get-output-bytevector.

Procedure: get-output-bytevector port

Returns a bytevector consisting of the bytes that have been output to the port so far in the order they were output. It is an error if port was not created with open-output-bytevector.

Input

If port is omitted from any input procedure, it defaults to the value returned by (current-input-port). It is an error to attempt an input operation on a closed port.

Procedure: read [port]

The read procedure converts external representations of Scheme objects into the objects themselves. That is, it is a parser for the non-terminal datum. It returns the next object parsable from the given textual input port, updating port to point to the first character past the end of the external representation of the object.

If an end of file is encountered in the input before any characters are found that can begin an object, then an end-of-file object is returned. The port remains open, and further attempts to read will also return an end-of-file object. If an end of file is encountered after the beginning of an object’s external representation, but the external representation is incomplete and therefore not parsable, an error that satisfies read-error? is signaled.

Procedure: read-char [port]

Returns the next character available from the textual input port, updating the port to point to the following character. If no more characters are available, an end-of-file value is returned.

The result type is character-or-eof.

Procedure: peek-char [port]

Returns the next character available from the textual input port, but without updating the port to point to the following character. If no more characters are available, an end-of-file value is returned.

The result type is character-or-eof.

Note: The value returned by a call to peek-char is the same as the value that would have been returned by a call to read-char with the same port. The only difference is that the very next call to read-char or peek-char on that port will return the value returned by the preceding call to peek-char. In particular, a call to peek-char on an interactive port will hang waiting for input whenever a call to read-char would have hung.

Procedure: read-line [port [handle-newline]]

Reads a line of input from the textual input port. The handle-newline parameter determines what is done with terminating end-of-line delimiter. The default, 'trim, ignores the delimiter; 'peek leaves the delimiter in the input stream; 'concat appends the delimiter to the returned value; and 'split returns the delimiter as a second value. You can use the last three options to tell if the string was terminated by end-or-line or by end-of-file. If an end of file is encountered before any end of line is read, but some characters have been read, a string containing those characters is returned. (In this case, 'trim, 'peek, and 'concat have the same result and effect. The 'split case returns two values: The characters read, and the delimiter is an empty string.) If an end of file is encountered before any characters are read, an end-of-file object is returned. For the purpose of this procedure, an end of line consists of either a linefeed character, a carriage return character, or a sequence of a carriage return character followed by a linefeed character.

Procedure: eof-object? obj

Returns #t if obj is an end-of-file object, otherwise returns #f.

Performance note: If obj has type character-or-eof, this is compiled as an int comparison with -1.

Procedure: eof-object

Returns an end-of-file object.

Procedure: char-ready? [port]

Returns #t if a character is ready on the textual input port and returns #f otherwise. If char-ready returns #t then the next read-char operation on the given port is guaranteed not to hang. If the port is at end of file then char-ready? returns #t.

Rationale: The char-ready? procedure exists to make it possible for a program to accept characters from interactive ports without getting stuck waiting for input. Any input editors as- sociated with such ports must ensure that characters whose existence has been asserted by char-ready? cannot be removed from the input. If char-ready? were to return #f at end of file, a port at end-of-file would be indistinguishable from an interactive port that has no ready characters.

Procedure: read-string k [port]

Reads the next k characters, or as many as are available before the end of file, from the textual input port into a newly allocated string in left-to-right order and returns the string. If no characters are available before the end of file, an end-of-file object is returned.

Procedure: read-u8 [port]

Returns the next byte available from the binary input port, updating the port to point to the following byte. If no more bytes are available, an end-of-file object is returned.

Procedure: peek-u8 [port]

Returns the next byte available from the binary input port, but without updating the port to point to the following byte. If no more bytes are available, an end-of-file object is returned.

Procedure: u8-ready? [port]

Returns #t if a byte is ready on the binary input port and returns #f otherwise. If u8-ready? returns #t then the next read-u8 operation on the given port is guaranteed not to hang. If the port is at end of file then u8-ready? returns #t.

Procedure: read-bytevector k [port]

Reads the next k bytes, or as many as are available before the end of file, from the binary input port into a newly allocated bytevector in left-to-right order and returns the bytevector. If no bytes are available before the end of file, an end-of-file object is returned.

Procedure: read-bytevector! bytevector [port [start [end]]]

Reads the next end − start bytes, or as many as are available before the end of file, from the binary input port into bytevector in left-to-right order beginning at the start position. If end is not supplied, reads until the end of bytevector has been reached. If start is not supplied, reads beginning at position 0. Returns the number of bytes read. If no bytes are available, an end-of-file object is returned.

Output

If port is omitted from any output procedure, it defaults to the value returned by (current-output-port). It is an error to attempt an output operation on a closed port.

The return type of these methods is void.

Procedure: write obj [port]

Writes a representation of obj to the given textual output port. Strings that appear in the written representation are enclosed in quotation marks, and within those strings backslash and quotation mark characters are escaped by backslashes. Symbols that contain non-ASCII characters are escaped with vertical lines. Character objects are written using the #\ notation.

If obj contains cycles which would cause an infinite loop using the normal written representation, then at least the objects that form part of the cycle must be represented using datum labels. Datum labels must not be used if there are no cycles.

Procedure: write-shared obj [port]

The write-shared procedure is the same as write, except that shared structure must be represented using datum labels for all pairs and vectors that appear more than once in the output.

Procedure: write-simple obj [port]

The write-simple procedure is the same as write, except that shared structure is never represented using datum labels. This can cause write-simple not to terminate if obj contains circular structure.

Procedure: display obj [port]

Writes a representation of obj to the given textual output port. Strings that appear in the written representation are output as if by write-string instead of by write. Symbols are not escaped. Character objects appear in the representation as if written by write-char instead of by write. The display representation of other objects is unspecified.

Procedure: newline [port]

Writes an end of line to textual output port. This is done using the println method of the Java class java.io.PrintWriter.

Procedure: write-char char [port]

Writes the character char (not an external representation of the character) to the given textual output port.

Procedure: write-string string [port [start [end]]]

Writes the characters of string from start to end in left-to-right order to the textual output port.

Procedure: write-u8 byte [port]

Writes the byte to the given binary output port.

Procedure: write-bytevector bytevector [port [start [end]]]

Writes the bytes of bytevector from start to end in left-to-right order to the binary output port.

Procedure: flush-output-port [port]

Procedure: force-output [port]

Forces any pending output on port to be delivered to the output file or device and returns an unspecified value. If the port argument is omitted it defaults to the value returned by (current-output-port). (The name force-output is older, while R6RS added flush-output-port. They have the same effect.)

Prompts for interactive consoles (REPLs)

When an interactive input port is used for a read-eval-print-loop (REPL or console) it is traditional for the REPL to print a short prompt string to signal that the user is expected to type an expression. These prompt strings can be customized.

Variable: input-prompt1

Variable: input-prompt2

These are fluid variable whose values are string templates with placeholders similar to printf-style format. The placeholders are expanded (depending on the current state), and the resulting string printed in front of the input line.

The input-prompt1 is used normally. For multi-line input commands (for example if the first line is incomplete), input-prompt1 is used for the first line of each command, while input-prompt2 is used for subsequent “continuation” lines.

The following placeholders are handled:
%%

A literal ‘%’.

%N

The current line number. This is (+ 1 (port-line port)).

%nPc

Insert padding at this possion, repeating the following character c as needed to bring the total number of columns of the prompt to that specified by the digits n.

%Pc

Same as %nPc, but n defaults to the number of columns in the initial prompt from the expansion of input-prompt1. This is only meaningful when expanding input-prompt2 for continuation lines.

%{hidden%}
Same as hidden, but the characters of hidden are assumed to have zero visible width. Can be used for ANSI escape sequences to change color or style:
(set! input-prompt1 "%{\e[48;5;51m%}{Kawa:%N} %{\e[0m%}")
The above changes both the text and the background color (to a pale blue).
%Hcd

If running under DomTerm, use the characters c and d as a clickable mini-button to hide/show (fold) the command and its output. (When output is visible c is displayed; clicking on it hides the output. When output is hidden d is displayed; clicking on it shows the output.) Ignored if not running under DomTerm.

%M

Insert a “message” string. Not normally used by Kawa, but supported by JLine.
These variables can be initialized by the command-line arguments console:prompt1=prompt1 and console:prompt2=prompt2, respectively. If these are not specified, languages-specific defaults are used. For example for Scheme the default value of input-prompt1 is "#|%H▼▶kawa:%N|# " and input-prompt2 is "#|%P.%N| ". These have the form of Scheme comments, to make it easier to cut-and-paste.

If input-prompt1 (respectively input-prompt2) does not contain an escape sequence (either "%{ or the escape character "\e") then ANSI escape sequences are added to to highlight the prompt. (Under DomTerm this sets the prompt style, which can be customised with CSS but defaults to a light green background; if using JLine the background is set to light green.)

For greater flexibility, you can also set a prompter procedure.

Procedure: set-input-port-prompter! port prompter

Set the prompt procedure associated with port to prompter, which must be a one-argument procedure taking an input port, and returning a string. The procedure is called before reading the first line of a command; its return value is used as the first-line prompt.

The prompt procedure can have side effects. In Bash shell terms: It combines the features of PROMPT_COMMAND and PS1.

The initial prompter is default-prompter, which returns the expansion of input-prompt1.

Procedure: input-port-prompter port

Get the prompt procedure associated with port.

Procedure: default-prompter port

The default prompt procedure. Normally (i.e. when input-port-read-state is a space) returns input-prompt1 after expanding the %-placeholders. Can also expand input-prompt2 when input-port-read-state is not whitespace.

Line numbers and other input port properties

Function: port-column input-port

Function: port-line input-port

Return the current column number or line number of input-port, using the current input port if none is specified. If the number is unknown, the result is #f. Otherwise, the result is a 0-origin integer - i.e. the first character of the first line is line 0, column 0. (However, when you display a file position, for example in an error message, we recommend you add 1 to get 1-origin integers. This is because lines and column numbers traditionally start with 1, and that is what non-programmers will find most natural.)

Procedure: set-port-line! port line

Set (0-origin) line number of the current line of port to num.

Procedure: input-port-line-number port

Get the line number of the current line of port, which must be a (non-binary) input port. The initial line is line 1. Deprecated; replaced by (+ 1 (port-line port)).

Procedure: set-input-port-line-number! port num

Set line number of the current line of port to num. Deprecated; replaced by (set-port-line! port (- num 1)).

Procedure: input-port-column-number port

Get the column number of the current line of port, which must be a (non-binary) input port. The initial column is column 1. Deprecated; replaced by (+ 1 (port-column port)).

Procedure: input-port-read-state port

Returns a character indicating the current read state of the port. Returns #\Return if not current doing a read, #\" if reading a string; #\| if reading a comment; #\( if inside a list; and #\Space when otherwise in a read. The result is intended for use by prompt prcedures, and is not necessarily correct except when reading a new-line.

Variable: symbol-read-case

A symbol that controls how read handles letters when reading a symbol. If the first letter is ‘U’, then letters in symbols are upper-cased. If the first letter is ‘D’ or ‘L’, then letters in symbols are down-cased. If the first letter is ‘I’, then the case of letters in symbols is inverted. Otherwise (the default), the letter is not changed. (Letters following a ‘\’ are always unchanged.) The value of symbol-read-case only checked when a reader is created, not each time a symbol is read.

Miscellaneous

Variable: port-char-encoding

Controls how bytes in external files are converted to/from internal Unicode characters. Can be either a symbol or a boolean. If port-char-encoding is #f, the file is assumed to be a binary file and no conversion is done. Otherwise, the file is a text file. The default is #t, which uses a locale-dependent conversion. If port-char-encoding is a symbol, it must be the name of a character encoding known to Java. For all text files (that is if port-char-encoding is not #f), on input a #\Return character or a #\Return followed by #\Newline are converted into plain #\Newline.

This variable is checked when the file is opened; not when actually reading or writing. Here is an example of how you can safely change the encoding temporarily:
(define (open-binary-input-file name)
  (fluid-let ((port-char-encoding #f)) (open-input-file name)))

Variable: *print-base*

The number base (radix) to use by default when printing rational numbers. Must be an integer between 2 and 36, and the default is of course 10. For example setting *print-base* to 16 produces hexadecimal output.

Variable: *print-radix*

If true, prints an indicator of the radix used when printing rational numbers. If *print-base* is respectively 2, 8, or 16, then #b, #o or #x is written before the number; otherwise #Nr is written, where N is the base. An exception is when *print-base* is 10, in which case a period is written after the number, to match Common Lisp; this may be inappropriate for Scheme, so is likely to change.

Variable: *print-right-margin*

The right margin (or line width) to use when pretty-printing.

Variable: *print-miser-width*

If this an integer, and the available width is less or equal to this value, then the pretty printer switch to the more miser compact style.

Variable: *print-xml-indent*

When writing to XML, controls pretty-printing and indentation. If the value is 'always or 'yes force each element to start on a new suitably-indented line. If the value is 'pretty only force new lines for elements that won’t fit completely on a line. The the value is 'no or unset, don’t add extra whitespace.