Kawa: Mapping Scheme names to Java names

Mapping Scheme names to Java names

Programs use "names" to refer to various values and procedures. The definition of what is a "name" is different in different programming languages. A name in Scheme (and other Lisp-like languages) can in principle contain any character (if using a suitable quoting convention), but typically names consist of "words" (one or more letters) separated by hyphens, such as ‘make-temporary-file’. Digits and some special symbols are also used. Traditionally, Scheme is case-insensitive; this means that the names ‘loop’, ‘Loop’, and ‘LOOP’ are all the same name. Kawa is by default case-sensitive, but we recommend that you avoid using upper-case letters as a general rule.

The Java language and the Java virtual machine uses names for classes, variables, fields and methods. Names in the Java language can contain upper- and lower-case letters, digits, and the special symbols ‘_’ and ‘$’. The Java virtual machine (JVM) allows most characters, but still has some limitations.

Kawa translates class names, package names, field names, and local variable names using the ”symbolic” convention, so most characters are unchanged. For example the Scheme function ‘file-exists?’ becomes the field ‘file-exists?’, but dotted.name becomes ‘dotted\,name’. Such names may not be valid Java name, so to access them from a Java program you might have to use reflection.

When translating procedure names to method names, Kawa uses a different translation, in order to achieve more “Java-like” names. This means translating a Scheme-style name like ‘make-temporary-file’ to "mixed-case" words, such as ‘makeTemporaryFile’. The basic rule is simple: Hyphens are dropped, and a letter that follows a hyphen is translated to its upper-case (actually "title-case") equivalent. Otherwise, letters are translated as is.

Some special characters are handled specially. A final ‘?’ is replaced by an initial ‘is’, with the following letter converted to titlecase. Thus ‘number?’ is converted to ‘isNumber’ (which fits with Java conventions), and ‘file-exists?’ is converted to ‘isFileExists’ (which doesn’t really). The pair ‘->’ is translated to ‘ $To$ ’. For example ‘list->string’ is translated to ‘list$To$string’.

Some symbols are mapped to a mnemonic sequence, starting with a dollar-sign, followed by a two-character abbreviation. For example, the less-than symbol ‘<’ is mangled as ‘$Ls’. See the source code to the mangleName method in the gnu.expr.Mangling class for the full list. Characters that do not have a mnemonic abbreviation are mangled as ‘$’ followed by a four-hex-digit unicode value. For example ‘Tamil vowel sign ai’ is mangled as ‘$0bc8’.

Note that this mapping may map different Scheme names to the same Java name. For example ‘string?’, ‘String?’, ‘is-string’, ‘is-String’, and ‘isString’ are all mapped to the same Java identifier ‘isString’. Code that uses such "Java-clashing" names is not supported. There is very partial support for renaming names in the case of a clash, and there may be better support in the future. However, some of the nice features of Kawa depend on being able to map Scheme name to Java names naturally, so we urge you to not write code that "mixes" naming conventions by using (say) the names ‘open-file’ and ‘openFile’ to name two different objects.