Mapping Scheme names to Java names

Programs use "names" to refer to various values and procedures. The definition of what is a "name" is different in different programming languages. A name in Scheme (and other Lisp-like languages) can in principle contain any character (if using a suitable quoting convention), but typically names consist of "words" (one or more letters) separated by hyphens, such as ‘make-temporary-file’. Digits and some special symbols are also used. Standard Scheme is case-insensitive; this means that the names ‘loop’, ‘Loop’, and ‘LOOP’ are all the same name. Kawa is by default case-sensitive, but we recommend that you avoid using upper-case letters as a general rule.

The Java language and the Java virtual machine uses names for classes, variables, fields and methods. Names in the Java language can contain upper- and lower-case letters, digits, and the special symbols ‘_’ and ‘$’. The Java virtual machine allows most characters, but still has some limitations. Kawa limits characters in generated names to those allowed by Java language (rather than those allowed by the virtual machine), for simplify and Java interoperatbility.

Given a name in a Scheme program, Kawa needs to map that name into a valid Java name. A typical Scheme name such as ‘make-temporary-file’ is not a valid Java name. The convention for Java names is to use "mixed-case" words, such as ‘makeTemporaryFile’. So Kawa will translate a Scheme-style name into a Java-style name. The basic rule is simple: Hyphens are dropped, and a letter that follows a hyphen is translated to its upper-case (actually "title-case") equivalent. Otherwise, letters are translated as is.

Some special characters are handled specially. A final ‘?’ is replaced by an initialis’, with the following letter converted to titlecase. Thus ‘number?’ is converted to ‘isNumber’ (which fits with Java conventions), and ‘file-exists?’ is converted to ‘isFileExists’ (which doesn’t really). The pair ‘->’ is translated to ‘$To$’. For example ‘list->string’ is translated to ‘list$To$string’.

Some symbols are mapped to a mnemonic sequence, starting with a dollar-sign, followed by a two-character abbreviation. For example, the less-than symbol ‘<’ is mangled as ‘$Ls’. See the source code to the mangleName method in the gnu.expr.Compilation class for the full list. Characters that do not have a mnemonic abbreviation are mangled as ‘$’ followed by a four-hex-digit unicode value. For example ‘Tamil vowel sign ai’ is mangled as ‘$0bc8’.

Note that this mapping may map different Scheme names to the same Java name. For example ‘string?’, ‘String?’, ‘is-string’, ‘is-String’, and ‘isString’ are all mapped to the same Java identifier ‘isString’. Code that uses such "Java-clashing" names is not supported. There is very partial support for renaming names in the case of a clash, and there may be better support in the future. However, some of the nice features of Kawa depend on being able to map Scheme name to Java names naturally, so we urge you to not write code that "mixes" naming conventions by using (say) the names ‘open-file’ and ‘openFile’ to name two different objects.

The above mangling is used to generate Java method names. Each top-level definition is also mapped to a Java field. The name of this field is also mangled, but using a mostly reversible mapping: The Scheme function ‘file-exists?’ is mapped to the method name ‘file$Mnexists$Qu’. Because ‘$’ is used to encode special characters, you should avoid using it in names in your source file.