This document describes installing and operating the gnu Smalltalk programming environment.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts. A copy of the license is included in the section entitled “GNU Free Documentation License”.
--- The detailed node listing ---
Using GNU Smalltalk:
Operation:
Legal concerns:
Features:
Packages
Emacs
C and Smalltalk:
Tutorial:
gnu Smalltalk is an implementation that closely follows the Smalltalk-80 language as described in the book Smalltalk-80: the Language and its Implementation by Adele Goldberg and David Robson, which will hereinafter be referred to as the Blue Book.
The Smalltalk programming language is an object oriented programming language. This means, for one thing, that when programming you are thinking of not only the data that an object contains, but also of the operations available on that object. The object's data representation capabilities and the operations available on the object are “inseparable”; the set of things that you can do with an object is defined precisely by the set of operations, which Smalltalk calls methods, that are available for that object: each object belongs to a class (a datatype and the set of functions that operate on it) or, better, it is an instance of that class. You cannot even examine the contents of an object from the outside—to an outsider, the object is a black box that has some state and some operations available, but that's all you know: when you want to perform an operation on an object, you can only send it a message, and the object picks up the method that corresponds to that message.
In the Smalltalk language, everything is an object. This includes not
only numbers and all data structures, but even classes, methods,
pieces of code within a method (blocks or closures), stack
frames (contexts), etc. Even if and while structures
are implemented as methods sent to particular objects.
Unlike other Smalltalks (including Smalltalk-80), gnu Smalltalk emphasizes Smalltalk's rapid prototyping features rather than the graphical and easy-to-use nature of the programming environment (did you know that the first GUIs ever ran under Smalltalk?). The availability of a large body of system classes, once you master them, makes it pretty easy to write complex programs which are usually a task for the so called scripting languages. Therefore, even though we have a nice GUI environment including a class browser (see Blox), the goal of the gnu Smalltalk project is currently to produce a complete system to be used to write your scripts in a clear, aesthetically pleasing, and philosophically appealing programming language.
An example of what can be obtained with Smalltalk in this novel way can be found in Class reference. That part of the manual is entirely generated by a Smalltalk program, starting from the source code for the class libraries distributed together with the system.
The gnu Smalltalk virtual machine may be invoked via the following command:
gst [ flags ... ] [ file ... ]
When you invoke gnu Smalltalk, it will ensure that the binary image file (called gst.im) is up to date; if not, it will build a new one as described in Loading an image or creating a new one. Your first invocation should look something like this:
"Global garbage collection... done"
gnu Smalltalk ready
st>
If you specify one or more files, they will be read and executed in order, and Smalltalk will exit when end of file is reached. If you don't specify file, gnu Smalltalk reads standard input, issuing a `st>' prompt if the standard input is a terminal. You may specify - for the name of a file to invoke an explicit read from standard input.
To exit while at the `st>' prompt, use Ctrl-d, or type ObjectMemory quit followed by <RET>. Use ObjectMemory snapshot first to save a new image that you can reload later, if you wish.
As is standard for GNU-style options, specifying -- stops the interpretation of options so that every argument that follows is considered a file name even if it begins with a `-'.
You can specify both short and long flags; for example, --version is exactly the same as -v, but is easier to remember. Short flags may be specified one at a time, or in a group. A short flag or a group of short flags always starts off with a single dash to indicate that what follows is a flag or set of flags instead of a file name; a long flag starts off with two consecutive dashes, without spaces between them.
In the current implementation the flags can be intermixed with file names, but their effect is as if they were all specified first. The various flags are interpreted as follows:
Smalltalk arguments, ignoring them as arguments
to gnu Smalltalk itself.
Examples:
| command line | Options seen by gnu Smalltalk | Smalltalk arguments
|
| (empty) | (none) | #()
|
| -Via foo bar | -Vi | #('foo' 'bar')
|
| -Vai test | -Vi | #('test')
|
| -Vaq | -Vq | #()
|
| --verbose -aq -c | --verbose -q | #('-c')
|
Directory kernel.
gst -f file args... gst -q file -a args...
This is meant to be used in the so called “sharp-bang” sequence at the beginning of a file, as in
#! /usr/bin/gst -f
... Smalltalk source code ...
gnu Smalltalk treats the first line as a comment, and the -f option ensures that the arguments are passed properly to the script. Use this instead to avoid hard-coding the path to gst:2
#! /bin/sh
"exec" "gst" "-f" "$0" "$@"
... Smalltalk source code ...
Caveat: The startup sequence is pretty complicated. If you are not interested in its customization, you can skip the first two sections below. These two sections also don't apply when using the command-line option -I, unless also using --maybe-rebuild-image.
You can abort gnu Smalltalk at any time during this procedure with Ctrl-c.
When gnu Smalltalk is invoked, it first chooses two paths, the “image path” and the “kernel path”. The image path is set by considering these paths in succession:
The “kernel path” is the directory in which to look for Smalltalk code compiled into the base image. The possibilities in this case are:
gnu Smalltalk can load images created on any system with the same pointer size as its host system by approximately the same version of gnu Smalltalk, even if they have different endianness. For example, images created on 32-bit PowerPC can be loaded with a 32-bit x86 gst VM, provided that the gnu Smalltalk versions are similar enough. Such images are called compatible images. It cannot load images created on systems with different pointer sizes; for example, our x86 gst cannot load an image created on x86-64.
Unless the -i flag is used, gnu Smalltalk first tries to load the file named by --image-file, defaulting to gst.im in the image path. If this is found, gnu Smalltalk ensures the image is “not stale”, meaning its write date is newer than the write dates of all of the kernel method definition files. It also ensures that the image is “compatible”, as described above. If both tests pass, gnu Smalltalk loads the image and continues with After the image is created or restored.
If that fails, a new image has to be created. The image path may now be changed to the current directory if the previous choice is not writeable.
To build an image, gnu Smalltalk loads the set of files that make up the
kernel, one at a time. The list can be found in libgst/lib.c, in
the standard_files variable. You can override kernel files by
placing your own copies in ~/.st/kernel/.3 For
example, if you create a file ~/.st/kernel/Builtins.st, it will
be loaded instead of the Builtins.st in the kernel path.
To aid with image customization and local bug fixes, gnu Smalltalk loads two more files (if present) before saving the image. The first is site-pre.st, found in the parent directory of the kernel directory. Unless users at a site change the kernel directory when running gst, /usr/local/share/smalltalk/site-pre.st provides a convenient place for site-wide customization. The second is ~/.st/pre.st, which can be different for each user's home directory.4.
Before the next steps, gnu Smalltalk takes a snapshot of the new memory image, saving it over the old image file if it can, or in the current directory otherwise.
Next, gnu Smalltalk sends the returnFromSnapshot event to the dependents
of the special class ObjectMemory (see Memory access).
Afterwards, it loads ~/.st/init.st if available.5
You can remember the difference between pre.st and init.st by remembering that pre.st is the pre-snapshot file and init.st is the post-image-load initialization file.
Finally, gnu Smalltalk loads files listed on the command line, or prompts for input at the terminal, as described in Command line arguments.
The language that gnu Smalltalk accepts is basically the same that other Smalltalk
environment accept and the same syntax used in the Blue Book, also
known as Smalltalk-80: The Language and Its Implementation.
The return operator, which is represented in the Blue Book as an
up-arrow, is mapped to the ASCII caret symbol ^; the assignment
operator (left-arrow) is usually represented as :=6.
Actually, the grammar of gnu Smalltalk is slightly different from the grammar of other Smalltalk environments in order to simplify interaction with the system in a command-line environment as well as in full-screen editors.
Statements are executed one by one; multiple statements are separated by a period. At end-of-line, if a valid statement is complete, a period is implicit. For example,
8r300. 16rFFFF
prints out the decimal value of octal 300 and hex FFFF,
each followed by a newline.
Multiple statements share the same local variables, which are automatically
declared. To delete the local variables, terminate a statement with
! rather than . or newline. Here,
a := 42
a!
a
the first two as are printed as 42, but the third one
is uninitialized and thus printed as nil.
In order to evaluate multiple statements in a single block, wrap them into an eval block as follows:
Eval [
a := 42. a printString
]
This won't print the intermediate result (the integer 42), only the final
result (the string '42').
ObjectMemory quit
exits from the system. You can also type a C-d to exit from Smalltalk if it's reading statements from standard input.
GNU Smalltalk provides three extensions to the language that make it simpler to write complete programs in an editor. However, it is also compatible with the file out syntax as shown in the Green Book (also known as Smalltalk-80: Bits of History, Words of Advice by Glenn Krasner).
A new class is created using this syntax:
superclass-name subclass: new-class-name [
| instance variables |
pragmas
message-pattern-1 [ statements ]
message-pattern-2 [ statements ]
...
class-variable-1 := expression.
class-variable-2 := expression.
...
]
In short:
<comment: 'Class comment'>
<category: 'Examples-Intriguing'>
<import: SystemExceptions>
<shape: #pointer>
A similar syntax is used to define new methods in an existing class.
class-expression extend [
...
]
The class-expression is an expression that evaluates to a class
object, which is typically just the name of a class, although it can be
the name of a class followed by the word class, which causes the
method definitions that follow to apply to the named class itself,
rather than to its instances.
Number extend [
radiusToArea [
^self squared * Float pi
]
radiusToCircumference [
^self * 2 * Float pi
]
]
A complete treatment of the Smalltalk syntax and of the class library can be found in the included tutorial and class reference (see Class Reference).
More information on the implementation of the language can be found in the Blue Book; the relevant parts are also available online as html documents, at http://users.ipa.net/~dwighth/smalltalk/bluebook/bluebook_imp_toc.html.
gnu Smalltalk comes with a set of files that provides a simple regression test suite.
To run the test suite, you should be connected to the top-level Smalltalk directory. Type
make check
You should see the names of the test suite files as they are processed, but that's it. Any other output indicates some problem.
Different parts of gnu Smalltalk comes under two licenses: the virtual machine and the development environment (compiler and browser) come under the gnu General Public License, while the system class libraries come under the Lesser General Public License.
The GPL licensing of the virtual machine means that all derivatives of the virtual machine must be put under the same license. In other words, it is strictly forbidden to distribute programs that include the gnu Smalltalk virtual machine under a license that is not the GPL. This also includes any bindings to external libraries. For example, the bindings to Gtk+ are released under the GPL.
In principle, the GPL would not extend to Smalltalk programs, since these are merely input data for the virtual machine. On the other hand, using bindings that are under the GPL via dynamic linking would constitute combining two parts (the Smalltalk program and the bindings) into one program. Therefore, we added a special exception to the GPL in order to avoid gray areas that could adversely hit both the project and its users:
In addition, as a special exception, the Free Software Foundation give you permission to combine gnu Smalltalk with free software programs or libraries that are released under the gnu LGPL and with independent programs running under the gnu Smalltalk virtual machine.You may copy and distribute such a system following the terms of the gnu GPL for gnu Smalltalk and the licenses of the other code concerned, provided that you include the source code of that other code when and as the gnu GPL requires distribution of source code.
Note that people who make modified versions of gnu Smalltalk are not obligated to grant this special exception for their modified versions; it is their choice whether to do so. The gnu General Public License gives permission to release a modified version without this exception; this exception also makes it possible to release a modified version which carries forward this exception.
Smalltalk programs that run under gnu Smalltalk are linked with the system classes in gnu Smalltalk class library. Therefore, they must respect the terms of the Lesser General Public License7.
The interpretation of this license for architectures different from that of the C language is often difficult; the accepted one for Smalltalk is as follows. The image file can be considered as an object file, falling under Subsection 6a of the license, as long as it allows a user to load an image, upgrade the library or otherwise apply modifications to it, and save a modified image: this is most conveniently obtained by allowing the user to use the read-eval-print loop that is embedded in the gnu Smalltalk virtual machine.
In other words, provided that you leave access to the loop in a documented way, or that you provide a way to file in arbitrary files in an image and save the result to a new image, you are obeying Subsection 6a of the Lesser General Public License, which is reported here:
a) Accompany the work with the complete corresponding machine-readable source code for the Library including whatever changes were used in the work (which must be distributed under Sections 1 and 2 above); and, if the work is an executable linked with the Library, with the complete machine-readable "work that uses the Library", as object code and/or source code, so that the user can modify the Library and then relink to produce a modified executable containing the modified Library. (It is understood that the user who changes the contents of definitions files in the Library will not necessarily be able to recompile the application to use the modified definitions.)
In the future, alternative mechanisms similar to shared libraries may be provided, so that it is possible to comply with the gnu LGPL in other ways.
In this section, the features which are specific to gnu Smalltalk are described. These features include support for calling C functions from within Smalltalk, accessing environment variables, and controlling various aspects of compilation and execution monitoring.
Note that, in general, gnu Smalltalk is much more powerful than the original
Smalltalk-80, as it contains a lot of methods that are common in today's
Smalltalk implementation and are present in the ANSI Standard for
Smalltalk, but were absent in the Blue Book. Examples include
Collection's allSatisfy: and anySatisfy: methods and many
methods in SystemDictionary (the Smalltalk dictionary's class).
The basic image in gnu Smalltalk includes powerful extensions to the Stream hierarchy found in ANSI Smalltalk (and Smalltalk-80). In particular:
fold:, detect:, inject:into:) these
are completely identical. For messages that return a new stream, such
as select: and collect:, the blocks are evaluated lazily,
as elements are requested from the stream using next.
, like SequenceableCollections.
For example, here is an empty generator and two infinite generators:
"Returns an empty stream"
Generator on: [ :gen | ]
"Return an infinite stream of 1's"
Generator on: [ :gen | [ gen yield: 1 ] repeat ]
"Return an infinite stream of integers counting up from 1"
Generator inject: 1 into: [ :value | value + 1 ]
The block is put “on hold” and starts executing as soon as #next
or #atEnd are sent to the generator. When the block sends
#yield: to the generator, it is again put on hold and the argument
becomes the next object in the stream.
Generators use continuations, but they shield the users from their complexity by presenting the same simple interface as streams.
Regular expressions, or "regexes", are a sophisticated way to efficiently match patterns of text. If you are unfamiliar with regular expressions in general, see Syntax of Regular Expressions, for a guide for those who have never used regular expressions.
gnu Smalltalk supports regular expressions in the core image with methods
on String.
The GNU gnu Smalltalk expression library is derived from GNU libc,
with modifications made originally for Ruby to support Perl-like syntax.
It will always use its included library, and never the ones installed on
your system; this may change in the future in backwards-compatible ways.
Regular expressions are currently 8-bit clean, meaning they can
work with any ordinary String, but do not support full Unicode, even
when package I18N is loaded.
Broadly speaking, these regexes support Perl 5 syntax; register groups `()' and repetition `{}' must not be given with backslashes, and their counterpart literal characters should. For example, `\{{1,3}' matches `{', `{{', `{{{'; correspondingly, `(a)(\()' matches `a(', with `a' and `(' as the first and second register groups respectively. gnu Smalltalk also supports the regex modifiers `imsx', as in Perl. You can't put regex modifiers like `im' after Smalltalk strings to specify them, because they aren't part of Smalltalk syntax. Instead, use the inline modifier syntax. For example, `(?is:abc.)' is equivalent to `[Aa][Bb][Cc](?:.|\n)'.
In most cases, you should specify regular expressions as ordinary
strings. gnu Smalltalk always caches compiled regexes, and uses a special
high-efficiency caching when looking up literal strings (i.e. most
regexes), to hide the compiled Regex objects from most code.
For special cases where this caching is not good enough, simply send
#asRegex to a string to retrieved a compiled form, which
works in all places in the public API where you would specify a regex
string. You should always rely on the cache until you have demonstrated
that using Regex objects makes a noticeable performance difference in
your code.
Smalltalk strings only have one escape, the `'' given by `''', so backslashes used in regular expression strings will be understood as backslashes, and a literal backslash can be given directly with `\\'8.
The methods on the compiled Regex object are private to this interface. As a public interface, gnu Smalltalk provides methods on String, in the category `regex'. There are several methods for matching, replacing, pattern expansion, iterating over matches, and other useful things.
The fundamental operator is #searchRegex:, usually written as
#=~, reminiscent of Perl syntax. This method will always
return a RegexResults, which you can query for whether
the regex matched, the location Interval and contents of the match and
any register groups as a collection, and other features. For example,
here is a simple configuration file line parser:
| file config |
config := LookupTable new.
file := (File name: 'myapp.conf') readStream.
file linesDo: [:line |
(line =~ '(\w+)\s*=\s*((?: ?\w+)+)') ifMatched: [:match |
config at: (match at: 1) put: (match at: 2)]].
file close.
config printNl.
As with Perl, =~ will scan the entire string and answer the
leftmost match if any is to be found, consuming as many characters as
possible from that position. You can anchor the search with variant
messages like #matchRegex:, or of course ^ and
$ with their usual semantics if you prefer.
You shouldn't modify the string while you want a particular RegexResults object matched on it to remain valid, because changes to the matched text may propagate to the RegexResults object.
Analogously to the Perl s operator, gnu Smalltalk provides
#replacingRegex:with:. Unlike Perl, gnu Smalltalk employs the pattern expansion
syntax of the #% message here. For example, 'The ratio is
16/9.' replacingRegex: '(\d+)/(\d+)' with: '$%1\over%2$' answers
'The ratio is $16\over9$.'. In place of the g
modifier, use the #replacingAllRegex:with: message instead.
One other interesting String message is #onOccurrencesOfRegex:do:, which
invokes its second argument, a block, on every successful match found in the
receiver. Internally, every search will start at the end of the previous
successful match. For example, this will print all the words in a stream:
stream contents onOccurrencesOfRegex: '\w+'
do: [:each | each match printNl]
[This section (and the implementation of namespaces in gnu Smalltalk) is based on the paper Structured Symbolic Name Spaces in Smalltalk, by Augustin Mrazik.]
The Smalltalk-80 programming environment, upon which gnu Smalltalk is
historically based, supports symbolic identification of objects in one
global namespace—in the Smalltalk system dictionary. This means
that each global variable in the system has its unique name which is
used for symbolic identification of the particular object in the source
code (e.g. in expressions or methods). The most important of these
global variables are classes defining the behavior of objects.
In development dealing with modelling of real systems, polymorphic
symbolic identification is often needed. By this, we mean that it
should be possible to use the same name for different classes or other
global variables. Selection of the proper variable binding should be
context-specific. By way of illustration, let us consider class
Statement as an example which would mean totally different things
in different domains:
This issue becomes inevitable if we start to work persistently, using
ObjectMemory snapshot to save after each session for later
resumption. For example, you might have the class Statement
already in your image with the “Bank” meaning above (e.g. in the
live bank support systems we all run in our images) and you might decide
to start developing YAC [Yet Another C]. Upon starting to
write parse nodes for the compiler, you would find that
#Statement is boundk in the banking package. You could replace
it with your parse node class, and the bank's Statement could
remain in the system as an unbound class with full functionality;
however, it could not be accessed anymore at the symbolic level in the
source code. Whether this would be a problem or not would depend on
whether any of the bank's code refers to the class Statement, and
when these references occur.
Objects which have to be identified in source code by their names are
included in Smalltalk, the sole instance of
SystemDictionary. Such objects may be identified simply by
writing their names as you would any variable names. The code is
compiled in the default environment, and if the variable is found in
Smalltalk, without being shadowed by a class pool or local
variables, its value is retrieved and used as the value of the
expression. In this way Smalltalk represents the sole symbolic
namespace. In the following text the symbolic namespace, as a concept,
will be called simply environment to make the text more clear.
To support polymorphic symbolical identification several environments will be needed. The same name may exist concurrently in several environments as a key, pointing to diverse objects in each.
Symbolic navigation between these environments is needed. Before approaching the problem of the syntax and semantics to be implemented, we have to decide on structural relations to be established between environments.
Since the environment must first be symbolically identified to direct
access to its global variables, it must first itself be a global
variable in another environment. Smalltalk is a great choice for
the root environment, from which selection of other environments and
their variables begins. From Smalltalk some of the existing
sub-environments may be seen; from these other sub-environments may be
seen, etc. This means that environments represent nodes in a graph
where symbolic selections from one environment to another one represent
branches.
The symbolic identification should be unambiguous, although it will be polymorphic. This is why we should avoid cycles in the environment graph. Cycles in the graph could cause also other problems in the implementation, e.g. inability to use trivially recursive algorithms. Thus, in general, the environments must build a directed acyclic graph; gnu Smalltalk currently limits this to an n-ary tree, with the extra feature that environments can be used as pool dictionaries.
Let us call the partial ordering relation which occurs between environments inheritance. Sub-environments inherit from their super-environments. The feature of inheritance in the meaning of object-orientation is associated with this relation: all associations of the super-environment are valid also in its sub-environments, unless they are locally redefined in the sub-environment.
A super-environment includes all its sub-enviroments as
Associations under their names. The sub-environment includes its
super-environment under the symbol #Super. Most environments
inherit from Smalltalk, the standard root environment, but they
are not required to do so; this is similar to how most classes derive
from Object, yet one can derive a class directly from nil.
Since they all inherit Smalltalk's global variables, it is not
necessary to define Smalltalk as pointing to Smalltalk's
Smalltalk in each environment.
The inheritance links to the super-environments are used in the lookup
for a potentially inherited global variable. This includes lookups by a
compiler searching for a variable binding and lookups via methods such
as #at: and #includesKey:.
Global objects of an environment, be they local or inherited, may be referenced by their symbol variable names used in the source code, e.g.
John goHome
if the #John -> aMan association exists in the particular environment or
one of its super-environments, all along the way to the root environment.
If an object must be referenced from another environment (i.e. which
is not one of its sub-environments) it has to be referenced either
relatively to the position of the current environment, using the
Super symbol, or absolutely, using the “full pathname”
of the object, navigating from the tree root (usually Smalltalk)
through the tree of sub-environments.
For the identification of global objects in another environment, we use a “pathname” of symbols. The symbols are separated by periods; the “look” to appear is that of
Smalltalk.Tasks.MyTask
and of
Super.Super.Peter.
As is custom in Smalltalk, we are reminded by capitalization that we
are accessing global objects. Another syntax returns the variable
binding, the Association for a particular global. The first
example above is equivalently:
#{Smalltalk.Tasks.MyTask} value
The latter syntax, a variable binding, is also valid inside literal arrays.
A superclass of SystemDictionary called RootNamespace is
defined, and many of the features of the Smalltalk-80
SystemDictionary will be hosted by that class. Namespace
and RootNamespace are in turn subclasses of
AbstractNamespace.
To handle inheritance, the following methods have to be defined or redefined in Namespace (not in RootNamespace):
#at:ifAbsent: and #includesKey:Namespace, trying to read
a variable, finds an association in its own dictionary or a
super-environment dictionary, it uses that; for Dictionary's
writes and when a new association must be created, Namespace
creates it in its own dictionary. There are special methods like
#set:to: for cases in which you want to modify a binding in a
super-environment if that is the relevant variable's binding.
#do: and #keysAbstractNamespace will also implement a new set of
methods that allow one to navigate through the namespace hierarchy;
these parallel those found in Behavior for the class hierarchy.
The most important task of the Namespace class is to provide
organization for the most important global objects in the Smalltalk
system—for the classes. This importance becomes even more crucial in
a structure of multiple environments intended to change the semantics of
code compiled for those classes.
In Smalltalk the classes have the instance variable name which
holds the name of the class. Each defined class is included in
Smalltalk, or another environment, under this name. In a
framework with several environments the class should know the
environment in which it has been created and compiled. This is a new
property of Class which must be defined and properly used in
relevant methods. In the mother environment the class shall be included
under its name.
Any class, as with any other object, may be included concurrently in several environments, even under different symbols in the same or in diverse environments. We can consider these “alias names” of the particular class or other value. A class may be referenced under the other names or in other environments than its mother environment, e.g. for the purpose of instance creation or messages to the class, but it should not compile code in these environments, even if this compilation is requested from another environment. If the syntax is not correct in the mother environment, a compilation error occurs. This follows from the existence of class “mother environments”, as a class is responsible for compiling its own methods.
An important issue is also the name of the class answered by the class for the purpose of its identification in diverse tools (e.g. in a browser). This must be changed to reflect the environment in which it is shown, i.e. the method `nameIn: environment' must be implemented and used in proper places.
Other changes must be made to the Smalltalk system to achieve the full
functionality of structured environments. In particular, changes have
to be made to the behavior classes, the user interface, the compiler,
and a few classes supporting persistance. One small detail of note is
that evaluation in the REPL or `Workspace', implemented
by compiling methods on UndefinedObject, make more sense if
UndefinedObject's environment is the “current environment” as
reachable by Namespace current, even though its mother
environment by any other sensibility is Smalltalk.
Using namespaces is often merely a matter of adding a `namespace'
option to the gnu Smalltalk XML package description used by
PackageLoader, or wrapping your code like this:
Namespace current: NewNS [
...
]
Namespaces can be imported into classes like this:
Stream subclass: EncodedStream [
<import: Encoders>
]
Alternatively, paths to
classes (and other objects) in the namespaces will have to be specified
completely. Importing a namespace into a class is similar to C++'s
using namespace declaration within the class proper's definition.
Finally, be careful when working with fundamental system classes. Although you can use code like
Namespace current: NewNS [
Smalltalk.Set subclass: #Set [
<category: 'My application-Extensions'>
...
]
]
this approach won't work
when applied to core classes. For example, you might be successful with
a Set or WriteStream object, but subclassing
SmallInteger this way can bite you in strange ways: integer
literals will still belong to the Smalltalk dictionary's version
of the class (this holds for Arrays, Strings, etc. too),
primitive operations will still answer standard Smalltalk
SmallIntegers, and so on. Similarly,
word-shaped will recognize 32-bit Smalltalk.LargeInteger objects,
but not LargeIntegers belonging to your own namespace.
Unfortunately, this problem is not easy to solve since Smalltalk has to
know the OOPs of determinate class objects for speed—it
would not be feasible to lookup the environment to which sender of a
message belongs every time the + message was sent to an Integer.
So, gnu Smalltalk namespaces cannot yet solve 100% of the problem of clashes between extensions to a class—for that you'll still have to rely on prefixes to method names. But they do solve the problem of clashes between class names, or between class names and pool dictionary names.
Namespaces are unrelated from packages; loading a package does not import the corresponding namespace.
Four classes (FileDescriptor, FileStream, File,
Directory) allow you to create files and access the file system
in a fully object-oriented way.
FileDescriptor and FileStream are much more powerful than the
corresponding C language facilities (the difference between the two is that,
like the C stdio library, FileStream does buffering). For one
thing, they allow you to write raw binary data in a portable endian-neutral
format. But, more importantly, these classes transparently implement
virtual filesystems and asynchronous I/O.
Asynchronous I/O means that an input/output operation blocks the
Smalltalk Process that is doing it, but not the others, which makes them
very useful in the context of network programming. Virtual file systems
mean that these objects can transparently extract files from archives
such as tar and gzip files, through a mechanism that can
be extended through either shell scripting or Smalltalk programming.
For more information on these classes, look in the class reference, under
the VFS namespace. URLs may be used as file names; though,
unless you have loaded the NetClients package (see Network support),
only file URLs will be accepted.
In addition, the three files, stdin, stdout, and stderr
are declared as global instances of FileStream that are bound to the
proper values as passed to the C virtual machine. They can be accessed as
either stdout and FileStream stdout—the former is easier to
type, but the latter can be clearer.
Finally, Object defines four other methods: print and
printNl, store and storeNl. These do a printOn: or
storeOn: to the “Transcript” object; this object, which is the sole
instance of class TextCollector, normally delegates write
operations to stdout. If you load the Blox gui, instead,
the Transcript Window will be attached to the Transcript object (see Blox).
The fileIn: message sent to the FileStream class, with a file
name as a string argument, will cause that file to be loaded into
Smalltalk.
For example,
FileStream fileIn: 'foo.st' !
will cause foo.st to be loaded into gnu Smalltalk.
Another gnu Smalltalk-specific class, the ObjectDumper class, allows
you to dump objects in a portable, endian-neutral, binary format. Note that
you can use the ObjectDumper on ByteArrays too, thanks to another
gnu Smalltalk-specific class, ByteStream, which allows you to treat
ByteArrays the same way you would treat disk files.
For more information on the usage of the ObjectDumper, look in the
class reference.
The DLD class enhances the C callout mechanism to automatically look
for unresolved functions in a series of program-specified libraries. To
add a library to the list, evaluate code like the following:
DLD addLibrary: 'libc'
The extension (.so, .sl, .a, .dll depending on your operating system) will be added automatically. You are advised not to specify it for portability reasons.
You will then be able to use the standard C call-out mechanisms to define all the functions in the C run-time library. Note that this is a potential security problem (especially if your program is SUID root under Unix), so you might want to disable dynamic loading when using gnu Smalltalk as an extension language. To disable dynamic loading, configure gnu Smalltalk passing the --disable-dld switch.
Note that a DLD class will be present even if dynamic loading is
disabled (either because your system is not supported, or by the
--disable-dld configure switch) but any attempt to perform
dynamic linking will result in an error.
gnu Smalltalk includes an automatic documentation generator invoked via the
gst-doc command. The code is actually part of the
ClassPublisher package, and gst-doc takes care
of reading the code to be documented and firing a ClassPublisher.
Currently, gst-doc can only generate output in Texinfo format, though this will change in future releases.
gst-doc can document code that is already in the image, or it can load external files and packages. Note that the latter approach will not work for files and packages that programmatically create code or file in other files/packages.
gst-doc is invoked as follows:
gst-doc [ flag ... ] class ...
The following options are supported:
class is either a class name, or a namespace name followed by
.*. Documentation will be written for classes that are specified
in the command line. class can be omitted if a -f or
-p option is given. In this case, documentation will be
written for all the classes in the package.
gnu Smalltalk provides methods to query its own internal data structures.
You may determine the real memory address of an object or the real
memory address of the OOP table that points to a given object, by
using messages to the Memory class, described below.
Returns the index of the OOP for anObject. This index is immume from garbage collection and is the same value used by default as an hash value for anObject (it is returned by Object's implementation of
hashandidentityHash).
Converts the given OOP index (not address) back to an object. Fails if no object is associated to the given index.
Converts the given OOP index (not address) back to an object. Returns nil if no object is associated to the given index.
Other methods in ByteArray and Memory allow to read various C types
(doubleAt:, ucharAt:, etc.). For examples of using
asOop and asObject, look at the Blox source code in
blox-tk/BloxBasic.st.
Another interesting class is ObjectMemory. This provides a few methods that enable one to tune the virtual machine's usage of memory; many methods that in the past were instance methods of Smalltalk or class methods of Memory are now class methods of ObjectMemory. In addition, and that's what the rest of this section is about, the virtual machines signals events to its dependents exactly through this class.
The events that can be received are
ObjectMemory quit was sent or because the specified files were
all filed in. Exiting from within this event might cause an infinite
loop, so be careful.
The gnu Smalltalk virtual machine is equipped with a garbage collector, a facility that reclaims the space occupied by objects that are no longer accessible from the system roots. The collector is composed of several parts, each of which can be invoked by the virtual machine using various tunable strategies, or invoked manually by the programmer.
These parts include a generation scavenger, a mark & sweep collectory with an incremental sweep phase, and a compactor. All these facilities work on different memory spaces and differs from the other in its scope, speed and disadvantages (which are hopefully balanced by the availability of different algorithms). What follows is a description of these algorithms and of the memory spaces they work in.
NewSpace is the memory space where young objects live. It is composed of three sub-spaces: an object-creation space (Eden) and two SurvivorSpaces. When an object is first created, it is placed in Eden. When Eden starts to fill up (i.e., when the number of used bytes in Eden exceeds the scavenge threshold), objects that are housed in Eden or in the occupied SurvivorSpace and that are still reachable from the system roots are copied to the unoccupied SurvivorSpace. As an object survives different scavenging passes, it will be shuffled by the scavenger from the occupied SurvivorSpace to the unoccupied one. When the number of used bytes in SurvivorSpace is high enough that the scavenge pause might be excessively long, the scavenger will move some of the older surviving objects from NewSpace to OldSpace. In the garbage collection jargon, we say that such objects are being tenured to OldSpace.
This garbage collection algorithm is designed to reclaim short-lived objects, that is those objects that expire while residing in NewSpace, and to decide when enough data is residing in NewSpace that it is useful to move some of it in OldSpace. A copying garbage collector is particularly efficient in an object population whose members are more likely to die than survive, because this kind of scavenger spends most of its time copying survivors, who will be few in number in such populations, rather than tracing corpses, who will be many in number. This fact makes copying collection especially well suited to NewSpace, where a percentage of 90% or more objects often fails to survive across a single scavenge.
The particular structure of NewSpace has many advantages. On one hand, having a large Eden and two small SurvivorSpaces has a smaller memory footprint than having two equally big semi-spaces and allocating new objects directly from the occupied one (by default, gnu Smalltalk uses 420=300+60*2 kilobytes of memory, while a simpler configuration would use 720=360*2 kilobytes). On the other hand, it makes tenuring decisions particularly simple: the copying order is such that short-lived objects tend to be copied last, while objects that are being referred from OldSpace tend to be copied first: this is because the tenuring strategy of the scavenger is simply to treat the destination SurvivorSpace as a circular buffer, tenuring objects with a First-In-First-Out policy.
An object might become part of the scavenger root set for several reasons: objects that have been tenured are roots if their data lives in an OldSpace page that has been written to since the last scavenge (more on this later), plus all objects can be roots if they are known to be referenced from C code or from the Smalltalk stacks.
In turn, some of the old objects can be made to live in a special
area, called FixedSpace. Objects that reside in FixedSpace are
special in that their body is guaranteed to remain at a fixed address
(in general, gnu Smalltalk only ensures that the header of the object remains
at a fixed address in the Object Table). Because the garbage
collector can and does move objects, passing objects to foreign code
which uses the object's address as a fixed key, or which uses a
ByteArray as a buffer, presents difficulties. One can use
CObject to manipulate C data on the malloc heap, which
indeed does not move, but this can be tedious and requires the same
attentions to avoid memory leaks as coding in C. FixedSpace provides
a much more convenient mechanism: once an object is deemed fixed, the
object's body will never move through-out its life-time; the space it
occupies will however still be returned automatically to the
FixedSpace pool when the object is garbage collected. Note that
because objects in FixedSpace cannot move, FixedSpace cannot be
compacted and can therefore suffer from extensive fragmentation. For
this reason, FixedSpace should be used carefully. FixedSpace however
is rebuilt (of course) every time an image is brought up, so a kind of
compaction of FixedSpace can be achieved by saving a snapshot,
quitting, and then restarting the newly saved image.
Memory for OldSpace and FixedSpace is allocated using a variation of
the system allocator malloc: in fact, gnu Smalltalk uses the same
allocator for its own internal needs, for OldSpace and for FixedSpace,
but it ensures that a given memory page never hosts objects that
reside in separate spaces. New pages are mapped into the address
space as needed and devoted to OldSpace or FixedSpace segments;
similarly, when unused they may be subsequently unmapped, or they
might be left in place waiting to be reused by malloc or
by another Smalltalk data space.
Garbage that is created among old objects is taken care of by a mark & sweep collector which, unlike the scavenger which only reclaims objects in NewSpace, can only reclaim objects in OldSpace. Note that as objects are allocated, they will not only use the space that was previously occupied in the Eden by objects that have survived, but they will also reuse the entries in the global Object Table that have been freed by object that the scavenger could reclaim. This quest for free object table entries can be combined with the sweep phase of the OldSpace collector, which can then be done incrementally, limiting the disruptive part of OldSpace garbage collection to the mark phase.
Several runs of the mark & sweep collector can lead to fragmentation (where objects are allocated from several pages, and then become garbage in an order such that a bunch of objects remain in each page and the system is not able to recycle them). For this reason, the system periodically tries to compact OldSpace. It does so simply by looping through every old object and copying it into a new OldSpace. Since the OldSpace allocator does not suffer from fragmentation until objects start to be freed nor after all objects are freed, at the end of the copy all the pages in the fragmented OldSpace will have been returned to the system (some of them might already have been used by the compacted OldSpace), and the new, compacted OldSpace is ready to be used as the system oldspace. Growing the object heap (which is done when it is found to be quite full even after a mark & sweep collection) automatically triggers a compaction.
You can run the compactor without marking live objects. Since the amount of garbage in OldSpace is usually quite limited, the overhead incurred by copying potentially dead objects is small enough that the compactor still runs considerably faster than a full garbage collection, and can still give the application some breathing room.
Keeping OldSpace and FixedSpace in the same heap would then make
compaction of OldSpace (whereby it is rebuilt from time to time in
order to limit fragmentation) much less effective. Also, the
malloc heap is not used for FixedSpace objects because gnu Smalltalk
needs to track writes to OldSpace and FixedSpace in order to support
efficient scavenging of young objects.
To do so, the grey page table9 contains one entry for each page in OldSpace or FixedSpace that is thought to contain at least a reference to an object housed in NewSpace. Every page in OldSpace is created as grey, and is considered grey until a scavenging pass finds out that it actually does not contain pointers to NewSpace. Then the page is recolored black10, and will stay black until it is written to or another object is allocated in it (either a new fixed object, or a young object being tenured). The grey page table is expanded and shrunk as needed by the virtual machine.
Drawing an histogram of object sizes shows that there are only a few sources of large objects on average (i.e., objects greater than a page in size), but that enough of these objects are created dynamically that they must be handled specially. Such objects should not be allocated in NewSpace along with ordinary objects, since they would fill up NewSpace prematurely (or might not even fit in it), thus accelerating the scavenging rate, reducing performance and resulting in an increase in tenured garbage. Even though this is not an optimal solution because it effectively tenures these objects at the time they are created, a benefit can be obtained by allocating these objects directly in FixedSpace. The reason why FixedSpace is used is that these objects are big enough that they don't result in fragmentation11; and using FixedSpace instead of OldSpace avoids that the compactor copies them because this would not provide any benefit in terms of reduced fragmentation.
Smalltalk activation records are allocated from another special heap,
the context pool. This is because it is often the case that they
can be deallocated in a Last-In-First-Out (stack) fashion, thereby
saving the work needed to allocate entries in the object table for them,
and quickly reusing the memory that they use. When the activation record
is accessed by Smalltalk, however, the activation record must be turned
into a first-class OOP12. Since even these objects are usually very
short-lived, the data is however not copied to the Eden: the eviction
of the object bodies from the context pool is delayed to the next
scavenging, which will also empty the context pool just like it
empties Eden. If few objects are allocated and the context pool turns
full before the Eden, a scavenging is also triggered; this is however
quite rare.
Optionally, gnu Smalltalk can avoid the overhead of interpretation by
executing a given Smalltalk method only after that method has been
compiled into the underlying microprocessor's machine code. This
machine-code generation is performed automatically, and the resulting
machine code is then placed in malloc-managed memory. Once
executed, a method's machine code is left there for subsequent
execution. However, since it would require way too much memory to
permanently house the machine-code version of every Smalltalk method,
methods might be compiled more than once: when a translation is not
used at the time that two garbage collection actions are taken
(scavenges and global garbage collections count equally), the
incremental sweeper discards it, so that it will be recomputed if and
when necessary.
A few methods in Object support the creation of particular objects. This include:
They are:
Marks the object so that it is considered weak in subsequent garbage collection passes. The garbage collector will consider dead an object which has references only inside weak objects, and will replace references to such an “almost-dead” object with nils, and then send the
mournmessage to the object.
Marks the object so that it is considered specially in subsequent garbage collection passes. Ephemeron objects are sent the message
mournwhen the first instance variable is not referenced or is referenced only through another instance variable in the ephemeron.Ephemerons provide a very versatile base on which complex interactions with the garbage collector can be programmed (for example, finalization which is described below is implemented with ephemerons).
Marks the object so that, as soon as it becomes unreferenced, its
finalizemethod is called. Beforefinalizeis called, the VM implicitly removes the objects from the list of finalizable ones. If necessary, thefinalizemethod can mark again the object as finalizable, but by default finalization will only occur once.Note that a finalizable object is kept in memory even when it has no references, because tricky finalizers might “resuscitate” the object; automatic marking of the object as not to be finalized has the nice side effect that the VM can simply delay the releasing of the memory associated to the object, instead of being forced to waste memory even after finalization happens.
An object must be explicitly marked as to be finalized every time the image is loaded; that is, finalizability is not preserved by an image save. This was done because in most cases finalization is used together with
CObjects that would be stale when the image is loaded again, causing a segmentation violation as soon as they are accessed by the finalization method.
Removes the to-be-finalized mark from the object. As I noted above, the finalize code for the object does not have to do this explicitly.
This method is called by the VM when there are no more references to the object (or, of course, if it only has references inside weak objects).
This method answers whether the VM will refuse to make changes to the objects when methods like
become:,basicAt:put:, and possiblyat:put:too (depending on the implementation of the method). Note that gnu Smalltalk won't try to intercept assignments to fixed instance variables, nor assignments viainstVarAt:put:. Many objects (Characters,nil,true,false, method literals) are read-only by default.
Changes the read-only or read-write status of the receiver to that indicated by
aBoolean.
Same as
#basicNew, but the object won't move across garbage collections.
Same as
#basicNew:, but the object won't move across garbage collections.
Ensure that the receiver won't move across garbage collections. This can be used either if you decide after its creation that an object must be fixed, or if a class does not support using
#newor#new:to create an object
Note that, although particular applications will indeed have a need for
fixed, read-only or finalizable objects, the #makeWeak primitive
is seldom needed and weak objects are normally used only indirectly,
through the so called weak collections. These are easier to use
because they provide additional functionality (for example, WeakArray
is able to determine whether an item has been garbage collected, and
WeakSet implements hash table functionality); they are:
WeakArray
WeakSet
WeakKeyDictionary
WeakValueLookupTable
WeakIdentitySet
WeakKeyIdentityDictionary
WeakValueIdentityDictionary
Versions of gnu Smalltalk preceding 2.1 included a WeakKeyLookupTable class
which has been replaced by WeakKeyDictionary; the usage is completely
identical, but the implementation was changed to use a more efficient
approach based on ephemeron objects.
gnu Smalltalk includes a packaging system which allows one to file in components (often called goodies in Smalltalk's very folkloristic terminology) without caring of whether they need other goodies to be loaded first.
The packaging system is implemented by a Smalltalk class,
PackageLoader, which looks for information about packages in
the XML file named (guess what) packages.xml, in one of three
places:
There are two ways to load something using the packaging system. The
first way is to use the PackageLoader's fileInPackage: and
fileInPackages: methods. For example:
PackageLoader fileInPackages: #('Blox' 'Browser').
PackageLoader fileInPackage: 'Compiler'.
The second way is to use the gst-load script which is installed together with the virtual machine. For example, you can do:
gst-load Browser Blox Compiler
and gnu Smalltalk will automatically file in:
Then it will save the Smalltalk image, and finally exit.
gst-load supports several options:
gst-load
won't exit.
To provide support for this system, you have to give away with your gnu Smalltalk goodies a small file (usually called package.xml) which looks like this:
<packages>
<package>
<name>BloxGTK</name>
<namespace>BLOX</namespace>
<directory>blox-gtk</directory>
<!-- The prereq tag identifies packages that
must be loaded before this one. -->
<prereq>GTK</prereq>
<!-- The provides tag identifies packages that
need not be loaded once this one is. -->
<provides>BLOX</provides>
<!-- The filein tag identifies packages that
compose this package and that should be loaded in the
image in this order. -->
<filein>BloxBasic.st</filein>
<filein>BloxWidgets.st</filein>
<filein>BloxText.st</filein>
<filein>BloxExtend.st</filein>
<filein>Blox.st</filein>
<!-- The file tag identifies packages that
compose this package's distribution. -->
<file>Blox.st</file>
<file>BloxBasic.st</file>
<file>BloxWidgets.st</file>
<file>BloxText.st</file>
<file>BloxExtend.st</file>
</package>
</packages>
Other tags exist:
modulegst_initModule
function in it. Modules can register functions so that Smalltalk
code can call them, and can interact with or manipulate Smalltalk
objects. The TCP package uses a module to provide a bridge
to the socket functions.
libraryGTK
package registers the GTK+ library in this way, so that the
bindings can use them.
calloutsunitSUnit among the prerequisites.
start%1 is replaced
with either nil or a String literal.
stop%1 is replaced
with either nil or a String literal.
testfile, filein and sunit) but not name.
The SUnit package is implicitly made a prerequisite
of the testing subpackage, and the default value of directory and
namespace is the one given for the outer package.
To install your package, you only have to do
gst-package path/to/package.xml
gst-package is a Smalltalk script which will create
a .star archive in the current image directory, with the
files specified in the file tags. By default the package is
placed in the system-wide package directory; you can use the option
--target-directory to create the .star file elsewhere).
Alternatively, gst-package can be used to create a skeleton
GNU style source tree. This includes a configure.ac that will
find the installation path of GNU Smalltalk, and a Makefile.am
to support all the standard Makefile targets (including make
install and make dist). To do so, go in the directory that
is to become the top of the source tree and type.
gst-package --prepare path1/package.xml path2/package.xml
In this case the generated configure script and Makefile will use more features of gst-package, which are yet to be documented. The gnu Smalltalk makefile similarly uses gst-package to install packages and to prepare the distribution tarballs.
The rest of this chapter discusses some of the packages provided with gnu Smalltalk.
Blox is a GUI building block tool kit. It is an abstraction on top of the a platform's native GUI toolkit that is common across all platforms. Writing to the Blox interface means your GUI based application will be portable to any platform where Blox is supported.
The Blox classes, which reside in the BLOX namespace and are
fully documented in Graphical users interfaces with BLOX, act as wrappers around other toolkits,
which constitute the required portability layer; currently the only one
supported is Tcl/Tk but alternative versions of Blox, for example based
on Gtk+ and GNOME, have been considered and might even replace Tcl/Tk
in the future13. Instead of
having to rewrite widgets and support for each platform, Blox simply
asks the other toolkit to do so (currently, it hands valid Tcl code
to a standard Tcl 8.0 environment); the abstraction from the operating
system being used is then extracted out of gnu Smalltalk.
Together with the toolkit, there is a browsing system in the browser directory that will allow the programmer to view the source code for existing classes, to modify existing classes and methods, to get detailed information about the classes and methods, and to evaluate code within the browser. In addition, some simple debugging tools are provided. An Inspector window allows the programmer to graphically inspect and modify the representation of an object and a walkback inspector was designed which will display a backtrace when the program encounters an error.
The Transcript global object is redirected to print to the transcript window instead of printing to stdout, and the transcript window as well as the workspaces, unlike the console read-eval-print loop, support variables that live across multiple evaluations:
a := 2 "Do-it"
a + 2 "Print-it: 4 will be shown"
This browser evolved from an Xt-based version developed around 1993 written by Brad Diller (bdiller@docent.com). Because of legal concerns about possible copyright infringement because his initial implementation used parts of ParcPlace's Model-View-Controller (MVC) message interface, he and Richard Stallman devised a new window update scheme which is more flexible and powerful than MVC's dependency mechanism, and allowed him to purge all the MVC elements from the implementation.
The code was then further improved to employ a better class design (for example, Brad used Dictionaries for classes still to be fleshed out), to be aesthetically more appealing (taking advantage of the new Blox text widget, the code browsers were enhanced with syntax highlighting), and to be more complete (adding multiple “views” to the inspector, namespace support and a complete debugger).
To start the browser you can simply type:
gst-blox
This will load any requested packages, then, if all goes well, a worksheet window with a menu named Smalltalk will appear in the top-left corner of the screen.
The Smalltalk-in-Smalltalk library is a set of classes for looking at Smalltalk code, constructing models of Smalltalk classes that can later be created for real, analyzing and performing changes to the image, finding smelly code and automatically doing repetitive changes. This package incredibly enhances the reflective capabilities of Smalltalk.
A fundamental part of the system is the recursive-descent parser which
creates parse nodes in the form of instances of subclasses of
RBProgramNode.
The parser's extreme flexibility can be exploited in three ways, all of which are demonstrated by source code available in the distribution:
RBParser that
can be overridden in different RBParser subclasses. This is done
by the compiler itself, in which a subclass of RBParser (class
STFileInParser) hands the parse trees to the STCompiler
class.
RBFormatter, by the syntax highlighting engine included
with the browser, and by the compiler.
In addition, two applications were created on top of this library which are specific to gnu Smalltalk. The first is a compiler for Smalltalk methods written in Smalltalk itself, whose source code provides good insights into the gnu Smalltalk virtual machine.
The second is the automatic documentation extractor, contained in two
files, packages/stinst/compiler/STLoader.st and
packages/stinst/compiler/STLoaderObjs.st. To be able to create
Texinfo files even if the library cannot be loaded (for example,
BLOX requires a running X server) Smalltalk source code is
interpreted and objects for the classes and methods being read in are
created; then, polymorphism allows one to treat these exactly like usual
classes which can be fed to gnu Smalltalk's ClassPublisher (found in
packages/stinst/doc/Publish.st.
gnu Smalltalk includes support for connecting to databases. Currently this support is limited to retrieving result sets from SQL selection queries and executing SQL data manipulation queries; in the future however a full object model will be available that hides the usage of SQL.
Classes that are independent of the database management system that is
in use reside in package DBI, while the drivers proper reside
in separate packages which have DBI as a prerequisite; currently,
drivers are supplied for MySQL and PostgreSQL, in packages
DBD-MySQL and DBD-PostgreSQL respectively.
Using the library is fairly simple. To execute a query you need to
create a connection to the database, create a statement on the connection,
and execute your query. For example, let's say I want to connect to the
test database on the localhost. My user name is doe and
my password is mypass.
| connection statement result |
connection := DBI.Connection
connect: 'dbi:MySQL:dbname=test:host=localhost'
user: 'doe'
password: 'mypass').
You can see that the DBMS-specific classes live in a sub-namespace
of DBI, while DBMS-independent classes live in DBI.
Here is how I execute a query.
statement := connection execute: 'insert into aTable (aField) values (123)'.
The result that is returned is a ResultSet. For date queries
the object returns the number of ows affected. For read queries (such
as selection queries) the result set supports standard stream protocol
(next, atEnd to read rows off the result stream) and
can also supply collection of column information. These are
instances of ColumnInfo) and describe the type, size, and
other characteristics of the returned column.
A common usage of a ResultSet would be:
| resultSet values |
[resultSet atEnd] whileFalse: [values add: (resultSet next at: 'columnName') ].
Different countries and cultures have varying conventions for how to
communicate. These conventions range from very simple ones, such as the
format for representing dates and times, to very complex ones, such as
the language spoken. Provided the programs are written to obey the
choice of conventions, they will follow the conventions preferred by the
user. gnu Smalltalk provides two packages to ease you in doing so.
The I18N package covers both internationalization and
multilingualization; the lighter-weight Iconv package
covers only the latter, as it is a prerequisite for correct
internationalization.
Multilingualizing software means programming it to be able to
support languages from every part of the world. In particular, it
includes understanding multi-byte character sets (such as UTF-8)
and Unicode characters whose code point (the equivalent of the
ASCII value) is above 127. To this end, gnu Smalltalk provides the
UnicodeString class that stores its data as 32-bit Unicode
values. In addition, Character will provide support for
all the over one million available code points in Unicode.
Loading the I18N package improves this support through
the EncodedStream class14, which interprets and transcodes
non-ASCII Unicode characters. This support is mostly transparent,
because the base classes Character, UnicodeCharacter
and UnicodeString are enhanced to use it. Sending asString
or printString to an instance of Character and
UnicodeString will convert Unicode characters so that they
are printed correctly in the current locale. For example,
`$<279> printNl' will print a small Latin letter `e' with
a dot above, when the I18N package is loaded.
Dually, you can convert String or ByteArray objects to
Unicode with a single method call. If the current locale's encoding is
UTF-8, `#[196 151] asUnicodeString' will return a Unicode string
with the same character as above, the small Latin letter `e' with
a dot above.
The implementation of multilingualization support is not yet
complete. For example, methods such as asLowercase,
asUppercase, isLetter do not yet recognize Unicode
characters.
You need to exercise some care, or your program will be buggy when
Unicode characters are used. In particular, Characters must
not be compared with ==15 and should
be printed on a Stream with display: rather than
nextPut:.
Also, Characters need to be created with
the class method codePoint: if you are referring to their
Unicode value; codePoint: is also the only method to create
characters that is accepted by the ANSI Standard for Smalltalk.
The method value:, instead, should be used if you are referring
to a byte in a particular encoding. This subtle difference means
that, for example, the last two of the following examples will fail:
"Correct. Use #value: with Strings, #codePoint: with UnicodeString."
String with: (Character value: 65)
String with: (Character value: 128)
UnicodeString with: (Character codePoint: 65)
UnicodeString with: (Character codePoint: 128)
"Correct. Only works for characters in the 0-127 range, which may
be considered as defensive programming."
String with: (Character codePoint: 65)
"Dubious, and only works for characters in the 0-127 range. With
UnicodeString, probably you always want #codePoint:."
UnicodeString with: (Character value: 65)
"Fails, we try to use a high character in a String"
String with: (Character codePoint: 128)
"Fails, we try to use an encoding in a Unicode string"
UnicodeString with: (Character value: 128)
Internationalizing software, instead, means programming it to be able to adapt to the user's favorite conventions. These conventions can get pretty complex; for example, the user might specify the locale `espana-castellano' for most purposes, but specify the locale `usa-english' for currency formatting: this might make sense if the user is a Spanish-speaking American, working in Spanish, but representing monetary amounts in US dollars. You can see that this system is simple but, at the same time, very complete. This manual, however, is not the right place for a thorough discussion of how an user would set up his system for these conventions; for more information, refer to your operating system's manual or to the gnu C library's manual.
gnu Smalltalk inherits from iso C the concept of a locale, that is, a
collection of conventions, one convention for each purpose, and maps each of
these purposes to a Smalltalk class defined by the I18N package, and
these classes form a small hierarchy with class Locale as its roots:
LcNumeric formats numbers; LcMonetary and LcMonetaryISO
format currency amounts.
LcTime formats dates and times.
LcMessages translates your program's output. Of course, the
package can't automatically translate your program's output messages
into other languages; the only way you can support output in the user's
favorite language is to translate these messages by hand. The package
does, though, provide methods to easily handle translations into
multiple languages.
Basic usage of the I18N package involves a single selector, the
question mark (?), which is a rarely used yet valid character for
a Smalltalk binary message. The meaning of the question mark selector
is “Hey, how do you say ... under your convention?”. You can send
? to either a specific instance of a subclass of Locale,
or to the class itself; in this case, rules for the default locale
(which is specified via environment variables) apply. You might say,
for example, LcTime ? Date today or, for example,
germanMonetaryLocale ? account balance. This syntax can be at
first confusing, but turns out to be convenient because of its
consistency and overall simplicity.
Here is how ? works for different classes:
Answer an
LcMessagesDomainthat retrieves translations from the specified file.
These two packages provides much more functionality, including more advanced formatting options support for Unicode, and conversion to and from several character sets. For more information, refer to Multilingual and international support with Iconv and I18N.
As an aside, the representation of locales that the package uses is exactly the same as the C library, which has many advantages: the burden of mantaining locale data is removed from gnu Smalltalk's mantainers; the need of having two copies of the same data is removed from gnu Smalltalk's users; and finally, uniformity of the conventions assumed by different internationalized programs is guaranteed to the end user.
In addition, the representation of translated strings is the standard
mo file format adopted by the gnu gettext library.
Seaside is a framework to build highly interactive web applications quickly, reusably and maintainably. Features of Seaside include callback-based request handling, hierarchical (component-based) page design, and modal session management to easily implement complex workflows.
A simple Seaside component looks like this:
Seaside.WAComponent subclass: MyCounter [
| count |
MyCounter class >> canBeRoot [ ^true ]
initialize [
super initialize.
count := 0.
]
states [ ^{ self } ]
renderContentOn: html [
html heading: count.
html anchor callback: [ count := count + 1 ]; with: '++'.
html space.
html anchor callback: [ count := count - 1 ]; with: '--'.
]
]
MyCounter registerAsApplication: 'mycounter'