Next: , Previous: The existing hierarchy, Up: More subclassing

6.9.2 Playing with Arrays

Imagine that you need an array, but alas you need that if an index is out of bounds, it returns nil. You could modify the Smalltalk implementation, but that might break some code in the image, so it is not practical. Why not add a subclass?

        "We could subclass from Array, but that class is specifically
         optimized by the VM (which assumes, among other things, that
         it does not have any instance variables).  So we use its
         abstract superclass instead.  The discussion below holds
         equally well."
        ArrayedCollection subclass: NiledArray [
            <shape: #pointer>
            boundsCheck: index [
                ^(index < 1) | (index > (self basicSize))
            at: index [
                ^(self boundsCheck: index)
                    ifTrue: [ nil ]
                    ifFalse: [ super at: index ]
            at: index put: val [
                ^(self boundsCheck: index)
                    ifTrue: [ val ]
                    ifFalse: [ super at: index put: val ]

Much of the machinery of adding a class should be familiar. We see another declaration like comment:, that is shape: message. This sets up NiledArray to have the same underlying structure of an Array object; we'll delay discussing this until the chapter on the nuts and bolts of arrays. In any case, we inherit all of the actual knowledge of how to create arrays, reference them, and so forth. All that we do is intercept at: and at:put: messages, call our common function to validate the array index, and do something special if the index is not valid. The way that we coded the bounds check bears a little examination.

Making a first cut at coding the bounds check, you might have coded the bounds check in NiledArray's methods twice (once for at:, and again for at:put:. As always, it's preferable to code things once, and then re-use them. So we instead add a method for bounds checking boundsCheck:, and use it for both cases. If we ever wanted to enhance the bounds checking (perhaps emit an error if the index is < 1 and answer nil only for indices greater than the array size?), we only have to change it in one place.

The actual math for calculating whether the bounds have been violated is a little interesting. The first part of the expression returned by the method:

        (index < 1) | (index > (self basicSize))

is true if the index is less than 1, otherwise it's false. This part of the expression thus becomes the boolean object true or false. The boolean object then receives the message |, and the argument (index > (self basicSize)). | means “or”—we want to OR together the two possible out-of-range checks. What is the second part of the expression? 1

index is our argument, an integer; it receives the message >, and thus will compare itself to the value self basicSize returns. While we haven't covered the underlying structures Smalltalk uses to build arrays, we can briefly say that the #basicSize message returns the number of elements the Array object can contain. So the index is checked to see if it's less than 1 (the lowest legal Array index) or greater than the highest allocated slot in the Array. If it is either (the | operator!), the expression is true, otherwise false.

From there it's downhill; our boolean object, returned by boundsCheck:, receives the ifTrue:ifFalse: message, and a code block which will do the appropriate thing. Why do we have at:put: return val? Well, because that's what it's supposed to do: look at every implementor of at:put or at: and you'll find that it returns its second parameter. In general, the result is discarded; but one could write a program which uses it, so we'll write it this way anyway.


[1] Smalltalk also offers an or: message, which is different in a subtle way from |. or: takes a code block, and only invokes the code block if it's necessary to determine the value of the expression. This is analogous to the guaranteed C semantic that || evaluates left-to-right only as far as needed. We could have written the expressions as ((index < 1) or: [index > (self basicSize)]). Since we expect both sides of or: to be false most of the time, there isn't much reason to delay evaluation of either side in this case.