Previous: Matrix Tutorial, Up: Vector/Matrix Tutorial

2.3.3 Vectors as Lists

Although Calc has a number of features for manipulating vectors and matrices as mathematical objects, you can also treat vectors as simple lists of values. For example, we saw that the k f command returns a vector which is a list of the prime factors of a number.

You can pack and unpack stack entries into vectors:

     3:  10         1:  [10, 20, 30]     3:  10
     2:  20             .                2:  20
     1:  30                              1:  30
         .                                   .
     
                        M-3 v p              v u

You can also build vectors out of consecutive integers, or out of many copies of a given value:

     1:  [1, 2, 3, 4]    2:  [1, 2, 3, 4]    2:  [1, 2, 3, 4]
         .               1:  17              1:  [17, 17, 17, 17]
                             .                   .
     
         v x 4 <RET>           17                  v b 4 <RET>

You can apply an operator to every element of a vector using the map command.

     1:  [17, 34, 51, 68]   1:  [289, 1156, 2601, 4624]  1:  [17, 34, 51, 68]
         .                      .                            .
     
         V M *                  2 V M ^                      V M Q

In the first step, we multiply the vector of integers by the vector of 17's elementwise. In the second step, we raise each element to the power two. (The general rule is that both operands must be vectors of the same length, or else one must be a vector and the other a plain number.) In the final step, we take the square root of each element.

(•) Exercise 1. Compute a vector of powers of two from 2^-4’ to ‘2^4’. See 1. (•)

You can also reduce a binary operator across a vector. For example, reducing ‘*’ computes the product of all the elements in the vector:

     1:  123123     1:  [3, 7, 11, 13, 41]      1:  123123
         .              .                           .
     
         123123         k f                         V R *

In this example, we decompose 123123 into its prime factors, then multiply those factors together again to yield the original number.

We could compute a dot product “by hand” using mapping and reduction:

     2:  [1, 2, 3]     1:  [7, 12, 0]     1:  19
     1:  [7, 6, 0]         .                  .
         .
     
         r 1 r 2           V M *              V R +

Recalling two vectors from the previous section, we compute the sum of pairwise products of the elements to get the same answer for the dot product as before.

A slight variant of vector reduction is the accumulate operation, V U. This produces a vector of the intermediate results from a corresponding reduction. Here we compute a table of factorials:

     1:  [1, 2, 3, 4, 5, 6]    1:  [1, 2, 6, 24, 120, 720]
         .                         .
     
         v x 6 <RET>                 V U *

Calc allows vectors to grow as large as you like, although it gets rather slow if vectors have more than about a hundred elements. Actually, most of the time is spent formatting these large vectors for display, not calculating on them. Try the following experiment (if your computer is very fast you may need to substitute a larger vector size).

     1:  [1, 2, 3, 4, ...      1:  [2, 3, 4, 5, ...
         .                         .
     
         v x 500 <RET>               1 V M +

Now press v . (the letter v, then a period) and try the experiment again. In v . mode, long vectors are displayed “abbreviated” like this:

     1:  [1, 2, 3, ..., 500]   1:  [2, 3, 4, ..., 501]
         .                         .
     
         v x 500 <RET>               1 V M +

(where now the ‘...’ is actually part of the Calc display). You will find both operations are now much faster. But notice that even in v . mode, the full vectors are still shown in the Trail. Type t . to cause the trail to abbreviate as well, and try the experiment one more time. Operations on long vectors are now quite fast! (But of course if you use t . you will lose the ability to get old vectors back using the t y command.)

An easy way to view a full vector when v . mode is active is to press ` (back-quote) to edit the vector; editing always works with the full, unabbreviated value.

As a larger example, let's try to fit a straight line to some data, using the method of least squares. (Calc has a built-in command for least-squares curve fitting, but we'll do it by hand here just to practice working with vectors.) Suppose we have the following list of values in a file we have loaded into Emacs:

       x        y
      ---      ---
      1.34    0.234
      1.41    0.298
      1.49    0.402
      1.56    0.412
      1.64    0.466
      1.73    0.473
      1.82    0.601
      1.91    0.519
      2.01    0.603
      2.11    0.637
      2.22    0.645
      2.33    0.705
      2.45    0.917
      2.58    1.009
      2.71    0.971
      2.85    1.062
      3.00    1.148
      3.15    1.157
      3.32    1.354

If you are reading this tutorial in printed form, you will find it easiest to press C-x * i to enter the on-line Info version of the manual and find this table there. (Press g, then type List Tutorial, to jump straight to this section.)

Position the cursor at the upper-left corner of this table, just to the left of the ‘1.34’. Press C-@ to set the mark. (On your system this may be C-2, C-<SPC>, or NUL.) Now position the cursor to the lower-right, just after the ‘1.354’. You have now defined this region as an Emacs “rectangle.” Still in the Info buffer, type C-x * r. This command (calc-grab-rectangle) will pop you back into the Calculator, with the contents of the rectangle you specified in the form of a matrix.

     1:  [ [ 1.34, 0.234 ]
           [ 1.41, 0.298 ]
           ...

(You may wish to use v . mode to abbreviate the display of this large matrix.)

We want to treat this as a pair of lists. The first step is to transpose this matrix into a pair of rows. Remember, a matrix is just a vector of vectors. So we can unpack the matrix into a pair of row vectors on the stack.

     1:  [ [ 1.34,  1.41,  1.49,  ... ]     2:  [1.34, 1.41, 1.49, ... ]
           [ 0.234, 0.298, 0.402, ... ] ]   1:  [0.234, 0.298, 0.402, ... ]
         .                                      .
     
         v t                                    v u

Let's store these in quick variables 1 and 2, respectively.

     1:  [1.34, 1.41, 1.49, ... ]        .
         .
     
         t 2                             t 1

(Recall that t 2 is a variant of s 2 that removes the stored value from the stack.)

In a least squares fit, the slope ‘m’ is given by the formula

     m = (N sum(x y) - sum(x) sum(y)) / (N sum(x^2) - sum(x)^2)

where sum(x)’ represents the sum of all the values of ‘x’. While there is an actual sum function in Calc, it's easier to sum a vector using a simple reduction. First, let's compute the four different sums that this formula uses.

     1:  41.63                 1:  98.0003
         .                         .
     
      r 1 V R +   t 3           r 1 2 V M ^ V R +   t 4

     1:  13.613                1:  33.36554
         .                         .
     
      r 2 V R +   t 5           r 1 r 2 V M * V R +   t 6

These are ‘sum(x)’, ‘sum(x^2)’, ‘sum(y)’, and ‘sum(x y)’, respectively. (We could have used * to compute ‘sum(x^2)’ and ‘sum(x y)’.)

Finally, we also need ‘N’, the number of data points. This is just the length of either of our lists.

     1:  19
         .
     
      r 1 v l   t 7

(That's v followed by a lower-case l.)

Now we grind through the formula:

     1:  633.94526  2:  633.94526  1:  67.23607
         .          1:  566.70919      .
                        .
     
      r 7 r 6 *      r 3 r 5 *         -

     2:  67.23607   3:  67.23607   2:  67.23607   1:  0.52141679
     1:  1862.0057  2:  1862.0057  1:  128.9488       .
         .          1:  1733.0569      .
                        .
     
      r 7 r 4 *      r 3 2 ^           -              /   t 8

That gives us the slope ‘m’. The y-intercept ‘b’ can now be found with the simple formula,

     b = (sum(y) - m sum(x)) / N
     1:  13.613     2:  13.613     1:  -8.09358   1:  -0.425978
         .          1:  21.70658       .              .
                        .
     
        r 5            r 8 r 3 *       -              r 7 /   t 9

Let's “plot” this straight line approximation, m x + b’, and compare it with the original data.

     1:  [0.699, 0.735, ... ]    1:  [0.273, 0.309, ... ]
         .                           .
     
         r 1 r 8 *                   r 9 +    s 0

Notice that multiplying a vector by a constant, and adding a constant to a vector, can be done without mapping commands since these are common operations from vector algebra. As far as Calc is concerned, we've just been doing geometry in 19-dimensional space!

We can subtract this vector from our original ‘y’ vector to get a feel for the error of our fit. Let's find the maximum error:

     1:  [0.0387, 0.0112, ... ]   1:  [0.0387, 0.0112, ... ]   1:  0.0897
         .                            .                            .
     
         r 2 -                        V M A                        V R X

First we compute a vector of differences, then we take the absolute values of these differences, then we reduce the max function across the vector. (The max function is on the two-key sequence f x; because it is so common to use max in a vector operation, the letters X and N are also accepted for max and min in this context. In general, you answer the V M or V R prompt with the actual key sequence that invokes the function you want. You could have typed V R f x or even V R x max <RET> if you had preferred.)

If your system has the GNUPLOT program, you can see graphs of your data and your straight line to see how well they match. (If you have GNUPLOT 3.0 or higher, the following instructions will work regardless of the kind of display you have. Some GNUPLOT 2.0, non-X-windows systems may require additional steps to view the graphs.)

Let's start by plotting the original data. Recall the “x” and “y” vectors onto the stack and press g f. This “fast” graphing command does everything you need to do for simple, straightforward plotting of data.

     2:  [1.34, 1.41, 1.49, ... ]
     1:  [0.234, 0.298, 0.402, ... ]
         .
     
         r 1 r 2    g f

If all goes well, you will shortly get a new window containing a graph of the data. (If not, contact your GNUPLOT or Calc installer to find out what went wrong.) In the X window system, this will be a separate graphics window. For other kinds of displays, the default is to display the graph in Emacs itself using rough character graphics. Press q when you are done viewing the character graphics.

Next, let's add the line we got from our least-squares fit.

     2:  [1.34, 1.41, 1.49, ... ]
     1:  [0.273, 0.309, 0.351, ... ]
         .
     
         <DEL> r 0    g a  g p

It's not very useful to get symbols to mark the data points on this second curve; you can type g S g p to remove them. Type g q when you are done to remove the X graphics window and terminate GNUPLOT.

(•) Exercise 2. An earlier exercise showed how to do least squares fitting to a general system of equations. Our 19 data points are really 19 equations of the form ‘y_i = m x_i + b’ for different pairs of ‘(x_i,y_i)’. Use the matrix-transpose method to solve for ‘m’ and ‘b’, duplicating the above result. See 2. (•)

(•) Exercise 3. If the input data do not form a rectangle, you can use C-x * g (calc-grab-region) to grab the data the way Emacs normally works with regions—it reads left-to-right, top-to-bottom, treating line breaks the same as spaces. Use this command to find the geometric mean of the following numbers. (The geometric mean is the nth root of the product of n numbers.)

     2.3  6  22  15.1  7
       15  14  7.5
       2.5

The C-x * g command accepts numbers separated by spaces or commas, with or without surrounding vector brackets. See 3. (•)

As another example, a theorem about binomial coefficients tells us that the alternating sum of binomial coefficients n-choose-0 minus n-choose-1 plus n-choose-2, and so on up to n-choose-n, always comes out to zero. Let's verify this for ‘n=6’.

     1:  [1, 2, 3, 4, 5, 6, 7]     1:  [0, 1, 2, 3, 4, 5, 6]
         .                             .
     
         v x 7 <RET>                     1 -

     1:  [1, -6, 15, -20, 15, -6, 1]          1:  0
         .                                        .
     
         V M ' (-1)^$ choose(6,$) <RET>             V R +

The V M ' command prompts you to enter any algebraic expression to define the function to map over the vector. The symbol ‘$’ inside this expression represents the argument to the function. The Calculator applies this formula to each element of the vector, substituting each element's value for the ‘$’ sign(s) in turn.

To define a two-argument function, use ‘$$’ for the first argument and ‘$’ for the second: V M ' $$-$ <RET> is equivalent to V M -. This is analogous to regular algebraic entry, where ‘$$’ would refer to the next-to-top stack entry and ‘$’ would refer to the top stack entry, and ' $$-$ <RET> would act exactly like -.

Notice that the V M ' command has recorded two things in the trail: The result, as usual, and also a funny-looking thing marked ‘oper’ that represents the operator function you typed in. The function is enclosed in ‘< >’ brackets, and the argument is denoted by a ‘#’ sign. If there were several arguments, they would be shown as ‘#1’, ‘#2’, and so on. (For example, V M ' $$-$ will put the function ‘<#1 - #2>’ on the trail.) This object is a “nameless function”; you can use nameless ‘< > notation to answer the V M ' prompt if you like. Nameless function notation has the interesting, occasionally useful property that a nameless function is not actually evaluated until it is used. For example, V M ' $+random(2.0) evaluates ‘random(2.0)’ once and adds that random number to all elements of the vector, but V M ' <#+random(2.0)> evaluates the ‘random(2.0)’ separately for each vector element.

Another group of operators that are often useful with V M are the relational operators: a =, for example, compares two numbers and gives the result 1 if they are equal, or 0 if not. Similarly, a < checks for one number being less than another.

Other useful vector operations include v v, to reverse a vector end-for-end; V S, to sort the elements of a vector into increasing order; and v r and v c, to extract one row or column of a matrix, or (in both cases) to extract one element of a plain vector. With a negative argument, v r and v c instead delete one row, column, or vector element.

(•) Exercise 4. The ‘k’th divisor function is the sum of the ‘k’th powers of all the divisors of an integer ‘n’. Figure out a method for computing the divisor function for reasonably small values of ‘n’. As a test, the 0th and 1st divisor functions of 30 are 8 and 72, respectively. See 4. (•)

(•) Exercise 5. The k f command produces a list of prime factors for a number. Sometimes it is important to know that a number is square-free, i.e., that no prime occurs more than once in its list of prime factors. Find a sequence of keystrokes to tell if a number is square-free; your method should leave 1 on the stack if it is, or 0 if it isn't. See 5. (•)

(•) Exercise 6. Build a list of lists that looks like the following diagram. (You may wish to use the v / command to enable multi-line display of vectors.)

     1:  [ [1],
           [1, 2],
           [1, 2, 3],
           [1, 2, 3, 4],
           [1, 2, 3, 4, 5],
           [1, 2, 3, 4, 5, 6] ]

See 6. (•)

(•) Exercise 7. Build the following list of lists.

     1:  [ [0],
           [1, 2],
           [3, 4, 5],
           [6, 7, 8, 9],
           [10, 11, 12, 13, 14],
           [15, 16, 17, 18, 19, 20] ]

See 7. (•)

(•) Exercise 8. Compute a list of values of Bessel's J1’ function ‘besJ(1,x)’ for ‘x’ from 0 to 5 in steps of 0.25. Find the value of ‘x’ (from among the above set of values) for which ‘besJ(1,x)’ is a maximum. Use an “automatic” method, i.e., just reading along the list by hand to find the largest value is not allowed! (There is an a X command which does this kind of thing automatically; see Numerical Solutions.) See 8. (•)

(•) Exercise 9. You are given an integer in the range 0 <= N < 10^m’ for ‘m=12’ (i.e., an integer of less than twelve digits). Convert this integer into a vector of ‘m’ digits, each in the range from 0 to 9. In vector-of-digits notation, add one to this integer to produce a vector of ‘m+1’ digits (since there could be a carry out of the most significant digit). Convert this vector back into a regular integer. A good integer to try is 25129925999. See 9. (•)

(•) Exercise 10. Your friend Joe tried to use V R a = to test if all numbers in a list were equal. What happened? How would you do this test? See 10. (•)

(•) Exercise 11. The area of a circle of radius one is ‘pi’. The area of the 2x2 square that encloses that circle is 4. So if we throw n darts at random points in the square, about ‘pi/4’ of them will land inside the circle. This gives us an entertaining way to estimate the value of ‘pi’. The k r command picks a random number between zero and the value on the stack. We could get a random floating-point number between -1 and 1 by typing 2.0 k r 1 -. Build a vector of 100 random ‘(x,y)’ points in this square, then use vector mapping and reduction to count how many points lie inside the unit circle. Hint: Use the v b command. See 11. (•)

(•) Exercise 12. The matchstick problem provides another way to calculate ‘pi’. Say you have an infinite field of vertical lines with a spacing of one inch. Toss a one-inch matchstick onto the field. The probability that the matchstick will land crossing a line turns out to be 2/pi’. Toss 100 matchsticks to estimate ‘pi’. (If you want still more fun, the probability that the GCD (k g) of two large integers is one turns out to be 6/pi^2’. That provides yet another way to estimate ‘pi’.) See 12. (•)

(•) Exercise 13. An algebraic entry of a string in double-quote marks, ‘"hello"’, creates a vector of the numerical (ASCII) codes of the characters (here, ‘[104, 101, 108, 108, 111]’). Sometimes it is convenient to compute a hash code of a string, which is just an integer that represents the value of that string. Two equal strings have the same hash code; two different strings probably have different hash codes. (For example, Calc has over 400 function names, but Emacs can quickly find the definition for any given name because it has sorted the functions into “buckets” by their hash codes. Sometimes a few names will hash into the same bucket, but it is easier to search among a few names than among all the names.) One popular hash function is computed as follows: First set ‘h = 0’. Then, for each character from the string in turn, set ‘h = 3h + c_i’ where ‘c_i’ is the character's ASCII code. If we have 511 buckets, we then take the hash code modulo 511 to get the bucket number. Develop a simple command or commands for converting string vectors into hash codes. The hash code for ‘"Testing, 1, 2, 3"’ is 1960915098, which modulo 511 is 121. See 13. (•)

(•) Exercise 14. The H V R and H V U commands do nested function evaluations. H V U takes a starting value and a number of steps n from the stack; it then applies the function you give to the starting value 0, 1, 2, up to n times and returns a vector of the results. Use this command to create a “random walk” of 50 steps. Start with the two-dimensional point ‘(0,0)’; then take one step a random distance between -1 and 1 in both ‘x’ and ‘y’; then take another step, and so on. Use the g f command to display this random walk. Now modify your random walk to walk a unit distance, but in a random direction, at each step. (Hint: The sincos function returns a vector of the cosine and sine of an angle.) See 14. (•)