Next: , Up: General Functions


10.2.1 Converting Strings To Numbers

The strtonum() function (see String Functions) is a gawk extension. The following function provides an implementation for other versions of awk:

     
     # mystrtonum --- convert string to number
     
     
     
     function mystrtonum(str,        ret, chars, n, i, k, c)
     {
         if (str ~ /^0[0-7]*$/) {
             # octal
             n = length(str)
             ret = 0
             for (i = 1; i <= n; i++) {
                 c = substr(str, i, 1)
                 if ((k = index("01234567", c)) > 0)
                     k-- # adjust for 1-basing in awk
     
                 ret = ret * 8 + k
             }
         } else if (str ~ /^0[xX][[:xdigit:]]+/) {
             # hexadecimal
             str = substr(str, 3)    # lop off leading 0x
             n = length(str)
             ret = 0
             for (i = 1; i <= n; i++) {
                 c = substr(str, i, 1)
                 c = tolower(c)
                 if ((k = index("0123456789", c)) > 0)
                     k-- # adjust for 1-basing in awk
                 else if ((k = index("abcdef", c)) > 0)
                     k += 9
     
                 ret = ret * 16 + k
             }
         } else if (str ~ \
       /^[-+]?([0-9]+([.][0-9]*([Ee][0-9]+)?)?|([.][0-9]+([Ee][-+]?[0-9]+)?))$/) {
             # decimal number, possibly floating point
             ret = str + 0
         } else
             ret = "NOT-A-NUMBER"
     
         return ret
     }
     
     # BEGIN {     # gawk test harness
     #     a[1] = "25"
     #     a[2] = ".31"
     #     a[3] = "0123"
     #     a[4] = "0xdeadBEEF"
     #     a[5] = "123.45"
     #     a[6] = "1.e3"
     #     a[7] = "1.32"
     #     a[7] = "1.32E2"
     #
     #     for (i = 1; i in a; i++)
     #         print a[i], strtonum(a[i]), mystrtonum(a[i])
     # }
     

The function first looks for C-style octal numbers (base 8). If the input string matches a regular expression describing octal numbers, then mystrtonum() loops through each character in the string. It sets k to the index in "01234567" of the current octal digit. Since the return value is one-based, the ‘k--’ adjusts k so it can be used in computing the return value.

Similar logic applies to the code that checks for and converts a hexadecimal value, which starts with ‘0x’ or ‘0X’. The use of tolower() simplifies the computation for finding the correct numeric value for each hexadecimal digit.

Finally, if the string matches the (rather complicated) regexp for a regular decimal integer or floating-point number, the computation ‘ret = str + 0’ lets awk convert the value to a number.

A commented-out test program is included, so that the function can be tested with gawk and the results compared to the built-in strtonum() function.