GNU Astronomy Utilities



12.3.3 Library data types (type.h)

Data in astronomy can have many types, numeric (numbers) and strings (names, identifiers). The former can also be divided into integers and floats, see Numeric data types for a thorough discussion of the different numeric data types and which one is useful for different contexts.

To deal with the very large diversity of types that are available (and used in different contexts), in Gnuastro each type is identified with global integer variable with a fixed name, this variable is then passed onto functions that can work on any type or is stored in Gnuastro’s Generic data container (gal_data_t) as one piece of meta-data.

The actual values within these integer constants is irrelevant and you should never rely on them. When you need to check, explicitly use the named variable in the table below. If you want to check with more than one type, you can use C’s switch statement.

Since Gnuastro heavily deals with file input-output, the types it defines are fixed width types, these types are portable to all systems and are defined in the standard C header stdint.h. You do not need to include this header, it is included by any Gnuastro header that deals with the different types. However, the most commonly used types in a C (or C++) program (for example, int or long) are not defined by their exact width (storage size), but by their minimum storage. So for example, on some systems, int may be 2 bytes (16-bits, the minimum required by the standard) and on others it may be 4 bytes (32-bits, common in modern systems).

With every type, a unique “blank” value (or place-holder showing the absence of data) can be defined. Please see Library blank values (blank.h) for constants that Gnuastro recognizes as a blank value for each type. See Numeric data types for more explanation on the limits and particular aspects of each type.

Global integer: GAL_TYPE_INVALID

This is just a place-holder to specifically mark that no type has been set.

Global integer: GAL_TYPE_BIT

Identifier for a bit-stream. Currently no program in Gnuastro works directly on bits, but features will be added in the future.

Global integer: GAL_TYPE_UINT8

Identifier for an unsigned, 8-bit integer type: uint8_t (from stdint.h), or an unsigned char in most modern systems.

Global integer: GAL_TYPE_INT8

Identifier for a signed, 8-bit integer type: int8_t (from stdint.h), or an signed char in most modern systems.

Global integer: GAL_TYPE_UINT16

Identifier for an unsigned, 16-bit integer type: uint16_t (from stdint.h), or an unsigned short in most modern systems.

Global integer: GAL_TYPE_INT16

Identifier for a signed, 16-bit integer type: int16_t (from stdint.h), or a short in most modern systems.

Global integer: GAL_TYPE_UINT32

Identifier for an unsigned, 32-bit integer type: uint32_t (from stdint.h), or an unsigned int in most modern systems.

Global integer: GAL_TYPE_INT32

Identifier for a signed, 32-bit integer type: int32_t (from stdint.h), or an int in most modern systems.

Global integer: GAL_TYPE_UINT64

Identifier for an unsigned, 64-bit integer type: uint64_t (from stdint.h), or an unsigned long in most modern 64-bit systems.

Global integer: GAL_TYPE_INT64

Identifier for a signed, 64-bit integer type: int64_t (from stdint.h), or an long in most modern 64-bit systems.

Global integer: GAL_TYPE_INT

Identifier for a int type. This is just an alias to int16, or int32 types, depending on the system.

Global integer: GAL_TYPE_UINT

Identifier for a unsigned int type. This is just an alias to uint16, or uint32 types, depending on the system.

Global integer: GAL_TYPE_ULONG

Identifier for a unsigned long type. This is just an alias to uint32, or uint64 types for 32-bit, or 64-bit systems respectively.

Global integer: GAL_TYPE_LONG

Identifier for a long type. This is just an alias to int32, or int64 types for 32-bit, or 64-bit systems respectively.

Global integer: GAL_TYPE_SIZE_T

Identifier for a size_t type. This is just an alias to uint32, or uint64 types for 32-bit, or 64-bit systems respectively.

Global integer: GAL_TYPE_FLOAT32

Identifier for a 32-bit single precision floating point type or float in C.

Global integer: GAL_TYPE_FLOAT64

Identifier for a 64-bit double precision floating point type or double in C.

Global integer: GAL_TYPE_COMPLEX32

Identifier for a complex number composed of two float types. Note that the complex type is not yet fully implemented in all Gnuastro’s programs.

Global integer: GAL_TYPE_COMPLEX64

Identifier for a complex number composed of two double types. Note that the complex type is not yet fully implemented in all Gnuastro’s programs.

Global integer: GAL_TYPE_STRING

Identifier for a string of characters (char *).

Global integer: GAL_TYPE_STRLL

Identifier for a linked list of string of characters (gal_list_str_t, see List of strings).

The functions below are defined to make working with the integer constants above easier. In the functions below, the constants above can be used for the type input argument.

Function:
size_t
gal_type_sizeof (uint8_t type)

Return the number of bytes occupied by type. Internally, this function uses C’s sizeof operator to measure the size of each type. For strings, this function will return the size of char *.

Function:
char *
gal_type_name (uint8_t type, int long_name)

Return a string literal that contains the name of type. It can return both short and long formats of the type names (for example, f32 and float32). If long_name is non-zero, the long format will be returned, otherwise the short name will be returned. The output string is statically allocated, so it should not be freed. This function is the inverse of the gal_type_from_name function. For the full list of names/strings that this function will return, see Numeric data types.

Function:
uint8_t
gal_type_from_name (char *str)

Return the Gnuastro integer constant that corresponds to the string str. This function is the inverse of the gal_type_name function and accepts both the short and long formats of each type. For the full list of names/strings that this function will return, see Numeric data types.

Function:
void
gal_type_min (uint8_t type, void *in)

Put the minimum possible value of type in the space pointed to by in. Since the value can have any type, this function does not return anything, it assumes the space for the given type is available to in and writes the value there. Here is one example

int32_t min;
gal_type_min(GAL_TYPE_INT32, &min);

Note: Do not use the minimum value for a blank value of a general (initially unknown) type, please use the constants/functions provided in Library blank values (blank.h) for the definition and usage of blank values.

Function:
void
gal_type_max (uint8_t type, void *in)

Put the maximum possible value of type in the space pointed to by in. Since the value can have any type, this function does not return anything, it assumes the space for the given type is available to in and writes the value there. Here is one example

uint16_t max;
gal_type_max(GAL_TYPE_INT16, &max);

Note: Do not use the maximum value for a blank value of a general (initially unknown) type, please use the constants/functions provided in Library blank values (blank.h) for the definition and usage of blank values.

Function:
int
gal_type_is_int (uint8_t type)

Return 1 if the type is an integer (any width and any sign).

Function:
int
gal_type_is_list (uint8_t type)

Return 1 if the type is a linked list and zero otherwise.

Function:
int
gal_type_out (int first_type, int second_type)

Return the larger of the two given types which can be used for the type of the output of an operation involving the two input types.

Function:
char *
gal_type_bit_string (void *in, size_t size)

Return the bit-string in the size bytes that in points to. The string is dynamically allocated and must be freed afterwards. You can use it to inspect the bits within one region of memory. Here is one short example:

int32_t a=2017;
char *bitstr=gal_type_bit_string(&a, 4);
printf("%d: %s (%X)\n", a, bitstr, a);
free(bitstr);

which will produce:

2017: 11100001000001110000000000000000  (7E1)

As the example above shows, the bit-string is not the most efficient way to inspect bits. If you are familiar with hexadecimal notation, it is much more compact, see https://en.wikipedia.org/wiki/Hexadecimal. You can use printf’s %x or %X to print integers in hexadecimal format.

Function:
char *
gal_type_to_string (void *ptr, uint8_t type, int quote_if_str_has_space);

Read the contents of the memory that ptr points to (assuming it has type type and print it into an allocated string which is returned.

If the memory is a string of characters and quote_if_str_has_space is non-zero, the output string will have double-quotes around it if it contains space characters. Also, note that in this case, ptr must be a pointer to an array of characters (or char **), as in the example below (which will put "sample string" into out):

char *out, *string="sample string"
out = gal_type_to_string(&string, GAL_TYPE_STRING, 1);
Function:
int
gal_type_from_string (void **out, char *string, uint8_t type)

Read a string as a given data type and put a pointer to it in *out. When *out!=NULL, then it is assumed to be already allocated and the value will be simply put there. If *out==NULL, then space will be allocated for the given type and the string will be read into that type.

Note that when we are dealing with a string type, *out should be interpreted as char ** (one element in an array of pointers to different strings). In other words, out should be char ***.

This function can be used to fill in arrays of numbers from strings (in an already allocated data structure), or add nodes to a linked list (if the type is a list type). For an array, you have to pass the pointer to the ith element where you want the value to be stored, for example, &(array[i]).

If the string was successfully parsed to the requested type, this function will return a 0 (zero), otherwise it will return 1 (one). This output format will help you check the status of the conversion in a code like the example below where we will try reading a string as a single precision floating point number.

float out;
void *outptr=&out;
if( gal_type_from_string(&outptr, string, GAL_TYPE_FLOAT32) )
  {
    fprintf(stderr, "%s could not be read as float32\n", string);
    exit(EXIT_FAILURE);
  }

When you need to read many numbers into an array, out would be an array, and you can simply increment outptr=out+i (where you increment i).

Function:
void *
gal_type_string_to_number (char *string, uint8_t *type)

Read string into smallest type that can host the number, the allocated space for the number will be returned and the type of the number will be put into the memory that type points to. If string could not be read as a number, this function will return NULL.

This function first calls the C library’s strtod function to read string as a double-precision floating point number. When successful, it will check the value to put it in the smallest numerical data type that can handle it; for example, 120 and 50000 will be read as a signed 8-bit integer and unsigned 16-bit integer types. When reading as an integer, the C library’s strtol function is used (in base-10) to parse the string again. This re-parsing as an integer is necessary because integers with many digits (for example, the Unix epoch seconds) will not be accurately stored as a floating point and we cannot use the result of strtod.

When string is successfully parsed as a number and there is . in string, it will force the number into floating point types. For example, "5" is read as an integer, while "5." or "5.0", or "5.00" will be read as a floating point (single-precision).

For floating point types, this function will count the number of significant digits and determine if the given string is single or double precision as described in Numeric data types.

For integers, negative numbers will always be placed in signed types (as expected). If a positive integer falls below the maximum of a signed type of a certain width, it will be signed (for example, 10 and 150 will be defined as a signed and unsigned 8-bit integer respectively). In other words, even though 10 can be unsigned, it will be read as a signed 8-bit integer. This is done to respect the C implicit type conversion in binary operators, where signed integers will be interpreted as unsigned, when the other operand is an unsigned integer of the same width.

For example, see the short program below. It will print -50 is larger than 100000 (which is wrong!). This happens because when a negative number is parsed as an unsigned, the value is effectively subtracted from the maximum and \(4294967295-50\) is indeed larger than 100000 (recall that \(4294967295\) is the largest unsigned 32-bit integer, see Numeric data types).

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

int
main(void)
{
  int32_t  a=-50;
  uint32_t b=100000;
  printf("%d is %s than %d\n", a,
         a>b ? "larger" : "less or equal", b);
  return 0;
}

However, if we read 100000 as a signed 32-bit integer, there will not be any problem and the printed sentence will be logically correct (for someone who does not know anything about numeric data types: users of your programs). For the advantages of integers, see Integer benefits and pitfalls.