GNU Astronomy Utilities



12.3.15 Tessellation library (tile.h)

In many contexts, it is desirable to slice the dataset into subsets or tiles (overlapping or not). In such a way that you can work on each tile independently. One method would be to copy that region to a separate allocated space, but in many contexts this is not necessary and in fact can be a big burden on CPU/Memory usage. The block pointer in Gnuastro’s Generic data container (gal_data_t) is defined for such situations: where allocation is not necessary. You just want to read the data or write to it independently (or in coordination with) other regions of the dataset. Added with parallel processing, this can greatly improve the time/memory consumption.

See the figure below for example: assume the larger dataset is a contiguous block of memory that you are interpreting as a 2D array. But you only want to work on the smaller tile region.

                            larger
              ---------------------------------
              |                               |
              |              tile             |
              |           ----------          |
              |           |        |          |
              |           |_       |          |
              |           |*|      |          |
              |           ----------          |
              |       tile->block = larger    |
              |_                              |
              |*|                             |
              ---------------------------------

To use gal_data_t’s block concept, you allocate a gal_data_t *tile which is initialized with the pointer to the first element in the sub-array (as its array argument). Note that this is not necessarily the first element in the larger array. You can set the size of the tile along with the initialization as you please. Recall that, when given a non-NULL pointer as array, gal_data_initialize (and thus gal_data_alloc) do not allocate any space and just uses the given pointer for the new array element of the gal_data_t. So your tile data structure will not be pointing to a separately allocated space.

After the allocation is done, you just point tile->block to the larger dataset which hosts the full block of memory. Where relevant, Gnuastro’s library functions will check the block pointer of their input dataset to see how to deal with dimensions and increments so they can always remain within the tile. The tools introduced in this section are designed to help in defining and working with tiles that are created in this manner.

Since the block structure is defined as a pointer, arbitrary levels of tessellation/grid-ing are possible (tile->block may itself be a tile in an even larger allocated space). Therefore, just like a linked-list (see Linked lists (list.h)), it is important to have the block pointer of the largest (allocated) dataset set to NULL. Normally, you will not have to worry about this, because gal_data_initialize (and thus gal_data_alloc) will set the block element to NULL by default, just remember not to change it. You can then only change the block element for the tiles you define over the allocated space.

Below, we will first review constructs for Independent tiles and then define the current approach to fully tessellating a dataset (or covering every pixel/data-element with a non-overlapping tile grid in Tile grid. This approach to dealing with parts of a larger block was inspired from a similarly named concept in the GNU Scientific Library (GSL), see its “Vectors and Matrices” chapter for their implementation.