GNU Astronomy Utilities


Next: , Previous: , Up: Gnuastro library   [Contents][Index]


11.3.21 Matching (match.h)

Matching is often necessary when two measurements of the same points have been done using different instruments (or hardware), different software or different configurations of the same software. In other words, you have two catalogs or tables and each has N columns containing the N-dimensional “positional” values of each point. Each can have other columns too, for example one can have brightness measurements in one filter, and another can have brightness measurements in another filter as well as morphology measurements or etc.

The matching functions here will use the positional columns to find the permutation necessary to apply to both tables. This will enable you to match by the positions, then apply the permutation to the brightness or morphology columns in the example above. The input and output data formats of the functions below are the some and described below before the actual functions. Each function also has extra arguments due to the particular algorithm it uses for the matching.

The two inputs of the functions (coord1 and coord2) must be List of gal_data_t. Each gal_data_t node in coord1 or coord2 should be a single dimensional dataset (column in a table) and all the nodes must have the same number of elements (rows). In other words, each column can be visualized as having the coordinates of each point in its respective dimension. The dimensions of the coordinates is determined by the number of gal_data_t nodes in the two input lists (which must be equal). The number of rows (or the number of elements in each gal_data_t) in the columns of coord1 and coord2 can be different. All these functions will all be satisfied if you use gal_table_read to read the two coordinate columns, see Table input output (table.h).

The functions below return a simply-linked list of three 1D datasets (see List of gal_data_t), let’s call the returned dataset ret. The first two (ret and ret->next) are permutations. In other words, the array elements of both have a type of size_t, see Permutations (permutation.h). The third node (ret->next->next) is the calculated distance for that match and its array has a type of double. The number of matches will be put in the space pointed by the nummatched argument. If there wasn’t any match, this function will return NULL.

The two permutations can be applied to the rows of the two inputs: the first one (ret) should be applied to the rows of the table containing coord1 and the second one (ret->next) to the table containing coord2. After applying the returned permutations to the inputs, the top nummatched elements of both will match with each other. The ordering of the rest of the elements is undefined (depends on the matching funciton used). The third node is the distances between the respective match (which may be elliptical distance, see discussion of “aperture” below).

The functions will not simply return the nearest neighbor as a match. The nearest neighbor may be too far to be a meaningful. They will check the distance between the distance of the nearest neighbor of each point and only return a match for it it is within an acceptable N-dimensional distance (or “aperture”). The matching aperture is defined by the aperture array that is an input argument to the functions. If several points of one catalog lie within this aperture of a point in the other, the nearest is defined as the match. In a 2D situation (where the input lists have two nodes), for the most generic case, it must have three elements: the major axis length, axis ratio and position angle (see Defining an ellipse and ellipsoid). If aperture[1]==1, the aperture will be a circle of radius aperture[0] and the third value won’t be used. When the aperture is an ellipse, distances between the points are also calculated in the respective elliptical distances (\(r_{el}\) in Defining an ellipse and ellipsoid).

Function:
gal_data_t *
gal_match_coordinates (gal_data_t *coord1, gal_data_t *coord2, double *aperture, int sorted_by_first, int inplace, size_t minmapsize, int quietmmap, size_t *nummatched)

Use a basic sort-based match to find the matching points of two input coordinates. See the descriptions above on the format of the inputs and outputs. To speed up the search, this function will sort the input coordinates by their first column (first axis). If both are already sorted by their first column, you can avoid the sorting step by giving a non-zero value to sorted_by_first.

When sorting is necessary and inplace is non-zero, the actual input columns will be sorted. Otherwise, an internal copy of the inputs will be made, used (sorted) and later freed before returning. Therefore, when inplace==0, inputs will remain untouched, but this function will take more time and memory. If internal allocation is necessary and the space is larger than minmapsize, the space will be not allocated in the RAM, but in a file, see description of --minmapsize and --quietmmap in Processing options.

Output permutations ignore internal sorting: the output permutations will correspond to the initial inputs. Therefore, even when inplace!=0 (and this function re-arranges the inputs in place), the output permutation will correspond to original (possibly non-sorted) inputs.

The reason for this is that you rarely want to permute the actual positional columns after the match. Usually, you also have other columns (for example the brightness, morphology and etc) and you want to find how they differ between the objects that match. Once you have the permutations, they can be applied to those other columns (see Permutations (permutation.h)) and the higher-level processing can continue. So if you don’t need the coordinate columns for the rest of your analysis, it is better to set inplace=1.


Next: , Previous: , Up: Gnuastro library   [Contents][Index]