Next: Statistical operations (`statistics.h`), Previous: Permutations (`permutation.h`), Up: Gnuastro library [Contents][Index]

Matching is often necessary when two measurements of the same points have been done using different instruments (or hardware), different software or different configurations of the same software. In other words, you have two catalogs or tables, and each has N columns containing the N-dimensional “coordinate” values of each point. Each table can have other columns too, for example, one can have brightness measurements in one filter, and another can have morphology measurements.

The matching functions here will use the coordinate columns of the two tables to find a permutation for each, and the total number of matched rows (\(N_{match}\)). This will enable you to match by the positions if you like. At a higher level, you can apply the permutation to the brightness or morphology columns to merge the catalogs over the \(N_{match}\) rows. The input and output data formats of the functions are the some and described below before the actual functions. Each function also has extra arguments due to the particular algorithm it uses for the matching.

The two inputs of the functions (`coord1`

and `coord2`

) must be List of `gal_data_t`

.
Each `gal_data_t`

node in `coord1`

or `coord2`

should be a single dimensional dataset (column in a table) and all the nodes (in each) must have the same number of elements (rows).
In other words, each column can be visualized as having the coordinates of each point in its respective dimension.
The dimensions of the coordinates is determined by the number of `gal_data_t`

nodes in the two input lists (which must be equal).
The number of rows (or the number of elements in each `gal_data_t`

) in the columns of `coord1`

and `coord2`

can (and, usually will!) be different.
In summary, these functions will be happy if you use `gal_table_read`

to read the two coordinate columns from a file, see Table input output (`table.h`).

The functions below return a simply-linked list of three 1D datasets (see List of `gal_data_t`

), let’s call the returned dataset `ret`

.
The first two (`ret`

and `ret->next`

) are permutations.
In other words, the `array`

elements of both have a type of `size_t`

, see Permutations (`permutation.h`).
The third node (`ret->next->next`

) is the calculated distance for that match and its array has a type of `double`

.
The number of matches will be put in the space pointed by the `nummatched`

argument.
If there was not any match, this function will return `NULL`

.

The two permutations can be applied to the rows of the two inputs: the first one (`ret`

) should be applied to the rows of the table containing `coord1`

and the second one (`ret->next`

) to the table containing `coord2`

.
After applying the returned permutations to the inputs, the top `nummatched`

elements of both will match with each other.
The ordering of the rest of the elements is undefined (depends on the matching function used).
The third node is the distances between the respective match (which may be elliptical distance, see discussion of “aperture” below).

The functions will not simply return the nearest neighbor as a match.
This is because the nearest neighbor may be too far to be a meaningful!
They will check the distance between the nearest neighbor of each point and only return a match if it is within an acceptable N-dimensional distance (or “aperture”).
The matching aperture is defined by the `aperture`

array that is an input argument to the functions.

If several points of one catalog lie within this aperture of a point in the other catalog, the nearest is defined as the match.
In a 2D situation (where the input lists have two nodes), for the most generic case, `aperture`

must have three elements: the major axis length, axis ratio and position angle (see Defining an ellipse and ellipsoid).
If `aperture[1]==1`

, the aperture will be a circle of radius `aperture[0]`

and the third value will not be used.
When the aperture is an ellipse, distances between the points are also calculated in the respective elliptical distances (\(r_{el}\) in Defining an ellipse and ellipsoid).

**Output permutations ignore internal sorting**: the output permutations will correspond to the initial inputs.
Therefore, even when `inplace!=0`

(and this function re-arranges the inputs in place), the output permutation will correspond to original (possibly non-sorted) inputs. The reason for this is that you rarely want to permute the actual positional columns after the match.
Usually, you also have other columns (such as the brightness and morphology) and you want to find how they differ between the objects that match.
Once you have the permutations, they can be applied to those other columns (see Permutations (`permutation.h`)) and the higher-level processing can continue.
So if you do not need the coordinate columns for the rest of your analysis, it is better to set `inplace=1`

.

- Function:

*gal_data_t **

**gal_match_sort_based***(gal_data_t*¶`*coord1`

, gal_data_t`*coord2`

, double`*aperture`

, int`sorted_by_first`

, int`inplace`

, size_t`minmapsize`

, int`quietmmap`

, size_t`*nummatched`

) -
Use a basic sort-based match to find the matching points of two input coordinates. See the descriptions above on the format of the inputs and outputs. To speed up the search, this function will sort the input coordinates by their first column (first axis). If

*both*are already sorted by their first column, you can avoid the sorting step by giving a non-zero value to`sorted_by_first`

.When sorting is necessary and

`inplace`

is non-zero, the actual input columns will be sorted. Otherwise, an internal copy of the inputs will be made, used (sorted) and later freed before returning. Therefore, when`inplace==0`

, inputs will remain untouched, but this function will take more time and memory. If internal allocation is necessary and the space is larger than`minmapsize`

, the space will be not allocated in the RAM, but in a file, see description of`--minmapsize`and`--quietmmap`

in Processing options.

- Function:

*gal_data_t **

**gal_match_kdtree***(gal_data_t*¶`*coord1`

, gal_data_t`*coord2`

, gal_data_t`*coord1_kdtree`

, size_t`kdtree_root`

, double`*aperture`

, size_t`numthreads`

, size_t`minmapsize`

, int`quietmmap`

, size_t`*nummatched`

) -
Use the k-d tree concept for finding matches between two catalogs, optionally in parallel (on

`numthreads`

threads). The k-d tree of the first input (`coord1_kdtree`

), and its root index (`kdtree_root`

), should be constructed and found before calling this function, to do this, you can use the`gal_kdtree_create`

of K-d tree (`kdtree.h`). The desired`aperture`

array is the same as`gal_match_sort_based`

and described at the top of this section. If`coord1_kdtree==NULL`

, this function will return a`NULL`

pointer and write a value of`0`

in the space that`nummatched`

points to.The final number of matches is returned in

`nummatched`

and the format of the returned dataset (three columns) is described above. If internal allocation is necessary and the space is larger than`minmapsize`

, the space will be not allocated in the RAM, but in a file, see description of`--minmapsize`and`--quietmmap`

in Processing options.

`statistics.h`), Previous: Permutations (`permutation.h`), Up: Gnuastro library [Contents][Index]

JavaScript license information

GNU Astronomy Utilities 0.19 manual, October 2022.