Table of Contents:
This program is designed to match up items in two different lists, which may have two different systems of coordinates. The program allows the two sets of coordinates to be related by a linear, quadratic, or cubic transformation. It is an implementation of the algorithm described in Valdes et al., Publications of the Astronomical Society of the Pacific, vol 107, page 1119 (1995).
The user must place the objects to be matched in ASCII text files, which have a fixed number of columns in each line. Each file must have contain two spatial coordinates, i.e. (row, col) or (x, y), and a "magnitude" for each item. This is a measure of brightness, which has the peculiar property that bright items have SMALLER magnitudes than faint ones. The user may optionally specify ID numbers for each star in a column. An example might look like this:
5 368.77 3.93 11.369 0.027 6 45.90 5.33 11.308 0.026 7 407.27 5.05 6.650 0.001 8 636.23 7.75 10.715 0.015 9 338.60 8.90 11.314 0.026
In the example above, the first column contains an "ID" number for each star, the second column contains an "x" coordinate, the third column a "y" coordinate, and the fourth column a "magnitude" coordinate. Note that the brightest item is number 7, and the faintest is number 5.
Other columns may contain data of any type, as long as each column contains no white space (tabs, spaces). The "x", "y" and "magnitude" (and, optionally, "ID") values must appear in the first ten columns, but there may be additional columns beyond the tenth.
The "match" program adopts the C-like convention that the leftmost column in a file is "number 0", the next column is "number 1", and so forth. Thus, in the example above,
column 0 has ID number column 1 has "x" coord column 2 has "y" coord column 3 has "magnitude" value column 4 has "magnitude error" value
The matching routines make a weak assumption that the units of the input coordinates are very roughly the same size as the uncertainty in position of each item. That is, the uncertainty in the position of an item at (200, 250) might be around +/- 1 unit in each coordinate. It is okay if the uncertainties are smaller than 1 unit, but if the uncertainties are much, much larger -- say, 50 or 100 units -- then the routines may fail. If your original coordinate values have absolute values larger than 5000, it might help to scale them all down by a factor of 10 or 100.
Usage is as follows:
match fileA xcolA ycolA magcolA fileB xcolB ycolB magcolB [id1=] [id2=] [max_iter=] [halt_sigma=] [outfile=] [trirad=] [nobj=] [matchrad=] [scale=] [min_scale=] [max_scale=] [transonly] [rotangle=] [rottol=] [recalc] [linear|quadratic|cubic] [medtf] [medsigclip=] [intrans=] [identity [xsh=] [ysh=]] [--version] [--help] [help]
The first 8 arguments are required, and must appear in the order given above. The remaining arguments are optional, and may appear in any order. The arguments are:
fileA name of ASCII file containing items in the first list xcolA column in fileA containing "x" value of items ycolA column in fileA containing "y" value of items magcolA column in fileA containing "mag" value of items fileB name of ASCII file containing items in the second list xcolB column in fileB containing "x" value of items ycolB column in fileB containing "y" value of items magcolB column in fileB containing "mag" value of items id1=N column in fileA containing "ID" value of items id2=N column in fileB containing "ID" value of items (the values must be unique within each list; the program will not check, and strange behavior will result if duplicate IDs occur) outfile=abc place output in files with base name "abc" abc.mtA items from list A which had matches in list B abc.mtB items from list B which had matches in list A abc.unA items from list A which had no matches in list B abc.unB items from list B which had no matches in list A trirad=0.001 critical value for counting triangles as a match (default value 0.002; see below) nobj=30 use the brightest 'nobj' values in each list during the matching process (default value 20) matchrad=4 after applying transformation to listA, so that both lists are in coordinate system of listB, items from the two lists must be within this distance to count as 'matched' (default value 5) scale=2.0 if present, only triangles with the given scale factor are counted as matches (in this example, only triangles in coord system A which are about 2.0 times larger than triangles in coord system B are counted as matches) min_scale=1.5 if present, only triangles with ratios greater than the given scale factor are counted as matches max_scale=2.5 if present, only triangles with ratios less than the given scale factor are counted as matches rotangle=30 if present, then only triangles which are rotated relative to each other by the given angle (in degrees) are counted as matches; requires the 'rottol' argument as well rottol=0.5 if present, then only triangles which are rotated relative to each other by 'rotangle' degrees plus or minus this amount (degrees) are counted as matches; requires the 'rotangle' argument as well max_iter=3 during the solution process, there is a loop in which the code matches stars in the two lists, finds the residuals for each pair, and discards pairs with big residuals. This argument sets a limit on the number of times this loop can be entered. Default value is given by AT_MATCH_MAXITER parameter in atpmatch.h, currently 3. halt_sigma=1.0e-12 during the solution process, if the typical residual between pairs of stars in the two lists drops to this level, stop iterating and declare success. The units are the square of the units in the second list of stars; often, radians squared. Default value is given by AT_MATCH_HALTSIGMA parameter in atpmatch.h, currently 1.0e-12. transonly if present, stop after having derived a transformation which best takes the 'nobj' objects from list A into objects in list B. Don't go on to find the matching objects explicitly. recalc if present, after having derived a transformation with the brightest 'nobj' objects, and having applied that transformation to ALL the objects in list A, derive a transformation again, this time using ALL the objects in list A and list B linear use a linear transformation (the default) quadratic use a quadratic transformation cubic use a cubic transformation medtf calculate the MEDTF statistics (see below); assumes that a simple translation connects the two lists medsigclip=N discard matched pairs more than N stdev from the median offset before calculating MEDTF statistics identity as an initial guess, use a linear TRANS in which there is no rotation, change in scale, or translation (unless xsh= and ysh= are specified) xsh= in initial guess, use linear TRANS with the given translation in the 'x' direction ysh= in initial guess, use linear TRANS with the given translation in the 'x' direction intrans=file as an initial guess, use the TRANS which is given in the ASCII text file 'file' (see below for the format) --version print the current version number and exit --help print list of command-line options help print list of command-line options
Example: given two files with the format shown above, called "starsA.dat" and "starsB.dat", one might call 'match' as follows:
match starsA.dat 1 2 3 starsB.dat 1 2 3 outfile=matched \ id1=0 id2=0 matchrad=3.0 trirad=0.001 nobj=40
The argument "trirad" may need some extra explanation. For a full desciption, see the paper by Valdes et al. In short, stars are grouped into triangles, the sides of which are labelled "a", "b" and "c" in order of decreasing size. The normalized ratios
b/a and c/a
are formed for each triangle. In order for a triangle, t1, from one list to be considered identical to a triangle, t2, in the other list, it must satisfy the equation
sqrt[ (t1.ba - t2.ba)^2 + (t1.ca - t2.ca)^2 ] <= trirad
So the value of "trirad" has units of neither list; it exists in a two-dimensional "triangle-space" formed by the coordinates b/a and c/a (each of which ranges from 0.0 to 1.0).
One way the user can control the matching of triangles is to specify a "scale factor". Only triangles with a ratio of size falling with a particular window will be counted as matches. The user can control the "scale factor window" in two ways:
scale=1.5The window is centered on the given factor, and extends a percentage (AT_MATCH_PERCENT) above it and below it.
min_scale=1.3 max_scale=1.7The window extends from the min_scale to max_scale, including both boundaries.
Another way the user can control the matching of triangles is to specify a "rotation angle". In order for a triangle in list A to count as a match with one in list B, the two must be rotated relative to each other by this amount (in degrees). The user must provide both rotangle and rottol arguments; if the user provides only one of these two arguments, the program will halt with an error message. All angles should be specified in the range (-180, 180) degrees.
If the two lists are known to have the same scale and rotation, the user may use the identity keyword to force an initial guess at the TRANS: a linear model with
a=0.0 b=1.0 c=0.0 d=0.0 e=0.0 f=1.0The program will use this initial TRANS structure to match up the items in the two lists, rather than trying to figure out the transformation itself. After looking for matched items with this fixed model, the program will use the matched pairs it finds to define an improved TRANS.
In a similar vein, if the user knows that the scale and rotation of the two sets of objects are identical, but there is a fixed and known translation of XX units in the x-direction and YY units in the y-direction, he can use the identity keyword with additional arguments
xsh=XX ysh=YYwhich will create an initial TRANS with
a=XX b=1.0 c=0.0 d=YY e=0.0 f=1.0
Finally, for ultimate control, the user may specify the exact values of the coefficients for the initial TRANS via the intrans=filename option. The program will look for an ASCII text file with the given filename, which must have the following format:
a=33.34 b=1.3e-4 c 99.01 d -5.323e1(note that the equals sign between the coefficient and the value is optional)
In any case, if the user specifies the initial TRANS, either via the identity or intrans method, that initial model is used to convert the coordinates of objects in list A into the system of list B. The program then looks for matches within the matchrad units. It then uses only the matched pairs to generate an improved TRANS model.
On the other hand, if the user does NOT provide an initial TRANS model, the program searches for one itself. After its first guess, the code matches up candidate pairs of stars from the two lists. It then enters a loop in which it calculates a plate solution, applies the solution to the stars in list A, then compares the predicted positions from list A to the actual positions of stars in list B. Based on those residuals, the code marks some of the pairs as "bad", discards them, and goes back to the top of the loop. The "max_iter" and "halt_sigma" arguments can be used to control the number of times the code enters this loop and discards possible matches. Note that, if one uses the project_coords routine to turn (RA, Dec) positions from a catalog into an input list, the units will of this list will be radians; hence, the residuals from a solution against this list will be in radians squared. A typical residual of 1 arcsec corresponds to "halt_sigma" of 2.4e-11 radians squared.
OUTPUT: THE TRANS AND MEDTF STRUCTURES
This "match" can send information directly to the screen (as discussed in this section), or to auxiliary output files (as discussed in the next section). If any errors occur during execution, the program will print messages to the stdout stream.
TRANS structure
The only ordinary information sent to stdout is a description of the TRANS structure, which converts coordinates in the file A to those in file B. A TRANS structure describes the transformation between coordinates (x, y) and (x', y'):
If the user has chosen the 'linear' option (which is the default),
x' = A + Bx + Cy y' = D + Ex + Fy
If the user has chosen the 'quadratic' option,
x' = A + Bx + Cy + Dxx + Exy + Fyy y' = G + Hx + Iy + Jxx + Kxy + Lyy
If the user has chosen the 'cubic' option,
x' = A + Bx + Cy + Dxx + Exy + Fyy + Gx(xx+yy) + Hy(xx+yy) y' = I + Jx + Ky + Lxx + Mxy + Nyy + Ox(xx+yy) + Py(xx+yy)
The coefficients A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P are the members of the TRANS structure. In the case of a linear solution, one can interpret these values as follows:
- a, d: describe the translation between the two systems - b,c,d,e: describe rotation and magnification; if there is a good transformation, they should satisfy b = +/-f c = -/+e magnification = sqrt(b*b + c*c); rotation angle = atan(c/b);
Thus, in the example given above, in order to match the coordinates of the items in the first list to those in the second list, one must
There are five additional members which describe the quality of the fit. The program prints out their values in a form like this (it prints out only the terms appropriate for the given order of the transformation):
TRANS: a=0.043180410 b=0.706908898 c=-0.706603954 d=0.053624825 e=0.706348660 f=0.708975359 sig=6.7655e-03 Nr=15 Nm=39 sx=5.8321e-01 sy=2.3939e-01
The final five elements always appear. They are
Note the difference between sig, on the one hand, and sx and sy on the other hand: the former refers to (relatively few) pairs used to DEFINE the transformation, whereas the latter refer to the (many) pairs which appear after items in file A are transformed into coord system of file B.
MEDTF structure
This information is produced only if the user requests it specifically, via the medtf and/or medsigclip options. The MEDTF structure contains information relevant to a particular case of matching: two lists which have the same scale and rotation, and are simply translated. It arises so frequently that there are special routines to measure how well one can match up these lists.
Suppose that two lists can be registered by a simply shift. After registration, how many items in the two lists match --- and how well do their positions agree? We measure the following quantities and then set the corresponding elements of the MEDTF structure:
The MEDTF information is printed to stdout (if requested) in the following format (it all appears on a single line, which I have broken into two pieces for clarity):
MEDTF: mdx=1.000000000 mdy=0.000000000 adx=0.000000000 ady=0.052631579 sdx=0.973328527 sdy=0.686231832 n=19The MEDTF line appears before the TRANS line in the output.
If the user specifies the medsigclip option, like so:
match filea.dat 1 2 3 fileb.dat linear medsigclip=3.0then an extra set of calculations are made after the two sets have been matched:
Strictly speaking, the medtf step requires re-calculating the positions of all objects in list A, after a valid TRANS has been found. Therefore, the user should always specify recalc together with medtf. However, the code will recognize the problem if medtf alone is given on the command-line, and set the recalc option itself. The program will print a warning message to stdout explaining its actions.
Likewise, the medsigclip option presupposes medtf (and, hence, recalc); once again, the program will recognize the situation and set the required flags itself, giving warnings to stdout.
OUTPUT: LISTS OF MATCHED AND UNMATCHED ITEMS
The second type of output is the creation of 4 output files. The files have names with the same base, which may be specified by the user via the "outfile=" option, or defaults to "match". There are 4 files, each of which contains 4 columns of numerical information:
column 1: an ID number, unique to each list column 2: "x" coordinate, in the system of list B column 3: "y" coordinate, in the system of list B column 4: "mag" coordinate
The are 4 files with this format. They are named (by default)
matched.mtA items from list A which had matches in list B matched.mtB items from list B which had matches in list A matched.unA items from list A which had no matches in list B matched.unB items from list B which had no matches in list A
The files "matched.mtA" and "matched.mtB" ought to have the same number of entries. The two files of unmatched items need not have the same number of entries.
If the user has specified the "id1" and "id2" arguments, then the ID numbers in these output files are taken from the specified columns in the input files. Otherwise, the program creates ID numbers which start at 0 in list A, and increase monotonically to N-1, where "N" is the number of stars in list A; the ID numbers then start at N for the first star in list B, and continue to increase to the end of list B.
If you know the relative scale of the two sets of objects, you should use the scale, or min_scale and max_scale, options to narrow down the possible matches.
As mentioned above, if the uncertainty in positions is larger than a few units of the input coordinates, divide all the input coordinates by some constant factor.
In the very common case that the two sets of objects have (nearly) identical scale and rotation, and little translation, use the identity option. This will fail if there is an offset of more than the matchrad units between the lists; if you know the offset even roughly, use the xsh= and ysh= options to improve the changes of finding a good match.
If the input are lists of stars from astronomical FITS images, one might use the FITS header keywords to determine the relative scale, rotation and/or translation, create a file with the appropriate TRANS coefficients, and then run the program with the intrans= option.
The algorithm creates a set of all the triangles which can be formed from the 'nobj' brightest members of each list: this set grows as 'nobj' to the sixth power (!), so that even a small increase in 'nobj' can lead to large increases in run time. The 'recalc' option can help: one may create two very large lists of candidates -- say, 200 items in each -- and then set 'nobj' to some small subset of the total. If all goes well, the 'nobj' candidates will provide a rough transformation without taking too long. That transformation may be applied to all 200 objects in each list, and then a better plate solution can be calculated, quickly, using all the objects which match.
For reference, given lists of 525 (detected in an image), and 1338 (from the Guide Star Catalog) stars, and asked to use 40 stars from each list in the matching process, 'match' took the following times to run: MM:SS
Compaq Aero Intel 486/25 4 Meg RAM 10:17 PC clone Intel Pentium 90 32 Meg RAM 0:23 PC clone Intel PentiumII 233 128 Meg RAM 0:02 Sun 4/50 4.2 MFlop ? 16 Meg RAM 0:09 SGI 150 MHz IP22/R4400 192 Meg RAM 0:01
It is possible to get a lot of information on the program's progress by setting the "#define DEBUG" flag in the source code. Setting the "#define DEBUG2" flag yields a LOT of information.
Be careful to ensure that the two lists DO include stars in common, within the 'nobj' brightest in each. If one list includes stars much brighter than the other, the 20 brightest (for example) might not appear at all in the other list ... in which case the matching algorithm will fail.
The program does the best it can -- it may report a TRANS even if a good match does not exist. To verify that the match is a valid one, make sure that the output ".mtA" and ".mtB" files contain more than 1 or 2 entries. If the TRANS coefficients don't satisfy b=+/-f and c=+/-e, it is likely that any reported matches are spurious.
Look at the file "atpmatch.h" for some constants that affect program operation. The #define'd constant AT_MATCH_REQUIRE, for example, sets the minimum number of points in each list which must exist.
It may be possible to improve the program's performance on some particular sort of data by tweaking some of the #define'd values in "atpmatch.h". In particular,
#define AT_MATCH_PERCENTILE 0.65 #define AT_MATCH_NSIGMA 2.5control an interative sigma-clipping loop which discards the worst stars in the current set of matches. A low value of AT_MATCH_NSIGMA will cause the loop to iterate many times, discarding lots of stars, even if they aren't much worse than the average. A low value of AT_MATCH_PERCENTILE will do the same, by lowering the percentile used as a stand-in for standard deviation and causing more pairs to qualify as "bad".
Another way to improve the performance on a particular dataset is to fiddle with the "max_iter" and "halt_sigma" arguments. It may be a good idea to start with very low values for "max_iter", and large values for "halt_sigma", and look at the solutions by #define'ing DEBUG in "atpmatch.c". One may then increase "max_iter", and decrease "halt_sigma", and see how the solutions improve.
The two lists of stars must have rectilinear coordinates for this program to find a transformation connecting them. Note that (RA, Dec) is, in general, NOT such a system. It is usually necessary to project (RA, Dec) onto a plane tangent to the sphere at the center of the field. However, for a field very close to the celestial equator, (RA, Dec) can be close enough to rectilinear for matching purposes.
The matching procedure works best if the magnitude of the two coordinate systems is not vastly different. That is, these two sets of objects, with a scale factor of 1.4, are easily matched:
list A list B ( 50, 100) ( 70, 140) ( 20, 60) ( 28, 84) (-30, 40) (-42, 56)
whereas these two sets, with a scale factor of 140000, might fail to be matched:
list A list B ( 0.00050, 0.00100) ( 70, 140) ( 0.00020, 0.00060) ( 28, 84) (-0.00030, 0.00040) (-42, 56)
It may help to multiply one set of values by a constant, so that they are approximately the same magnitude as the other set. It may also help to add a constant to one set so that it has roughly the same origin as the other set.
One often wants to match objects detected in an image with those listed in a catalog. The detected objects usually have coordinates from the detector, (row, col) or (x, y), which are really projections of the celestial position onto a plane tangent to the celestial sphere, centered near the middle of the image. The catalog objects, on the other hand, usually have coordinates in Right Ascension (RA) and Declination (Dec) which are in the spherical system. In order to match the two sets of coordinates, it helps to project the catalog positions onto a plane, tangent to the celestial sphere at the center of the image. The two sets of projected coordinates may then be fairly compared.
One can find the relationship between spherical coords (RA, Dec) and their projected equivalents (xi, eta) in any good book on spherical trigonometry, such as Smart's Textbook on Spherical Astronomy. One source for this information on the Web is a paper published a few years ago, now available from the Astrophysics Data Systems archive, Astrometry of Single-Chord Occultations: Application to the 1993 Triton Event by Olkin et al., Publications of the Astronomical Society of the Pacific, volume 108, page 202 (1996). See especially the material in section 3.2, on page 204 of the paper.
The project_coords program converts (RA, Dec) coordinates into the coordinates on a tangent plane, called (xi, eta) by convention. The input must be an ASCII list of objects, with one line per object. Each line must have columns of data, two of which must be the RA and Dec in decimal degrees (not sexigesimal HH:MM:SS notation!). The project_coords program will convert these columns of the input list, but leave the others as-is.
Usage is:
project_coords starfile racol deccol ra dec [asec] [outfile=]where
So, for example, given a catalog file gsc.cat, which looks like this:
GSC_0043_23388 33.87533 -15.22973 13.90 star GSC_0043_23390 33.88923 -15.12091 14.30 starlike_object GSC_0043_23395 33.90756 -15.33952 12.50 star
and given an image of the sky centered at (RA, Dec) = (33.89, -15.20), one might run
project_coords gsc.cat 1 2 33.89 -15.20 outfile=gsc.proj
Looking then at the file gsc.proj, we should find
GSC_0043_23388 -2.470478e-04 -5.188947e-04 13.90 star GSC_0043_23390 -1.297375e-05 1.380382e-03 14.30 starlike_object GSC_0043_23395 2.955624e-04 -2.435100e-03 12.50 star
The values in the RA and Dec columns have been replaced by their xi and eta equivalents projected onto the tangent plane. Note also that the other columns of information are now separated by exactly one space, which might be different from the input spacing.
The (xi, eta) values have units which might be interpreted as radians from the tangent point of the projection. Astronomers might recall that one radian is about 206,265 arcseconds, and likewise one arcsecond is about 4.85 x 10^(-6) radians. If the asec option is given, then these values are converted from radians to arcseconds before being printed.
After project_coords has converted the coords of a catalog list to (xi, eta), one might run the match program on this converted list, and the list of objects detected in the image.
As mentioned in the section on project_coords, above, one often projects catalog (RA, Dec) onto a tangent plane before matching against objects detected in an image. The match program can be used to find a good transformation which takes the (row, col) coordinates of detected objects into the (xi, eta) coordinates of catalog objects. Afterwards, one may wish to
and then
The apply_match program performs both these steps, transforming a list of detected objects with (row, col) coordinates into a list with (RA, Dec) coordinates. The input must be an ASCII list of objects, with one line per object. Each line must have columns of data, two of which must be the position in (x, y) coordinates of the detector. The apply_match program will convert these columns of the input list, but leave the others as-is.
Usage is:
apply_match starfile xcol ycol ra dec linear|quadratic|cubic a b c d e f [g h i j h k [l m n o]] [outfile=]where
So, for example, suppose we have an ASCII file of objects detected in an image called stars.cat:
1 2.30 585.68 2462 10.63 14.575 0.033 2 1.71 1864.68 2462 9.51 15.917 0.108 3 3.94 1400.51 2462 10.84 16.024 0.119where the second and third columns contain the (row, col) positions of each object. We use this as an input to the match program, comparing it to a subset of a catalog extracted at (RA, Dec) = (33.89, -15.20), and find the following linear TRANS structure:
TRANS: a=0.043661422 b=0.707484673 c=-0.707367056 d=0.077633855 e=0.708066513 f=0.706880146
We can then convert all the (row, col) coords of objects into (RA, Dec) by running
apply_match stars.cat 1 2 33.89 -15.20 linear \ 0.043661 0.707484 -0.70736 0.077633 0.708066 0.706880
which prints to stdout a list like this:
1 318.81225 43.19368 2462 10.63 14.575 0.033 2 318.63845 43.05146 2462 9.51 15.917 0.108 3 318.70546 43.15008 2462 10.84 16.024 0.119
Now the second and third columns contain the transformed and de-projected coords of the each objects in (RA, Dec).
Note that the spacing of the columns may be modified: the program leaves a single space between columns.