A pipeline for reducing TASS Mark IV data

Feb 17, 2003
Mar 15, 2003
Apr 15, 2003
Apr 24, 2003
May 16, 2003
Aug 02, 2003
Mar 27, 2005

Tom Droege has distributed a number of CD-Roms full of images taken by the TASS Mark IV prototype cameras. I have tried to write a pipeline to process the contents of an entire CD-Rom in an automatic fashion. The pipeline is designed for a computer running Linux (or some allied *nix-like OS). I provide source code for all the pieces, but no executables; you must build each piece on your system.


Auxiliary software packages

The pipeline uses several auxiliary pieces of software:

You will also need one (or two) reference catalog(s) of stellar positions and magnitudes. I have created two subsets of the Tycho-2 catalog which might be used for the purpose, though they are far from ideal as photometric references. I'd be happy to replace them with a better reference catalog, if someone can point one out to me.

Big version: lots of stars, mediocre photometry
The big version contains almost 1.5 million stars, which improves the chances of getting enough stars in any random Mark IV field to yield a good astrometric solution. However, the Tycho magnitudes become very poor at the faint end, where most of the stars are.

Small version: fewer stars, better photometry
The small version contains about 361,000 stars, averaging about nine stars per square degree. The limits on acceptable photometry from the Tycho2 catalog are much tighter, which makes this a much better photometric reference.

If you want to try the pipeline on some real astronomical data, you can use a small set of data collected Tom Droege on April 14, 2001. I have used this subset of 12 pairs of images as a testbed in developing the pipeline, so the default parameter values work reasonably with it. The testbed data volume is about 200 MB, so think carefully before you start to download!


Incredibly brief cookbook

Hardy souls may follow these steps. A step-by-step guide, step_by_step.html shows the shell commands needed to download, unpack, and run the code.

  1. Create a top-level directory on a disk with at least 2500 MB of free space. We'll call this topdir for future reference.
  2. Download all 5 software packages, into 5 different sub-directories of topdir:
  3. Build each package. This requires you to go into each directory and type ./configure; make . If you wish, you can verify that the "photom" and "match" packages built properly by typing make check , which will run a small test of the executables. Remember that the "photom" package requires the GNU Scientific Library; so you'll have to install that first.
  4. Create 2 sub-directories under topdir to hold the raw image data, the Tycho reference catalog, and all the processed images and other products
  5. Put all the "testbed" raw image data into the input directory. You can either copy files from the Mark IV Data Disk 24, if you have a copy, or download them from http://spiff.rit.edu/tass/small_input/ Remember, there each image is 8 MB in size, and there may be quite a few in the input. This step will take a while...
  6. Put the Tycho-2 astrometric and photometric reference catalog (or catalogs, if you choose to use one file for astrometric and a different file for photometric reference) into the input directory.
  7. Go to the pipeline sub-directory. Check the values in the file setup.param -- you may need to change some directory names.
  8. Go to the pipeline sub-directory. Start up a TCL shell. Cross fingers. Type source markiv_driver.tcl
  9. If all goes well, within several hours, you should have a large set of cleaned images and data files in the output directory. The three most important files are the ones with names that start with "M" and end with extensions ".cal", ".list", and ".param".


How the pipeline works

The pipeline starts with a set of FITS images taken by a TASS Mark IV camera, and (if all goes well) produces a set of ASCII text files with the positions and magnitudes of all stars detected in the images. There are many steps between the start and end. I've tried to write the pipeline so that one can skip parts which aren't of interest; that is, one might run only the image reduction routines, or only the calibration routines.

Here are the major steps:

  1. setup the directories used on a system
  2. generate a list of images and image types
  3. create reference catalogs for the night
  4. create a master dark image
  5. create a master flatfield image (and mask)
  6. clean the target images
  7. deal with the sky
  8. detect and measure stars in each image
  9. calibrate the positions of stars
  10. collating the lists of raw magnitudes
  11. calibrate the magnitudes of stars


Overview of the TCL pipeline

A set of TCL (which stands for "Tool Command Language") scripts act to glue together the various software packages needed to reduce a set of astronomical images completely. For the most part, each TCL source code file contains routines which deal with one step of the reduction process. The TCL may call external Unix commands to do some of its work (for example, sort or cp), and it may call programs from some of the other packages to do some of its work (for example, photom or stars). Many of the TCL source code files have matching "parameter" files:

      setup.tcl         setup.param
      astrom.tcl        astrom.param
A few of the TCL source code files contain general-purpose library routines; they have no corresponding parameter files.

Each parameter file is a simple, ASCII text file which specifies key/value pairs. For example, the "setup.param" file looks something like this:

# Parameters used by the "setup" script

# directory containing the raw image files
input_dir       /data/testbed/input

# directory in which all intermediate and final output files go
output_dir      /data/testbed/output

Users can modify the pipeline's actions by editing these text files; in many cases, there will be no need to modify the source code to the TCL scripts, or to the external commands.

The pipeline is designed as a single "driver" routine and a set of sub-routines. One may control the actions of the driver by modifying the file markiv_driver.param, which looks in part like this:

# flags to execute (or not) specific pieces of the pipeline
#     1       means execute it
#     0       means skip it
do_setup           1
do_make_list       0
do_refcat          0
do_make_dark       0
do_make_flat       0
do_ccdproc         0
do_sky             0
do_stars           0
do_astrom          0
do_collate         0
do_photom          0

If a value in this list is set to "1", then its corresponding sub-routine is executed. If not, the sub-routine is skipped. In the example above, the driver is set to call do_setup, but nothing else.

One runs the main driver by

  1. starting a TCL shell in the pipeline directory
  2. typing source markiv_driver.tcl

The markiv_driver.tcl does a bit more than just call the desired sub-routines. It also sets up some global variables which are used by several of the sub-routines. One may cut down on the amount of typing needed if running the pipeline repeatedly by executing the TCL command

           proc go {} { source markiv_driver.tcl }
For the remainder of the TCL session, one may execute the driver simply by typing "go".


Setting up the directories used on a system

The first time this software is installed on a new system, several parameters will have to be modified: the directory containing the input, raw image files, the directory which holds reference catalogs, the directory into which all output should be placed, the directories holding external executable programs:

and the latitude and longitude of the observations. Default values for all these parameters are supplied in the distribution kit, but they will very probably need to be changed.

  1. Edit the file setup.param to set the directory names therein.
  2. Set the do_setup value to "1" in the markiv_driver.param file.
  3. Within a TCL shell, execute
             source markiv_driver.tcl
    

The setup procedure checks to make sure that the proper versions of the match and photom programs have been installed. If the versions are too old, it will complain and tell you the versions it requires to run.

After having run the setup procedure once, the directory names in all the other parameter files should be set to their new values. One may then change do_setup to "0" in the markiv_driver.param file.

The parameter skip_bad in the setup.param file indicates whether frames with properties which fall outside user-defined limits should continue to be processed anyway, or should be marked as "bad" and ignored from then on.


Generating a list of images and image types

The script make_list.tcl contains routines which read information from the raw Mark IV CCD images on a CD-Rom and generate an ASCII text file with a list of the images.

The output of this routine is an ASCII text file, placed into the output directory. The first few lines contain "header" information; they begin with a comment character, "#". Any line which starts with this character will be ignored by most of the pipeline processing; you can add notes to the file, or remove files temporarily by inserting a "#" at the start of the line.

#  software {pipeline 0.3} {match 0.6} {photom 0.6} {xvista 0.1.2} {bait 0.1}  
#  run on Thu Apr 24 18:37:42 EDT 2003 
The first line shows the version of each package used to reduce the data. The second line provides the date and time when the pipeline was initially run.

After that, there is one line per image, with a format like this:

hvra2604797.fits V 2452605.79772 60.2 object 79.23 -1.71 1933 38.08 3.94 1160 -8.977 -0.024 1
hvra2604799.fits V 2452605.79905 60.3 object 79.25 -1.04 1904 33.55 3.87 1233 -9.051 -0.024 1
hvra2604800.fits V 2452605.80037 59.8 object 79.27 -0.47 1911 34.04 3.77 1320 -9.046 -0.024 1
where the columns are

  1. image file name
  2. filter
  3. Julian Date of exposure
  4. exposure time (seconds)
  5. image type: "dark", "object", or "flat" (taken from the IMAGETYP value in FITS header, and translated into a purely lowercase string)
  6. approx. Right Ascension (decimal degrees) (updated later)
  7. approx. Declination (decimal degrees) (updated later)
  8. sky value (initially zero, set later)
  9. skysig value (initially zero, set later)
  10. average Full Width at Half Maximum (FWHM) for frame, in pixels (initially zero, set later)
  11. number of stars in image (initially zero, set later)
  12. photometric zeropoint (initially zero, set later)
  13. photometric color term, which will be the same for all images in some passband for a run (initially zero, set later)
  14. "include" flag: if 1, keep processing this image; if 0, ignore it

If one is using a dataset which doesn't contain all the required information in the FITS header -- that is, which doesn't have RA, Dec, exposure time, etc. -- then one can simply create an ASCII text file with the proper information by some other means. One might type the information in by hand, for example, or use some other little program to put an observing log into this format.

Several of the procedures further down the pipeline will set or update some of these values. The do_astrom procedure, for example, will change the RA and Dec values based upon stars found in the image. The do_photom procedure will modify the values of the photometric zeropoint and color term for each image.

The include flag is initially set to "1" for all images, meaning "include this image in all processing." If the user wishes, he may set parameters in some of the other param files which place limits on certain properties. For example, in the sky.param file, the user may set limits on the background sky level in images in each filter. If an image exceeds the limit, its include field may be set to "0", indicating that it should be ignored in all subsequent processing. The include values of bad frames are changed to "0" only if the skip_bad parameter in the setup.param file is set to 1. That is, only if the user sets skip_bad to one, AND if a frame's properties fall outside the acceptable boundaries, will it be skipped.


Creating reference catalogs for the night

The script refcat.tcl contains a routine which generates two small reference catalogs of stars which cover the area observed during a night. The idea here is to speed up program execution a bit: a full photometric/astrometric catalog covering the entire sky is likely to be very large. By selecting only the subset of that large catalog(s) which might be needed for reducing a single night's data, one can decrease the amount of time spent in the calibrations.

This routine assumes that there are two reference catalogs available, one for astrometric and the other for photometric purposes; the directories containing these references, and their names, are defined in refcat.param. The function scans the list of images taken during a night to figure out the area(s) needed for reductions. It then selects from each catalog only stars which fall in the area(s) and puts them into files for later use.

One may use the same reference catalog for both purposes. The pipeline will always create two separate subsets, even if is given a single reference source.

Note that it may take a significant length of time to run. On my test machine (Intel 633 MHz CPU, 512 MB RAM), the refcat procedure takes about 20 minutes (!) to scan through a big reference catalog (the big subset of the Tycho-2 catalog mentioned above, with about 1.5 million stars), and pick out several hundred stars needed to reduce a set of 4 target fields. Therefore, I have added a Perl script, refcat.pl, which duplicates the action of the refcat.tcl TCL script. The Perl script runs about four times faster on my machine. The pipeline will first attempt to run the Perl script; if it fails, then the pipeline tries the TCL script. This is the only use of Perl in the pipeline so far.


Creating a master dark image

The script makedark.tcl creates "master" dark frames for each camera, and for each exposure time. In a typical Mark IV observing program, this might mean 4 "master" darks:

The routine looks at the list of exposures created by the make_list procedure to determine the number of master darks needed, and which raw darks should be used to create each.

Note that Mark IV data disks created in late 2000/early 2001 by Tom Droege have a small problem with the exposure times recorded in the FITS headers. Exposures are supposed to have a few, common, round exposure times -- for example, 100 seconds for all images during the night. However, the data acquisition program records the exposure time with a small random offset: say, 99.4 seconds for one image, 100.4 seconds for the next. As far as we can tell, the actual exposure times may actually be 100.0 seconds. Therefore, the current pipeline rounds the exposure times recorded in the FITS headers to the nearest 10 seconds, and uses those rounded times for all subsequent processing.

This procedure calls the XVista program median to combine a set of raw dark frames into a single master dark frame. The median program looks at the values in all the input images for a given pixel (say, pixel [50, 100]). It sorts these input values and selects the median: if there are an odd number, 2*N + 1, of input values, it chooses value (N + 1); if there are an even number, 2*N, of input values, it chooses value N. It then places this median value into the given pixel of the output, "master" dark frame.

The output "master" dark frames have names with three pieces, separated by underscores:

      prefix_V_15.fts
where

In the course of later reductions of target images, the pipeline will search for a "master" dark frame which matches a target image's camera and exposure time. It will then subtract that "master" dark from the target image as a first step in the reductions.


Creating a master flatfield image

The script makeflat.tcl contains a routine which creates a "master" flatfield image and a "master" mask of bad regions for each camera, There are two ways to generate a master flatfield, depending on the value of parameter in the make_flat.param file.

twisky flats:
Combine (slightly processed) target frames to generate a "night sky flatfield." Unless there are many different fields, it is likely that the resulting image will contain residuals at the positions of stars in some of the frames. Set twisky_flats parameter to "1" to select this option.

lightbox flats:
Combine special images of a grey card or uniformly illuminated screen. Set twisky_flats parameter to "0" to select this option; and, of course, you'll need to take a special set of flatfield images, too. The flatfield images must have
            IMAGETYP='flat'
in the FITS header to be recognized automatically.

Choosing the proper frames for night-sky flats can be tricky. The make_flat procedure uses the following rules:

Once a set of input images has been chosen, each image is preprocessed to remove the dark current, in the following manner (see the routines in sub_dark.tcl):

The XVista program sub is called to subtract the master dark frame from each target image.

The basic idea is that the temperature of the chip may have changed slightly from the time that the dark frames were taken, and the time the target frames were taken. We assume that any shift in temperature causes the dark current to change by an additive constant; in other words, we assume that the pattern of the dark current remains the same, and only the average value shifts up or down. We use the values in a set of edge columns to tell us if any such shift has occurred. Some of the pixels on a CCD chip are covered, so that light cannot strike them; one can use these pixels to monitor the dark current in the chip, in both dark frames and target frames.

The file ccdproc.param has a parameter, called dark_cols, which denotes the columns to use as these edge columns. This example will result in a single column being used to monitor dark current:

dark_cols     2 
whereas this example will use the average of three columns:
dark_cols     2  3  4

For images in Tom Droege's Disk Set 24 (written late 2002), it looks like the overscan columns between 2051 and 2055 are good choices.

If one does not wish to shift the master dark frame, but instead simply to subtract it from each target frame, one should set the dark_cols value to "-1", like so:

dark_cols    -1
This special value will cause the master dark to be subtracted as-is from the target frame, with no shifting.

After all the target frames have had the dark current subtracted, they are run through the XVista program median to produce a "master" flatfield image. Recent versions of the pipeline do not use the true median value of all pixels at a given position, but instead the interquartile mean (i.e. the mean of all pixel values between the 25'th and 75'th quartiles).

After the master flatfield has been created, a copy is made and trimmed to the same size that target images will be trimmed. The XVista program mask is used to find "bad regions" in the flatfield image. A "bad region" is a set of connected pixels which are all far from the local mean. One can control the mask program by modifying the parameters mask_gridsize, mask_minsig and mask_minpix in the make_flat.param file. Each master flatfield has a matching mask file, with the same name but extension .msk. The mask file has a list of bad regions with the following format:

    808    130     156   172    999  1016  
   1152   3586     232  2023   1711  1713  
   1295    190     272   287   1204  1239  
The columns are
  1. region ID. Because many regions are found, but fail to meet the criteria for acceptance, the surviving regions usually have big gaps in ID values.
  2. number of connected bad pixels in the region
  3. minimum row of all bad pixels in the region
  4. maximum row of all bad pixels in the region
  5. minimum col of all bad pixels in the region
  6. maximum col of all bad pixels in the region

In the detection step below, any stars which fall into the bounding box of a bad region are marked as suspect.


Cleaning the target images

The script ccdproc.tcl contains routines which "clean" each target image:

See the notes above on subtracting dark current , which employs the XVista program sub.

In the flatfielding step, the division is carried out by the XVista program div in the following manner

                        (pixel value in target) * (average flatfield value)
   clean pixel value =  ---------------------------------------------------
                                (pixel value in flatfield)

This is equivalent to normalizing the flatfield value to have an average value of 1.0, and then dividing by the normalized flatfield image. Since XVista can't handle floating-point pixel values (only 16-bit integer values), it resorts to this slight rearrangement of the arithmetic.

After each image has gone through the dark-subtracting and flat-fielding stages, it is trimmed to remove the columns at the very edges which are often full of garbage values. This will reduce the number of spurious stellar detections reported in the subsequent processing. The ccdproc.param file contains a set of keywords which specify the area on the chip which should be trimmed:

# these positions define a box within the raw image which contains
#   "good" sky data.  The values start at zero (first row is 0, not 1)
#   The "end" position is inclusive, so
#          start_row  10   end_row 20    
#   means a trimmed region with 11 (not 10!) rows
#          
trim_start_row      6
trim_end_row     2034
trim_start_col      6
trim_end_col     2037

After the ccdproc procedure is finished, the output directory should contain nice, clean versions of each target image. The FITS header of each clean image contains a set of COMMENT lines describing the processing:

COMMENT   subtracted master dark file /data/testbed/output/master_dark_V_15.fts 
COMMENT     with delta 3.28 on Sat Jan 20 18:19:20 EST 2001                     
COMMENT   do_flatfield: div by master flat /data/testbed/output/master_flat_V.ft
COMMENT   do_flatfield: finished Sat Jan 20 18:19:22 EST 2001                   
COMMENT   trimmed sr=6 nr=2029 sc=6 nc=2032                                     
COMMENT      by ccdproc on Sat Jan 20 18:19:23 EST 2001                         

The original, raw data files in the input directory are not modified in any way.


Measuring and (optionally) removing the background sky

The script sky.tcl and param file sky.param control this simple section of the pipeline. It does only two things.

First, it always measures the background sky level in each image. The XVista program "sky" is used to form a histogram of pixel values, and then to fit a gaussian to the peak of this histogram. The maximum of this gaussian is then used defined to be the "sky" value, and the width of the gaussian defines the "skysig" value. Both of these values are written into the make_list output file for later reference.

Second, the user may choose to subtract a model of the background sky by setting the subtract_sky parameter to "1". If he does so, he can use additional parameters to control the type of model (a constant value, or one with linear gradients, or one with quadratic terms), and, if desired, have a copy of the fitted model written to a disk file for debugging. Subtracting a model for the background may leave a nearly uniform level across the frame, which causes the star-finding algorithm to detect stars more evenly across the entire image.

In addition, there are parameters which can be used to mark a frame as "bad":

# Maximum acceptable values for the sky and skysig, in each filter
#    The values are counts, after dark has been subtracted.  So a value
#    of 0 would mean "absolutely no light from the sky," and a value
#    of 1000 would mean "1000 counts above the dark level".
#    Each filter has its own value.
#    Used by the "check_sky" routine
max_sky { V 2500 } { I  5000 }
max_skysig { V 200 } { I 200 }

Any frame which fails to fall within the acceptable range(s) is at risk for being discarded (depending on the skip_bad value).


Detecting and measuring stars in each image

The script do_stars.tcl contains routines which detect and measure the properties on stars in each target image. The TCL code calls external programs from the XVista package to do all the real work. There are lots of parameters to control the finding and measuring, and the user may need to modify some of the values in stars.param for each particular night's data. We can divide this portion of the pipeline into three pieces:

The first step is to calculate an appropriate sky level for each image.

Yes, this was done in the previous stage of the pipeline, but may need to be repeated if a model for the sky was subtracted. So, we do it again in every case.
The XVista program sky does this job. In most cases, the distribution of pixel values in the background sky is broad -- at least several tens of ADUs. Some CCDs (including those in some of the Mark IV units) have minor irregularities in the analog-to-digital (A/D) conversions which produce a pattern in the distribution of pixel values; for example, there may be an excess of pixels with values evenly divisible by 4 (pix mod 4 = 0), and a deficit of pixels with values one greater (pix mod 4 = 1). The skybin parameter in the sky.param file controls the binning factor used in calculating a sky value; a value larger than any A/D patterns will smooth them out and yield a sensible value for the mean sky level, and the width of the distribution.

The second step is to detect stars. The XVista program stars does this job, using the XVista variables sky and skysig determined by the first step. The stars program is designed to follow the algorithm for detecting stellar peaks described in Peter Stetson's 1987 paper on DAOPHOT. Each candidate peak in the image is measured in a few simple ways, and only candidates whose characteristics fall within ranges set by the user in the stars.param file are accepted. This step produces a file in the output directory with a name like

       hira2011799.fits.coo
where the first part of the name is that of the image, and the extension is .coo (short for "coordinates"). The file contains one line per detected star, with a format like this:
    1   17.07   1552.08    8845    2.589    -0.058    0.759 
    2   25.97   1588.19    4923    2.391    -0.151    0.743 
    3   35.81   1339.97    2924    2.287     0.029    0.810 
where the columns contain
  1. a star ID number
  2. the row position (pixels)
  3. the column position (pixels)
  4. the peak value (ADU above sky level)
  5. estimate of Full-Width at Half-Maximum (FWHM) (pixels)
  6. "roundness" parameter (see below)
  7. "sharpness" parameter (see below)

The "roundness" is a measure of the symmetry of the stellar image. It is defined as

                      xwidth - ywidth
   roundness =  2 * (-----------------)
                      xwidth + ywidth

The "sharpness" parameter is similar to, but not exactly the same as, the sharpness parameter in DAOPHOT. It is here

  (image value at star centroid) - (mean of image values around centroid)
  -----------------------------------------------------------------------
                    (image value at star centroid)
So, a value close to 1 indicates a very sharp peak (possibly a cosmic-ray hit or chip defect), while a value close to 0 indicates a very soft peak.

The third step is to measure the brightness of the detected stars. The XVista program phot does this job. It performs simple synthetic aperture photometry on the image, using a circular aperture centered on the star's position. In stars.param, the user may specify the radius of a single aperture in pixels:

apers      { V { 8 } }  { I { 6 } }
or several apertures
apers      { V { 8 10 12 }  { I { 6 7 8 } }
Note that most of the values in the stars.param file come in keyed lists, with values associated with a filter name. The apers parameter is the only one which may contain a list within each of the keyed lists.

The user can also specify the inner and outer radii of an annulus used to measure a local sky value for each star, and properties of the CCD. Using these values, the program calculates the uncertainty in the estimated magnitude for each star. If the peak of the star (including sky) is greater than the saturate parameter, then the "quality flag" for the star is OR'ed with 1.

The phot program calculates the noise in a measurement by adding together the readout noise, noise due to the star's electrons, and noise due to the sky's electrons. By default, it uses the sky value and the gain to determine the number of electrons contributed by the sky. However, if the user has subtracted the sky from an image before measuring stars, then the program will find a very low sky value, and so underestimate the sky contribution to the overall noise. If the user specifies

# 1 means "yes, use the empscatter option"; 0 means "no, do not use it"
    empscatter         1
in the stars.param file, phot will take a different approach: it will use the scatter of values in the sky annulus to estimate empirically the noise in the background sky.

The phot program produces one output file per image, with a name like:

       hira2011799.fits.pht
where the first part of the name is that of the image, and the extension is .pht (short for "photometry"). The file contains one line per detected star, with a format like this:
    1   17.07 1552.08   7661  23.92   12.809  0.021  1
    2   25.97 1588.19   7671  19.21   13.263  0.031  0
    3   35.81 1339.97   7704  15.58   14.176  0.070  3
where the columns contain
  1. a star ID number
  2. the row position (pixels)
  3. the column position (pixels)
  4. the local sky value (ADU)
  5. uncertainty in sky value (ADU)
  6. instrumental magnitude in first aperture
  7. uncertainty in instrumental magnitude in first aperture
  8. (very last column) quality flag

If the user specifies more than one aperture, then there are two extra columns per line for each aperture, containing the magnitude and uncertainty thereof. The "quality flag" always comes in the very last column.

The "quality flag" is a bit-wise OR combination of several flags.

          0          no problems with star
          1          star may be saturated
          2          star is close to a bad region
          4          star is close to an edge of the image
          8          star was not detected in a passband
In this context, "close to" means "within an aperture radius of". In the example above, star 1 is possibly saturated, while star 3 is both possibly saturated and close to a bad region. The flag value meaning "star was not detected in this passband" will obviously not be set at this point; it may appear later, when data from several passbands are collected and merged.


Calibrating the positions of stars

The script do_astrom.tcl contains routines to calibrate the positions of stars detected in the images. The result of this step will be a set of star lists with (RA, Dec) positions.

We must perform the following steps for each image:

  1. pick out a subset of detected stars which we'll use for the calibration -- basically, the brightest stars in the field
  2. project the coordinates of these detected stars onto a plane, to facilitate matching. The plane is tangent to the center of the field, based on the image's (RA, Dec) coords. This converts the detected (row, col) into plate coords (xi, eta).
  3. create a subset of reference stars which ought to appear in this image, based on its (RA, Dec)
  4. project the reference stars' coordinates onto a plane, centered at the same (RA, Dec) position. This converts the reference (RA, Dec) positions into plate coords (xi, eta).
  5. attempt to match up the detected stars to the reference stars, via a transformation which includes translation, rotation and scaling
  6. if successful,
    1. apply the transformation to ALL the detected stars, turning their projected (xi, eta) values into new (xi', eta') ... and then de-project these back into (RA, Dec)
    2. write the calibrated (RA, Dec) positions for all detected stars into a disk file

The pipeline calls upon the match programs to do the bulk of the work. In steps 2 and 4, the pipeline invokes the project_coords program. In step 5, it calls the match program. If the matching is successful, the apply_trans program carries out step 6a.

There are a lot of tunable parameters to control the actions of match programs , and a lot of tunable parameters to control the selection of stars from the detected and reference lists. It is very likely that one will have to experiment with the values in astrom.param the first time one reduces data from a particular telescope.

For each image which matches the reference catalog successfully, the pipeline writes a file with a name like:

       hira2011799.fits.ast
where the first part of the name is that of the image, and the extension is .ast (short for "astrometry"). The file contains one line per detected star, with a format like this:
 50  230.26362    0.71553 7744 16.48 9.795  0.002   1
 93  229.61541   -0.46105 7807 24.10 9.835  0.002   0
 78  229.83316    1.76558 7759 15.17 9.857  0.002   3
where the columns contain
  1. the star ID number
  2. the RA position (decimal degrees)
  3. the Dec position (decimal degrees)
  4. the local sky value (ADU)
  5. uncertainty in sky value (ADU)
  6. instrumental magnitude in first aperture
  7. uncertainty in instrumental magnitude in first aperture
  8. quality flag

This file looks just like the .pht file, except that the "row" value has been replaced by RA, and the "col" value has been replaced by the Dec. The stars now appear sorted by instrumental magnitude, and the columns may have slightly different widths on different lines.

For every image which matches the reference catalog, the pipeline will write new, more accurate values for the RA and Dec of each field center into the make_list.out file. The pipeline will also modify the cleaned image's FITS header in the following way:

When matching fails: The raw Mark IV images Tom Droege placed on data disk 15 have (RA, Dec) values which are approximately correct, but differ from the actual field centers by up to 2 degrees. If one uses these (RA, Dec) values, the matching procedures will fail on some images. For the fields in which matching fails, no .ast files are written to disk; and, as a result, no photometric calibration will be done. If one looks at the successful matches among these images, one can figure out the correct (RA, Dec) values for the images which fail. In that case, one can edit the make_list output file by hand to fix these (RA, Dec) values. Re-running the astrom procedure will then yield good calibrations for all the fields.


Collating the lists of raw magnitudes

The script do_collate.tcl contains routines to combine information from the two (or more) simultaneous exposures of each field. The basic idea is to find all stars which appear in both exposures, and write down in one place the positions of those stars and their raw magnitudes in all passbands. This step makes it easier to perform the photometric calibration in the next stage of the pipeline.

By default, stars which are detected in only a single image will be discarded at this stage. However, if the user sets


# if set to 0, then the collate program will discard any star
#              which is only detected in a single passband
# if set to 1, then the collate program accepts stars detected
#              in a single passband and sends them to output
# If you set this to 1, be sure to set the corresponding
#    parameter to 1 in the "photom.param" file, too.

   collate_orphans       1

in the collate.param file, such "orphan" stars will be retained and sent to the output file.

The collate procedure has a few interesting parameters. One is the matching distance between detected objects in different passbands. In the collate.param file, this value is specified in arcseconds:

# this is the distance within which V and I detections must coincide
#   to count as a match.  Units are arcsec
collate_matchrad    3.0

The collate.param file also contains values which must be set to the latitude and longitude of the observatory used to make the observations.

# observatory latitude (decimal degrees, positive North)
#   and longitude (decimal degrees, positive West of Greenwich)
latitude           41.83
longitude          88.33

The collate procedure creates a new set of ASCII text files in the output directory. There is one file for each set of simultaneous exposures, with a name of the following form:

      Mhra2011799.clt

in which the initial letter, M, stands for "matched"; the remainder of the name, hra2011799, is taken from the names of the input files (minus their filter values), and the extension after the decimal point, clt, indicates this is collated output.

In other words, for a Mark IV unit with two cameras which take simultaneous exposures of the same field, the collate procedure might turn

    hira2011799.fits.ast   (V-band star list with RA,Dec and raw V mag)
    hvra2011799.fits.ast   (I-band star list with RA,Dec and raw I mag)
into
    Mhra2011799.clt   (combined star list with RA,Dec and raw V,I mags)

The output file has one line per star, with a format like this: (since the lines are more than 80 characters wide, I have split each line and indented the second half for clarity):

    167    229.77825     1.06227   2451659.84821   1.410   
                                 V 19.256  0.056    I 18.296  0.027    1

     75    229.80549     0.55199   2451659.84821   1.421   
                                 V 18.325  0.024    I 15.749  0.003  257
		      
    107    229.85660     0.20468   2451659.84821   1.429   
                                 V 18.669  0.033    I 17.471  0.013    0

where the columns contain
  1. an internal star ID number
  2. the Right Ascension of the primary (V) detection (decimal degrees)
  3. the Declination of the primary (V) detection (decimal degrees)
  4. Julian Date of exposure
  5. airmass (see below)
    (line is split in example above at this point ...)
  6. name of first passband
  7. raw magnitude in first passband
  8. uncertainty in raw magnitude in first passband
  9. name of second passband
  10. raw magnitude in second passband
  11. uncertainty in raw magnitude in second passband
  12. (very last column) combined quality flags for all passbands (see below)

If there are more than two passbands in each set of simultaneous exposures, there may be additional triplets of columns appended to each line. The quality flag always appears in the last column.

The "quality flag" in the collated output file is a simple combination of the individual flags for each passband. The individual flags are combined via bit-wise OR as follows:

       passband    0      flag     is left alone
                   1               is shifted  8 bits left  (<<8)
                   2               is shifted 16 bits left  (<<16)
                   3               is shifted 24 bits left  (<<24)
The combined quality flag is a 32-bit quantity. It is printed as an unsigned integer. In the example above, star 167 is possibly saturated in V-band, star 75 is possibly saturated in both V-band and I-band, and star 107 has no known problems.


Calibrating the magnitudes of stars

The script do_photom.tcl contains routines to turn the raw, instrumental magnitudes calculated for each star in each image into calibrated magnitudes on a standard scale.

The basic idea is to match up stars which were detected in all passbands against stars with known magnitudes in all passbands in a photometric reference catalog. The catalog which was created by the refcat procedure is assumed to have photometric as well as astrometric information. Values in the photom.param file specify which columns in this catalog contain the following information:

In future versions of the code, there should be a more flexible set of parameters, which allow passbands other than V and I.

The code also uses the files created by the collate procedure, which contain

The code matches up detected stars with catalog stars by comparing their RA and Dec values, and accepting as a match any pairs which fall within ref_matchrad arcseconds of each other.

The routines calculate a photometric solution, using stars observed in all images during the night. It ignores stars which have non-zero quality flags in any passband. The solution converts instrumental magnitudes into calibrated magnitudes. It uses the difference between raw magnitudes in two passbands (specified by the filt_pairs parameter) to define a raw color for each star. The solution may

  1. ignore extinction completely, solving for a zero-point a(j) for each frame j, plus a single color term b:
         calibrated mag  =   raw mag  +  a   +   b * (raw color)
                                          j
    

  2. include differential extinction across each frame, with user-supplied extinction coefficient
         calibrated mag  =   raw mag  +  a   +   b * (raw color)  -  k * airmass
                                          j
    

  3. (with lots of extra work by user) perform the usual all-sky solution

The pipeline is designed to operate in one of the first two modes, depending on the value of the fixk parameter in the photom.param file. If the values of this parameter are zero, then the solution does NOT include differential extinction across each image; if the values of fixk are non-zero, then the solution DOES include differential extinction.

If the user wishes to operate in the traditional all-sky mode, he must modify the photom.tcl source code so that the photom program is called with mode=extinct. This is not a good idea for most Mark IV data, but might make sense if the pipeline is used to reduce images taken by some other instrument at a good site.

As described in the documentation for photom , the code typically produces 4 output files. Each file has a stem based on the name of the first collate input file during the night. In this example, the stem is Mhra2011797 , because the earliest images of the sky were taken at (truncated) Julian Date 2011.797.

After all the other processing has finished, the do_photom routine makes a copy of the make_list.out file, which now has detailed information about every frame in the run. This copy is given a name with the same stem as the calibrated star list, so that one can easily keep the two files together. In the example shown above the copy would have the name Mhra2011797.list

Another summary file is produced at this point, too. There are many parameters used to control the actions of each step of the pipeline. If one wanted to reproduce the reduction procedures exactly, one would need not only the proper version of the software, but also the exact same values of each parameter. In order to enable this exact duplication, the pipeline creates a single big ASCII text file which is the concatenation of all the .param files. This collection of all parameters is given a name with the same stem as the calibrated star list and image information file, but extension .param. Thus, following the example above, this file might be Mhra2011797.param

When the pipeline has finished, one might save for future reference only these files:


Last modified Mar 27, 2005 by MWR