Testing analysis on an exoplanet dataset

Michael Richmond
May 06, 2018

The TESS and AAVSO are putting together a team of ground-based observers who can perform followup studies of exoplanet candidates found by TESS. People who want to join the team are asked to analyze a "practice" dataset, which can be found at the astrodennis.com website. I decided to give it a try.

Contents:


Step 1: creating calibration frames

The first step is to create "master" calibration frames. The dataset contains

All raw images are 16-bit integer FITS files, with 1375 x 1100 pixels. The camera was cooled to -20 C for all images.

I used the XVista image processing package to do the work. One of its limitations is that it can only handle 16-bit integer data, in the range -32768 to +32767 counts. In order to prevent some bright stars with pixels between 32768 and 65535 from overflowing, my first step was to divide all raw images (calibration and science) by 2. Therefore, all pixel values quoted below in darks and flats will have values which are half the value one would derive with other software. Since there are plenty of bits of noise due to the sky background, this won't affect the science results to any significant degree.

The next step was to combine all the bias images into a master bias frame. We'll need that frame, plus the master dark frame, to remove the dark current from flatfield images. Using an interquartile mean algorithm, I created a master bias image. The histogram of pixel values looks like this:

A nice, tight distribution. Fine.

Next, we need to create a "master dark" frame. Now, what we really need are two master dark images: one for the science frames (45 seconds long), and another for the flatfield frames (3 seconds long). The second must be an image which will allow us to compute the contribution of the dark current to an image of any exposure time. What we have are a set of 45-second darks. So, the plan is

  1. create a median of 45-second darks = "master_dark_45"
  2. subtract from this master 45-second dark the master bias, leaving a "45-second dark current" image = "master_current_45"
  3. later, we can scale this to provide the dark current created in a 3-second image (since flats are 3 seconds long

The combined 45-second dark image (= "master_dark_45") has a histogram of pixel values which has a peak at a SLIGHTLY larger value than the master bias:

Subtracting the master bias from the combined 45-second dark image yields our "master_current_45" frame. Its histogram of pixel values has a peak of just a few counts, because this chip doesn't generate very much dark current per second. The figure below compares the histogram of this "master_current_45" to the histogram of the original combination of the 45-second images alone. Note the slightly broader peak in the "master_current_45", due to the random noise in the master bias frame which we subtracted from it.

Finally, we come to the flatfield frames. Since these have an exposure time of 3 seconds, in order to subtract the dark current properly we need to do a bit of algebra. For each pixel in every raw flatfield image, we compute


                               (                                        3    )
 cleaned_flat  =  raw_flat  -  ( master_bias  +  [ master_current_45 * --- ] )
                               (                                       45    )

Then, having cleaned our flats, we combine with an interquartile mean to create the "master flatfield" image. It turns out to have a mean value of 15164 counts, and looks like this: North is up, East to left. The greyscale runs from black = 14800 counts (and lower) to white = 15600 counts (and higher).

The master flat shows several features of note:

  1. a set of big "donuts" due to dust far from the focal plane, likely sitting on a filter (even though the FITS header claims there is no filter)
  2. a set of small "donuts" due to dust close to the focal plane, probably on the dewar window
  3. a large-scale gradient from bright, in south-east corner, to dark, in north-west corner
  4. very mild vignetting


Step 2: applying calibration frames to make science images clean

This is pretty simple. All the science frames are 45 seconds long, so we can get rid of the dark current by subtracting the "master_dark_45" image. Then, we can perform the flatfield correction by dividing by a normalized value of the "master_flat" image. In other words, on a pixel-by-pixel basis


               ( raw - master_dark_45 ) 
 cleaned =     -------------------------- * (mean_of_master_flat)
                    ( master_flat )

Even the raw images look pretty nice to the eye, but the clean images look even nicer. Aside from a few hot pixels which aren't completely removed by the dark subtraction, the images are quite pretty:

Let's add a few labels to provide orientation and identify some stars for later reference.


Step 3: measuring the brightness of stars

We can measure the brightness of stars in these clean images using aperture photometry. How big should we make the aperture? I checked the FWHM of stars in a few images, and found

The change in size of the PSF across the frame would make absolute photometry difficult, but since we only care about changes in the brightness of each star throughout a set of continguous images on one night, it doesn't matter to us. I chose three apertures, of radius 5, 7, 9 pixels. In all cases, I used a annulus of radii 15 and 30 pixels to determine a local sky background.

I then measured the brightness of every star with a peak at least 15-sigma above the background in every frame. This typically yielded 70 to 90 stars.

The science images consist of a set of 336 images, taken as the field of WASP-12 gradually rose in the eastern sky, starting at airmass 1.878 and ending at airmass 1.013, just before the meridian. This change in airmass can lead to small, gradual changes in apparent brightness due to second-order, color-dependent extinction. We'll see an example of that soon. It also causes the sky background to decrease throughout the run:


Step 4: ensemble photometry

So, we now have measurements of the instrumental magnitude of some 70 to 90 stars per image, for 336 images. What do we do with all this information?

I prefer to employ inhomogeneous ensemble photometry, which uses every star available in every image to create a photometric reference system. Each star is then compared to this overall photometric reference system, yielding differential magnitudes for all stars at once. You can read more about the procedure in Kent Honeycutt's article on inhomogeneous ensemble photometry, and at the home page for the software I wrote to implement it.

One of the outputs of the ensemble photometry procedure is a measure of the magnitude zero-point offset of each image, relative to some arbitrary reference; in other words, how much did the magnitudes of ALL the stars in one image appear to shift brighter or fainter, compared to the magnitudes of all the stars in other images? A graph of this quantity over the course of the run can reveal the presence of clouds or other photometric variations. In our case, we see a general trend for smaller offsets (meaning "brighter stars") at later times; this is due to the decrease in atmospheric extinction as the field rose higher and higher in the sky. I see no evidence for clouds.

A second quantity generated by the ensemble photometry procedure is a measurement of the scatter around the mean magnitude of each star. If we plot this scatter against the mean (differential) magnitude of each star, we expect to see small values for bright stars, and large values for faint ones. The results for this dataset look as we expect, with a few exceptions.

The floor of this distribution is about 0.004 mag. But there are two small groups of outliers:

What we learn is that the brightest stars in the target images, labelled in red in the chart above, are all saturated, or at least reach into the non-linear portion the CCD's response. I therefore modified the ensemble photometry code to mark these bright stars as "variable" objects; they will not be used to set the photometric reference.

I also marked WASP-12 as a "variable" source, preventing it from being used as a photometric reference.

Having made those changes, I re-ran the ensemble procedure to generate a fresh photometric solution. I then made a graph showing the light curve of WASP-12 as well as light curves of a few relatively bright stars. We can use these light curves as a check to make sure we haven't left some systematic errors in the results.

In the graph below, I've performed a very rough photometric calibration using star "B" = TYC-1891-326-1 to convert the differential, unfiltered magnitudes onto the V-band magnitude scale. I've also shifted the light curves of the comparison stars vertically in order to fit them all onto a nice scale.

What do we see?

If we dig a litle deeper, we might notice that another relatively bright star has a somewhat elevated scatter. The star marked "P" in the chart above has a scatter of 0.007 mag, compared to 0.005 mag for stars of similar brightness. If we measure its light curve, we see something that may be intrinsic variation. Look at the bottom of the graph below, in which I've replaced the star "E" with this possibly varying star "P" (shifted vertically to fit on the graph).

Is this real variation? It's hard to say from this dataset alone, but perhaps additional measurements would provide an answer. Star "P" appears in catalogs as UCAC4 598-033251.