The curious case of Gliese 581 d

In 2005, radial velocity measurements by Bonfils et al., A&A 443, L15 (2005) indicated that the nearby M dwarf Gliese 581 was accompanied by a Neptune-mass planet, Gliese 581 b. The evidence was solid.

Figure 2 taken from Bonfils et al., A&A 443, L15 (2005)



  Q: What is the amplitude of the radial velocity variations?

Two years later, Udry et al., A&A 469, L43 (2007) showed that further observations of the system indicated the presence of two more planets. This brought the total to 3:

(the Neptune-mass original) mass = 15.6 Earths, period = 5.37 days
mass = 5.1 Earths, period = 12.9 days
mass = 8.3 Earths, period = 83.4 days

Figure 3 taken from Udry et al. (2007)



  Q: What is the amplitude of the radial velocity variations of planet 'd' ?

In 2009, things got really exciting when Mayor et al. A&A 507, 487 (2009) announced that, not only had they found evidence for a fourth planet (e), but a revised analysis showed that the third planet (d) was actually inside the habitable zone.

According to this paper, the tally was now

(the Neptune-mass original) mass = 15.6 Earths, period = 5.37 days
mass = 5.4 Earths, period = 12.9 days
mass = 7.1 Earths, period = 66.8 days (was 83.4 days)
mass = 1.9 Earths, period = 3.15 days

Figure 2 taken from Mayor et al. A&A 507, 487 (2009)

Figure S6 taken from Robertson et al., Science 345, 440 (2014)

Planet d does NOT exist!

Now, for several years, things were a bit unsettled. Some scientists claimed that some of these planets might not be real detections; others claimed to find two MORE planets, 'f' and 'g'. Eventually, Robertson et al., Science 345, 440 (2014) put forth the claim that some of the variations in radial velocity of Gliese 581 might not be due to planets, but due to activity in the star's atmosphere.

After all, this M star has quite an active chromosphere, as the emission line in H-alpha shows.

Figure 1 taken from Bonfils et al., A&A 443, L15 (2005)

Spectral measurements of Gliese 581 suggest that the star rotates with a period of 130 +/- 2 days. Robertson et al. argued that the signal for planet 'd' (period = 66 days = half of stellar rotation period) was correlated so strongly with stellar activity (as indicated by the strenth of H-alpha emission) that there was no planet 'd'. They made a similar argument for the non-existence of planet 'g' (period = 33 - 36 days = one-quarter of stellar rotation period ).

Figure 1 taken from Robertson et al., Science 345, 440 (2014)

.... or does it?

Just a few days ago, Anglada-Escude and Tuomi, arXiv 1503.01976 pointed out that one facet of the analysis used by Robertson et al. (and LOTS of other astronomers) was done improperly; or, perhaps, less than optimally. They write:

Detecting a planet candidate consists of quanti- fying the improvement of a merit statistic when one signal is added to the model. Approximate methods are often used to speed up the analyses, such as computing periodograms on residual data. Even when models are linear, correlations exist between parameters. Similarly, statistics based on residual analyses are biased quantities and cannot be used for model comparison.
A golden rule in data-analysis is that the data should not be corrected, but it is our model which needs improvement.

The point here is that it's dangerous to to search for periodic signals in a noisy dataset in the following manner:

compute power at different periods from the data
select period with the strongest power
fit a model to the data with that period
subtract the model from the data, leaving a residual

Round 2:

compute power at different periods from the RESIDUAL
select period with the strongest power
fit a model to the RESIDUAL with that period
subtract the model from the data, leaving a RESIDUAL_2

Round 3:

compute power at different periods from the RESIDUAL_2
select period with the strongest power

etc.

Anglada-Escude and Tuomi warn that operating on a series of successive residuals is poor practice, because each of the fitting steps in this sequence will actually be influenced by both real effects (planets) and noise as well.

They provide a simple example: suppose that your job is to fit a model to some measurements, where your models look like this.



   simplest model:    y  =  a*x


   next model:        y  =  a*x  +  b

The question is -- does the data justify only the simplest model, or does it justify a more complex one?

They illustrate their argument with this figure. On the left is the result of the sequence

fit one parameter -- subtract the model from data to form residuals -- then fit the RESIDUALS to find the second parameter

On the right is the result of the procedure

fit all parameters simultaneously

The authors go on to write about the "false-alarm probabilities (FAP)" discussed so frequently in the analysis of radial velocity variations.

derived false alarm probabilities would be representative only if a model with one-sinusoid and one offset is a sufficient description of the data, measurements are uncorrelated, noise is normally distributed, and uncertainties are fully characterized.
Every single of these hypothesis breaks down when dealing with Doppler residuals: the number of signals in not known a priori, fits to data correlate residuals, and formal uncertainties are never realistic.