This was the final meeting of the exoplanets working group. There was further discussion of the mixture importance sampler, followed by a fairly lengthy discussion of the reversible jump algorithm as it might be applied to the selection of a model in the planets problem (how many planets do the data support?). In particular, a reversible jump algorithm was discussed in which the jumps from one model to another are independent of the present model and parameter values. No one has yet tried this approach on the planets problem.
Further discussion of marginal likelihood integration. Merlise shared this paper written by JingQuin (Rosy) Luo: luo_03may06.pdf. And Jim (later) wrote up this paper to replace his earlier "Crazy Idea Number 2", posted under the 1 March 2006 meeting minutes: berger_03may06.pdf
We continue to discuss the efficacy of different methods for integrating the likelihood over the full parameter space.
This meeting was mostly spent discussing choices of priors when doing model selection. In particular, Floyd expressed concern that when comparing a one-planet to a two-planet model in the case of HD88133 (see Miscellany section of this web page for priors and model statement), he estimated a Bayes factor of only around 5 favoring the one-planet model, even though it is obvious to the eye that the data support the presence of a planet much more strongly than that. And thus began the discussion of prior choices and how much influence they may have over model Bayes factors. Floyd inquired about eliciting priors from astronomers, and Bill said he knew of no case where it was done. Jim and Merlise shared some wisdom about data-driven priors, such as intrinsic priors (in which some part of the data are used with an uninformative prior to generate a posterior that is then used as a prior upon the remainder of the data) and priors that are equal to the likelihood function raised to some power. The following paper by Jim Berger and L.R. Pericchi introduces various default priors for model comparison: Objective Bayesian methods for model selection: introduction and comparison.
We discussed "Jim's Crazy Importance Sampler #3" (which Merlise urged us to refer to in the future as the ratio of estimators) and wondered about its variance. Jim thinks it may grow too rapidly with the number of dimensions to be useful, but Eric Ford thinks perhaps not, considering that the number of dimensions in a four-planet problem is "only" p=22.
We also discussed the relative merits of other marginal likelihood estimators, and will continue to do so next week.
At this short meeting (2:00-2:45) methods were discussed for improving marginal likelihood estimators.
Rather than continuing the discussion on experimental design issues this week, we returned to methods of estimating the marginal likelihood of a model for model selection. Jim shared a newer version of "Jim's Crazy Idea" (not so crazy, perhaps?), which is located here: berger_29mar06.pdf.
Tom also shared an estimator of marginal likelihood, a document describing which may appear on this page in the near future.
Eric Ford and Merlise Clyde both have committed to writing up and presenting some of the results from this working group's discussions. They would like to be able to compare and contrast different estimators of marginal likelihood, so it is suggested that those who are implementing such an estimator on a one-planet star system do so with HD88133, whose data are given on the "data sets" section of this web page; and use the full Keplerian orbit model with priors given in the document "priors statement", which is posted on the "miscellany" section of this web page. Furthermore, if you use an estimator and find a marginal likelihood, please send an email to floyd (floyd@samsi.info) including a brief statement of what estimator you used, what your marginal likelihood estimate was, and (if possible) an estimate of its uncertainty. These will be posted right here.
Eric Ford led a discussion of experimental design issues, especially as they relate to planet detection using radial velocity measurements. The issues include:
Two papers of interest are "Adaptive Scheduling Algorithms for Planet Searches" (Ford) and "Bayesian Adaptive Exploration" (Loredo).
Merlise and Floyd share results of applying a variation of Skilling's nested sampler to HD88133. The pdf file posted here contains some unfortunate errors. When an updated version of this file is posted, it will be noted with an "updated" icon. (bullard_15mar06.pdf)
The group met briefly from 1:30 to 2:00 and discussed importance sampling. Floyd had some results from drawing from a mixture of three T4 distributions in the case of HD73526, the star with three supported periods (bullard_08mar06.pdf), and Bill shared his promising results from applying Jim's "crazy importance sampler" (see 1 March) to an artificial problem.
At 2:00 the meeting ended and the group remained to hear Giovanni Punzi's 2:00 talk, "Ordering rules for the Neyman construction with nuisance parameters".
Further discussion about model selection and estimating marginal likelihoods/posteriors for models. Jim Berger shares a "crazy idea" for an importance sampler.
Downloads:
Jim's "crazy ideas" about importance sampling: berger_01mar06.pdf
Merlise Clyde presents John Skilling's proposed method for estimating marginal likelihoods for models. There are mixed feelings among the group about its potential usefulness. Some are optimistic, some consider it utterly hopeless.
Downloads:
Presentation: clyde_22feb06.pdf (.pdf)
Code for simple example: nested.R
Recommended reading: Skilling, "Nested Sampling for General Bayesian Computation". (See Suggested papers page.)
Floyd Bullard shares his updated MCMC solution to HD88133 and HD46375 that were originally proposed on 26 January and discussed on 1 February. Briefly the group discusses further possible improvements, and the possibility of sampling from all parameters simultaneously.
Downloads:
Presentation: bullard_15feb06.pdf. I apologize for this being such a big files. I don't know how to shrink it.
Phil Gregory presents his parallel tempering sampler, including a means he devised for automating the search for good propoposal distribution step sizes. His chains show relatively poor convergence and clear strong negative correlation between two parameters, however (chi and omega). The take-away message seems to be that when one is using MCMC to explore a parameter space in which parameters are highly correlated, one should either reparameterize to lessen the correlation, or else use a proposal distribution in which the parameters are correlated in a way similar to their correlations in the posterior.
Downloads:
Phil's presentation: gregory_01feb06.pdf
Star data used in Phil's presentation: HD73526.txt (Tinney et al. 2003, ApJ 587, 423)
The first of the two star data sets that Eric shared last week were discussed. Four group participants shared their solutions. Floyd's solution mixes poorly due to independent sampling of highly correlated parameters. Phil Gregory's may suffer from the same problem, to a lesser degree. Barbara was not present to present her solution but she shared it and it was discussed. Eric Ford shared his own solution, which appeared to mix well and converge well within his 10^6 iterations.
The big message du jour was perhaps this: choose the parameterization of your problem carefully. In the case of estimating the parameters of a single orbiting planet, for example, eccentricity is highly correlated with angle of periastron, and a better parameterization is e*sin(omega) and e*cos(omega). See link at right for a fragment of Eric Ford's code for determining the radial velocity at a particular time given the orbital parameters.
Following the meeting at 3:15, most group members present remained to listen to Eric present a model selection issue he had studied: selecting between a circular orbit model and an eccentric orbit model. The motivation is that if a star has an obvious orbit but no compelling evidence that the orbit is non-circular, it should not (?) get penalized by a factor of (1-0)*(2pi) for the prior volume of the eccentricity and omega dimensions when being compared to other models (no planet, two planets, etc.)
Downloads:
Some graphics from Ford's solutions to HD88133 and HD46375: ford_01feb06.pdf
Barbara's solutions to HD88133 and HD46375: mcarthur_01feb06.pdf, mcarthur_01feb06.ppt.
Floyd's solutions to HD88133 and HD46375: bullard_01feb06.ppt (a cautionary tale of what not to do!)
Code fragment in C++ by Eric Ford for computing radial velocity: ford_radial_velocity_code_fragment.txt
Eric Ford shares thoughts on prior distributions for model parameters.
Phil Gregory describes the models that arise in the model selection process. Most agree that the models to consider are what he calls M0s, M1s, M2s, etc., where Mns is a model in which n planets are present and there is some noise whose variance must be estimated. The number of parameters in model Mns is 5n+1+j, where j is the number of different observatories that measured radial velocities.
Merlise Clyde expounds upon the difficulties that arise when trying to estimate the marginal likelihood for a given model. In particular, when draws are made from the posterior distribution of the parameters, a weighted average of the likelihoods obtained puts large weights on very few draws, resulting in very high variance in the estimate of the marginal likelihood.
First meeting of exoplanets group. Eric Ford describes how radial velocities are measured.
Barbara McArthur describes how astrometry is conducted with the Hubble Space Telescope (HST).
Eric Ford shares some information about the characterists of the planets that have been discovered so far. Selection biases are discussed.
Eric also shares briefly results from a computer simulation he wrote of a planetary system evolving.
Downloads:
Radial velocity data for HD88133.txt (Fischer et al. 2005 ApJ 620: 481-486) and HD46375.txt (Marcy, Butler & Vogt 2000 ApJ 536, L43-46). Working group members will compare solutions (i.e., model parameter estimates) in future meetings.
Opening workshop, Exoplanets sessions.
Talks given by Alex Wolszczan ("Searches for Radio Pulsars and Planets around Them"), and by
Barbara McArthur ("Analysis of Radial Velocity and Astrometric Signals in the Detection of Multi-Planet Extrasolar Planetary Systems").