|
2006 Summer Program on Multiplicity and Reproducibility in Scientific Studies
July 10-28, 2006
Research Foci
Description of Activities
Further Information
Research Foci
Concerns over multiplicities in statistical analysis and reproducibility of scientific experiments
are becoming increasingly prominent in almost every scientific discipline, as experimental and
computational capabilities have vastly increased in recent years. This 2006 SAMSI summer program
will look at the following key issues that arise.
Reproducibility: A scientist plans and executes an experiment. A clinical trials
physician runs a clinical trial assigning patients to treatments at random and blinding who has
what treatment. A survey sampling person collects a survey. Scientists use statistical methods to
help them judge if something has happened beyond chance. They expect that if others replicate
their work, that a similar finding will happen. To clear a drug the FDA requires two studies, each
significant at 0.05.
A recent paper by Ioannidis (JAMA 2005; 294:218-228) showed startling and disconcerting lack of
reproducibility of influential statistical studies published in major medical journals. It found
that about 30% of randomized, double blinded medical trials failed to replicate and that 5 out of
6 non-randomized studies failed to replicate - about an 80% failure rate. We aim to explore and
clarify the causes of failures to reproduce in more detail, not only in the Ioannidis paper, but
also more broadly, identifying commonalities that lead to these problems, and attempting to estimate
its prevalence. Multiplicities (both obvious and hidden) will be considered in particular, along
with selection biases and regression to the mean. At the conclusion of the program, recommendations
for scientific reporting and publication will be made.
Subgroup Analysis: Large, complex data sets are becoming more commonplace and people
want to know which subgroups are responding differently to one another and why. The overall sample
is often quite large, but subgroups may be very small and there are often many questions. Genetic
data is being collected on clinical trials. Which patients will respond better to a drug and which
will have more severe side effects? Disease, drug, or side effects can result from different
mechanisms. Identification of subgroups of people where there is a common mechanism is useful for
diagnosis and prescribing of treatment. Large educational surveys involve groups with different
demographics, different educational resources and subject to different educational practices. What
groups are different; how are differences related to resources and practices? What really works and
why? Is the finding the result of chance? There is a need for effective statistical methods for
finding subgroups that are responding differently. There is a need to be able to identify complex
patterns of response and not be fooled by false positive results that come about from multiple
testing. Our idea is to bring together statisticians and subject experts to develop and explore
statistical strategies to address the subgroup problem. The benefit will be creditable statistical
methods that are likely to produce results that will replicate in future studies.
Massive Multiple Testing: The routine use of massively multiple comparisons in inference
for large scale genomic data has generated a controversy and discussion about appropriate ways to
adjust for multiplicities. We will study different approaches to formally describe and address the
multiplicity problem, including the control of various error rates, decision theoretic approaches,
hierarchical modeling, probability models on the space of multiplicities, and model selection techniques.
Besides applications in inference for genomic data we will consider similar problems arising in clinical
trial design and analysis, record matching problems, classification in spatial inference, anomaly
discovery and syndrome surveillance. The goal of this program is to identify the relative merits
and limitations of the competing approaches for diverse applications, and to understand which features
of reproducibility are addressed.
Program Leaders: Peter Mueller (M.D. Anderson Cancer Center), Juliet Shaffer (U. Calif. Berkeley),
Peter Westfall (Texas Tech. Univ., Chair); Stan Young (NISS, Local Scientific Coordinator); James Berger
(SAMSI, Directorate Liaison), and Ray Carroll (Texas A&M, National Advisory Committee Liaison).
Description of Activities
Workshops
The Opening Workshop will be held July 10-12, 2006, and will focus on clear formulation of the
challenges in the area. This will set the stage for the subsequent Program research.
A Transition Workshop will be held July 27-28, 2006, summarizing the results of the Program
research and discussing the remaining challenges.
Working Groups: The working groups, one in each of the three research foci areas listed above,
will meet during the period July 13-26, 2006, to carry out the Program research.
Further Information
For additional information about the program or to apply to participate, write
[email protected]. If interested in participating
during the entire three week period, please send a letter describing your interest, along with a
vita, to the indicated e-mail address. Application forms for workshop participation will be available
later.
|
|