Group Leaders: Ashish Mahabal (Astro, Caltech), G. Jogesh Babu (Stat, PSU)
SAMSI Webmaster: David Jones
Weekly Meeting: Mondays 2:00-4:00pm ET
Description: Time Domain Astronomy (TDA) has been getting richer in terms of datasets that span several years, many bands, and include dense and sparse light-curves for hundreds of millions of sources. The variety, volume etc. squarely fall in the Big Data regime, but the science questions that can be posed imply that standard, canned routines cannot be used except in trivial cases. The light-curves often have large gaps, are heteroskedastic, and the intrinsic variability – often poorly understood – adds an element of uncertainty when multi-band data are not obtained simultaneously. TDA can thus be viewed as the umbrella within which several large problems can be tackled. These span from Kepler-type planet search/characterization (also covered in other groups) to characterization of specific classes e.g. binary black-hole searches from Catalina Real-Time Transient Survey (CRTS) to searching flaring stars away from the plane of the Galaxy using light-curves as well as ancillary data from other sources like SDSS and WISE to name just a few. Combining different datasets is a fertile field in itself with the sum of the parts potentially being more than the whole, but not fully realized yet owing to lack of good methodology. Besides obtaining more sources of well understood types, the clustering in search for characterization naturally leads to outliers – not just individual interesting sources, but also entire subclasses.
- What mathematical and statistical approaches can be used to best characterize and quantify salient features of irregular, heteroscedastic, gappy time series, and to identify specific feature sets, templates and models? How can we identify many weak features or a few strong ones in such high dimensional time series Big Data?
- What are the best methods for classifying time series incorporating auxiliary/covariate information? Are there specific domain-knowledge based features that can be identified to improve class discrimination?
- How can significant outliers/anomalies and subclasses thereof be detected? Investigate correlated Functional Data techniques to detect outliers, anomalies and subclasses.
- How can data sets from multiple surveys (with or without overlap in certain key parameters) be combined? Can we take a predictive model obtained in one survey and transform it into an accurate model for another survey?
What techniques apply to specific categories of time series: non-stationary, stochastic/deterministic, etc., e.g., can we assume ergodicity? Develop formal statistical tools/tests for assessing viability of simpler data structures assumptions.
News and Updates: Coming soon…
SAMSI Directorate Liaison: Sujit Ghosh