2012-13 Program on Statistical and Computational Methodology for Massive Datasets
This year-long SAMSI program focused on fundamental methodological questions of statistics, mathematics and computer science posed by massive datasets, with applications to astronomy, high energy physics, and the environment.
Serious challenges posed by massive datasets have to do with "scalability" and "data streaming". Techniques developed for small or moderate-sized datasets simply do not translate to modern massive data sets. Data acquisition rates on the order of gigabytes per second necessitate innovative approaches towards computing environments, analysis, and algorithms.
Research Working Groups
Working groups are at the very heart of the scientific activities at SAMSI. They consist of SAMSI visitors, postdoctoral fellows, graduate students, local faculty, and other scientists. The working groups met every week throughout the program year, to pursue the following research topics that were identified at the Planning Workshop and at the Opening Workshop, or subsequently chosen by the working group participants:
- Inference
- Online streaming and sketching
- Imaging
- Data mining and clustering
- Multi-scale modeling
- Graphical models and graphics processors
- Stochastic processes and astrophysical inference
- Discovery and classification in synoptic surveys
- High energy physics
- Environment and climate
Workshops
- Opening workshop 9-12 September 2012
- Astrostatistics workshop, 19-21 September 2012
- SAMSI-FODAVA Workshop on Interactive Visualization and Analysis of Massive Data, 10-12 December 2012
- SAMSI-NCAR Workshop on Massive Datasets in Environment and Climate, 12-15 February 2013
- Transition Workshop, 20-22 May 2013
Graduate course
The two-semester graduate course Computational and inferential methods for high dimensions and massive datasets (Fall 2012 and Spring 2013) covered fundamental methodological questions of statistics, mathematics and computer science posed by massive datasets, with applications to astronomy, high energy physics, and environment and climate.
Workshops
Working Groups
- Bayesian Inference using INLA
- SAMSI SPRING 2013 Course: Computational and Inferential Methods for High Dimensions and Massive Datasets
- MD Inference for Multiple Testing
- MD Inference - Dimension Reduction Variable Selection and Sparsity
- MD Stochastic Processes and Astrophysical Inference
- MD Graphical Models & Graphics Processors
- MD Inference & Simulation in Complex Models
- MD Multiscale Modeling
- MD High Energy Physics (HEP)
- MD Inference
- MD Environment and Climate
- MD Datamining & Clustering
- MD Discovery & Classification in Synoptic Surveys
- MD Online Streaming & Sketching
- MD Imaging
- SAMSI FALL Course - Computational and Inferential Methods for High Dimensions and Massive Datasets