Opening Workshop, Massive Datasets Program - September 9-12, 2012
Workshop Information
The Opening Workshop was held Sunday-Wednesday, 9-12 September 2012, at the Radisson Hotel Research Triangle Park, NC. The hotel is in close proximity to SAMSI.
Schedule
Sunday, September 9, 2012
Radisson RTP
8:30-9:00 | Registration and Continental Breakfast |
8:50-9:00 | Welcome and Introduction Ilse Ipsen, N.C. State University/SAMSI |
Tutorials | |
9:00-10:00 | Tamas Budavari, Johns Hopkins University Statistical Methods in Astronomy VIDEO |
10:00-10:30 | Break |
10:30-11:30 | Petros Drineas, Rensselaer Polytechnic Institute Mining Massive Datasets: A (randomized) Linear Algebraic Perspective VIDEO |
11:30-1:00 | Lunch |
1:00-2:00 | Haesun Park, Georgia Institue of Technology Visual Analytics for Knowledge Discovery in High Dimensional Data VIDEO |
2:00-2:30 | Break |
2:30-3:30 | Stephen Wright, University of Wisconsin Optimization Techniques for Statistical Analysis on Large Datasets |
3:30-4:00 | Break |
4:00-5:00 | Michael Jordan, Univ. of California-Berkeley Resampling Methods for Massive Data VIDEO |
Monday, September 10, 2012
Radisson RTP
8:30-8:55 | Registration and Continental Breakfast |
8:55-9:00 | Welcome |
Session: Inference | |
9:00-9:45 | Bin Yu, University of California, Berkeley Stability |
9:45-10:30 | Xiaotong Shen, University of Minnesota On Personalized Information Filtering |
10:30-11:00 | Break |
11:00-11:45 | Brian Caffo, Johns Hopkins University Resting State Brain Functional Connectivity Data: progress, future challenges and data |
11:45-12:15 | Panel Chair: Bill Eddy, Carnegie Mellon University Panelists: Alex Gray, Georgia Tech, Karen Kafadar, Indiana University, Bo Li, Purdue University |
12:15-1:30 | Lunch |
Session: Imaging | |
1:30-2:15 | Jim Nagy, Emory University Numerical Methods for Large Scale Inverse Problems in Image Reconstruction |
2:15-3:00 | Jianqing Fan, Princeton University Iterative Screening and Estimation |
3:00-3:30 | Break |
3:30-4:15 | Rollin Thomas, Lawrence Berkeley National Lab Supernova Discovery in the Era of Data-Intensive Science |
4:15-4:45 | Panel Co-Chairs: Daniela Ushizima, Lawrence Berkeley National Lab and Jiayang Sun, Case Western Reserve Panelists: Peihua Qiu, University of Minnesota, Erkki Somersalo, Case Western |
4:45-5:15 | Poster blitz (2 minutes per poster) |
5:15-5:30 | Break |
5:30-7:30 | Poster Session and Reception SAMSI will provide poster presentation boards and tape. The board dimensions are 4 ft. wide by 3 ft. high. They are tri-fold with each side being 1 ft. wide and the center 2 ft. wide. Please make sure your poster fits the board. The boards can accommodate up to 16 pages of paper measuring 8.5 inches by 11 inches. |
Tuesday, September 11, 2012
Radisson RTP
8:30-9:00 | Registration and Continental Breakfast |
Session: Environment & Climate | |
9:00-9:45 | Anna Michalak, Stanford University A Bird’s Eye View of the Carbon Cycle: Spatiotemporal tools for constraining the CO2 budget from atmospheric observations |
9:45-10:30 | Dan Crichton, Jet Propulsion Lab Architecting Highly Scalable Scientific Data Management and Discovery Systems |
10:30-11:00 | Break |
11:00-11:45 | Noel Cressie, University of Wollongong and The Ohio State University Uncertainty Quantification for Regional-Climate-Model Output |
11:45-12:15 | Panel: Chair: Jessica Matthews, CICS-NC Panelists: Amy Braverman, Jet Propulsion Lab, Steve Sain, NCAR, Richard Smith, SAMSI/UNC-CH |
12:15-1:30 | Lunch |
Session: High Energy Physics | |
1:30-2:15 | Steffen Bass, Duke University Recreating the Big Bang in the Laboratory: The Scientific, Computational and Data Challenges of High Energy Nuclear Physics |
2:15-3:00 | Kyle Cranmer, New York University Statistical Aspects of the Discovery of the Higgs Boson at the Large Hadron Collider |
3:00-3:30 | Break |
3:30-4:15 | Luc Demortier, Rockefeller University Searches and Measurements in High Energy Physics |
4:15-4:45 | Panel Chair: Robert Wolpert, Duke University Panelists: Mandeep Gill, SLAC; Cosma Shalizi, Carnegie Mellon University; Daniel Whiteson, University of California, Irvine |
4:45-6:00 | Open Mic and Refreshments |
Wednesday, September 12, 2012
Radisson RTP
8:30-9:00 | Registration and Continental Breakfast |
Session: Streaming, Sketching & Datamining | |
9:00-9:45 | Michael Mahoney, Stanford University Implementing Randomized Matrix Algorithms in Parallel and Distributed Environments |
9:45-10:30 | Maryam Fazel, University of Washington Convex Relaxations for Recovery of Models with Simultaneous Structures |
10:30-11:00 | Break |
11:00-11:45 | Inderjit Dhillon, University of Texas, Austin Sparse Inverse Covariance Matrix Estimation Using Quadratic Approximation |
11:45-12:15 | Panel Chair: Piotr Indyk, MIT Panelists: Graham Cormode, AT&T Labs-Research, Ashish Goel, Stanford University, Michael Mahoney, Stanford University |
12:15-1:30 | Lunch |
Working Groups | |
1:30-3:00 | Working Group Formation and Initial Meeting |
3:00 | Adjourn |
Partial list of research topics:
* Data visualization and analytics:
High-speed visualization of high-dimensional datasets; data representation, extraction, integration and transformation; real-time visual interaction; spatio-temporal data mining
* Online streaming and sketching:
Algorithm paradigms for massive datasets (streaming, online, randomized); scalability; filtering; anomaly detection; data structures for fast computation of statistics; database enabled machine learning tools; computing environments and programming models (GPU's, cloud computing, custom chips)
* Large-scale optimization:
Convex optimization (sparse modeling and compressed sensing, matrix completion); online optimization (streaming data, on-line learning, control theory); distributed optimization (parallel and GPU computation, data fusion); machine learning; high-dimensional models
* Inference:
Dimension reduction for high-dimensional data (feature selection, sub-sampling and screening, sparse PCA); predictive inference and multiple testing (false discovery rates, uncertainty in prediction); high-dimensional MCMC methods for posterior inference (particle filters, hybrids with optimization methods)
* Imaging:
Rapid registration and segmentation (GPU's, distributed computing); multiple testing and inference for large-scale imaging data (sky surveys, satellite images, false discovery rate with dependence); dynamic imaging (streaming data, spatio-temporal models)
* Systems and architectures :
Reliability; resilience; probabilistic computing, multiple precision; real-time methods; variable data flows; hardware platforms
* High-energy physics:
Reconstruction and analysis of particle collisions from the LHC; pattern recognition and parameter extraction; simulations to estimate error rates; parameter estimation for large numbers of parameters; maximum likelihood estimators
* Astronomy:
Statistics on remote resources; computations on special purpose architectures and GPUs; communication avoiding methods; randomized and online algorithms; detection and classification of transient events and outliers; Bayesian inference and machine learning; high dimensional models with empirical priors; non-parametric models; visualization of large high-dimensional datasets
* Environment and climate:
Production, validation, processing, distribution and integration of data; data fusion and remote sensing; algorithms for large distributed datasets; spatial or spatio-temporal statistics