Application Deadline is August 5, 2016
Location
The workshop venue has changed. It will now be held at North Carolina State University:
http://analytics.ncsu.edu
Description
The workshop aims to bring academic researchers and industrial engineers together for the exploration and scientific discussions on recent challenges faced by practitioners and related theories and proven best practices in both academia and industries on distributed data analytics.
In recent works of computational mathematics and machine learning, great strides have been made in distributed optimization and distributed learning. For example, using ‘consensus’ on local variables and global variable, the Alternating Direction Method of Multipliers (ADMM) algorithm can be utilized to solve a distributed version of the LASSO problem. On the other hand, classical statistical methodology, theory, and computation are based on the assumption that the entire data are available at a central location; this is a significant shortcoming in modern problem solving. It is known that computing speed at a single machine can be thousands time faster than the data transmission between locations.
Specific goals of the workshop include (i) exposing academic researchers to both the challenges in industrial applications and current computing tools being used in industry, (ii) introducing industrial researchers to the frontiers of applied mathematical and statistical methods regarding distributed inference, and (iii) educating graduate students and early-career researchers about practical computing and theoretical studies in distributed analytics. The workshop will begin with few tutorial type lectures followed by lectures and panels on state-of-the-art research based methods by leading researchers and practitioners in this emerging field of mathematics.
The workshop will be limited to about 50 participants and funding support priority will be given to U.S. based researchers.
Schedule and Supporting Media
Wednesday, September 21st
NCSU Centennial Campus
Time | Description | Speaker | Slides | Videos |
---|---|---|---|---|
8:50– 9:00 | Welcome and Introductory Remarks | Sujit Ghosh, SAMSI | ||
9:00 –9:35 | Scalable Probabilistic Inference from Big and Complex Data | David Dunson, Duke | ||
9:40-10:15 | Asynchronous Parallel Coordinate Update Algorithms | Wotao Yin, UCLA | ||
10:50–11:25 | Distributed Hyper-Parameter Optimization for Machine Learning | Yan Xu, SAS | ||
11:30–12:05 | Interaction Selection and Screening for High Dimensional Data | Helen Zhang, University of Arizona | ||
1:30–2:05 | Privacy-Preserving Methods for Handling Missing Data in Distributed Health Data Networks | Qi Long, Emory | ||
2:10–2:45 | DPDA Application in Predix Ecosystem for Real-time Monitoring and Diagnostics of Energy Assets | Xiaomo Jiang, GE Power | ||
2:50–3:25 | A Sequential Split-Conquer-Combine Approach for Gaussian Process Model in Analysis of Big Spatial Data | Min-ge Xie, Rutgers | ||
4:00–4:15 | Funding Opportunities at NSF | Yong, Zeng, NSF | ||
4:15–5:00 | Discussion: Lightning Talks | Discussion Moderator, Sujit Ghosh, SAMSI | Alexander | |
Spencer | ||||
Liu & Mei | ||||
Zhang | ||||
Madar | ||||
Wang | ||||
Yuchen |
Thursday, September 22nd
NCSU Centennial Campus
Time | Description | Speaker | Slides | Videos |
---|---|---|---|---|
9:00-9:35 | Distributed Estimation and Inference with Statistical Guarantees | Jianqing Fan, Princeton | ||
9:40-10:15 | HPDA Growth Constraints in Digital Marketing | Samuel Franklin, 360i | ||
10:50-11:25 | Bayesian Aggregation for Extraordinarily Large Dataset | Guang Cheng, Purdue | ||
11:30-12:05 | Blessing of Massive Scale | Han Liu, Princeton | ||
1:30-2:05 | Bayesian Neural Networks for High Dimensional Nonlinear Variable Selection | Faming Liang, University of Florida | ||
2:10-2:45 | Strategies & Principles for Distributed Machine Learning | Eric Xing, Carnegie Mellon | ||
2:50-3:25 | Challenges and Opportunities in Automated Driving and Connected Vehicles | Yilu Zhang and Wei Tong, GM | ||
4:00–4:35 | Parallel Local Graph Clustering | Kimon Fountoulakis, UC Berkeley |
Friday, September 23rd
NCSU Centennial Campus
Time | Description | Speaker | Slides | Videos |
---|---|---|---|---|
9:40-10:15 | Scalable and Robust Statistical Estimation: a tale of the geometric median | Stas Minsker, University of Southern California | ||
10:50-11:25 | Uncover Customer Insights with Apache Spark and ML | Bo Zhang, IBM | ||
11:30-12:05 | Some Recent Development in Spatial Statistics for Large Datasets | Raj Guhaniyogi, University of California, Santa Cruz |
Questions: email [email protected]