Professional Development Lunch
January 29, 2014, 12:00pm – 1:30pm
Scalable and Robust Bayesian Inference via the Median Posterior
February 19, 2014, 1:15pm – 2:15pm
Speaker: Sanvesh Srivastava
Many Bayesian learning methods for massive data benefit from working with small subsets of observations. In particular, significant progress has been made in scalable Bayesian learning via stochastic approximation. However, Bayesian learning methods in distributed computing environments are often problem- or distribution-specific and use ad hoc techniques. We propose a novel general approach to Bayesian inference that is scalable and robust to corruption in the data. Our technique is based on a popular idea of splitting the data into several non-overlapping subgroups, evaluating the posterior distribution given each independent subgroup, and then combining the results. The main novelty is the proposed aggregation step which is based on finding the geometric median of posterior distributions. We present both theoretical and numerical results illustrating the advantages of our approach.
***This is based on a joint work with Lizhen Lin, Stanislav Minsker, David B. Dunson***
Professional Development Lunch
March 5, 2014, 1:00pm – 2:00pm
Speaker: Kenny Lopiano
Kenny Lopiano will talk about how he went about forming his own company.
MUSIC for line spectral estimation: stability and super-resolution
March 12, 2014, 1:15pm – 2:15pm
Speaker: Wenjing Liao
The problem of spectral estimation, namely – recovering the frequency contents of a signal – arises in various fields of science and engineering, including speech recognition, array imaging and remote sensing. In this talk, I will introduce the MUltiple SIgnal Classification (MUSIC) algorithm for line spectral estimation and provide a stability analysis of it. Numerical comparisons of MUSIC with other algorithms, such as greedy algorithms and L1 minimization, show that MUSIC combines the advantages of strong stability and low computational complexity for the detection of well-separated frequencies. Moreover, MUSIC is the only algorithm possessing the capability of resolving closely spaced frequencies.
***This is a joint work with Albert Fannjian.***
Baysian Analysis for finite population prediction with limited survey information
March 19, 2014, 1:15pm – 2:15pm
Speaker: Neung Soo Ha
We describe a Bayesian method for making a finite population prediction of uninsured persons in Florida in each county. We develop a predictive distribution for nonsampled units when the selection probabilities are only available for the sampled units. We also fine-tune our prediction estimates by using the benchmarking techniques, so when aggregated, the overall estimates for a larger geographical (or population domain) area are equivalent to the corresponding direct estimates, the latter being usually believed to be quite reliable. We demonstrate our methods to the Behavioral Risk Factor Surveillance System.
Compressive Support Detection based on Multiple Hypothesis Testing and the Tube Method
March 26, 2014, 1:15pm – 2:15pm
Speaker: Yi Grace Wang
Compressive sensing is a technology to reduce the size of data from the data collection stage. Through sampling much less data, it reduces imaging time and cost, making it very useful in applications arising in, for example, astronomy, medical imaging, and sensor networks, especially when dealing with massive data. In medical applications for instance, less imaging means less radiation in some cases. Compressive sensing reconstruction takes advantage of the signal’s compressibility or sparsity in some transformation domain to recover the underlying image of interest. This paper addresses how to make an inference about the support of the underlying signal from compressive sensing data. We develop a general and practical multiple comparison (MCP) inferential procedure, via the tube method, for compressive support detection. It is the first work that is able to make compressive inference about the underlying signal defined on continuous domain. Parameter selection is done by generalized cross validation and variance is also estimated. Comparison with Bonferroni based approach validates the advantages of the proposed method.
Complex contagion on noisy geometric networks
April 9, 2014, 1:15pm – 2:15pm
Speaker: Dane Taylor
The study of contagion on networks is central to our understanding of collective social processes and epidemiology. However, for networks arising from an underlying manifold such as the Earth’s surface, it remains unclear the extent to which the dynamics will reflect this inherent structure, especially when long-range, “noisy” edges are present. We study the Watts threshold model (WTM) for complex contagion on noisy geometric networks — a generalization of small world networks in which nodes are embedded on a manifold. To study the extent to which contagion adheres to the manifold versus the network, which can greatly disagree on notions such as node-to-node distance, we present WTM-maps that embed the network nodes as a point cloud for which we study the geometry, topology, and intrinsic dimensionality. Interestingly, this work bridges several research disciplines by aligning the pursuits of network science and epidemiology with those of manifold learning and dimension reduction.
April 16, 2014, 1:15pm – 2:15pm
Global and local connectivity analysis of galactic spectra
April 23, 2014, 1:15pm – 2:15pm
Speaker: David Lawlor
Over the past decade, the astronomical community has begun to utilize machine learning tools for understanding and interpreting the vast quantities of data collected by large-scale observational surveys such as the Sloan Digital Sky Survey (SDSS). We add to this literature by examining the connectivity of a large set of spectroscopic data through its embedding in certain diffusion spaces. We are able to interpret our embeddings in physical terms as well as to identify certain rare galaxy types and outliers due to errors in the preprocessing pipeline. If time permits we will also discuss a local analogue of these diffusion embeddings that allows one to focus on a particular region of interest in the data.
Regularized tensor regression
April 30, 2014, 1:15pm – 2:15pm
Speaker: Minh Pham
In this talk, I am going to present my ongoing work on regularized tensor regression. Tensor regression is extended from the classical regression framework and it has many applications in neuro-imaging. From the optimization standpoint, it is a very challenging problem where the objective function has two or more regularization terms. I am going to discuss application of standard methods such as Split Bregman, ADMM, and a different potential method for this type of problem.
Temporal latent space network models for dynamic grooming interactions in baboon troops
May 7, 2014, 1:15pm – 2:15pm
Speaker: Bailey Fosdick
Baboon troops are intriguing social populations as they have strict social hierarchies and about once every fifteen years a given troop will fission into two new troops. Often this occurs according to matrilineal or patrilineal lines, but once in a while, neither of these familial patterns is exhibited. In these cases, primatologists are greatly interested in understanding the severance process and determining whether temporal data on baboon grooming activities may foreshadow the fission event and eventual troop memberships. Current network models are inadequate for modeling such data since they cannot readily account for variable intensity of observation across time and baboons, and they lack a natural mechanism that allows for social distancing over time and prediction of future grooming. In this talk, we present a dynamic latent space network model that addresses these issues. We demonstrate our methodology on data from a baboon troop in the Amboseli National Reserve in Kenya. This is joint work with Yingbo Li, David Banks, and Susan Alberts.