# Postdoctoral Fellow Seminars

## Lecture: Advancements in Hybrid Iterative Methods for Inverse Problems

September 16, 2020, 1:15pm – 2:15pm
Virtual
Speaker: Julianne Chung, Virginia Tech University

### Abstract

In many physical systems, measurements can only be obtained on the exterior of an object (e.g., the human body or the earth’s crust), and the goal is to estimate the internal structures. In other systems, signals measured from machines (e.g., cameras) are distorted, and the aim is to recover the original input signal. These are natural examples of inverse problems that arise in fields such as medical imaging, astronomy, geophysics, and molecular biology.

Hybrid iterative methods are increasingly being used to solve large, ill-posed inverse problems, due to their desirable properties of (1) avoiding semi-convergence, whereby later reconstructions are no longer dominated by noise, and (2) enabling adaptive and automatic regularization parameter selection.  In this talk, we describe some recent advancements in hybrid iterative methods for computing solutions to large-scale inverse problems. First, we consider a hybrid approach based on the generalized Golub-Kahan bidiagonalization for computing Tikhonov regularized solutions to problems where explicit computation of the square root and inverse of the covariance kernel for the prior covariance matrix is not feasible. This is useful for large-scale problems where covariance kernels are defined on irregular grids or are only available via matrix-vector multiplication. Second, we describe flexible hybrid methods for solving $\ell_p$ regularized inverse problems, where we approximate the p-norm penalization term as a sequence of 2-norm penalization terms using adaptive regularization matrices, and we exploit flexible preconditioning techniques to efficiently incorporate the weight updates.  We introduce a flexible Golub-Kahan approach within a Krylov-Tikhonov hybrid framework, such that our approaches extend to general (non-square) l_p regularized problems. Numerical examples from dynamic photoacoustic tomography, space-time deblurring, and passive seismic tomography demonstrate the range of applicability and effectiveness of these approaches.

## Lecture: Sharp 2-norm Error Bounds for the Conjugate Gradient Method and LSQR

September 23, 2020, 1:15pm – 2:15pm
Virtual
Speaker: Eric Hallman, North Carolina State University

### Abstract

When running any iterative algorithm it is useful to know when to stop. Here we review the conjugate gradient method, an iterative method for solving Ax=b where A is symmetric positive definite, as well as estimates for the 2-norm error \|x-x_*\|_2, where x_* is the solution to the linear system. We introduce a new method for computing an upper bound on the 2-norm error, and show that given certain mild assumptions our bounds are optimal. Experimental results are discussed as well as the implications of our work for solving the least-squares problem \min_x \|Ax-b\| using the iterative algorithm LSQR.

## Lecture: Convergence of the Parameters in Mixture Models with Repeated Measurements

September 30, 2020, 1:15pm – 2:15pm
Virtual
Speaker: Yun Wei, SAMSI

### Abstract

Latent structure models with many observed variables are among the most powerful and widely used tools in statistics for learning about heterogeneity within data population(s). An important canonical example of such models is the mixture of product distributions. We consider the finite mixture of product distribution with the special structure that the product distributions in each mixture component are also  identically distributed.  In this setup, each mixture component consists of samples from repeated measurements and thus such data are exchangeable sequences. Applications of the model include psychological study and topic modeling.

We show that with sufficient repeated measurements, a model that is not originally identifiable becomes identifiable. The posterior contraction rate for the parameter estimation is also obtained and it shows that repeated measurements are beneficial for estimating parameters in each mixture component. Such results hold for general probability kernels including all regular exponential families and can be applied to hierarchical models.

Based on joint work with Xuanlong Nguyen.

## Lecture: Randomized Approaches to Accelerate MCMC Algorithms for Bayesian Inverse Problems

October 7, 2020, 1:15pm – 2:15pm
Virtual
Speaker: Arvind Saibaba, North Carolina State University

### Abstract

Markov chain Monte Carlo (MCMC) approaches are traditionally used for uncertainty quantification in inverse problems where the physics of the underlying sensor modality is described by a partial differential equation (PDE). However, the use of MCMC algorithms is prohibitively expensive in applications where each log-likelihood evaluation may require hundreds to thousands of PDE solves corresponding to multiple sensors; i.e., spatially distributed sources and receivers perhaps operating at different frequencies or wavelengths depending on the precise application. In this talk, I will show how to mitigate the computational cost of each log-likelihood evaluation by using several randomized techniques and embed these randomized approximations within MCMC algorithms. These MCMC algorithms are computationally efficient methods for quantifying the uncertainty associated with the reconstructed parameters. We demonstrate the accuracy and computational benefits of our proposed algorithms on a model application from diffuse optical tomography where we invert for the spatial distribution of optical absorption.

## Lecture: Individual Level Always Survivor, Direct, Spillover Effects with Applications

October 14, 2020, 1:15pm – 2:15pm
Virtual
Speaker: Jaffer Zaidi, SAMSI

### Abstract

We provide investigators with the ability to quantify individual level always survivor, direct, and spillover effects. The survivor average causal effect is commonly identified with more assumptions than those guaranteed by the design of a randomized clinical trial. This paper demonstrates that individual level causal effects in the `always survivor’ principal stratum can be identified with no stronger identification assumptions than randomization. We illustrate the practical utility of our methods using data from a clinical trial on patients with prostate cancer. We also provide another application on the spillover effects of randomized get out the vote campaigns. Our methodology is the first and, as of yet, only proposed procedure that enables detecting individual level causal effects in the presence of truncation by death using only the assumptions that are guaranteed by design of the clinical trial.

## Lecture: Scalable Bayesian Inference for Time Series via Divide-and-conquer

October 21, 2020 1:15pm – 2:15pm
Virtual
Speaker: Deborshee Sen, SAMSI

### Abstract

Bayesian computational algorithms tend to scale poorly as the size of data increases. This had led to the development of divide-and-conquer-based approaches for scalable inference. These divide the data into chunks, perform inference for each chunk in parallel, and then combine these inferences. While appealing theoretical properties and practical performance has been demonstrated for independent observations, scalable inference for dependent data remains challenging. In this work, we study the problem of Bayesian inference from very long time series.  The literature in this area focuses mainly on approximate approaches that lack any theoretical guarantees and may provide arbitrarily poor accuracy in practice.  We propose a simple and scalable divide-and-conquer algorithm, and provide accuracy guarantees.  Numerical simulations and real data applications demonstrate the effectiveness of our approach.

## Lecture: Probabilistic Learning on Manifolds

October 28, 2020 1:15pm – 2:15pm
Virtual
Speaker: Ruda Zhang, SAMSI

### Abstract

Probabilistic models of data sets often exhibit salient geometric structure. Such a phenomenon is summed up in the manifold distribution hypothesis, and can be exploited in probabilistic learning tasks such as density estimation and generative modeling. In this talk I present a framework for probabilistic learning on manifolds (PLoM), which uses manifold learning to discover low-dimensional structures within high-dimensional data, and exploits topological properties of the learned manifold to efficiently build probabilistic models. A joint distribution is partitioned into a marginal distribution on the manifold and conditional distributions on normal spaces of the manifold. The marginal distribution can be estimated using Riemannian kernels, and the conditional distributions can be estimated discretely by normal-bundle bootstrap or continuously using Gaussian kernels. Combining the marginal and conditional models gives a joint generative model. I will also talk about related algorithms and software development, and potential applications.

## Lecture: Competition and Spreading of Low and High-Quality Information in Online Social Networks

November 4, 2020 1:15pm – 2:15pm
Virtual
Speaker: Diego Fregolente, SAMSI

### Abstract

The advent of online social networks as major communication platforms for the exchange of information and opinions is having a significant impact on our lives by facilitating the sharing of ideas. Through networks such as Twitter and Facebook, users are exposed daily to a large number of transmissible pieces of information that compete to attain success. Such information flows have increasingly consequential implications for politics and policy, making the questions of discrimination and diversity more important in today’s online information networks than ever before.   However, while one would expect the best ideas to prevail, empirical evidence suggests that high-quality information has no competitive advantage. We investigate this puzzling lack of discriminative power through an agent-based model that incorporates behavioral limitations in managing a heavy flow of information and measures the relationship between the quality of an idea and its likelihood to become prevalent at the system level. We show that both information overload and limited attention contribute to a degradation in the system’s discriminative power. A good tradeoff between discriminative power and diversity of information is possible according to the model. However,  calibration with empirical data characterizing information load and finite attention in real social media reveals a weak correlation between quality and popularity of information. In these realistic conditions, the model provides an interpretation for the high volume of viral misinformation we observe online.

## Lecture: Retrospective Causal Inference via Matrix Completion, with an Evaluation of the Effect of European Integration on Labour Market Outcomes

November 11, 2020 1:15pm – 2:15pm
Virtual
Speaker: Jason Poulus, SAMSI

### Abstract

We propose a method of retrospective counterfactual prediction in panel data settings with units exposed to treatment after an initial time period (later-treated), and always-treated units, but no never-treated units. We invert the standard setting by using the observed post-treatment outcomes to predict the counterfactual pre-treatment potential outcomes under treatment for the later-treated units. We impute the missing outcomes via a matrix completion estimator with a propensity- and elapsed-time weighted objective function that corrects for differences in the covariate distributions and elapsed time since treatment between groups. Our methodology is motivated by evaluating the effect of two milestones of European integration on the share of cross-border workers in sending border regions. We provide evidence that opening the border increased the probability of working beyond the border in Eastern European regions.