Geometric and Topological Summaries of Data and Inference – Fall 2013


First class: Tuesday, August 27, 2013, 4:30-7:00 p.m. in Room 150 at SAMSI
No class: September 10, 2013
No class: October 8, 2013
No class: November 26, 2013
Last class: Tuesday, December 3, 2013

Course Description:  The course focused on geometric and topological summaries computed from data that are routinely generated across science and engineering. The focus was on modeling objects that have geometric or topological structure. Examples included curves, or surfaces such as bones or teeth, or objects of higher dimension such as positive definite matrices, or subspaces that describe variation in phenotypic traits due to genetic variation, or the geometry of multivariate trajectories generated from cellular processes. Specific topics included the following.

(1) Geometry in statistical inference — Material covered included recent work in machine learning and statistics on the topics of manifold learning, subspace inference, factor models, and inferring covariance/positive definite matrices. Applications will be used to highlight methodologies. The focus was on methods used to reduce high-dimensional data to low-dimensional summaries using geometric ideas.

(2) Topology in statistical inference — Material covered focused on probabilistic perspectives on topological summaries such as persistence homology and on inference of topological summaries based on the Hodge operator and the Laplacian on forms. Again, applications were used to highlight methodologies.

(3) Random geometry and topology — Material covered the geometry and topology induced by random processes. Topics included the topology of random clique complexes, random geometric complexes, limit theorems of Betti numbers of random simplicial complexes.

(4) Applications of the Laplacian operator in data analysis — Material will cover the various uses of the Laplacian in data analysis, including manifold learning, spectral clustering, and Cheeger inequalities. More advanced topics included the Hodge operator or combinatorial Laplacian and applications to data analysis including decomposing ranked data into consistent and inconsistent components, inference of structure in social networks, and decomposing games into parts that have Nash equilibria and parts that cycle.

Prerequisites: Background in calculus and linear algebra and some reasonable foundation in statistics and probability.

Course Format: The main instructor was Sayan Mukherjee but there were several guest lecturers, with material and instructors paralleling certain of the major themes in the 2013-2014 year-long SAMSI program on Low-Dimensional Structure in High-Dimensional Systems (LDHD).

Registration for this course is being processed through your respective university:

  • Duke: STA 790.01
  • NCSU: MA 810.001
  • UNC: STOR 892.1

Questions: email