Syndicate content

2013-14 Course: Geometric and Topological Summaries of Data and Inference

The course will focus on geometric and topological summaries computed from data that are routinely generated across science and engineering. The focus is on modeling objects that have geometric or topological structure. Examples include curves, or surfaces such as bones or teeth, or objects of higher dimension such as positive definite matrices, or subspaces that describe variation in phenotypic traits due to genetic variation, or the geometry of multivariate trajectories generated from cellular processes. Specific topics will include the following.

(1) Geometry in statistical inference -- Material covered will include recent work in machine learning and statistics on the topics of manifold learning, subspace inference, factor models, and inferring covariance/positive definite matrices. Applications will be used to highlight methodologies. The focus will be on methods used to reduce high-dimensional data to low-dimensional summaries using geometric ideas.

(2) Topology in statistical inference -- Material covered will focus on probabilistic perspectives on topological summaries such as persistence homology and on inference of topological summaries based on the Hodge operator and the Laplacian on forms. Again, applications will be used to highlight methodologies.

(3) Random geometry and topology -- Material will cover the geometry and topology induced by random processes. Topics include the topology of random clique complexes, random geometric complexes, limit theorems of Betti numbers of random simplicial complexes.

(4) Applications of the Laplacian operator in data analysis -- Material will cover the various uses of the Laplacian in data analysis, including manifold learning, spectral clustering, and Cheeger inequalities. More advanced topics will include the Hodge operator or combinatorial Laplacian and applications to data analysis including decomposing ranked data into consistent and inconsistent components, inference of structure in social networks, and decomposing games into parts that have Nash equilibria and parts that cycle.

Prerequisites: Background in calculus and linear algebra and some reasonable foundation in statistics and probability.

Course Format: The main instructor will be Sayan Mukherjee but there will be several guest lecturers, with material and instructors paralleling certain of the major themes in the 2013-2014 year-long SAMSI program on Low-Dimensional Structure in High-Dimensional Systems (LDHD).

All course updates including example projects, reading material, and lecture slides will be posted at http://www.stat.duke.edu/~sayan/SAMSI/.

Peter Kim's picture

Today's Lecture by Professor Eungchun Cho on Hodge Theory

Peter Kim's picture

Lectures for Jan 28

Peter Kim's picture

Revised Lecture Notes for Jan 21

Peter Kim's picture

Jan 21 Lecture Notes

Peter Kim's picture

Notes