Computational and Inferential Methods for High Dimensions and Massive Datasets (Fall 2012)


Course Day and Time: Course was held at SAMSI in RTP on Tuesdays, 4:30-7:00 p.m. in Room 150.
First class Tuesday, September 4, 2012 – last class day, Tuesday, November 27, 2012

Course Description: This course focused on fundamental methodological questions of statistics, mathematics and computer science posed by massive datasets, with applications to astronomy, high energy physics, and the environment.

Topics included:

  • Data: where it comes from and how massive datasets can be efficiently managed, including dealing with missing and noisy data, anomalies and transient events
  • Computing: how computational needs can be met by distributing computing over the available computational resources including cluster, cloud and GPU computing; efficient computational algorithms
  • Visualization: data visualization to enhance human understanding
  • Statistical Inference: problems and opportunities in high dimensional data; false discovery rates; regularization, Bayes and empirical Bayes; parametric, semi-parametric and non- parametric modeling; leveraging algorithms and computer resources

Registration for this course is processed through your respective university:

  • Duke: STA 790-03
  • NCSU: MA 810.001
  • UNC: STOR 940.1

Questions: email