Computational and Inferential Methods for High Dimensions and Massive Datasets (Fall 2012)

Course Information

Principal Instructors: Various

Course Day and Time: Course was held at SAMSI (driving directions) in RTP on Tuesdays, 4:30-7:00 p.m. in Room 150.

Schedule: First class Tuesday, September 4, 2012 ; last class day, Tuesday, November 27, 2012

Course Description:
This course focused on fundamental methodological questions of statistics, mathematics and computer science posed by massive datasets, with applications to astronomy, high energy physics, and the environment.

Topics included:

  • Data: where it comes from and how massive datasets can be efficiently managed, including dealing with missing and noisy data, anomalies and transient events
  • Computing: how computational needs can be met by distributing computing over the available computational resources including cluster, cloud and GPU computing; efficient computational algorithms
  • Visualization: data visualization to enhance human understanding
  • Statistical Inference: problems and opportunities in high dimensional data; false discovery rates; regularization, Bayes and empirical Bayes; parametric, semi-parametric and non- parametric modeling; leveraging algorithms and computer resources

Registration for this course is processed through your respective university:
Duke: STA 790-03
NCSU: MA 810.001
UNC: STOR 940.1

Questions about the course or the MD program should be emailed to [email protected]