Program on Data Science in the Social and Behavioral Sciences

Data sources in the social and behavioral sciences range from census and surveys with thousands to millions of individuals, to aggregate data on cities, counties, and organizations, to various forms of passive data such as traces of internet behaviors, social media activity, and personal technology usage (i.e., phone usage, wearables). Intensive time series data involve continuous monitoring of physiological conditions that are related to social and behavioral variables for groups of individuals. Concomitant with these newer data formats is a greater need for algorithms and analyses that capture associations between genetic, environmental, and epigenetic variables for prediction and understanding. Measurement error is an additional challenge with these data because uncorrected it can bias results. This program will address topics in computational social science, including social networks, machine learning, simulation methods, and other innovative data analysis procedures suitable for the complexity of such data.

