Ian Foster, Rayid Ghani, Ron S. Jarmin, Frauke Kreuter and Julia Lane have released a free textbook on useful computational methods for social sciences:

Big Data in Social Science


The goal of this book is to provide social scientists with an understanding of the key elements of this new science, its value, and the opportunities for doing better work.


The world has changed for empirical social scientists. The new types of “big data” have generated an entire new research field—that of data science. That world is dominated by computer scientists who have generated new ways of creating and collecting data, developed new analytical and statistical techniques, and provided new ways of visualizing and presenting information. These new sources of data and techniques have the potential to transform the way applied social science is done.

The way in which data are used has also changed for both government agencies and businesses. Chief data officers are becoming as common in federal and state governments as chief economists were decades ago, and in cities like New York and Chicago, mayoral offices of data analytics have the ability to provide rapid answers to important policy questions (Lee et al. 2012). But since federal, state, and local agencies lack the capacity to do such analysis themselves (Alawadhi et al. 2012), they must make these data available either to consultants or to the research community. Businesses are also learning that making effective use of their data assets can have an impact on their bottom line (Brynjolfsson, Hitt, and Kim 2011).


Table of Contents:

  1. Introduction
  2. Working with Data and APIs
  3. Record Linkage
  4. Databases
  5. Scaling up through Parallel and Distributed Computing
  6. Machine Learning
  7. Text Analysis
  8. Networks - The Basics
  9. Information Visualization
  10. Data Quality and Inference Errors
  11. Privacy and Confidentiality
  12. Workbooks


The class on which this book is based was created in response to a very real challenge: how to introduce new ideas and methodologies about economic and social measurement into a workplace focused on producing high-quality statistics.