Social Data Science
  • Syllabus
  • Calendar
  • Problem Sets
    • Problem Set 1: Code Basics
    • Problem Set 2: Visualization
    • Problem Set 3: Estimator and Bootstrap
    • Problem Set 4: Statistical Learning
    • Problem Set 5: Income Prediction Challenge
    • Problem Set 6: Potential Outcomes
  • Piazza
  • Honors Section
    • Welcome
    • Asking questions with data
    • Accessing data
    • Preparing data
    • Sketching a visualization
    • Writing
    • Presenting orally
  • Extension Requests
  • Past Year Sites
  1. Past Year Sites
  • Home
  • Getting Started
    • Research Questions in Social Data Science
    • Software Prerequisites
    • Basics of R
    • Visualizing a Distribution
    • Summary Statistics
    • Population Sampling
    • Confidence Intervals
  • Models for Subgroup Summaries
    • Linear Regression
    • Logistic Regression
    • Forests
    • Economic Opportunity Measured by Predictability
    • Sample Splitting
    • Data-Driven Estimator Selection
    • Are Complex Models Better?
  • Causal Inference with Measured Confounding
    • Defining Causal Effects
    • Exchangeability
    • Directed Acyclic Graphs
    • Matching
    • Models for Causal Inference
  • Causal Inference with Unmeasured Confounding
    • Difference in Difference
    • Regression Discontinuity
    • Instrumental Variables

Past Year Sites

The material on this site draws on the websites of several past courses that I have taught.

  • UCLA SOC 114 (Winter 2025)
  • Cornell INFO 3370 (Spring 2024) (Spring 2023)
  • Cornell INFO 3900 (Fall 2023)
Back to top