Causal Inference with Big Data
In the 21st century, information is created and stored at unprecedented rates. The access to high-dimensional large data sets – “Big Data” – has opened up new possibilities for business analytics and economic research. Massive datasets alone are, however, insufficient to answer fundamental questions within business and economics. Using the potential outcome framework, we explore various methods useful for causal inference in the Big Data era. We discuss the promise and pitfalls of large-scale experimentation and consider empirical applications relevant for business and policy analysis.
The course covers the following topics:
- What is big data?
- The potential outcome framework
- Regression and matching
- Large-scale experimentation
- Treatment effect heterogeneity
- False positives and p-hacking
- Publication bias
- Regression discontinuity designs
- Supplementary analysis
- Data visualization
- Feature engineering and feature learning
- Introduction to image analysis
- Introduction to text analysis
Learning outcome knowledge
After having completed this course, students should be familiar with the potential outcome framework and microeconometric methods useful for answering “what if” questions using Big Data. Students learn the distinction between causal models and predictive models.
- Written assignment: 80%
- Presentation: 20%