Winter School in Empirical Research Methods

Longitudinal Data Analysis

Instructor: Christopher Zorn

PREREQUISITES (KNOWLEDGE OF TOPIC)

Comfortable familiarity with univariate differential and integral calculus, basic probability theory, and linear algebra is required. Students should have completed Ph.D.-level courses in introductory statistics and linear regression models, up to the level of Regression III. Familiarity with discrete and continuous univariate probability distributions will be helpful.

HARDWARE

Students will be required to provide their own laptop computers.

SOFTWARE

All analyses will be conducted using the R statistical software. R is free, open-source, and runs on all contemporary operating systems. The instructor will also offer limited support for Stata and SAS.

COURSE CONTENT

The subject matter of the course is regression models for data that vary both over cross-sectional units and across time. The course will begin with a discussion of the relevant dimensions of variation in such data, and discuss some of the challenges and opportunities that such data provide. It will move on to models for one-way unit effects (fixed, between, and random), models for complex panel error structures, dynamic panel models, and nonlinear models for discrete dependent variables. The second part of the course will focus on models for time-to-event (“survival,” or “event history”) data. In every case, students will learn the statistical theory behind the various models, details about estimation and inference, and techniques for the substantive interpretation of statistical results. Students will also develop statistical software skills for fitting and interpreting the models in question, and will use the models in both simulated and real data applications. Students will leave the course with a thorough understanding of both the theoretical and practical aspects of conducting analyses of longitudinal data.

STRUCTURE

Day 1:

  • Morning: Overview of Panel/TSCS data + One-Way Unit Effects
  • Afternoon: GLS-ARMA and Dynamic Panel Data Models

Day 2:

  • Morning: Hierarchical / Multilevel Models for TSCS Data
  • Afternoon: Models for Binary and Event Count Dependent Variables

Day 3:

  • Morning: Generalized Estimating Equations
  • Afternoon: Introduction to Survival / Event History Data

Day 4:

  • Morning: Parametric and Semiparametric Models for Survival Data
  • Afternoon: Discrete-Time Models

Day 5:

  • Morning: Survival Model Extensions
  • Afternoon: Examination

LITERATURE

Mandatory:

The course has two required texts:

Box-Steffensmeier, Janet M., and Bradford S. Jones. 2004. Event History Modeling: A Guide for Social Scientists. New York: Cambridge University Press.

Hsaio, Cheng. 2003. Analysis of Panel Data. New York: Cambridge University Press.

Additional readings will also be assigned as necessary, all of which will be available on github and/or through JSTOR.

Supplementary / voluntary:

None.

Mandatory readings before course start:

None.

EXAMINATION PART

Grading:

  • Two written homework assignments (20% each)
  • A final examination (50%)
  • Oral participation (10%)


Supplementary aids

The exam will be a “practical examination” (see below for content). Students will be allowed access to (and encouraged to reference) all course materials, notes, help files, and other documentation in completing their exam.

EXAMINATION CONTENT

The examination will involve the application of the techniques taught in the class to one or more “live” data example(s). These will typically take the form of either (a) a replication and extension of an existing published work, or (b) an original analysis of observational data with a survival / duration component. Students will be required to specify, estimate, and interpret various forms of survival models, to conduct and present diagnostics and robustness checks, and to give detailed justifications for their choices.

LITERATURE

Panel / Time-Series Cross-Sectional Models:

Beck, Nathaniel, and Jonathan N. Katz. 1995. “What To Do (And Not To Do) With Time- Series Cross-Section Data.” American Political Science Review 89(September): 634-647.

Cameron, A. Colin, and Pravin K. Trivedi. 1998. Regression Analysis of Count Data. New York: Cambridge University Press. Chapter 9.

Clark, Tom S. and Drew A. Linzer. 2015. “Should I Use Fixed Or Random Effects?” Political Science Research and Methods 3(2):399-408.

Keele, Luke, and Nathan J. Kelly. 2006. “Dynamic Models for Dynamic Theories: The Ins and Outs of Lagged Dependent Variables.” Political Analysis 14(2):186-205.

Hsaio, Cheng. 2003. Analysis of Panel Data. New York: Cambridge University Press.

Zorn, Christopher. 2001. “Estimating Between- and Within-Cluster Covariate Effects, with an Application to Models of International Disputes.” International Interactions 27(4):433-45.

Zorn, Christopher. 2001. “Generalized Estimating Equation Models for Correlated Data: A Review with Applications.” American Journal of Political Science 45(April):470-90.

Survival / Event History Models:

Beck, Nathaniel, Jonathan N. Katz, and Richard Tucker. 1998. “Taking Time Seriously: Time-Series-Cross-Section Analysis with a Binary Dependent Variable.” American Journal of Political Science 42(October):1260-88 (and erratum).

Box-Steffensmeier, Janet M., and Bradford S. Jones. 2004. Event History Modeling: A Guide for Social Scientists. New York: Cambridge University Press.

Box-Steffensmeier, Janet M., and Christopher Zorn. 2001. “Duration Models and Proportional Hazards in Political Science.” American Journal of Political Science 45(October):951- 67.

Box-Steffensmeier, Janet M., and Christopher Zorn. 2002. “Duration Models for Repeated Events.” Journal of Politics 46(November):1069-94.

Pintilie, Melania. 2007. “Analyzing and Interpreting Competing Risk Data.” Statistics in Medicine 26:1360-67.

Signorino, Curt, and David Carter. 2010. “Back to the Future: Modeling Time Dependence in Binary Data.” Political Analysis 18(3):271-292. Also read response by Beck and rejoinder by Signorino & Carter.

Zorn, Christopher. 2000. “Modeling Duration Dependence.” Political Analysis 8(Autumn): 367-380.

WORK LOAD

At least 24 units 45 minutes each on 5 consecutive days. Please structure your course accordingly. Main course times: 9:30-12:15 a.m. and 1:15-3:30 p.m.