Winter School in Empirical Research Methods

Basic and Advanced Multilevel Modeling with Stata

Instructor: Tenko Raykov

Prerequisites (Knowledge of topic)

A graduate statistics course, at an introductory level, with exposure to regression analysis.


Provided by host institution, however it is highly recommended to bring your own laptop.


Provided by host institution

  • Stata, v. 15 (experience with Stata is not required, but will be helpful)

Course Content

Day 1 (morning Session): A brief introduction to Stata

  • What is Stata?
  • Resources for working with Stata
  • Why use Stata?
  • A data set to illustrate some data management capabilities of Stata
  • The Stata working windows
  • Exploring a data set
  • Examining variables
  • Putting order into a data file
  • Assigning labels and variable names
  • Dealing with missing values – a first essential step
  • Modifying existing and creating new variables
  • Transforming variables
  • A general approach to variable transformation
  • Getting help.

Day 1
(afternoon session): Fitting single-level regression models using Stata

  • Data set and research question
  • Preliminary analyses
  • Single-level regression analysis with Stata
  • Plotting residuals against predictors
  • Plotting residuals against fitted (predicted) values
  • Plotting standardized residuals.

Day 2 (morning session): Why do we need multilevel and mixed models?

  • What is multilevel modeling, why can’t we do without it, and how come aggregation and disaggregation do not do the job?
  • Examples of nested data and the hallmark of
    multilevel modeling
  • Another important instance of multilevel modeling
  • Aggregation and disaggregation of variable scores
  • Analytic benefits of multilevel modeling.
  • The beginnings of multilevel modeling – why what we already know about regression analysis will be so useful
  • A brief review of regression analysis
  • Multilevel models as sets of regression equations
  • An illustrative example of multilevel modeling.

Day 2 (afternoon session): The intra-class correlation coefficient and its estimation

  • The fully unconditional two-level model and definition of the intraclass correlation coefficient (ICC)
  • Point and interval estimation of the ICC using Stata

Day 3 (morning session): How many levels? – Proportion of third level variance and its evaluation

  • Proportion third level variance
  • The fully unconditional three-level model
  • Point and interval estimation of proportion third level variance using Stata.

Day 3 (afternoon session): Robust modeling of lower-level variable relationships in the presence of clustering effect

  • What is robust modeling in the presence of nesting effects?
  • Robust modeling of hierarchical data using Stata.

Day 4 (morning session): Mixed effects models (mixed models)

  • What are mixed models, what are they made of, and why are they useful?
  • An illustration of the difference between fixed
    and random effects
  • Examples of mixed modeling frameworks
    Mixed models with continuous response variables.
    Random intercept models
  • Fitting a random intercept model with Stata
  • Model adequacy evaluation
  • Between- and within-estimators and when to use which
    Random regression models
  • An instructive example and the restricted maximum likelihood (REML) method
  • Random intercept and slope model
  • Multiple random slopes
  • Fixed effects, random effects, and total effects
  • Numerical issues
    Nested levels – conditional three-level mixed models

Day 4 (afternoon session): Mixed models with discrete responses

  • Why do we need these models?
  • A few important statistical facts
  • The generalized linear model (GLIM)
  • Random intercept models with discrete outcomes
  • Random regression models with discrete outcomes
  • Model choice
  • Appendix – Cross-classification and crossed effects multilevel models.

Day 5 (morning session): Longitudinal multilevel modeling

  • Introduction
  • Multilevel modeling of longitudinal data
  • Using Stata to fit unconditional and conditional growth curve models (cross-sectional time series).

Day 5 (afternoon session): Extensions, Limitations, Conclusion and Outlook.

  • What we could not cover in this course – your next steps.
  • Extensions of multilevel models
  • Limitations of multilevel modeling
  • Conclusion and outlook.



  • Snijders, T. A. B., & Bosker, R. J. (2013). Multilevel analysis. An introduction to basic and advanced multilevel modeling. Thousand Oaks, CA: Sage.

Supplementary / voluntary:

  • Rabe-Hesketh, S., & Skrondal, A. (2012). Multilevel and longitudinal modeling with Stata. College Station, TX: Stata Press.

Mandatory readings before course start:

  • Raykov, T. (2019). A course in multilevel modeling. Lecture notes. Michigan State University, East Lansing, Michigan, USA.

Examination Part

Take home assignment, to be submitted within 3 weeks upon course completion.

Participants are allowed any literature they can find, incl. the lecture notes volume to be provided in pdf form to them before course commences.

Supplementary Aids

Course participants are allowed to use any literature they can access, incl. the lecture notes.

Examination Content

Multilevel modeling with missing data, violations of missing at random, and accounting for clustering effects.