Content area
Full text
(ProQuest: ... denotes formulae and/or non-US-ASCII text omitted; see image)
Original Articles
In social science research, it is common to confront data that are clustered or grouped into higher-level units. One of the most frequently encountered challenges when modeling these data arises when the dependent variable exhibits group-level variation beyond what can be explained by the independent variables alone. In these cases, fitting a standard linear regression or generalized linear model without accounting for the grouped nature of the observations can lead to poorly fitting models and misleading estimates of both the effect of independent variables of interest and of the precision of those estimates (Beck and Katz 1995; Greene 2012).
The two dominant approaches to remedy this problem are the use of so-called fixed-effects or random-effects models.1Although much has been written on the theoretical properties of both approaches (for example, Kreft and DeLeeuw 1998; Robinson 1998; Kennedy 2003; Frees 2004; Gelman 2005; Wilson and Butler 2007; Arceneaux and Nickerson 2009; Wooldridge 2010; Greene 2012), recommendations for applied researchers are often confusing--or even contradictory (Gelman and Hill 2007, 245). Often they are made with reference to idealized datasets with very large sample sizes, or using divergent standards for assessing model quality. There remains little consistent guidance for researchers trying to decide how best to model the data they have on hand. They are left to wonder: "Should I use fixed or random effects?"
In this article, we offer practical guidance for researchers choosing between fixed- and random-effects models. As we describe below, both models entail a series of assumptions that might be violated in any given dataset. Under certain conditions, random effects models can introduce bias, but reduce the variance of estimates of coefficients of interest. Fixed-effects estimates will be unbiased, but may be subject to high sample dependence. We argue that researchers ought not to place undue weight on minimizing either bias or variance, but rather consider the trade-off between the two in either model. While it is true that under a random-effects specification there may be bias in the coefficient estimates if the covariates are correlated with the unit effects, it does not follow that any correlation between the covariates and the unit effects implies that fixed effects should be...