Content area
Full Text
psychometrikavol. 70, no. 2, 325345
june 2005DOI: 10.1007/s11336-003-1083-3SELECTING THE NUMBER OF CLASSES UNDER LATENT CLASS REGRESSION:
A FACTOR ANALYTIC ANALOGUEGuan-Hua Huangnational chiao tung universityRecently, the regression extension of latent class analysis (RLCA) model has received much attention in the field of medical research. The basic RLCA model summarizes shared features of measured
multiple indicators as an underlying categorical variable and incorporates the covariate information in
modeling both latent class membership and multiple indicators themselves. To reduce complexity and
enhance interpretability, one usually fixes the number of classes in a given RLCA. Often, goodness of fit
methods comparing various estimated models are used as a criterion to select the number of classes. In this
paper, we propose a new method that is based on an analogous method used in factor analysis and does
not require repeated fitting. Two ideas with application to many settings other than ours are synthesized
in deriving the method: a connection between latent class models and factor analysis, and techniques of
covariate marginalization and elimination. A Monte Carlo simulation study is presented to evaluate the
behavior of the selection procedure and compare to alternative approaches. Data from a study of how
measured visual impairments affect older persons functioning are used for illustration.Key words: categorical data, factor analysis, finite mixture model, goodness of fit test, latent profile model,
marginalization, residuals in generalized linear models, Monte Carlo simulation.1. IntroductionLatent class analysis (LCA), originally described by Green (1951) and systematically developed by Lazarsfeld and Henry (1968), Goodman (1974), has been found useful for classifying
subjects based on their responses to a set of categorical items. The basic model postulates an
underlying categorical latent variable with, say, J categories, and measured items are assumed
independent of one another within any category of the latent variable. Observed relationships
among measured variables are thus assumed to result from the underlying classification of the
data produced by the categorical latent variable. Recently, several authors extended the LCA
model to describe the effects of measured covariates on the underlying mechanism (Dayton and
Macready, 1988; Van der Heijden, Dessens, and Bokenholt, 1996; Bandeen-Roche, Miglioretti,
Zeger, and Rathouz, 1997), or on measured item distributions within latent levels (Melton, Liang,
and Pulver, 1994). This paper studies the problem of determining the number of...