Content area
Full Text
KEY WORDS: methods, statistics, log-linear models, logit models
ABSTRACT
Log-linear methods provide a powerful framework and the statistical apparatus for rigorously analyzing categorical data. These methods were introduced and developed by Leo Goodman and others in the early 1970s. In the late 1970s and the early 1980s, Goodman, Alan Agresti, Clifford Clogg, Otis Dudley Duncan, and others showed how these models could help us to estimate associations between discrete variables, including ordered and unordered polytomies. The last decade has witnessed a set of diverse extensions of these techniques. This paper reviews the basic log-linear strategy and illustrates key concepts. Citations are given to other articles on these topics, many of which are nontechnical and contain substantive sociological applications.
INTRODUCTION
Until the late 1960s, sociologists typically analyzed contingency tables, or two-way tables formed by cross-classifying categorical variables, by calculating chi-square values testing the hypothesis of independence. If independence did not hold, departures from it were typically described in terms of differences in percentages or proportions, or by some global measure of association. Where tables consisted of more than a pair of variables, elaboration procedures
(Lazarsfeld 1955, Rosenberg 1969) were often followed, which involved computing chi-squares for two-way tables (like A x B), and then again for multiple subtables formed from them (i.e. A x B, within categories of C). Analysts were typically concerned with determining whether independence held after the subtables were formed but not before (indicating a spurious association), whether it held in some subtables but not others (indicating a partial or conditional association), or whether departures from independence appeared more pronounced in some subtables than others (indicating an interaction). Where departures from independence were exhibited in various subtables, they were frequently described by differences in the percentage differences, or differences in some summary measure of association calculated for the different subtables.1
The analysis of cross-classified data changed quite dramatically in the 1970s, with the publication of a series of papers on log-linear models by Goodman (see, for example, 1970, 1971a,b, 1972, 1973a,b), many of which were collected in 1978 into his book Analyzing Qualitative/Categorical Data. Other books appeared around that time, many borrowing from and building on Goodman's work; these include important books by Bishop, Fienberg & Holland (1975), Haberman (1974, 1978/1979), and Fienberg...