Full Text

PSYCHOMETRIKAVOL. 71, NO. 4, 713732 DECEMBER 2006

DOI: 10.1007/s11336-005-1295-9

LIMITED INFORMATION GOODNESS-OF-FIT TESTING IN MULTIDIMENSIONAL CONTINGENCY TABLES

ALBERT MAYDEU-OLIVARES UNIVERSITY OF BARCELONA AND INSTITUTO DE EMPRESA BUSINESS SCHOOL

HARRY JOE

UNIVERSITY OF BRITISH COLUMBIA

We introduce a family of goodness-of-t statistics for testing composite null hypotheses in multidimensional contingency tables. These statistics are quadratic forms in marginal residuals up to order r. They are asymptotically chi-square under the null hypothesis when parameters are estimated using any asymptotically normal consistent estimator. For a widely used item response model, when r is small and multidimensional tables are sparse, the proposed statistics have accurate empirical Type I errors, unlike Pearsons X2. For this model in nonsparse situations, the proposed statistics are also more powerful than X2. In addition, the proposed statistics are asymptotically chi-square when applied to subtables, and can be used for a piecewise goodness-of-t assessment to determine the source of mist in poorly tting models.

Key words: multivariate discrete data, categorical data analysis, multivariate multinomial distribution, composite likelihood, item response theory, Lisrel.

1. Introduction

Consider the problem of modeling N independent and identically distributed observations on n discrete random variables consisting, respectively, of K1,...,Kn categories. This type of data arises, for ingfstance, in surveys, educational tests, or social science questionnaires when the number of choices is not constant over items. The observed data can be gathered in an n-dimensional contingency table with C =

ni Ki cells.

Now, consider a parametric model, (), where is the C-dimensional vector of cell probabilities, which depends on a q-dimensional parameter vector which is typically estimated from the data. For assessing the t of the model, consider a composite null hypothesis H0 :

= () for some versus H1 : = () for any . Researchers confronted with testing such a composite hypothesis face two problems. First, how to assess the overall goodness of t of the hypothesized model and, second, how to determine the source of the mist in poorly tting models.

The two most commonly used goodness-of-t statistics for testing the overall goodness of t of a parametric model in multivariate categorical data analysis are Pearsons X2 = 2N Cc=1(pc c)2/c, and the likelihood ratio statistic G2 = 2N Cc=1 pc ln(pc/c). When the model holds, the two...

Show less

Limited Information Goodness-of-fit Testing in Multidimensional Contingency Tables

Full Text

Suggested sources

Limited Information Goodness-of-fit Testing in Multidimensional Contingency Tables

Content area

Full Text

Suggested sources