Content area
Full text
Contents
- Abstract
- The Centrality of Psychological Measurement
- A Theoretical Model for Scale Development
- Substantive Validity: Conceptualization and Development of an Initial Item Pool
- Conceptualization
- Literature Review
- Creation of an Item Pool
- Basic principles of item writing
- Choice of format
- Structural Validity: Item Selection and Psychometric Evaluation
- Test Construction Strategies
- Criterion-based methods
- Internal consistency methods
- Item response theory (IRT)
- Initial Data Collection
- Inclusion of comparison (anchor) scales
- Sample considerations
- Psychometric Evaluation
- Analysis of item distributions
- Unidimensionality, internal consistency, and coefficient alpha
- The “attenuation paradox.”
- Structural analyses in scale construction
- Creating subscales
- External Validity: The Ongoing Process
Abstract
A primary goal of scale development is to create a valid measure of an underlying construct. We discuss theoretical principles, practical issues, and pragmatic decisions to help developers maximize the construct validity of scales and subscales. First, it is essential to begin with a clear conceptualization of the target construct. Moreover, the content of the initial item pool should be overinclusive and item wording needs careful attention. Next, the item pool should be tested, along with variables that assess closely related constructs, on a heterogeneous sample representing the entire range of the target population. Finally, in selecting scale items, the goal is unidimensionality rather than internal consistency; this means that virtually all interitem correlations should be moderate in magnitude. Factor analysis can play a crucial role in ensuring the unidimensionality and discriminant validity of scales.
Scale development remains a growth industry within psychology. A PsycLIT database survey of articles published in the 6-year period from 1989 through 1994 revealed 1,726 articles with the key words “test construction” or “scale development” published in English-language journals, 270 in other-language journals, and 552 doctoral dissertations. During this same period (i.e., beginning with its inception), 50 articles addressing scale development or test construction were published in Psychological Assessment alone. The majority of these articles reported the development of one or more new measures (82%); most of the rest presented new scales derived from an existing instrument (10%). We use these 41 scale-development articles as a reference set for our discussion. Clearly, despite the criticism leveled at psychological testing in recent years, assessment retains a central role within the field.
Given that test construction remains a thriving activity, it is worthwhile to...





