Content area
Abstract
This research applied two-stage testing (TST) to a licensure testing examination. Using actual data from an allied health profession's licensure examination (N = 4,611), various combinations of TST were compared to the fixed length traditional multiple choice (TMC) exam and various "best" shorter tests. The variables evaluated were: (a) Routing Test length (n$\sb{\rm t}$ = 13 and 20), (b) Measurement Test length (n$\sb{\rm t}$ = 24 and 47), (c) the number of Measurement Tests (2 and 3), (d) IRT model choice (3 parameter logistic model with all parameters estimated (3p-V) and a 3 parameter logistic model with a fixed c parameter (3p-F)), (e) the effects of intentional misrouting, and (f) the effects of shifting the cut score up one standard error (se) or down one se on the baseline exams while maintaining the correct cut score on the TST, also known as Boundary Theory analysis. To evaluate the relative efficiency of TSTs versus TMC examinations, three indices were used: (a) kappa (decision consistency) based on pass/fail status, (b) RMSE evaluated across ten levels of the theta continuum, and (c) BLAS evaluated across the same ten levels of the theta continuum as RMSE. Actual routing errors were also evaluated. The results of this study indicate that the 3p-F model was associated with decision consistency indices superior to the 3p-V model. Increased items in either the Routing or Measurement test, led to higher kappa values. Although actual routing errors were high (13% to 31%), IRT appeared to rectify some of the negative effects of routing error in terms of decision consistency.





