Content area
Full Text
Abstract
The aim of this study is to determine whether items from the mathematics section of the 2012 Level Determination Exam indicate item bias according to gender and school type. In particular, the process of item bias has been determined using the Delphi technique and focus group interviews. A two-stage mixed method research has been used for the study. While the first stage consists of identifying items that display differential item functioning (DIF) according to gender and school type, the second stage consists of determining the sources of DIF using the Delphi technique and examining through a focus-group interview which DIF sources lead to item bias. Mantel-Haenszel and logistic regression methods have been used for DIF analysis. While two items with significant DIF were detected according to gender, five items in favor of private schools were detected according to school type. In the process of item bias, the reasons why items display DIF have been determined using the Delphi technique, and 22 DIF sources were agreed upon. Finally, an expert panel was made to examine whether the DIF sources are grounds for item bias or not. According to the panel of experts, one item according to gender and two items according to the school type have been determined to show bias.
Keywords
Test bias Differential item functioning Delphi technique Item bias expert panel
Obtaining as accurate an error-free measurement as possible is desirable for being able to obtain accurate information about the quantity of a characteristic being measured and for making proper decisions based on these measurement results. However, having errors in measurement results is inevitable in social sciences such as education and psychology (Tan, 2013). Having the true value of a feature observed in the measurement is desirable, but the actual value cannot be obtained directly due to various errors involved in measuring. Estimating the true value is attempted with the help of observed values. The true value of the measured feature is the average of the scores obtained from an infinite number of measurements of the feature according to classical test theory (CTT) (Crocker & Algina, 1986). According to CTT, the observed scores of an individual selected without unbiased from the population is the sum of the true score and the error...