Content area
Full Text
The VALUE reliability study was developed to gather data on the usability and transferability of rubrics both within and across institutions. This study was also designed to address the degree of reliability and consensus in scoring across faculty from different disciplinary backgrounds. Reliability data were gathered and analyzed for three of the fifteen existing VALUE rubrics - critical thinking, integrative learning, and civic engagement.
The effectiveness of assessment instruments is commonly evaluated by the degree to which validity and reliability can be established. Instruments should, therefore, both accurately capture the intended outcome (validity) and be able to do so consistently (reliability). Because validity is often harder to establish than reliability, it is preferable for assessments to contain multiple forms of validity. In important ways the rubric development process itself provided the VALUE rubrics with substantial degrees of two types of validity. First, because the VALUE rubrics were created nationally by teams of faculty, those people closest to student learning and outcomes assessment on campuses, the rubrics hold a high degree of face validity. The face validity of the rubrics is apparent in the scale of interest and circulation of the rubrics to date, as evidenced by the approximately eleven thousand people from over three thousand institutions and organizations, international and domestic, who have logged in on the AAC&U VALUE web page (http://www.aacu.org/value/index.cfm) to access the rubrics.
Second, the specific employment of faculty experts in particular outcome areas to populate the development teams provides the rubrics with additional content validity. Experts are commonly used to establish content validity to verify that "the measure covers the full range of the concepts meaning" (Chambliss and Schutt 2003, 69).
The objectives for establishing national reliability estimates for the VALUE rubrics were two-fold. One, because the rubrics were created nationally and interdisciplinarily we sought to emulate this procedure in order to establish a cross- disciplinary reliability score for each rubric. Two, we also sought to establish reliability scores within disciplines to examine the range of similarities and differences across faculty from different disciplinary backgrounds.
METHODS
Forty- four faculty members were recruited to participate in the study. Faculty members were evenly distributed across four broad disciplinary areas: humanities, natural sciences, social sciences, and professional and applied sciences. Each faculty member scored...