An investigation of the consequences for students of using different procedures to equate tests as fit to the Rasch model degenerates
Many large-scale national and international testing programs use the Raschmodel to govern the construction of measurement scales that can be used tomonitor standards of performance and monitor performance over time. Asignificant issue that arises in such programs is that once a decision has beenmade to use the model, it is not possible to reverse the decision if the data do notfit the model. There are two levels of question that result from such a situation.One of them involves the issue of misfit to the model. That is, how robust is themodel to violations of fit of the data to the model? A second question emergesfrom the premise that the issue of fit to the model is a relative matter. That is,ultimately, it becomes the decision of users as to whether data fit the model wellenough to suit the purpose of the users. Once this decision has been made, suchas in the case of large-scale testing programs like the ones refocused to above,then the question reverts to one in which the focus is on the applications of theRasch model. More specifically, in the case of this study, the intention is toexamine the consequences of variability of fit to the Rasch model on themeasures of student performance obtained from two different equatingprocedures.Two related simulation studies have been conducted to compare the resultsobtained from using two different equating procedures (namely separate andconcurrent equating) with the Rasch Simple Logistic model, as data-model fitgets progressively worse. The results indicate that when data-model fit rangesfrom good fit to average fit (MNSQ ? 1.60), there is little or no differencebetween the results obtained from the different equating procedures. However,when data-model fit ranges from relatively poor fit to poor fit (MNSQ > 1.60), theresults from using different equating procedures prove less comparable.When the results of these two simulation studies are translated to a situation inAustralia, for example, where different states use different equating proceduresto generate a single comparable score and then these scores are used tocompare performances amongst students and to predetermined standards orbenchmarks, it raises significant equity issues. In essence, it means that in thelatter situation, some students are deemed to be either above or below thestandards purely as a consequence of the equating procedure selected. Forexample, students could be deemed to be above a benchmark if separateequating was used to produce the scale; yet these same students could bedeemed to fall below the benchmark if concurrent equating is used. The actualconsequences of this decision will vary from situation to situation. For example, ifthe same equating procedure was used each year to equate the data to form asingle scale, then it could be argued that it does not matter if the results varyfrom occasion to occasion because it is consistent for the cohort of students fromyear to year. However, if other states or countries, for example, use a differentequating procedure and the results are compared, then there is an equityproblem. The extent of the problem is dependent upon the robustness of themodel to varying degrees of misfit.
Year of publication: |
2006
|
---|---|
Institutions: | Sadeghi, Rassoul, Education, Faculty of Arts & Social Sciences, UNSW |
Publisher: |
Awarded by:University of New South Wales. School of Education |
Subject: | Psychology | Experimental | Research | Methodology |
Saved in:
freely available
Saved in favorites
Similar items by subject
-
Real world research : a resource for social scientists and practitioner-researchers
Robson, Colin, (1993)
-
Handbook of mixed methods in social & behavioral research
Tashakkori, Abbas, (2003)
-
Understanding and evaluating research in applied clinical settings
Morgan, George A., (2006)
- More ...