Date of Award

December 2016

Degree Type

Dissertation

Degree Name

Doctor of Philosophy

Department

Educational Psychology

First Advisor

Bo Zhang

Committee Members

David Armstrong, Cindy Walker, Razia Azen, Stephen Wester

Keywords

Clarke Statistic, Model-Data Fit, Multidimentional Item Response Theory, Vuong Statistic

Abstract

The primary importance of the study was that no statistics were able to provide probabilistic statements about model-data fit between the non-nested compensatory and non-compensatory MIRT models. Secondarily, the Vuong and Clarke statistics have been utilized prolifically in economics and political science and have great potential to contribute to educational measurement. Finally, application of the Vuong and Clarke statistics will not only reduce the damage of misspecified MIRT models but will also promote the use of less known models, in this case, the non-compensatory models.

The purpose of the study was to investigate whether the Vuong and Clarke statistics can be used to detect model-data fit between compensatory and non-compensatory MIRT models. The effectiveness of the statistics was evaluated through simulated Type I error and power studies. The Type I error studies compared the true and estimated compensatory models. The power studies compared the true non-compensatory model with the estimated compensatory model. The controlling factors included test structure, sample size, test length, and correlation between person traits. Overall, the statistics produced very large values which resulted in, on average, extremely high rejection rates for all conditions, if the assumed sampling distributions of the two fit statistics were used. In other words, the nominal Type I error rates were not observed. Consequently, alternate processes were employed to assess statistical power: The Receiver Operating Characteristic (ROC) curves were used to assess the discrimination ability of the statistics and a Monte Carlo resampling technique was used to assess power rates.

Results of both analyses clearly indicated the value of the Vuong and Clarke statistics in detecting model-data fit for the MIRT models. The ROC curve results provided evidence of discrimination ability of both statistics under most conditions. The power analyses provided strong evidence that under most conditions the Vuong statistic, the Clarke statistic, or both statistics are able to detect misfit. There was an observable impact of the four condition factors on the performance of the fit statistics. The patterns of results included increased power with increased test length and sample size, and decreased power with increased correlation between traits. The statistics were particularly effective with large sample size, large test length, and low correlation between trait conditions. The statistics were less effective when correlation between traits was high. There was also a difference in the performance of the statistics across test structure.

Share

COinS