潛在特質模型錯排分派量測下評分者間信度之探討

陳宏Chen, Hung臺灣大學:數學研究所劉繕榜Liu, Shan-PangShan-PangLiu2010-05-052018-06-282010-05-052018-06-282009U0001-1108200915453900http://ntur.lib.ntu.edu.tw//handle/246246/180617本研究的目的旨在當一大群應試者經由錯排方式分派給評分者時，探討評分者對應試者潛在特質進行評等之評分者間信度。在潛在特質模型(LTM)的假設下，polychoric 相關係數被用來當作評分者間信度。們認為經由錯排方式將約三十萬名應試者分派給幾百位的評分者，能確保兩評分者共同評等的應試者至少上百人。在這樣的設置下，我們發現所有的評分者都會被分成幾個循環組。透過分析論證及100次的模擬結果，發現所形成的循環組數大部分不超過十組，至少有一組2-循環或3-循環的比例為0.59，而且經常產生評分者個數超過100的循環組。每位評分者所被分派到的應試者潛在特質之分配，經由Kolmogrove-Smirov test發現大部分來自於標準常態分配，僅有少數群應試者潛在特質與其他群差別在於平均數的差異。潛在特質模型(LTM)的假設下，我們認為鑑別參數可視為評分者評等精確度的指標。同時我們也說明評分者的評等與應試者潛在特質之相關性與等級門檻(thresholds)和鑑別參數有關。兩評分者觀感潛在變數之相關係數為鑑別參數之乘積，並以兩階段的方式以polychoric相關係數來估計。由評分者所給的級分比例求出他們的等級門檻，鑑別參數則是藉由polychoric相關係數及適當的錯排分派方式推得。後針對本研究的結果作個總結與建議。We investigate the inter-rater reliability when the ability of large number of examinees is classified to ordinal grade by raters through derangement. The polychoric correlation coefficient is used as inter-rater reliability when the latent trait model (LTM) is assumed.o ensure at least hundreds examinees is graded by two raters when the number of raters is around a few hundred and the number of examinees is around three hundred thousand, we consider assigning examinees to raters through derangement. Under this setting, it is found that all raters are grouped into several cycles. Through analytic argument and simulation, it is found that the number of group is often not more than ten, the probability of getting at least one cycle of size 2 or 3 is close to 0.59, and the size of largest cycle is often exceeding one hundred. It also finds that the distributions of latent trait of examinees by different raters are close to each other up to a location shift.nder the assumption of the LTM, the discriminate parameter in models can be regard as the accuracy of rating.The correlation between the grades given by raters and the latent trait of examinees was affected by the interaction of the thresholds and discriminate parameter. The correlation coefficient of perspective latent trait variables of two raters is the product of their discriminate parameter, and polychoric correlation coefficient can be estimated by two stages method. The parameter of the thresholds of raters were estimated by the proportion of rating, while as discriminate parameter can be estimates through appropriate derangement.inally according to the result of research, we propose the summary and some suggestions.中文摘要…………………………………………………………iii文摘要…………………………………………………………iv一章前言……………………………………………………1.1研究動機與寫作量測模型………………………………… 1.2試題反應理埨與評分者評等模型………………………… 3.3潛在特質理論與潛在群組模型與評分者的評等假設…… 7.4評分者的評等一致性……………………………………… 8.5研究問題……………………………………………………12二章分派方式與應試者潛在特質常態性的檢驗…………14.1評量網絡……………………………………………………14.2分派方式及評分者間共同評等應試者之個數……………16.3隨機均等分派之常態性檢驗………………………………21.4隨機分派一次且群組互換之常態性檢驗…………………24三章評分者之量測模型假設及Thresholds估計…………27.1模型假設的定義及適切性…………………………………27.2評等與潛在特質變數的相關性……………………………30.3評分者間信度及量測模型之假設…………………………32.4 Step 1:評分者之thresholds估計………………………34四章評分者間信度之估計…………………………………38.1 兩評分者間相關係數rho之估計…………………………38.2 rho之 Consistency………………………………………41.3估計及檢定…………………………………………………52.4錯排方式與評分者鑑別參數beta之估計…………………53五章總結與討論……………………………………………56.1結論…………………………………………………………56.2研究上的限制………………………………………………57.3討論及建議…………………………………………………58考文獻…………………………………………………………59錄………………………………………………………………61 n位評分者進行錯排形成可能之循環組數遞迴關係式……61 以R程式執行應試者群組互換及潛在特質之常態性檢驗…65application/pdf1984056 bytesapplication/pdfen-US信度polychoric 相關係數潛在特質模型等級門檻錯排reliabilitypolychoric correlation coefficientlatent trait modelthresholdsderangement潛在特質模型錯排分派量測下評分者間信度之探討Study on Inter-raters Reliability under the Latent Trait Model through Derangementthesishttp://ntur.lib.ntu.edu.tw/bitstream/246246/180617/1/ntu-98-R92221028-1.pdf