1 Conceptual Issues in Observed-Score Equating Wim J. van der Linden CTB/McGraw-Hill.

Conceptual Issues in Observed-Score Equating

Wim J. van der Linden

CTB/McGraw-Hill

Outline

• Review of Lord (1980)

• Local equating

• Few examples

• Discussion

Review of Lord (1980)

• Notation– X: old test form with observed score X– Y: new test form Y with observed score Y– θ: common ability measured by X and Y– x=φ(y): equating transformation

Review of Lord (1980) Cont’d

• Case 1: Infallible measures– X and Y order any population identically– Equivalence of ranks establishes equating

transformation

( )G y p

( ( ))F y p 1( ) ( ( ))y F G y

• Case 1: Infallible measures Cont’d

– Q-Q curve – Issues related to discreteness, strict

monotonicity, and sampling error will beignored

– Equating is population invariant– Equating error always equal to zero

( ) ( )e x y x

• Case 2: Fallible measures– For each test taker, observed score are random

variables– Realizations of X and Y do not order populations

of test takers identically– Criterion of equity of equating

( ) θ θY Xf f for all θ

• Case 2: Fallible measures Cont’d

– Lord’s theorem: Under realistic conditions, scores X and Y on two tests cannot be equated unless either (1) both scores are perfectly reliable of (2) the two tests are strictly parallel [in which case φ(y)=y]

• Case 2: Fallible measures Cont’d

– Equating no longer population invariant

θ( ) ( ) (θ) θX Xf x f x f d

θ( ) ( ) (θ) θY Yf y f y f d

• Two approximate methods– IRT true-score equating

– Use ξ=ξ(η) to equate Y to X

• Two approximate methods Cont’d

– IRT observed-score equating, for a sample of test takers a=1,…,N

( ) ( θ )N

f x N f X

( ) ( θ )N

f y N f Y

• Lord’s forgotten question:

What is really needed is a criterion for evaluatingsuch approximate procedures, so as to be able to choose from among them. If you can’t be fair (provide equity) to everyone, what is the next best thing? (p.207)

Local Equating

• New definition of equating error

• Equity=no equating error!

• Setting e2(y) equal to zero and solving for φ(y) gives

2 ( ) θ θ( ;θ) ( ) ( )Y Xe x F y F x

* 1θ ( ) θ( ;θ) ( ), X Yx y F F y R

Local Equating Cont’d

• Because of monotonicity of x=φ(y), the resultis the family of error-free (or true) equating transformations

• Lord’s theorem is based on implicit assumptionof a single transformation

* 1θ θ( ;θ) ( ), X Yy F F y R

• Theorem: For a population of test takers P for which X and Y measure the same θ, equating with the family of transformations φ*(y;θ) has the following properties:(i) equity for each p P(ii) symmetry in X and Y for each p P (iii) population invariance within P

• Theorem defines population P– No sampling of test takers required– Includes future test takers

• Alternative definition of equating error:

13 θ θ( ;θ) ( ) ( )X Ye y y F F y

• Definition of bias, MSE, etc., in equating now straightforward

• Lord’s criterion for finding the “next best thing”

• Alternative motivations of local equating– Thought experiment– History of standard error of measurement– Comparison with

• true-score equating

• IRT observed-score equating

– Same score but different equated scores?

• Alternative motivations Cont’d

– One measurement instrument but different transformations?

Few Examples

• It may seem as if local equating replaces Lord’s set of impossible conditions for equating (perfect reliability; parallel test) by another impossible condition (known ability)

• However, post hoc improvement of reliability or parallelness is impossible but we can always approximate an unknown ability

Few Examples Cont’d

• Possible approximations– Estimating ability– Anchor scores as a proxy of ability– Y=y as a proxy of ability– Proxies based on collateral information

Discussion

• Criterion of equity involves a different equating transformation for each ability level

• Traditional equating uses “one-size fits all” transformation, which compromises betweenthe transformations for ability levels. As a result, the equating is always (i) biased and(ii) population dependent

Discussion Cont’d

• Lord’s theorem on the impossibility or unnecessity of observed-score equating wastoo pessimistic because it assumed the use of a single equating transformation for a population of test takers

Equipercentile Method

0 5 10 15 20 25 30 35 40

Test Y

Test X

Test Score

1( ( ))x F G y

Thought Experiment

pTest Y

Thought Experiment Cont’d

Test Y

Test X

Test Y

Test X

Transformation Y → X

qTest Y

Test X

Test Y

Test X

Test Y

Test X

Transformations Y → X

Test Y (Population 1)

Test X (Population 2)

(y) qp

Transformations Y → X

Standard Error of Measurement

• Classical test theory involves one SEM for an entire population of test takers

• Stronger models condition on ability measured; e.g., IRT

'1E X XX

θ (θ)(1- (θ))i iEi

True-Score Equating

• True-score equating is a degenerate case of local equating

( θ) ( θ), θE Y E X R

Different Equated Scores?

• Why should two test takers, p and q, with the same score of 23 out of 30 items correct on a new test form need different equated scores on the same old form?– Would this not even be unfair?– Fallible scores

Different Equated Scores? Cont’d

Observed-score distribution of p Observed-score distribution of q

Different Transformations?

• Example of measuring tape

• Number-correct scores are counts of responses, no fundamental measures

• Responses have person and item effects– Test equating requires “some type of control for

differential examinee ability”—von Davier, Holland & Thayer (2004, p. 2)

Different Transformations? Cont’d

• An effective way to disentangle item and person effects is through IRT modeling

• Observed-score equating is an attempt to do the same through a transformation of total scores– Only possible way is (i) to first condition on the

abilities and (ii) then transform the score to adjust for the item effects

Estimating Ability

• Assumption: fitting response model

• Calculate family of true equating transformations (Lord-Wingersky’s recursive procedure)

• Use member of family at point estimate of θ

• Bias study for 40-item subtests of LSAT

• Application in adaptive testing

Bias Study

0 10 20 30 40e

0 10 20 30 40

Traditional Equating Local Equating at θ

Family of True Transformationsfor LSAT Subtest

0 5 10 15 20 25

=-2.0x

Anchor Score as Proxy

• Current methods– Chain equating– Poststratification equating– Linear equating methods: Tucker, Levine,

Braun-Holland, linear chain equating

• Use conditional distributions of X and Y given anchor score A=a

1( ( )) X a Y ax F F y a A

Anchor Score as Proxy Cont’d

• Empirical bias study for same LSAT subtests

Bias Study—Anchor-Test Design

-8-6-4-202468

0 10 20 30 40

Chain Equating

-8-6-4-202468

0 10 20 30 40

Poststratification Equating

-8-6-4-202468

0 10 20 30 40

Local Equating

Y=y as Proxy of Ability

• Single-group design– Estimate distributions of X given Y=y directly

from bivariate distribution of X and Y– Model-based estimate of Y given y

Y=y as Proxy of Ability

• Linear local equating

• Because μY|y=y (classical test theory),

( ) ( ), 0,1,..., .X y

X y Y yY y

x y y y n

( ) , 0,1,..., .X yx y y n

Collateral Information

• Any variables correlating substantiallywith θ– Earlier tests – Battery of subtests– Response times

• Alternative sources give different equatings; just find the “next best thing”

1 Conceptual Issues in Observed-Score Equating Wim J. van der Linden CTB/McGraw-Hill.

Documents

Transcript of 1 Conceptual Issues in Observed-Score Equating Wim J. van der Linden CTB/McGraw-Hill.

Εισαγωγή σταΨηφιακάΗλεκτρονικάslides Jaeger v-2017… · Microelectronic Circuit Design, 4E McGraw-Hill Chap 6 - 3 Logic Voltage Level Definitions •V

STOCHASTIC HYDROLOGY · 2017. 8. 4. · 1969 44.5 61.6 104.1 112.3 115.9 1970 ... Applied Hydrology by V.T.Chow, D.R.Maidment, L.W.Mays, McGraw-Hill 1998 . IDF Equations for Indian

Mechanics of Materials: A Lecturebook...Mechanics of Materials: A Lecturebook A set of conceptual questions Conceptual questions 2 ME 323 Conceptual question 2.1 A rectangular cross-section

Some Conceptual Problems in Cosmology Prof. J. V. Narlikar IUCAA, Pune.

6805 Overview of nuclear physics - Ohio State University · and nuclear energy applications. Such computational capability, coupled with conceptual and algorithmic advances, will

Fundamento Conceptual (Ondas Transversales)

Pharmacotherapy: A Pathophysiologic Approach The McGraw-Hill Companies Chapter 28.

16.512, Rocket Propulsion Prof. Manuel Martinez-Sanchez ... · (Ref. Phillip Thompson Compressive Fluid Dynamics, McGraw Hill, 1972, Ch. 9) 2-D or axisymmetric Homentropic as well

Theory of Ordinary Differential Equations - Virginia …€¦ · Earl A. Coddington and Norman Levinson, Theory of Ordinary Differential Equations, McGraw-Hill, 1955. (see pages 314

6 Principios Fundamentales para la Eficiencia Energética · The industrial Press •Electric Power Metering by Archer ... •Industrial Power Systems Handbook Donald Beeman McGraw

Electrical conductivity - University of Washingtoncourses.washington.edu/mse170/lecture_notes/luscombe/week10.pdfFontana, Corrosion Engineering, 3rd ed., McGraw-Hill Book Company,

MibML: A Conceptual Modeling Grammar for Integrative …misrc.umn.edu/workshops/2004/spring/rajiv.pdf3 theoretically-grounded conceptual-modeling grammar termed MibML (Multiagent-based

a. b. - Estrella Mountain Community College · Copyright © The McGraw-Hill Companies ... Companies, Inc. Permission required for reproduction or ... Permission required for reproduction

Copyright © 2009 – The McGraw-Hill Companies srl Capitolo 8 Massimizzazione dei profitti.

Elastic Strain Recovery - Materialsmatclass/101/pdffiles/Lecture_9.pdf · Elastic Strain Recovery ... Guy, Essentials of Materials Science, McGraw-Hill Book Company, New York, 1976.

ΕΝΝΟΙΟΛΟΓΙΚΟ ΛΕΞΙΚΟ ΝΑΥΤΙΛΙΑΚΩΝ ΟΡΩΝiims.gr/wp-content/uploads/2017/10/Conceptual-Dictionary-of... · ΕΝΝΟΙΟΛΟΓΙΚΟ ΛΕΞΙΚΟ ΝΑΥΤΙΛΙΑΚΩΝ

Theory of Elementary Particles - DESY · | requirement of local symmetry (conceptual analogy to general relativity) GAUGE PRINCIPLE-invariance with respect to local phase transformations

Appendix B - Boralex · Appendix B Equipment Specifications and Conceptual Plans. ... Remote monitoring: ENERCON SCADA * For more information on the ENERCON storm control feature,

CS420 lecture three Lowerbounds wim bohm, cs CSU.

Chapter 11 Solutions and Their Colligative Propertiespeople.emich.edu/mmilletti/CHEM 123/Ch11slides.pdf · Solvation in Aqueous Solutions . ... Copyright (O The McGraw-Hill Companies,