DC CAS 2012 Technical Report

Technical Report
Spring 2012 Test Administration

Washington, D.C.
Comprehensive Assessment System
(DC CAS)

December 31, 2012

CTB/McGraw-Hill
Monterey, California 93940

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

2

Developed and published by CTB/McGraw-Hill LLC, 20 Ryan Ranch Road, Monterey, California 939405703. Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education. All
rights reserved. Only authorized customers may copy, download and/or print the document, located online
at http://osse.dc.gov/seo/cwp/view. Any other use or reproduction of this document, in whole or in part,
requires written permission of the District of Columbia Office of the State Superintendent of Education.

Technical Report for Spring 2012 Test Administration of DC CAS

3

Table of Contents
List of Tables ................................................................................................................................................................... 5
Section 1. Overview ........................................................................................................................................................ 7
Section 2. Item and Test Development .......................................................................................................................... 8
Overview ........................................................................................................................................................................ 8
Content Standards .......................................................................................................................................................... 8
Item Development .......................................................................................................................................................... 8
Test Development .......................................................................................................................................................... 9
Test Design..................................................................................................................................................................... 9
Section 3. Test Administration Guidelines and Requirements ................................................................................. 17
Overview ...................................................................................................................................................................... 17
Guidelines and Requirements for Administering DC CAS .......................................................................................... 18
Materials Orders, Delivery, and Retrieval .................................................................................................................... 19
Secure Inventory .......................................................................................................................................................... 19
Section 4. Student Participation .................................................................................................................................. 20
Tests Administered ....................................................................................................................................................... 20
Participation in DC CAS .............................................................................................................................................. 20
Definition of Valid Test Administration ...................................................................................................................... 21
Special Accommodation .............................................................................................................................................. 21
Section 5. Scoring .......................................................................................................................................................... 29
Selection of Scoring Raters .......................................................................................................................................... 29
Recruitment .................................................................................................................................................................. 29
The Interview Process .................................................................................................................................................. 29
Training Material Development ................................................................................................................................... 29
Preparation and Meeting Logistics for Rangefinding ................................................................................................... 30
Training and Qualifying Procedures ............................................................................................................................ 30
Breakdown of Scoring Teams ...................................................................................................................................... 31
Monitoring the Scoring Process ................................................................................................................................... 31
Section 6. Methods ........................................................................................................................................................ 33
Classical Item Level Analyses ..................................................................................................................................... 33
Item Bias Analyses ....................................................................................................................................................... 33
Calibration and Equating .............................................................................................................................................. 35
Goodness of Fit ............................................................................................................................................................ 35
Year-to-Year Equating Procedures .............................................................................................................................. 37
Establishing Upper and Lower Bounds for the Grade Level Scales............................................................................. 38
Reliability Coefficients ................................................................................................................................................ 39
Standard Errors of Measurement .................................................................................................................................. 40
Proficiency Level Analyses .......................................................................................................................................... 40
Classification Consistency ........................................................................................................................................... 40
Classification Accuracy................................................................................................................................................ 41
Section 7. Standard Setting .......................................................................................................................................... 47
Grades 3-10 Reading Cut Score Review ..................................................................................................................... 48
Grade 2 Reading and Mathematics Standard Setting ................................................................................................... 49
Grades 4, 7, and 10 Composition Standard Setting ...................................................................................................... 49
Final, Approved DC CAS Cut Scores .......................................................................................................................... 49
Section 8. Evidence for Reliability and Validity ........................................................................................................ 52
Reliability ..................................................................................................................................................................... 52
Validity......................................................................................................................................................................... 53
Item Level Evidence..................................................................................................................................................... 53
Classical Item Statistics ................................................................................................................................................ 53
Inter-Rater Reliability .................................................................................................................................................. 54
Differential Item Function ............................................................................................................................................ 55
Test and Strand Level Evidence ................................................................................................................................... 55
Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

4

Operational Test Scores ............................................................................................................................................... 55
Strand Level Scores ...................................................................................................................................................... 56
Standard Errors of Measurement .................................................................................................................................. 56
Proficiency Level Evidence ......................................................................................................................................... 57
Correlational Evidence across Content Areas .............................................................................................................. 58
References ..................................................................................................................................................................... 99
Appendix A: Checklist for DC Educator Review of DC CAS Items ...................................................................... 101
Appendix B: DC CAS Composition Scoring Rubrics .............................................................................................. 103
Appendix C: Operational and Field Test Item Adjusted P Values ........................................................................ 105
Appendix D: Internal Consistency Reliability Coefficients for Examinee Subgroups ......................................... 146
Appendix E: Classification Consistency and Accuracy Estimates for All Proficiency Levels for Examinee
Subgroups .................................................................................................................................................................... 152

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

5

List of Tables
Table 1. DC CAS 2012 Operational Test Form Blueprints: Reading ............................................................................. 11
Table 4. DC CAS 2012 Operational Test Form Blueprints: Composition...................................................................... 16
Table 5. Number and Percent of Examinees with Valid Test Administrations on the 2012 DC CAS in Reading,
Mathematics, Science/Biology, or Composition ..................................................................................................... 23
Table 6. Number and Percent of Students in Special Programs with Test Scores on the 2012 DC CAS in Reading,
Mathematics, Science/Biology, or Composition ..................................................................................................... 24
Table 7. Number and Percent of Students Coded for ELL Access for Proficiency Levels 1-4 in Reading,
Mathematics, Science/Biology, or Composition ..................................................................................................... 25
Table 8. Number and Percent of Students Receiving One or More English Language Learner Test Administration
Accommodations in Reading, Mathematics, Science/Biology, or Composition ..................................................... 26
Table 9. Number and Percent of Students Receiving One or More Special Education Test Administration
Accommodations in Reading, Mathematics, Science/Biology, or Composition ..................................................... 27
Table 10. Number and Percent of Students Receiving One or More Selected Special Education Test Administration
Accommodations in Reading, Mathematics, Science/Biology, or Composition ..................................................... 28
Table 11. DC CAS 2012 Numbers of Operational Items Flagged for Poor Fit During Calibration ............................... 43
Table 12. Correlations Between the Item Parameters for the Reference Form and 2012 DC CAS Operational Test
Form ........................................................................................................................................................................ 44
Table 13. Scaling Constants Across Administrations, All Grades and Content Areas ................................................... 45
Table 14. LOSS and HOSS for Relevant Grades in Reading, Mathematics, Science/Biology and Composition .......... 46
Table 15. Final Cut Score Ranges .................................................................................................................................. 51
Table 16. DC CAS 2012 Classical Item Level Statistics ................................................................................................ 59
Table 17. DC CAS 2012 Operational Inter-Rater Agreement for Constructed Response Items: Reading ..................... 60
Table 18. DC CAS 2012 Operational Inter-Rater Agreement for Constructed Response Items: Mathematics .............. 61
Table 19. DC CAS 2012 Operational Inter-Rater Agreement for Constructed Response Items: Science/Biology ........ 62
Table 20. DC CAS 2012 Operational Inter-Rater Agreement for Constructed Response Items: Composition .............. 63
Table 21. DC CAS 2012 Field Test Inter-Rater Agreement for Constructed Response Items: Reading ........................ 64
Table 22. DC CAS 2012 Field Test Inter-Rater Agreement for Constructed Response Items: Mathematics ................ 65
Table 23. DC CAS 2012 Field Test Inter-Rater Agreement for Constructed Response Items: Science/Biology ........... 66
Table 24. Numbers of Operational Items Flagged for DIF Using the Mantel-Haenszel Procedure: Reading ................ 67
Table 25. Numbers of Operational Items Flagged for DIF Using the Mantel-Haenszel Procedure: Mathematics ......... 69
Table 26. Numbers of Operational Items Flagged for DIF Using the Mantel-Haenszel Procedure: Science/Biology ... 70
Table 27. Numbers of Operational/Field Test Items Flagged for DIF Using the Mantel-Haenszel Procedure:
Composition ............................................................................................................................................................ 71
Table 28. Numbers of Field Test Items Flagged for DIF Using the Mantel-Haenszel Procedure: Reading ................... 72
Table 29. Numbers of Field Test Items Flagged for DIF Using the Mantel-Haenszel Procedure: Mathematics ........... 74
Table 30. Numbers of Field Test Items Flagged for DIF Using the Mantel-Haenszel Procedure: Science/Biology ...... 75
Table 31. Total Test Scale and Raw Score Means and Reliability Statistics .................................................................. 76
Table 31. Coefficient Alpha Reliability for Reading Strand Scores ............................................................................... 77
Table 33. Coefficient Alpha Reliability for Mathematics Strand Scores ........................................................................ 78
Table 34. Coefficient Alpha Reliability for Science/Biology Strand Scores .................................................................. 79
Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

6

Table 35. Coefficient Alpha Reliability for Composition Strand Scores ........................................................................ 80
Table 36. DC CAS 2012 Reading Strand Correlations by Grade ................................................................................... 81
Table 37. DC CAS 2012 Mathematics Strand Correlations by Grade ............................................................................ 82
Table 38. DC CAS 2012 Science/Biology Strand Correlations by Grade ...................................................................... 84
Table 39. DC CAS 2012 Composition Rubric Score Correlations by Grade ................................................................. 85
Table 40. DC CAS 2012 Number Correct to Scale Score Conversions with Associated Standard Errors of
Measurement (SEM): Reading ................................................................................................................................ 86
Table 41. DC CAS 2012 Number Correct to Scale Score Conversions with Associated Standard Errors of
Measurement (SEM): Mathematics ......................................................................................................................... 88
Table 42: DC CAS 2012 Number Correct to Scale Score Conversions with Associated Standard Errors of
Measurement (SEM): Science/Biology ................................................................................................................... 91
Table 43. DC CAS 2012 Number Correct to Scale Score Conversions with Associated Standard Errors of
Measurement (SEM): Composition ......................................................................................................................... 93
*Proficiency Level Scale Score cuts (Basic, Proficient, Advanced)............................................................................... 93
Table 44. DC CAS 2012 Percentages of Students at Each Performance Level .............................................................. 94
Table 45. Classification Consistency and Accuracy Rates by Grade and Cut Score: Reading ....................................... 95
Table 46. Classification Consistency and Accuracy Rates by Grade and Cut Score: Mathematics ............................... 96
Table 47. Classification Consistency and Accuracy Rates by Grade and Cut Score: Science/Biology.......................... 97
Table 48. Classification Consistency and Accuracy Rates by Grade and Cut Score: Composition ............................... 97
Table 49. Correlations Between Reading, Mathematics, Science/Biology, and Composition Total Test Raw Scores,
by Grade .................................................................................................................................................................. 98
Table C1. DC CAS 2012 Operational Form Item Adjusted P Values: Reading........................................................... 105
Table C2. DC CAS 2012 Operational Form Item Adjusted P Values: Mathematics ................................................... 114
Table C3. DC CAS 2012 Operational Form Item Adjusted P Values: Science/Biology.............................................. 122
Table C4. DC CAS 2012 Operational Form Item Adjusted P Values: Composition ................................................... 125
Table C5. DC CAS 2012 Field Test Form Item Adjusted P Values: Reading ............................................................. 126
Table C6. DC CAS 2012 Field Test Form Item Adjusted P Values: Mathematics ...................................................... 135
Table C7. DC CAS 2012 Field Test Form Item Adjusted P Values: Science/Biology ................................................ 143
Table D1. Internal Consistency Reliability Coefficients for Examinee Subgroups: Reading ...................................... 146
Table D2. Internal Consistency Reliability Coefficients for Examinee Subgroups: Mathematics ............................... 148
Table D3. Internal Consistency Reliability Coefficients for Examinee Subgroups: Science/Biology ......................... 150
Table D4. Internal Consistency Reliability Coefficients for Examinee Subgroups: Composition ............................... 151
Table E1. Classification Consistency and Accuracy Rates for All Cut Scores and Examinee Subgroups: Reading .... 152
Table E2. Classification Consistency and Accuracy Rates for All Cut Scores and Examinee Subgroups: Mathematics154
Table E3. Classification Consistency and Accuracy Rates for All Cut Scores and Examinee Subgroups:
Science/Biology..................................................................................................................................................... 156
Table E4. Classification Consistency and Accuracy Rates for All Cut Scores and Examinee Subgroups: Composition157

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

7

Section 1. Overview
The primary purpose of the DC CAS is to measure the mastery of Reading, Mathematics, Science,
Biology, and Composition content standards of all District of Columbia (DC) public school
students annually. The assessments provide the foundation for an accountability system that
enables the District to determine whether students and schools are making adequate yearly
progress on DC content standards as required by the No Child Left Behind (NCLB) Act. In
addition, the assessments are used by district- and school-based instructional staff to focus their
lessons on content standards and evaluate whether students and schools are achieving those
standards. Parents use the results to monitor their children's educational progress and the
effectiveness of their school and school district.
This document describes the operational District of Columbia Comprehensive Assessment System
(DC CAS) that was administered to students in the spring of 2012 to assess students' skills in
Grades 2-10 Reading; Grades 2-8 and 10 Mathematics; Grades 5 and 8 Science and high school
Biology; and Grades 4, 7, and 10 Composition. The DC CAS consists of multiple choice (MC) and
constructed response (CR) items in Reading, Mathematics, and Science/Biology, and writing
prompts for Composition. All items are administered under standardized conditions, where
students are allowed accommodations when eligible.
Technical reports for assessment programs are the primary means for test developers and
assessment program managers to communicate with test users (American Educational Research
Association, American Psychological Association, & National Council on Measurement in
Education, 2009, p. 67). The standards require technical reports to document, for example,
rationales and recommended uses for tests (Standard 6.3) and technical characteristics, such as
score reliability and validity of score interpretations (Standard 6.5). Because of the technical
nature of developing, implementing, and validating achievement tests like the DC CAS,
technical reports target audiences with some level of technical training and understanding.
Furthermore, the evidence provided in this report is directly relevant to the Standards and
Assessments Peer Review Guidance, Critical Elements (January 12, 2009; see
http://www.ed.gov/policy/elsec/guid/saaprguidance.pdf).
This technical report is written to document procedures and results from developing, analyzing,
and validating the 2012 DC CAS. It provides information relevant to an evaluation of the validity
of intended interpretations and uses of results from the 2012 DC CAS tests. The design of the test
administration, content development and forms construction, classical item analysis, differential
item functioning (DIF), item response theory analyses (IRT), and proficiency level data are
provided.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

8

Section 2. Item and Test Development
This section contains information relevant to the Standards and Assessments Peer Review
Guidance, Critical Elements 4.1 and 5.1:
4.1
For each assessment, including all alternate assessments, has the State documented the issue of
validity (in addition to the alignment of the assessment with the content standards), as described in
the Standards for Educational and Psychological Testing (AERA/APA/NCME, 1999), with
respect to all of the following categories:
(d) Has the State ascertained that the scoring and reporting structures are consistent with the subdomain structures of its academic content standards (i.e., are item interrelationships consistent with
the framework from which the test arises)?
5.1
Has the State outlined a coherent approach to ensuring alignment between each of its
assessments...based on grade-level achievement standards, and the academic content standards
and academic achievement standards the assessment is designed to measure?

Overview
A key piece of validity evidence is provided by the procedures used to develop the test's content
and the alignment of items with the test blueprint and specifications. By setting forth a description
of the events that took place in a test's development, we establish evidence of validity for the DC
CAS based on test development procedures and test content.
Evidence of validity based on test content includes information about the item and test
specifications. Test development involves creating a design framework from the statement of the
achievement construct to be measured. Design elements include numbers and types of items and
score points allocated to each content strand in each content area test.

Content Standards
The DC CAS tests are aligned to either DC Content Standards, Common Core State Standards
(CCSS), or to both for content areas transitioning to the common core. The standards serve as
reporting categories. Reading content is fully aligned to the CCSS in Reading. Mathematics
content has been transitioning to the CCSS and remains partially aligned to the DC Content
Standards, which serve as the reporting categories. In 2013, the Mathematics content will be fully
aligned to the CCSS. in Mathematics. In 2012, the total numbers of operational items included in
the Science/Biology test design remained the same, although they were distributed differently
under the new standard headings.

Item Development
Each year, newly developed items are field tested in DC CAS in all grades and content areas.
These items are developed by CTB and, prior to field testing, go through a rigorous content and
psychometric review and approval process. CTB content and style editors, supervisors, and
managers review all items and then provide items to participants in Content and Bias/Sensitivity
Review workshops conducted in DC. OSSE invited educators, members of the DC Public Schools
Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

9

(DCPS) administration, and community representatives to participate in workshops to review the
items. A training session was provided by CTB, after which the participants reviewed all items for
content and grade appropriateness, as well as for alignment to the content standards, then rated
each item for acceptance, revision, or rejection. The reviewers used the criteria in the checklist in
Appendix A to guide their rating decisions.

Test Development
CTB's Research and Development teams, with the approval of the OSSE, developed test forms
designed to measure student performance through both multiple choice (MC) and constructed
response (CR) item types. The total number of items and score points included n each reporting
category serves as the test blueprint, details of which are provided in Tables 1-4.
The items that are available for selection in the DC CAS 2012 assessments originated from a pool
of operational and formerly field tested items from the 2011 DC CAS administration, with a small
number of items selected from older pools (excluding 2009). The Grade 2 Reading and
Mathematics items and the Grade 9 Reading items originated from CTB-owned items in the
TerraNova(TM) item pool.
DC CAS assessments are equated each year so that, from one form and year to the next, student
scores remain comparable. The equating process requires a set of items to link or anchor one year
to the next. These anchor items are a small subset of items that, proportionally, reflect the overall
test blueprints. They are typically selected first, around which the remaining operational items are
selected for each form.
The forms were assembled by CTB Development staff who attended to the blueprint requirements,
as well as various other content and psychometric requirements, such as test length, score points,
item types, statistical comparability, and quality. For example, all proposed selections for
operational forms were compared to previous DC CAS test forms to ensure they remain parallel
and comparable in terms of test difficulty and coverage of the DC CAS content standards, as
specified in the 2012 test blueprints.
Once the forms were assembled, they went through an iterative review and approval process where
they were reviewed and approved by CTB Research, and then by OSSE. The items were reviewed
for content standards alignment and appropriateness by CTB test developers and by OSSE.

Test Design
The DC CAS tests are designed as operational tests with embedded field test items. In this way,
newly developed items can be field tested in and amongst operational items. This is an advantage
over separate field test designs that highlight the items that do not -count? towards students' scores
and can decrease the motivation of their serious effort and response.
To maximize the number of items field tested while minimally impacting the testing time required,
two forms were developed for all grades and content areasexcept Composition. Each form
included the same core set of operational items (which comprised the equating anchor subset) and
a set of unique embedded field test items. The two forms were spiraled together and packaged to
ensure near equal distribution of the forms in classrooms and so that field test data were based on
randomly equivalent groups across all students in the District (i.e., no sample was drawn).

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

10

The Composition tests were designed as operational field tests. This design captures student
performance and score on field tested items. Four writing prompts in each of Grades 4, 7, and 10
were administered in spiral fashion within the classroom. Therefore, each student only needed to
respond to one prompt. The prompts were developed to reflect Common Core State Standards and
scored three times based on the following traits/rubrics: Writing Topic Development, Writing
Language Conventions, and Understanding Literary Text or Understanding Informational Text
(depending upon the type of passage associated with each prompt: literary or informational). The
rubrics used to score these items can be found in Appendix B. Note that student reports reflected
the scores from all rubrics; however, only the two Writing traits contributed to the overall scale
score and proficiency level designations this year.

Test Administration Design
For Grades 4-8 and 10, both Reading and Mathematics items were included in the same test
books. Reading items were in a stand-alone test book at Grade 9 since Mathematics was not tested
at that grade level. For all grades, test books and answer booklets were color-coded. Students in
Grades 2 and 3 used scannable test books in which they recorded their answers.
For Grades 3-10, each Reading and Mathematics test was divided into four sessions, for a total of
eight sessions per grade level test. For Grade 2, each Reading and Mathematics test was divided
into three sessions, for a total of six sessions. For all grades, each session included both multiple
choice and constructed response items.
A similar configuration was used for the Science/Biology tests. Students responded to the test
items in one of two test books. They recorded their answers in scannable answer documents. No
manipulatives were provided. The Science/Biology tests were divided into three sessions, each
with both multiple choice and constructed response items.
Composition test books were provided for each of the four prompts. The test books were scannable
documents that included the following: directions to students, evaluation criteria, a writing prompt,
three lined pages, and a biogrid. The prompts were administered within the established two-week
testing window. Students were also issued two sheets of double-sided, lined draft paper, specially
developed for the Composition test, for planning their writing.
Additional information regarding administration manuals and procedures is provided in Section 3.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

11

Technical Report for Spring 2012 Test Administration of DC CAS

Table 1. DC CAS 2012 Operational Test Form Blueprints: Reading
Operational
Grade

2

3

4

5

6

Content Strand

1
3
4
1
3
4
1
3
4
1
3
4
1
3
4

Vocabulary Acquisition & Use
Reading Informational Text
Reading Literary Text
Total
Vocabulary Acquisition & Use
Reading Informational Text
Reading Literary Text
Total
Vocabulary Acquisition & Use
Reading Informational Text
Reading Literary Text
Total
Vocabulary Acquisition & Use
Reading Informational Text
Reading Literary Text
Total
Vocabulary Acquisition & Use
Reading Informational Text
Reading Literary Text
Total

Number
of MC
Items/
Points
7
13
13
33
8
18
19
45
8
17
20
45
8
16
21
45
9
15
21
45

% of
MC
Points
100.00%
81.25%
81.25%
84.62%
100.00%
85.71%
76.00%
83.33%
100.00%
85.00%
76.92%
83.33%
100.00%
72.73%
87.50%
83.33%
100.00%
71.43%
87.50%
83.33%

Number Number
of CR
of CR
Items
Points
0
1
1
2
0
1
2
3
0
1
2
3
0
2
1
3
0
2
1
3

0
3
3
6
0
3
6
9
0
3
6
9
0
6
3
9
0
6
3
9

Anchor
% of
CR
Points
0.00%
18.75%
18.75%
15.38%
0.00%
14.29%
24.00%
16.67%
0.00%
15.00%
23.08%
16.67%
0.00%
27.27%
12.50%
16.67%
0.00%
28.57%
12.50%
16.67%

Total
Number
of
Points
7
16
16
39
8
21
25
54
8
20
26
54
8
22
24
54
9
21
24
54

Field
Test

Number
of Items

% of
Points

Number
of Items

----5
9
9
23
4
8
11
23
4
8
11
23
4
8
10
22

----62.50%
42.86%
36.00%
42.59%
50.00%
40.00%
42.31%
42.59%
50.00%
36.36%
45.83%
42.59%
44.44%
38.10%
41.67%
40.74%

4
19
19
42
4
20
14
38
4
20
14
38
4
14
20
38
4
14
20
38

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

12

Table 1. DC CAS 2012 Operational Test Form Blueprints: Reading (continued)
Operational
Grade

7

8

9

10

Content Strand

1
3
4
1
3
4
1
3
4
1
3
4

Vocabulary Acquisition & Use
Reading Informational Text
Reading Literary Text
Total
Vocabulary Acquisition & Use
Reading Informational Text
Reading Literary Text
Total
Vocabulary Acquisition & Use
Reading Informational Text
Reading Literary Text
Total
Vocabulary Acquisition & Use
Reading Informational Text
Reading Literary Text
Total

Number
of MC
Items/
Points
8
18
19
45
7
20
18
45
8
21
16
45
9
18
18
45

% of
MC
Points
100.00%
75.00%
86.36%
83.33%
100.00%
76.92%
85.71%
83.33%
100.00%
80.77%
84.21%
84.91%
100.00%
75.00%
85.71%
83.33%

Number Number
of CR
of CR
Items
Points
0
2
1
3
0
2
1
3
0
2
1
3
0
2
1
3

0
6
3
9
0
6
3
9
0
5
3
8
0
6
3
9

Anchor
% of
CR
Points
0.00%
25.00%
13.64%
16.67%
0.00%
23.08%
14.29%
16.67%
0.00%
19.23%
15.79%
15.09%
0.00%
25.00%
14.29%
16.67%

Total
Number
of
Points
8
24
22
54
7
26
21
54
8
26
19
53
9
24
21
54

Field
Test

Number
of Items

% of
Points

Number
of Items

5
9
9
23
5
10
8
23
4
11
8
23
5
8
10
23

62.50%
37.50%
40.91%
42.59%
71.43%
38.46%
38.10%
42.59%
50.00%
42.31%
42.11%
43.40%
55.56%
33.33%
47.62%
42.59%

4
14
20
38
4
18
18
40
4
18
18
40
4
18
18
40

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

13

Table 2. DC CAS 2012 Operational Test Form Blueprints: Mathematics
Operational
Grade

2

3

Content Strand

1
2
3
4
1
2
3
4
5

4

1
2
3
4
5

5

1
2
3
4
5

Operations & Algebraic Thinking
Numbers & Operations Base Ten
Geometry
Measurement and Data
Total
Number Sense & Operations
Patterns, Relations & Algebra
Geometry
Measurement
Data Analysis, Statistics &
Probability
Total
Number Sense & Operations
Patterns, Relations & Algebra
Geometry
Measurement
Data Analysis, Statistics &
Probability
Total
Number Sense & Operations
Patterns, Relations & Algebra
Geometry
Measurement
Data Analysis, Statistics &
Probability
Total

Anchor

Field
Test

Number
of MC
Items/
Points
8
11
7
12
38
16
9
4
12

% of
MC
Points

Number
of CR
Items

Number
of CR
Points

% of
CR
Points

Total
Number
of Points

Number
of Items

% of
Points

Number
of Items

80.00%
100.00%
100.00%
85.71%
90.48%
84.21%
100.00%
57.14%
100.00%

1
0
0
1
2
1
0
1
0

2
0
0
2
4
3
0
3
0

20.00%
0.00%
0.00%
14.29%
9.52%
15.79%
0.00%
42.86%
0.00%

10
11
7
14
42
19
9
7
12

-----9
4
2
4

-----47.37%
44.44%
28.57%
33.33%

7
12
3
14
29
9
6
7
6

10

76.92%

1

3

23.08%

13

6

46.15%

4

51
23
7
4
7

85.00%
100.00%
70.00%
57.14%
100.00%

3
0
1
1
0

9
0
3
3
0

15.00%
0.00%
30.00%
42.86%
0.00%

60
23
10
7
7

25
11
6
2
2

41.67%
47.83%
60.00%
28.57%
28.57%

32
10
6
4
9

10

76.92%

1

3

23.08%

13

4

30.77%

3

51
20
10
6
9

85.00%
100.00%
76.92%
66.67%
100.00%

3
0
1
1
0

9
0
3
3
0

15.00%
0.00%
23.08%
33.33%
0.00%

60
20
13
9
9

25
10
6
3
2

41.67%
50.00%
46.15%
33.33%
22.22%

32
10
6
4
8

6

66.67%

1

3

33.33%

9

3

33.33%

4

51

85.00%

3

9

15.00%

60

24

40.00%

32

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

14

Table 2. DC CAS 2012 Operational Test Form Blueprints: Mathematics (continued)
Operational
Grade

6

Content Strand

1
2
3
4
5

7

1
2
3
4
5

8

1
2
3
4
5

10

1
2
3
4
5

Number Sense & Operations
Patterns, Relations & Algebra
Geometry
Measurement
Data Analysis, Statistics &
Probability
Total
Number Sense & Operations
Patterns, Relations & Algebra
Geometry
Measurement
Data Analysis, Statistics &
Probability
Total
Number Sense & Operations
Patterns, Relations & Algebra
Geometry
Measurement
Data Analysis, Statistics &
Probability
Total
Number Sense & Operations
Patterns, Relations & Algebra
Geometry
Measurement
Data Analysis, Statistics &
Probability
Total

Anchor

Number
of MC
Items/
Points
15
13
8
5

% of
MC
Points

Number
of CR
Items

Number
of CR
Points

% of
CR
Points

Total
Number Number
of
of Items
Points
18
8
16
5
8
5
8
2

83.33%
81.25%
100.00%
62.50%

1
1
0
1

3
3
0
3

16.67%
18.75%
0.00%
37.50%

10

100.00%

0

0

0.00%

10

51
16
12
8
7

85.00%
84.21%
80.00%
100.00%
70.00%

3
1
1
0
1

9
3
3
0
3

15.00%
15.79%
20.00%
0.00%
30.00%

8

100.00%

0

0

51
16
18
5
3

85.00%
100.00%
85.71%
62.50%
50.00%

3
0
1
1
1

9

100.00%

51
11
18
6
7

Field Test

% of
Points

Number
of Items

44.44%
31.25%
62.50%
25.00%

10
7
6
5

4

40.00%

4

60
19
15
8
10

24
8
6
2
2

40.00%
42.11%
40.00%
25.00%
20.00%

32
10
7
9
3

0.00%

8

5

62.50%

3

9
0
3
3
3

15.00%
0.00%
14.29%
37.50%
50.00%

60
16
21
8
6

23
8
8
4
1

38.33%
50.00%
38.10%
50.00%
16.67%

32
6
12
4
5

0

0

0.00%

9

4

44.44%

5

85.00%
100.00%
85.71%
66.67%
100.00%

3
0
1
1
0

9
0
3
3
0

15.00%
0.00%
14.29%
33.33%
0.00%

60
11
21
9
7

25
3
11
2
4

41.67%
27.27%
52.38%
22.22%
57.14%

32
5
10
9
4

9

75.00%

1

3

25.00%

12

5

41.67%

3

51

85.00%

3

9

15.00%

60

25

41.67%

31

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

15

Table 3. DC CAS 2012 Operational Test Form Blueprints: Science/Biology
Operational
Grade

5

8

High
School

Content Strand

1
2
3
4
1
2
3
4
1
2
3
4

Science and Technology
Earth and Space Science
Physical Science
Life Science
Total
Scientific Thinking and Inquiry
Matter and Reactions
Forces
Energy and Waves
Total
Cell Biology & Biochemistry
Genetics and Evolution
Multicellular Organisms
Ecosystems
Total

Number
of MC
Items/
Points
14
12
10
11
47
6
21
8
12
47
13
15
10
9
47

% of
MC
Points
87.50%
85.71%
100.00%
84.62%
88.68%
75.00%
91.30%
80.00%
100.00%
88.68%
86.67%
100.00%
83.33%
81.82%
88.68%

Number Number
of CR
of CR
Items
Points
1
1
0
1
3
1
1
1
0
3
1
0
1
1
3

2
2
0
2
6
2
2
2
0
6
2
0
2
2
6

Anchor
% of
CR
Points
12.50%
14.29%
0.00%
15.38%
11.32%
25.00%
8.70%
20.00%
0.00%
11.32%
13.33%
0.00%
16.67%
18.18%
11.32%

Total
Number Number
of
of Items
Points
16
-14
-10
-13
-53
-8
9
23
4
10
2
12
4
53
6
15
25
15
11
12
6
11
2
53
2

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Field
Test

% of
Points

Number
of Items

-----47.37%
44.44%
28.57%
33.33%
46.15%
41.67%
47.83%
60.00%
28.57%
28.57%

7
12
3
14
29
9
6
7
6
4
32
10
6
4
9

Technical Report for Spring 2012 Test Administration of DC CAS

16

Table 4. DC CAS 2012 Operational Test Form Blueprints: Composition

Grade

4, 7, 10

Scoring Rubric

Writing Topic Development
Writing Language Conventions
Understanding Literary Text*
Understanding Informational Text*
Total Possible Points

Number Number
of CR
of CR
Items
Points
4
4
2
2
--

6
4
4
4
14

Contribution to
Overall Scale Score
Number
of Points

% of
Points

6
4
--10

60.00%
40.00%
--100.00%

* Understanding Literary or Informational Text Rubric was considered as a field test rubric and did not contribute to students' overall scores.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

17

Section 3. Test Administration Guidelines and Requirements
This section contains information relevant to the Standards and Assessments Peer Review
Guidance, Critical Elements 4.3, 4.5, and 6.2:
4.3
Has the State ensured that its assessment system is fair and accessible to all students, including
students with disabilities and students with limited English proficiency, with respect to each of
the following issues:
(a) Has the State ensured that the assessments provide an appropriate variety of accommodations
for students with disabilities? and
(b) Has the State ensured that the assessments provide an appropriate variety of linguistic
accommodations for students with limited English proficiency?
4.5
Has the State established clear criteria for the administration, scoring, analysis, and reporting
components of its assessment system, including all alternate assessments, and does the State
have a system for monitoring and improving the on-going quality of its assessment system?
6.2
1. What guidelines does the State have in place for including all students with disabilities in the
assessment system?
(a) Has the State developed, disseminated information on, and promoted use of appropriate
accommodations to increase the number of students with disabilities who are tested against
academic achievement standards for the grade in which they are enrolled?
(b) Has the State ensured that general and special education teachers and other appropriate staff
know how to administer assessments, including making use of accommodations, for students
with disabilities and students covered under Section 504.

Overview
Administration of the DC CAS assessments each spring is managed by the Office of the State
Superintendent of Education (OSSE), coordinated in each school by a Test Chairperson, and
conducted by classroom teachers. Assessment office staff trained school Test Chairpersons on
test administration guidelines and requirements using the 2012 Test Chairperson's Manual. Test
Chairpersons, in turn, trained all Test Administrators and proctors. Test Administrators
administered all DC CAS assessments according to requirements and steps in the Test
Directions.
The Test Chairperson's Manual directs Test Chairpersons to follow the procedures for training
Test Administrators and proctors on required procedures for administering each test and
maintaining test security before, during, and after test administrations. It also provides
information on available accommodations for students with disabilities and English language
learners.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

18

The Test Directions covers similar topics and requirements. In addition, it provides instructions
on scheduling test administrations, preparing students for the test administration, using
standardized testing procedures, and verbatim instructions for administering each test to students.
It also provides information on available accommodations for students with disabilities and for
English language learners.

Guidelines and Requirements for Administering DC CAS
The Test Chairperson's Manual indicates that DC CAS administrations should be scheduled to
ensure that all students have adequate time to respond to all test items under unhurried
conditions. It also describes testing condition requirements to ensure that students can feel as
comfortable as possible and are not distracted during administration. The manual requires each
Test Chairperson to complete a Test Site Observation Report to ensure that adequate testing
conditions can be provided. It also contains instructions on distributing test materials to Test
Administrators, retrieving the materials, accounting for 100% of all secure materials, shipping
the materials to CTB for processing, and maintaining security of the materials at all times and
throughout the entire process.
The Test Chairperson's Manual and Test Directions provide information on available test
administration accommodations for students with disabilities and for English language learners.
They specify approved accommodations that maintain standard testing conditions (e.g., reading
only Mathematics, Science, or Health questions or Composition writing prompts to examinees)
and identify accommodations that are considered modifications to the test that will result in
invalidated test scores (e.g., assisted reading of Reading passages).
The Test Chairperson's Manual and Test Directions specify accommodations approved for
students with disabilities in the following areas: timing/scheduling (e.g., providing breaks
between prescribed timing sections of the tests), setting (e.g., individual and small group
administrations), presentation (e.g., reading of [only] Mathematics, Science, or Health test
questions or Composition writing prompts), and response accommodations (e.g., dictating
responses). The Test Chairperson's Manual and Test Directions specify accommodations
approved for English language learners; they are in the following areas: direct linguistic
support-oral, direct linguistic support--written, and indirect linguisticsupport. Both manuals
indicate that Test Administrators must record on the student's answer document all test
administration accommodations that are provided.
CTB provides test administration sessions for school Test Chairpersons in the month prior to test
administration. School Test Chairpersons are required to conduct training sessions, and all school
staff who will handle test materials must attend these sessions. School Test Chairpersons are
explicitly required in the Test Chairperson's Manual to oversee the test administrations in their
schools. They are required to ensure that test materials are available in adequate numbers and
that school staff adhere to test security requirements, track materials by using security checklists,
report breaches if they occur, document disruptions during testing, sign test materials in and out
each day, account for 100% of secure test materials, and report missing or damaged materials
immediately to CTB Customer Service and OSSE by completing the online Security Exceptions
Survey.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

19

Materials Orders, Delivery, and Retrieval
Customer orders were managed in CTB's Online Enrollment System. Schools updated and
validated their enrollments or indicated non-participation. CTB used the results for order
fulfillment.
Prior to shipment of materials, barcodes were applied to the secure materials for the purpose of
secure inventory tracking (a description of the Secure Inventory process is provided next in this
section). Corresponding security checklists were also produced. Daily tracking reports were
provided to OSSE for the purpose of monitoring the deliveries.
The appropriate district and school staff were previously trained to maintain security and monitor
quantities of materials. Shortly after delivery, they unpacked and reviewed materials to ensure
readiness for administration, as described in the previous section of this report, -Guidelines and
Requirements for Administering DC CAS.? In the event that the materials received were not
sufficient for administration, a short/add window functioned to permit CTB Customer Service to
process requests for additional materials while maintaining a secure inventory.
After the test administration was complete, the materials were packaged for retrieval and picked
up according to a verified schedule. Daily tracking reports also served for OSSE to monitor
retrievals. When the materials were back in CTB's custody, all books with security barcodes
were accounted for as described in the following section of this report, -Secure Inventory.?

Secure Inventory
To further support the full range of test security requirements for DC CAS, CTB has instituted a
comprehensive Test Security/Test Inventory System. This system was created using industry best
practices. Upon request, CTB further customized a security model to precisely match the needs
of DC CAS security requirements. This security model for the DC CAS assessment maintains its
own list of material deliverables and services, from assessment barcoding to inventory checking
and shipment tracking, as described in the steps below.
1. Secure materials are barcoded at the printer, vertically banded, and inventoried. Barcode
files are sent to CTB. Packing lists and test materials are sent to the schools.
2. Materials are distributed into the schools.
3. Following the test administration, school staff members separate secure and non-secure
materials and package them for return to CTB following Test Chairperson's Manual
instructions.
4. The dedicated/secure carrier contacts the schools to schedule retrieval of their materials
on a specified date.
5. Scorable secure documents are accounted for during answer document scanning, and
nonscorable secure documents are scanned into an inventory return system. Materials
sent to the wrong CTB facility are forwarded to the appropriate site, as needed.
6. Missing Materials Reports are sent to OSSE for resolution once scanning is completed.
Given a list of shipped security barcodes minus the barcode numbers already received,
the remaining list is considered to be missing inventory.
7. OSSE contacts schools and reports back to CTB on findings, including additional books
that have been located, contaminated books that could not be returned to CTB, and
damaged or destroyed books where no barcode was available for scanning.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

20

8. CTB processes additional, received inventory and approved exceptions, and produces a
final missing inventory report.
As of September 20, 2012, approximately 99.68% of secure materials were accounted for; 212
secure test books were missing for the 2012 administration, compared with 103 test books
missing in 2011.

Section 4. Student Participation
This section contains information relevant to Standards and Assessments Peer Review Guidance,
Critical Elements 6.1 and 6.2:
6.1
1. Do the State's participation data indicate that all students in the tested grade levels or grade
ranges are included in the assessment system (e.g., students with disabilities, students with
limited English proficiency, economically disadvantaged students, race/ethnicity, migrant
students, homeless students, etc.)?
2. Does the State report separately the number and percent of students with disabilities assessed
on the regular assessment without accommodations, on the regular assessment with
accommodations, on an alternate assessment against grade level standards, and, if applicable, on
an alternate assessment against alternate achievement standards and/or on an alternate
assessment against modified academic achievement standards?
6.2
1. What guidelines does the State have in place for including all students with disabilities in the
assessment system?
(a) Has the State developed, disseminated information on, and promoted use of appropriate
accommodations to increase the number of students with disabilities who are tested against
academic achievement standards for the grade in which they are enrolled?

Tests Administered
All public schools in the District of Columbia administered the DC CAS tests between April 17
and April 27, 2012.
The tests administered were:
Reading, Grades 2-10
Mathematics, Grades 2-8 and 10
Composition, Grades 4, 7, and 10
Science, Grades 5 and 8
Biology, for those students in Grades 8-12 who were enrolled in a high school Biology
course

Participation in DC CAS
The DC CAS Test Chairperson's Manual states that all students enrolled in all public schools in
the District of Columbia must participate in DC CAS grade level test administrations, with one
exception: A student with significant cognitive disabilities whose Individualized Education
Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

21

Program (IEP) indicates that the student meets OSSE's established criteria may participate in the
DC CAS alternate assessment portfolio.
Approximately 4,500 students were assessed in Reading and Mathematics at each tested grade,
with slightly fewer in each tested grade in Composition and Science/Biology. Only students with
a valid test administration as required by the type of analysis, as defined below, are included in
the reports.

Definition of Valid Test Administration
In this technical report, two sets of rules are used to define a valid test administration. The first
set of rules is for psychometric analyses included in this report (e.g., reliability, DIF, item
parameter calibration, and equating). Answer documents are excluded when any of the following
conditions are observed:
Three or more of the first five items are invalidly marked or omitted.
The operational test total raw score equals zero and the sum of the operational and field
test item valid responses is less than 5.
All operational and field test items are omitted.
The second set of valid test administration rules are for analyses summarizing test performance
(e.g., overall numbers of examinees, descriptive statistics, and correlations of test scores). All
students who have a valid test score, as defined in the DC CAS Spring 2012 Business
Requirements, are included in these analyses. For the Reading, Mathematics, Science, and
Biology assessments, the requirements document outlines a valid attempt on the test as:
At least one item marked with a correct response OR
At least 5 items validly marked in the content area
And for Composition, a valid attempt is defined as:
A score of non-zero on both parts of the item
Note: To maintain confidentiality of individual student results, this report does not show
subgroup results for fewer than 25 students. The race/ethnicity subgroups Native
Hawaiian/Pacific Islander and American Indian/Alaska Native contain fewer than 25 students
per grade and are not shown in the following tables.
The total number and percent of students with valid tests are provided at the total and subgroups
of gender and race/ethnicity in Table 5. Participation rates for students in special populations,
such as special education, Title 1, English Language Learners, and students with 504 plans is
provided in Table 6. ELL students who participate in the DC CAS were classified by their
schools into one of four language proficiency levels. These levels are related to levels of
language instruction services and participation in school instruction, the number and percent of
which are provided in Table 7.

Special Accommodation
Students with disabilities and ELLs who participate in DC CAS grade level administrations may
be provided approved test administration accommodations that are specified by special education
IEP teams, Section 504 teams, or ELL teams. Test administration accommodations are
Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

22

categorized into one or more of four categories: timing/scheduling, setting, presentation, and
response. For a student to receive an accommodation, the accommodation had to be in place
during the school year and specified in the student's IEP or 504 plan. Within prescribed
parameters, students in ELL programs received test administration accommodations in one or
more of three categories: direct linguistic support-oral, direct linguistic support-written, and
indirect linguistic support. The number and percent of the various accommodations documented
are provided in Tables 6-10. For more information on these accommodations, please refer to the
DC CAS Test Chairperson's Manual.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

23

Technical Report for Spring 2012 Test Administration of DC CAS

Table 5. Number and Percent of Examinees with Valid Test Administrations on the 2012 DC CAS in Reading, Mathematics,
Science/Biology, or Composition
Students
with Test
Scores

N

%

N

%

2
3
4
5
6
7
8
9
10

4,491
4,754
4,589
4,744
4,545
4,301
4,359
4,164
4,272

2,274
2,402
2,317
2,402
2,297
2,160
2,172
2,031
2,039

50.63%
50.53%
50.49%
50.63%
50.54%
50.22%
49.83%
48.78%
47.73%

2,194
2,334
2,253
2,326
2,222
2,126
2,163
2,061
2,186

48.85%
49.10%
49.10%
49.03%
48.89%
49.43%
49.62%
49.50%
51.17%

2
3
4
5
6
7
8
10

4,514
4,781
4,603
4,759
4,567
4,325
4,381
4,245

2,284
2,418
2,320
2,415
2,304
2,161
2,179
2,027

50.60%
50.58%
50.40%
50.75%
50.45%
49.97%
49.74%
47.75%

2,205
2,344
2,264
2,328
2,236
2,148
2,178
2,173

48.85%
49.03%
49.19%
48.92%
48.96%
49.66%
49.71%
51.19%

5
8
High School

4,707
4,263
3,715

2,381
2,096
1,744

50.58%
49.17%
46.94%

2,299
2,122
1,890

48.84%
49.78%
50.87%

4
7
10

4,470
4,146
3,511

2,236
2,049
1,638

50.02%
49.42%
46.65%

2,206
2,062
1,830

49.35%
49.73%
52.12%

Grade

Males

Females

African
American

Asian
N

%

Reading
95
2.12%
94
1.98%
102
2.22%
78
1.64%
68
1.50%
55
1.28%
55
1.26%
77
1.85%
64
1.50%
Mathematics
100
2.22%
97
2.03%
104
2.26%
81
1.70%
69
1.51%
55
1.27%
58
1.32%
64
1.51%
Science/Biology
79
1.68%
57
1.34%
69
1.86%
Composition
103
2.30%
55
1.33%
58
1.65%

Hispanic

White

N

%

N

%

N

%

3,216
3,475
3,357
3,694
3,596
3,458
3,545
3,296
3,559

71.61%
73.10%
73.15%
77.87%
79.12%
80.40%
81.33%
79.15%
83.31%

626
665
632
578
566
508
476
489
445

13.94%
13.99%
13.77%
12.18%
12.45%
11.81%
10.92%
11.74%
10.42%

521
479
461
366
268
240
236
197
153

11.60%
10.08%
10.05%
7.72%
5.90%
5.58%
5.41%
4.73%
3.58%

3,224
3,486
3,360
3,692
3,608
3,463
3,541
3,533

71.42%
72.91%
73.00%
77.58%
79.00%
80.07%
80.83%
83.23%

632
675
638
587
573
527
497
445

14.00%
14.12%
13.86%
12.33%
12.55%
12.18%
11.34%
10.48%

523
482
464
368
269
240
236
152

11.59%
10.08%
10.08%
7.73%
5.89%
5.55%
5.39%
3.58%

3,641
3426
2,968

77.35%
80.37%
79.89%

588
493
396

12.49%
11.56%
10.66%

366
233
197

7.78%
5.47%
5.30%

3,244
3,304
2,886

72.57%
79.69%
82.20%

622
498
388

13.91%
12.01%
11.05%

456
230
140

10.20%
5.55%
3.99%

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

24

Technical Report for Spring 2012 Test Administration of DC CAS

Table 6. Number and Percent of Students in Special Programs with Test Scores on the 2012 DC CAS in Reading, Mathematics,
Science/Biology, or Composition
Grade

Students
with Test
Scores

Special
Education
N

2
3
4
5
6
7
8
9
10

4,518
4,783
4,603
4,763
4,572
4,332
4,394
4,164
4,282

304
458
514
669
675
662
636
568
699

5
8
High School

4,707
4,263
3,715

570
475
404

4
7
10

4,470
4,146
3,511

422
512
459

English
Language
Learner

Title I
Targeted

Home
Schooling

%

N

%

N

%

1%
1%
1%
1%
1%
1%
1%
0%
0%

309
268
265
238
51
136
141
170
444

7%
6%
6%
5%
1%
3%
3%
4%
10%

1
1
2
0
1
2
2
0
0

0%
0%
0%
0%
0%
0%
0%
0%
0%

1%
1%
0%

237
134
149

5%
3%
4%

0
2
0

0%
0%
0%

1%
1%
0%

248
124
189

6%
3%
5%

1
2
0

0%
0%
0%

Section 504

%
N
%
N
Reading and/or Mathematics
7%
432
10%
27
10%
392
8%
33
11%
259
6%
32
14%
233
5%
35
15%
244
5%
36
15%
237
5%
40
14%
250
6%
27
14%
242
6%
10
16%
182
4%
7
Science/Biology
12%
208
4%
33
11%
241
6%
25
11%
140
4%
7
Composition
9%
221
5%
29
12%
186
4%
31
13%
151
4%
7

Note: Students who participated in more than one test administration are counted only once. Student subgroups are indicated in
the Program Participation section on the biogrid on each student's answer document.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

25

Table 7. Number and Percent of Students Coded for ELL Access for Proficiency Levels 1-4 in Reading, Mathematics,
Science/Biology, or Composition
Students
with Test
Scores

N

2
3
4
5
6
7
8
9
10

4,518
4,783
4,603
4,763
4,572
4,332
4,394
4,164
4,282

48
28
20
23
33
42
43
62
5

5
8
High School

4,707
4,263
3,715

19
39
21

4
7
10

4,470
4,146
3,511

10
16
6

Grade

Level 1

Level 2
%

N

Level 3
%

Reading and/or Mathematics
1%
93
2%
1%
49
1%
0%
23
0%
0%
30
1%
1%
48
1%
1%
32
1%
1%
47
1%
1%
67
2%
0%
21
0%
Science/Biology
0%
25
1%
1%
45
1%
1%
50
1%
Composition
0%
20
0%
0%
20
0%
0%
24
1%

Level 4

N

%

N

%

189
178
74
65
83
82
78
57
78

4%
4%
2%
1%
2%
2%
2%
1%
2%

153
202
183
130
99
75
82
40
83

3%
4%
4%
3%
2%
2%
2%
1%
2%

59
77
46

1%
2%
1%

121
80
38

3%
2%
1%

60
74
65

1%
2%
2%

149
76
67

3%
2%
2%

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

26

Table 8. Number and Percent of Students Receiving One or More English Language Learner Test Administration Accommodations
in Reading, Mathematics, Science/Biology, or Composition
Grade

Students
with Test
Scores

Direct Linguistic
Support--Oral1
N

%

2
3
4
5
6
7
8
9
10

4,518
4,783
4,603
4,763
4,572
4,332
4,394
4,164
4,282

374
375
214
193
195
149
195
208
166

8%
8%
5%
4%
4%
3%
4%
5%
4%

5
8
High School

4,707
4,263
3,715

181
189
127

4%
4%
3%

4
7
10

4,470
4,146
3,511

199
109
146

4%
3%
4%

Direct Linguistic
Support--Written
N

%

Reading and/or Mathematics
200
4%
192
4%
120
3%
144
3%
135
3%
127
3%
146
3%
94
2%
141
3%
Science/Biology
144
3%
144
3%
130
3%
Composition
105
2%
35
1%
50
1%

Indirect Linguistic
Support

Other

N

%

N

%

392
389
226
203
199
171
210
204
171

9%
8%
5%
4%
4%
4%
5%
5%
4%

11
3
2
1
1
3
1
1
10

0%
0%
0%
0%
0%
0%
0%
0%
0%

187
204
130

4%
5%
3%

0
0
10

0%
0%
0%

202
134
145

5%
3%
4%

1
0
7

0%
0%
0%

Note: Students who received more than one accommodation in a single content area test can be counted in multiple columns. Students who received accommodations in
more than one content area test administration are counted only once. Students for whom the ELL bubble was not completed but who did receive these ELL test
administration accommodations are counted here.
1

The -Oral Reading of Test in English? accommodation is typically not permitted for the Reading test.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

27

Technical Report for Spring 2012 Test Administration of DC CAS

Table 9. Number and Percent of Students Receiving One or More Special Education Test Administration Accommodations in
Reading, Mathematics, Science/Biology, or Composition
Grade

Students
with Test
Scores

Timing/ Scheduling
N

%

Presentation1

Setting
N

2
3
4
5
6
7
8
9
10

4,518
4,783
4,603
4,763
4,572
4,332
4,394
4,164
4,282

352
506
594
718
714
701
669
539
646

8%
11%
13%
15%
16%
16%
15%
13%
15%

371
487
598
714
717
699
673
543
660

5
8
High School

4,707
4,263
3,715

648
613
388

14%
14%
10%

649
612
414

4
7
10

4,470
4,146
3,511

484
557
389

11%
13%
11%

497
562
430

%

N

Reading/Mathematics
8%
352
10%
476
13%
561
15%
669
16%
677
16%
658
15%
648
13%
420
15%
503
Science/Biology
14%
608
14%
577
11%
297
Composition
11%
456
14%
529
12%
300

Response

Other

%

N

%

N

%

8%
10%
12%
14%
15%
15%
15%
10%
12%

191
332
393
434
508
556
591
276
603

4%
7%
9%
9%
11%
13%
13%
7%
14%

21
10
25
21
9
13
6
11
15

0%
0%
1%
0%
0%
0%
0%
0%
0%

13%
14%
8%

354
468
220

8%
11%
6%

23
5
5

0%
0%
0%

10%
13%
9%

256
378
243

6%
9%
7%

24
13
6

1%
0%
0%

Note: Students who received more than one accommodation in a single content area test can be counted in multiple columns. Students who received accommodations in
more than one content area test administration are counted only once. Students for whom the Special Education bubble was not completed and who did receive these
Special Education test administration accommodations are counted here.
1
The -Presentation? column contains 10 accommodations, two of which are not typically permitted for Reading assessments: -Reading Test Questions? and -Translation
of Words or Phrases? is available for Mathematics, Science/Biology, and Composition only.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

28

Technical Report for Spring 2012 Test Administration of DC CAS

Table 10. Number and Percent of Students Receiving One or More Selected Special Education Test Administration Accommodations
in Reading, Mathematics, Science/Biology, or Composition

Grade

Students
with Test
Scores

Small Group and
Individual
Administrations

Breaks

N

%

2
3
4
5
6
7
8
9
10

4,518
4,783
4,603
4,763
4,572
4,332
4,394
4,164
4,282

303
447
508
623
618
584
563
449
539

7%
9%
11%
13%
14%
13%
13%
11%
13%

5
8
High School

4,707
4,263
3,715

569
503
310

12%
12%
8%

4
7
10

4,470
4,146
3,511

411
442
324

9%
11%
9%

N

%

Reading and/or Mathematics
361
8%
478
10%
584
13%
702
15%
704
15%
690
16%
663
15%
518
12%
637
15%
Science/Biology
637
14%
605
14%
392
11%
Composition
484
11%
553
13%
417
12%

Read or Translate
Test Questions
(MA, SC and WR
only)1

Responses Dictated

N

%

N

%

267
374
460
560
535
502
486
111
211

6%
8%
10%
12%
12%
12%
11%
3%
5%

70
103
130
104
70
77
47
36
46

2%
2%
3%
2%
2%
2%
1%
1%
1%

500
439
152

11%
10%
4%

88
52
29

2%
1%
1%

368
416
149

8%
10%
4%

97
58
31

2%
1%
1%

Note: Students who received more than one accommodation in a single content area test can be counted in multiple columns. Students who received accommodations in
more than one content area test administration are counted only once. Accommodations are recorded by Test Administrators in the Accommodations section on the
biogrid on each student's answer document.
1
The -Reading Test Questions? and -Translation of Words or Phrases? accommodations are typically not permitted for the Reading test.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

29

Section 5. Scoring
This section contains information relevant to Standards and Assessments Peer Review Guidance,
Critical Element 4.5:
Has the State established clear criteria for the administration, scoring, analysis, and reporting
components of its assessment system, including all alternate assessments, and does the State
have a system for monitoring and improving the on-going quality of its assessment system?
Multiple choice items were scored by CTB using electronic scanning equipment. Constructed
response items were scored by human raters who were trained by CTB. Evidence of validity is
provided by the procedures for hand-scoring described below.

Selection of Scoring Raters
CTB/McGraw-Hill and Kelly Services Inc. strive to develop a highly qualified, experienced core
of raters so that the integrity of all projects is appropriately maintained.

Recruitment
CTB requires that all team leaders and raters possess a bachelor's degree or higher. Kelly
Services Inc. carefully screened all new applicants and required them to produce either a
transcript or a copy of the degree. Kelly Services Inc. also required a one- to two-hour
interview/screening process. Individuals who did not present proper documentation or had less
than desirable work records were eliminated during this process. Kelly Services Inc. verified that
100% of all potential raters met the degree requirement. All experienced raters and team leaders
had already successfully completed the screening process.

The Interview Process
All potential raters completed a pre-interview activity. For some parts of the
pre-interview activity, applicants were shown examples of test responses and were supplied with
a scoring guide. In a brief introduction, they became acquainted with the application of a rubric.
After the introduction, applicants applied the scoring guide to score the sample responses.
Each applicant's scores were used for discussion during the interview process to determine the
applicant's trainability, as well as his or her ability to understand and implement the standards set
forth in the sample scoring guide.
Kelly Services Inc. interviewed each applicant and determined the applicant's suitability for a
specific content area and grade level. Applicants with strong leadership skills were questioned
further to determine whether they were qualified to be team leaders.
When Kelly Services Inc. felt applicants were qualified, the applicants were recommended for
employment. All assignments were made according to availability and suitability. Before being
hired, all employees were required to read, agree to, and sign a nondisclosure agreement
outlining the CTB/McGraw-Hill business ethics and security procedures.

Training Material Development
Scoring guides for the 2012 constructed response items were written by CTB's Development
teams in conjunction with OSSE and, for Reading Grade 9 DCPS. Composition's Understanding
Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

30

Literary Text and Understanding Informational Text rubric was added this year to the scoring
guides, and also underwent a rangefinding process in DC to identify anchor papers, which
represent the exemplars at each score point.
Prior to actual scoring, CTB supervisors studied and internalized the scoring guides along with
existing materials that were then used in training raters to hand-score the constructed response
items for all content areas. This ensured consistency in scoring the same items across
administrations (such as field test to operational), with the same anchor papers and training
philosophy.

Preparation and Meeting Logistics for Rangefinding
Rangefinding is the process of reviewing student responses to newly tested (field tested) items to
identify anchor or exemplar papers at each score point. The anchor papers are concrete examples
of particular score points and are delineated in the scoring guides used during training and
scoring. All DC CAS constructed response items go through this process prior to operational
scoring. For example, for the newly field tested Composition prompts (four in each of Grades 4,
7, and 10), an extensive rangefinding workshop was held in DC, from June 18, 2012, through
June 22, 2012, with discussion groups of three or four DC teachers per grade. These groups of
teachers chose the anchor papers to be used during subsequent rubric training.
In preparation for rangefinding, CTB content supervisors reviewed hundreds of student
responses to identify a variety of papers for the reviews. These potential anchors were then
assembled for review at rangefinding. During rangefinding, participants were placed in groups of
three or more (plus the CTB content supervisor/facilitator) to discuss a particular grade and
content area, and were involved in discussion of all field test items for that grade. Rubrics were
passed out and discussed so that all participants became familiar with the items and the criteria
that raters would use to score the student responses after rangefinding. DC participants, along
with their CTB facilitator, then reviewed packets containing approximately 35 to 50 responses
per item and applied the rubrics and scoring criteria in order to choose appropriate anchor papers.
This process effectively set the range of each score point for each item. At least one anchor paper
for each score point was chosen for every item, and discussion within each group included
insights, suggestions, and summary statements for future training on the item. These were
recorded by the CTB facilitator. The chosen anchor papers and their final scores were also
recorded by the CTB representative, and a DC participant provided sign-off that consensus on
the scoring of the items was achieved.

Training and Qualifying Procedures
Hand-scoring involves training and qualifying team leaders and raters, monitoring scoring
accuracy and production, and ensuring the security of both the test materials and the scoring
facilities. An explanation of the training and qualification procedures follows.
All raters were trained and qualified in specific rater item blocks (RIBs), each of which consisted
of a single item to be scored. Raters and team leaders were trained in the following steps:
Reviewing the student answer booklet
Reviewing rubrics
Reviewing anchor papers and training papers and answering questions arising from
the established scores
Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

31

Explaining scoring strategies, followed by a question-and-answer period
Administering Qualifying Round 1
Reviewing Qualifying Round 1 established scores, and answering questions arising
from the scores
Administering Qualifying Round 2 (if necessary)
Explaining condition codes and sensitive paper procedures
Explaining nonstandard response or computer-generated response (nsr/cgr)
procedures
Explaining unscannable image procedures
All raters were trained and qualified using the same procedures and criteria used for the team
leaders, who had been trained previously. The CTB content experts who supervised the training
of the team leaders also supervised the training of the raters.

Breakdown of Scoring Teams
Groups of CTB content experts oversaw the training and scoring of the constructed response
items for 2012 in Reading, Mathematics, Science/Biology, and Composition. Each of the content
experts was responsible for training and scoring all of the items in his or her content area. Teams
of between raters (numbers of which depend on the content and grade) trained on and scored all
the operational items at their respective grades, and some cross-training was done across grades
to ensure on-time completion.
Training and scoring of the operational constructed response items occurred May 8-18, 2012, for
Reading, Mathematics, and Science/Biology, and July 9-20, 2012, for Composition. Training
and scoring of the field test constructed response items occurred July 11-18, 2012, Training
consisted of a review of the rubrics, followed by analysis of the anchor papers for each item.
Raters then participated in qualifying rounds, which consisted of ten books of sample papers for
the item in a given RIB. Raters were given two opportunities to achieve acceptable qualification
ratings; those not meeting the minimum were dismissed.

Monitoring the Scoring Process
After training was completed and live scoring began, a number of quality control measures were
put in place to ensure that books were scored accurately and that raters remained consistent in
scoring accuracy.
Throughout the course of hand-scoring, calibration sets of pre-scored papers (checksets/validity
sets) were administered daily to each rater to monitor scoring accuracy and to maintain a
consistent focus on the established rubrics and guidelines. Approximately 6% of books that the
raters received were -checkset? papers rather than live books, where the checksets were -blind?
or unknown to the rater. Raters whose checkset accuracy repeatedly dipped below the quality
standards were flagged and retrained. In addition to the checkset process, CTB's hand-scoring
protocol included the use of read-behinds (spot checks during live scoring). The read-behind was
another valuable rater-reliability monitoring technique that allowed a team leader to review a
rater's scored documents, providing feedback and counseling as appropriate. The CTB Data
Monitoring staff also ran inter-rater reliability reports throughout live scoring to look for any
raters who were struggling and in need of retraining. Retraining involved a one-on-one
discussion between the supervisor (or a team leader) and the rater, who discussed the problem
item(s) as well as the scoring guides and, if necessary, training papers. If the rater's accuracy on
Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

32

checkset scores did not meet the quality standards after this retraining, the rater was dismissed
from the project immediately.
Approximately 10% of all DC CAS tests were scored by a second rater to establish inter-rater
reliability statistics for all constructed response items, results of which are provided in Section 8.
This procedure is called a -double-blind read? because the second rater does not know the first
rater's score.
Scoring Security
All raters had to sign nondisclosure forms indicating that they were not to disclose the items they
were scoring. Security guards were on-site whenever employees were present in the building. All
employees were issued identification badges and were required to wear them in plain view at all
times. Visitors and employees who forgot their badges were issued visitors' badges and were
required to wear them in plain view. All employees and visitors were subject to inspection of
their personal effects.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

33

Section 6. Methods
This section contains information relevant to Standards and Assessments Peer Review Guidance,
Critical Elements 4.4, 4.5, 4.6, and 5.6.
4.4
When different test forms or formats are used, the State must ensure that the meaning and
interpretation of results are consistent.
(a) Has the State taken steps to ensure consistency of test forms over time?
4.5
Has the State established clear criteria for the administration, scoring, analysis, and reporting
components of its assessment system, including all alternate assessments, and does the State
have a system for monitoring and improving the on-going quality of its assessment system?
5.6
Assessment results must be expressed in terms of the achievement standards, not just scale
scores or percentiles.
This section describes the methods used to analyze the item and test level data for the DC CAS.
Results of the item and test level analyses described here are provided as evidence for reliability
and validity in Section 8.

Classical Item Level Analyses
Each operational test item was first reviewed in terms of classical raw score statistics. Each
item's frequency distribution (number of students responding for each answer choice or score
level) as well as each item's overall p value (proportion of students choosing the correct answer)
and point biserial or item-test correlation (how correlated each individual item is with the test as
a whole based on the correct response) were reviewed. Typically, p values should range between
0.30 and 0.90. Items with a p value less than 0.30 are considered more difficult since less than
30% of the students are getting the correct answer. Values greater than 0.90 indicate a fairly easy
item, with more than 90% of students getting the correct answer. With newly tested content, the
p value may dip lower than 0.30, at which point the item should be evaluated in light of the
newness of content or students' opportunity to learn the content. Point biserials or item-test
correlations are usually in the range of 0.30 and above, although some items can be acceptable
when as low as 0.15. The point biserials of each item's distractors, or incorrect responses, were
also analyzed, as well as any distractor with a positive point biserial, either of which was
reviewed for the possibility of an additional correct response or no correct response.
It is also important to track the rate at which students do not respond to, or omit, items. Omitted
items receive a zero score. The rate of omission often provides some information about test
times, or speededness, particularly if there is a high rate of items omitted at the end of a test
session. It also provides an indication of items that may simply be unclear or illogically
presented. When more than 5% of students omit an item, the item is reviewed by both CTB
Research and Development and shared with OSSE.

Item Bias Analyses
Differential item functioning (DIF) statistics provide a measure of the systematic errors by
Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

34

subgroups that are specifically attributed to some bias or systematic over- or underrepresentation of subgroup performance when compared with total group performance. To
evaluate the potential bias, items are first reviewed from content perspectives. All items are
screened in Content and Bias Review meetings comprised of DC educators to ensure that no
obviously sensitive terms, phrases, scenarios, or illustrations that could influence examinee
performance appear in the DC CAS items prior to field testing and selection for operational test
forms
For the DC CAS program, CTB uses Mantel-Haenszel statistics (Mantel & Haenszel, 1959) to
evaluate DIF for both operational and field test items. The subgroups compared in the DIF
analyses for the 2012 administration reflect conventional subgroupings, and were based on
gender (male-reference and female-focal) and race/ethnicity (African American-reference,
and Asian, Hispanic, and White-focal). As with all statistical tests, Mantel-Haenszel DIF
statistics are subject to Type I and II errors. An item flagged for DIF may or may not provide an
unfair advantage or disadvantage for one examinee subgroup compared with another. However,
the flag does show when an item is more difficult for a particular focal subgroup of students than
would be expected based on their total test scores, when compared with the difficulty of the item
for the comparison or reference subgroup with equivalent total test scores. OSSE and CTB
screen all items that are flagged for DIF after each administration to identify items that may
favor or disadvantage examinee subgroups.
The statistical procedures and flagging criteria used by CTB to identify items that exhibit DIF
are those used by the Educational Testing Service (ETS) for the National Assessment of
2
Educational Progress (NAEP). For multiple choice items, the Mantel-Haenszel ( MH ) statistic
(Mantel & Haenszel, 1959) was used to evaluate potential DIF in items. In this procedure, items
with A, B, and C level DIF are flagged.
2
For multiple choice items, the Mantel-Haenszel ( MH ) statistic flags items for potential DIF
using the following criteria:
B level DIF, where a -B? indicates DIF and has an absolute value of the MantelHaenszel ( MH ) that is significantly greater than zero (at the 0.05 level) and
1.5
1 or 1
1.5 .
MH
MH
C level DIF, where a -C? indicates DIF and has an absolute value of the MantelHaenszel ( MH ) that is significantly greater than zero (at the 0.05 level) and
| MH | exceeds 1.5.

For constructed response items, an effect size (ES) statistic based on the Mantel 2 is used to
flag items for potential DIF. ES is obtained by dividing the standardized mean difference (SMD)
statistics by the standard deviation of the item. Items are flagged using the same rules that are
used in NAEP:
BB level, where the Mantel statistic is significant (p < 0.05) and |ES| is between
0.17 and 0.25
CC level, where the Mantel statistic is significant (p < 0.05) and |ES| 0.25
C- and CC-level flags indicate moderate to severe DIF. B- and BB-level flags indicate moderate
DIF. A-level flags indicate negligible DIF. (A detailed description of these procedures can be
found in Zwick, Donoghue, & Grima, 1993.)
Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

35

Positive DIF values indicate items that favor the focal group, while negative values indicate
items that disadvantage the focal group.

Calibration and Equating
Scaling and linking was accomplished using the PARDUX and SAS computer programs to
implement the three-parameter logistic model (3PL) and the two-parameter partial-credit (2PPC)
IRT models for item calibration and scaling, and the Stocking and Lord (1983) procedure was
used for equating. These software programs were developed at CTB/McGraw-Hill to enable
scaling and linking of complex assessment data.
In PARDUX, a marginal maximum likelihood procedure was used to simultaneously estimate
the item parameters under the 3PL model (used for multiple choice items) and the 2PPC model
(used for constructed response items) (Bock & Aitkin, 1981; Thissen, 1982). These models were
implemented using the microcomputer program PARDUX (Burket, 1995). For setting the 2006
base scales for Reading and Mathematics, all scales were also calibrated in PARSCALE (Muraki
& Bock, 1991) as verification of the PARDUX results.
Under the 3PL model, the probability that a student with trait or scale score
to multiple choice item j is as follows:
Pj ( )

cj

(1 c j ) /[1 exp( 1.7 a j (

b j ))].

responds correctly

(1)

In equation (1), a j is the item discrimination, b j is the item difficulty, and c j is the probability
of a correct response by a very low-scoring student. The 2PPC model holds that the probability
that a student with trait or scale score will respond in category k to partial-credit item j is
given by
mj

Pjk ( )

exp( z jk ) /

exp( z ji ),

(2)

i 1

k 1

where z jk

( k 1) f j

g ji , and g j 0

0 for all j.

i 0

The summary output of the above equations is in two different metrics corresponding to the two
item response models (3PL and 2PPC). The location and discrimination parameters for the
multiple choice items are in the traditional 3PL metric (labeled b and a, respectively). In the
2PPC model, f (alpha) and g (gamma) are analogous to b and a, where alpha is the discrimination
parameter and gamma over alpha (g/f) is the location where adjacent trace lines cross on the
ability scale. Because of the different metrics used, the 3PL parameters b and a are not directly
comparable to the 2PPC parameters f and g; however, they can be converted to a common
metric. The two metrics are related by b = g/f and a = f /1.7 (Burket, 1995). Application of this
procedure locates both the multiple choice and constructed response items on the same scale.
Note that for the 2PPC model there are mj - 1 (where mj is a score level j), independent g's, and
one f, for a total of mj independent parameters estimated for each item, while there is one a and
one b per item in the 3PL model.

Goodness of Fit
Goodness-of-fit statistics were computed for each item to examine how closely the item's data
conform to the item response models. This provides a measure of validity. A procedure described
Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

36

by Yen (1981) was used to measure fit. In this procedure, students are rank ordered on the basis
of their ? values and sorted into ten cells with 10% of the sample in each cell. Each item j in
each decile I has a response from Nij examinees. The fitted IRT models are used to calculate an
expected proportion Eijk of examinees who respond to item j in category k. The observed
proportion Oijk is also tabulated for each decile, and the approximate chi-square statistic
10 m j

Q1 j

N ij (Oijk

i 1 k 1

Eijk ) 2

Eijk

,

Q1 j should be approximately chi-square distributed with degrees of freedom (DF) equal to the
number of -independent? cells, 10(mj - 1), minus the number of estimated parameters. For the
3PL model, mj = 2, so DF = 10(2 - 1) - 3 = 7 . For the 2PPC model,
DF = 10( m j - 1) - m j = 9m j 1 . Since DF differs between multiple choice and constructed
response items and among constructed response items with different score levels m j , Q1 j is
transformed, yielding the test statistic
Q1 j DF
Zj
.
2 DF
This statistic is useful for flagging items that fit relatively poorly. Zj is sensitive to sample size,
and cut-off values for flagging an item based on Zj have been developed and were used to
identify items for the item review. The cut-off value is (N/1500 x 4) for a given test, where N is
the sample size.
Model-fit information is obtained from the Z-statistic. The Z-statistic is a transformation of the
chi-square (Q1) statistic that takes into account differing numbers of score levels as well as
sample size:
Zj

(Q1 j

DF )
2 DF

, where j = item j.

The Z-statistic is an index of the degree to which obtained proportions of students with each item
score are close to the proportions that would be predicted by the estimated thetas and item
parameters. These values are computed for ten intervals corresponding to deciles of the theta
distribution (Burket, 1995). The Z-statistic is used to characterize item fit. The critical value of Z
is different for each grade because it is dependent on sample size.
Evidence of the validity of the scalings is provided by model fit. If the IRT model fits the
empirical item response distributions for the population we want to generalize to
(i.e., District of Columbia students), then the claim that the scores are valid indicators of an
underlying proficiency is strengthened. Fit statistics indicate the degree of difference between (a)
expected probabilities of correct responses at each proficiency level and (b) observed
probabilities examined when items are field tested and when they are used operationally. Table
11 indicates that only small numbers of operational items were flagged for poor fit to the IRT
model. No items were removed from operational scaling and scoring due to poor fit.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

37

Year-to-Year Equating Procedures
Once the IRT scaling is accomplished, equating the scale across years enables comparability of
scores from one year to the next and across all test forms in the same content area and grade. In
2007 through 2012, anchor item sets that equate the current test forms to the previous year's
scale were used in a Stocking and Lord (1983) equating methodology.
The Stocking and Lord (1983) procedure, also called test characteristic curve (TCC) method,
was used to place each grade on the vertical scale that had been developed for each content area.
It minimizes the mean squared difference between the two characteristic curves, one based on
estimates from the previous calibration and the other on transformed estimates from the current
calibration. Let ^ j be the test characteristic curve based on estimates from the previous
calibration and ^ * be the test characteristic curve based on transformed estimates from the
j
current calibration
^j

^( j)

n

Pi ( j ; ai , bi ,ci ),
i 1

^*
j

^( j)

n

Pi ( j ;
i 1

ai
, M 1bi
M1

M 2 , ci ).

The TCC method determines the scaling constants (multiplicative -- M1 and additive -- M2) by
minimizing the following quadratic loss function (F):
\
1 N
F
( ^ j ^ * )2
j
N a1
where N is the number of examinees in the arbitrary group.
Anchor items consist of multiple choice and/or constructed response items. The Reading and
Science/Biology equating anchor items for 2012 sets included multiple choice items and one
constructed response item; in Mathematics, all of the anchor items were multiple choice items.
Anchor items are rotated in and out of use each year, to the degree possible, to minimize item
over-exposure. Anchor items are placed in approximately the same location or same third of the
location as the original administration. Anchor item a and b parameters are calibrated freely (i.e.,
not fixed during calibration). The number and representativeness of the anchor items relative to
the overall test and blueprints is provided in Tables 1-4. The blueprint should be proportionally
represented in the anchor sets.
Because Composition prompts are so few, the -items? or scores from each of the Writing rubrics
were linked to the Reading scale by first matching students' Reading and Composition item-level
scored responses. The Reading operational items were treated as anchor items and the Stocking
and Lord common-item equating procedure was conducted.
Once calibrated, the anchor item set and equating results are carefully reviewed to ensure that it
is performing very similarly in both current and reference (just prior) year. These standard CTB
Research team quality checks are followed during calibration and equating analyses for all
grades and content areas. Additional anchor item checks were conducted for items flagged in any
Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

38

of the following verifications, which were performed to ensure the quality and accuracy of the
equating:
1. Correlation coefficients for the reference and equated IRT item parameters should be
very high (0.90-1.00). Specifically, differential anchor item performance between the
2011 and 2012 administrations was evaluated by comparing the correlations between the
reference and new form item difficulty (b parameter), discrimination (a parameter), and
proportion correct (p value) values after equating. IRT guessing (c) parameters typically
fluctuate considerably, are held to fixed values during equating, and were not considered
in this evaluation. The correlations are shown in Table 12 for the discrimination (a) and
difficulty (b) parameters and are moderate to high, ranging from 0.85 to 0.97 for a
parameters (0.84-0.98 in 2011) and from 0.96 to .099 for b parameters (0.94-1.00 in
2011). These correlations indicate that the items performed similarly in the two
administrations and provide evidence that the equating results are reasonable and
accurate.
2. Reference and equated anchor item parameters and TCCs should be closely aligned. The
TCCs are reviewed after each equating cycle for each grade and content area. Further,
statistical differences were evaluated with four difference statistics: root mean squared
difference, mean absolute difference, maximum absolute difference, and the absolute
value of the mean signed difference.
3. The scaling constants, or Stocking-Lord linear transformation parameters, should be
fairly stable across administrations. There are two constants, a multiplicative constant
(M1) and an additive constant (M2). Because PARDUX calibrations center the IRT scale
close to the average proficiency of the test takers, the magnitude of the 2011-2012
differences in these scaling constants indicates the degree of differences in average
difficulty of the reference and new test form administrations. The scaling constants from
the 2012 administration along with constants across the 2007-current years of the
DC CAS administration and scales are provided in Table 13.
4. P values of the anchor items for the estimated new form and the reference form should be
similar and aligned on a regression line, show the same direction and magnitude of
change as do the scale scores. The correlations of the anchor item p values in Table 12
are highly correlated, ranging from 0.96 to 0.99 for all grades and content areas. This is
an indication that the anchor items performed similarly in the examinee populations in
2011 and 2012.
Once the tests are equated, final parameter tables are developed into scoring tables, from which
each student's scale score is derived. Examinee scale scores are estimated for DC CAS using
number correct scoring.

Establishing Upper and Lower Bounds for the Grade Level Scales
Upper and lower bound scale scores are called the lowest obtainable scale score (LOSS) and
highest obtainable scale score (HOSS). A maximum likelihood procedure cannot produce
scale score estimates for students with perfect scores or scores below the level expected from
guessing. Also, while maximum likelihood estimates are available for students with extreme
scores other than zero or perfect scores, occasionally these estimates have standard errors of
measurement that are very large, and differences between these extreme values have very
Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

39

little meaning. Therefore, scores are established for these students based on a rational but
necessarily non-maximum likelihood procedure.
For the DC CAS, LOSS and HOSS were set to be equal at the same grade for each content
area. For example, the Grade 3 LOSS and HOSS are 300 and 399, (respectively) and the
Grade 5 LOSS and HOSS are 500 and 599, respectively, for Reading, Mathematics, and
Science. These values were established on the 2006 base scale for Reading and Mathematics,
the 2008 base scale for Science/Biology, the 2011 base scale for Reading Grade 9, and the
2012 Reading scale for Composition. These values remain constant from year to year. The
LOSS and HOSS for all grades are provided in Table 14.

Reliability Coefficients
Total test reliability statistics (alpha and CSEMs) measure the level of consistency (reliability) of
performance over all test questions in a given form, the results of which imply how well the
questions measure the content domain and could continue to do so over repeated administrations.
Total test reliability coefficients (in this case measured by Cronbach's alpha [ ; 1951]) may
range from 0.00 to 1.00, where 1.00 refers to a perfectly reliable test. The DC CAS reliability
data are based on DC students in the calibration sample of approximately 4,500 students per
grade/content.
The total test reliabilities of the operational forms were evaluated first by Cronbach's
internal consistency. The specific calculation for Cronbach's is calculated as
^

k
k 1

1

^ i2
2
^X

,

where k is the number of items on the test form,
test variance.

index of

(8.1)

^ i2

2
is the variance of item i, and ^ X is the total

The stratified coefficient alpha is an internal consistency score reliability index. It measures the
internal consistency of a test that contains both multiple choice and constructed response items.
The stratified alpha treats the multiple choice and constructed response sections as separate
subtests, estimates the reliability of the two subtests, and combines those estimates to estimate
total test score internal consistency.
The Feldt-Raju index is a third index of internal consistency. It is also designed for mixed-format
tests. Unlike the stratified alpha that stratifies the items based on the number of score points, the
Feldt-Raju corrects the underestimation of Cronbach's alpha, which assumes that tests are
parallel in classical test theory terms; mixed format tests are more appropriately assumed to be
congeneric.
As a rule of thumb, reliability coefficients for test scores that are equal to or greater than 0.80 are
considered acceptable for tests of moderate lengths. All of the reliability indices calculated
provide evidence that these tests are performing as expected and that they support inferences
about what students know and can do in relation to the content knowledge and skills that the tests
target.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

40

Standard Errors of Measurement
Whereas reliability coefficients indicate the degree of consistency in test scores, the standard
error of measurement (SEM) indicates the degree of unreliability in test scores. The standard
error is an estimate of the standard deviation of observed scores to expect if an examinee were
retested under unchanged conditions. Conditional standard deviations of observed scores can be
found for each score level. The conditional estimate of measurement error increases as the
number of items that coincide with examinees' levels of performance decreases. Generally, there
are few students with extreme scores; these score levels are measured less accurately than
moderate scores. If all of the items are very difficult or very easy for examinees, the error of
measurement will be larger than when the items' difficulties are distributed across the ability
levels of the students being tested.
In addition to classic internal consistency reliability coefficients, the SEM based on IRT is also
provided as reliability evidence for the DC CAS scores. The IRT SEM provides conditional
standard errors that are specific to each scale score. These standard errors were estimated as a
function of the scale scores using IRT. Accuracy of measurement is especially important when
applied to individual scores. The IRT-based SEM indicates the expected standard deviation of
observed scores if an examinee at a specific level of ability were tested repeatedly under
unchanged conditions.

Proficiency Level Analyses
One of the cornerstones of the NCLB Act (US DOE, 2002) is the measurement of Adequate
Yearly Progress (AYP) for states with respect to the percentage of students at or above the
academic performance standards established by states. Because of a heavy emphasis on moving
all students to or above the -Proficient? category by year 2014, the consistency and accuracy of
the classification of students into these proficiency categories is of particular interest.
The statistical quality of cut scores that define the proficiency levels in which students are placed
per their performance serves as additional validity evidence. Details about the standard setting
workshops and Bookmark Standard Setting Procedure used to set the cut scores are given in the
DC CAS Cut Score Setting Technical Report (CTB/McGraw-Hill, 2012). It may be useful to
note that the Bookmark procedure (Mitzel, Lewis, Patz, & Green, 2001) is a well-documented
and highly regarded procedure that has been demonstrated by independent research to produce
reasonable cut scores on tests across the country.
It is also important to review the specific scale score SEM for each cut score. Comparison of
these SEMs to the SEMs associated with other DC CAS scale scores for each test should almost
always be among the lowest, meaning that the DC CAS tests tend to measure most accurately
near the cut score. This is a desirable quality when cut scores are used to classify examinees.

Classification Consistency
Not only is it important that the amount of measurement error around the cut score be minimal;
also important is the expected consistency with which students would be classified into
performance levels if given the test over repeated occasions. Classification consistency, or
decision consistency, is defined as the extent to which the classifications of examinees agree on
the basis of two independent administrations of a test or administration of two parallel test forms.
However, it is practically infeasible to obtain data from repeated administrations of a test
because of cost, time, and students' recall of the first administration. Therefore, a common
practice is to estimate decision consistency from one administration of a test.
Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

41

Classification Accuracy
Classification accuracy, or decision accuracy, is defined as the extent to which the actual
classifications of test-takers based on observed test scores agree with classifications that would
be made on the basis of their true scores (Livingston & Lewis, 1995). It is common practice to
estimate decision accuracy using a psychometric model to estimate true scores that correspond to
observed scores as the basis for estimating classification accuracy. In other words, classification
consistency refers to the agreement between two observed scores, while classification accuracy
refers to the agreement between the observed score and the estimated true score.
A straightforward classification consistency estimation can be expressed in terms of a
contingency table representing the probability of a particular classification outcome under
specific scenarios. For example, the table below is a contingency table of
(H+1) rows (H+1) columns, where H is the number of cut scores, such that two cut scores
yield a 3 3 contingency table.
Example of Contingency Table with Two Cut Scores
Level 1
Level 2
Level 3
Sum

Level 1
P11
P12
P13
P1.

Level 2
P21
P22
P23
P2.

Level 3
P31
P32
P33
P3.

Sum
P.1
P.2
P.3
1.0

Hambleton and Novick (1973) proposed P as a measure of classification consistency, where P is
defined as the sum of the diagonal values of the contingency table (shaded above):
P = P11 + P22 + P33.
To account for statistical chance agreement, Swaminathan, Hambleton, & Algina (1974)
suggested using Cohen's kappa (1960):
P Pc
kappa =
,
1 Pc
where Pc is the chance probability of a consistent classification under two completely random
assignments. This probability, Pc , is the sum of the probabilities obtained by multiplying the
marginal probability of the first administration and the corresponding marginal probability of the
second administration:

Pc = (P1.

P.1 ) + (P2.

P.2 ) + (P3.

P.3 ).

Kolen and Kim (2005) suggested a method for estimating consistency and accuracy that involves
the generation of item responses using item parameters based on the IRT model (see also Kim,
Choi, Um, & Kim, 2006, as well as Kim, Barton, & Kim, 2008). Two sets of item responses are
generated using a set of item parameters and an examinee's ability distribution from a single test
administration.
CTB used the KKCLASS program (Kim, 2007) to calculate these statistics on the 2012 DC CAS
results. The KKCLASS program implements an IRT-based procedure that is consistent with DC
CAS IRT scaling and scoring. The procedure is described below.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

42

^
Step 1: Obtain item parameters (I) and ability distribution weight ( g ( ) ) at each
quadrature point from a single test.
Step 2: Compute two raw scores at each quadrature point. At a given quadrature point
j , generate two sets of item responses using the item parameters from a test form,
assuming that the same test form was administered twice to an examinee with the true
ability j .
Step 3: Construct a classification matrix at each quadrature point. Determine the joint
event for the cells in the contingency table using the raw scores obtained from Step 2.
Step 4: Repeat Steps 2 and 3 R times and get average values over R replications.

^
Step 5: Multiply distribution weight ( g ( ) ) by average values in Step 4 for each
quadrature point, and sum across all quadrature points. From this final contingency table,
classification consistency indices, such as consistency agreement and kappa, can be
computed.
Step 6: Because examinees' abilities are estimated at each quadrature point, this
quadrature point can be considered the true score. Therefore, classification accuracy is
computed using both examinees' estimated abilities (observed scores) and quadrature
point (true score).

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

Table 11. DC CAS 2012 Numbers of Operational Items Flagged for Poor Fit During
Calibration
Content

Science/Biology

Composition

0
2

6
7

0
3

8
9

Mathematics

Flagged for Poor Fit
3
2

4
5
Reading

Grade
2
3

0
0

10
2
3
4
5
6
7
8
10
5
8
High School
4
7
10

1
2
2
2
1
0
0
0
3
0
3
1
0
3
1

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

43

Technical Report for Spring 2012 Test Administration of DC CAS

44

Table 12. Correlations Between the Item Parameters for the Reference Form and 2012 DC
CAS Operational Test Form
Grade

Discrimination
(a)

2
3
4
5
6
7
8
9
10

N/A
0.95
0.97
0.97
0.96
0.95
0.97
0.85
0.95

Difficulty
(b)

P Value
Correlation

Reading

2
3
4
5
6
7
8
10
5
8
High School

N/A
0.99
0.99
0.99
0.98
0.97
0.99
0.98
0.97
Mathematics
N/A
N/A
0.96
0.98
0.93
0.99
0.97
0.97
0.93
0.98
0.90
0.98
0.94
0.98
0.88
0.97
Science/Biology
0.92
0.99
0.89
0.97
0.96
0.96

N/A
0.99
0.99
0.99
0.98
0.98
0.99
0.98
0.97
N/A
0.98
0.99
0.98
0.99
0.98
0.98
0.97
0.99
0.96
0.98

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

45

Technical Report for Spring 2012 Test Administration of DC CAS

Table 13. Scaling Constants Across Administrations, All Grades and Content Areas
2007
Mult
Add

2008
Mult
Add

2
3
4
5
6
7
8
9
10

N/A
10.40
11.80
11.40
10.80
10.40
11.10
N/A
11.30

N/A
352.60
451.20
552.20
652.10
751.30
851.80
N/A
954.50

N/A
10.70
11.70
11.30
10.40
10.40
10.40
N/A
10.90

N/A
354.00
453.30
554.90
652.90
752.70
853.80
N/A
953.40

2
3
4
5
6
7
8
10

N/A
14.50
14.10
14.10
13.40
13.70
13.00
15.50

N/A
353.90
452.10
552.20
647.30
746.90
844.80
945.20

N/A
16.20
13.20
14.80
14.50
13.40
12.50
16.90

N/A
354.00
456.40
555.90
649.80
750.00
847.40
945.40

5
8
High School

N/A
N/A
N/A

N/A
N/A
N/A

8.00
8.00
8.00

550.00
850.00
950.00

4
7
10

N/A
N/A
N/A

N/A
N/A
N/A

N/A
N/A
N/A

N/A
N/A
N/A

Grade

2009
Mult
Add
Reading
N/A
N/A
10.70 353.10
12.40 453.40
11.40 553.70
10.40 653.00
10.20 754.70
11.10 853.50
N/A
N/A
10.70 954.10
Mathematics
N/A
N/A
17.30 357.00
14.10 457.90
15.10 556.40
14.30 650.10
14.60 751.00
12.90 848.50
16.70 947.30
Science/Biology
8.70
549.90
8.90
851.00
7.70
946.60
Composition
N/A
N/A
N/A
N/A
N/A
N/A

2010
Mult Add

2011
Mult
Add

2012
Mult
Add

N/A
14.30
13.40
12.40
11.40
11.50
12.30
N/A
12.10

N/A
349.60
451.60
553.20
651.60
754.30
854.60
N/A
952.10

N/A
13.60
13.50
12.20
11.20
11.60
12.00
13.50
13.00

N/A
350.40
451.10
554.20
652.70
754.30
856.90
950.00
955.60

N/A
13.09
12.22
12.23
11.39
11.55
11.75
13.85
12.60

N/A
349.66
453.49
555.64
652.04
755.52
855.29
948.32
954.87

N/A
16.70
13.80
14.20
14.30
15.10
13.50
16.50

N/A
352.40
455.10
556.80
650.30
751.20
848.60
944.30

N/A
17.30
14.10
14.70
14.20
14.50
13.00
16.20

N/A
353.80
455.10
557.00
652.40
753.60
851.60
948.00

N/A
15.87
13.68
14.62
14.36
15.02
13.08
16.25

N/A
353.15
457.78
558.84
652.35
754.39
851.59
948.32

9.10
9.40
7.50

549.20
851.90
949.50

9.20
9.40
7.80

549.40
852.90
949.90

8.91
9.52
8.36

550.22
852.86
950.88

N/A
N/A
N/A

N/A
N/A
N/A

N/A
N/A
N/A

N/A
N/A
N/A

12.78 452.38
11.34 754.68
13.11 952.72

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

46

Technical Report for Spring 2012 Test Administration of DC CAS

Table 14. LOSS and HOSS for Relevant Grades in Reading, Mathematics, Science/Biology
and Composition
Grade

LOSS

HOSS

2

200

300

3

300

399

4

400

499

5

500

599

6

600

699

7

700

799

8

800

899

9

900

999

10

900

999

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

47

Section 7. Standard Setting
This section contains information relevant to the Standards and Assessments Peer Review
Guidance, Critical Elements 2.1, 2.2, and 2.3:
2.1
Has the State formally approved/adopted challenging academic achievement standards in
Reading/Language Arts and Mathematics for each of Grades 3 through 8 and for the 10-12 grade
range? These standards were to be completed by school year 2005-2006.
2.2
Has the State formally approved/adopted academic achievement descriptors in Science for each of
the grade spans 3-5, 6-9, and 10-12 as required by school year 2005-06?
2.3
1. Do these academic achievement standards (including modified and alternate academic
achievement standards, if applicable) include for each content area-(a) At least three levels of achievement, including two levels of high achievement (proficient and
advanced) that determine how well students are mastering a State's academic content standards
and a third level of achievement (basic) to provide information about the progress of lowerachieving students toward mastering the proficient and advanced levels of achievement; and
(b) Descriptions of the competencies associated with each achievement level; and
(c) Assessment scores (-cut scores?) that differentiate among the achievement levels and a
rationale and procedure used to determine each achievement level?
The DC CAS cut scores associated with each of the four proficiency levels (Below Basic, Basic,
Proficient, Advanced) for each grade and content area were all set through a content, statistical,
and policy-based process. The content and related statistics were reviewed through standard setting
workshops conducted with DC teachers in Washington, DC, and the resulting cut score
recommendations were provided to the DC Technical Advisory Council and OSSE approvals.
Prior to setting performance standards for the DC CAS Reading, Mathematics, Science/Biology,
and Composition tests, CTB test development staff drafted performance level descriptions for each
grade and content area. DC staff reviewed, refined, and approved the descriptions prior to each
workshop.
In 2012, standard setting workshops were conducted to review and recommend cut scores for
Reading Grades 2-10, Mathematics Grade 2, and Composition Grades 4, 7, and 10. Previous
standard settings were conducted to determine cut scores in 2006 for Mathematics Grades 3-8 and
10, in 2008 for Science Grades 5 and 8 and Biology High School, and in 2011 for Reading Grade
9.
The Bookmark Standard Setting Procedure (BSSP; Lewis, Mitzel, & Green, 1996; Lewis, Mitzel,
Mercado, & Schultz, 2012) was implemented to establish performance standards for the
assessments of Reading, Mathematics, and Science/Biology. Cut scores for Grade 2 Reading and
Mathematics were established in July 2012. Cut scores for Grades 3-10 Reading were reviewed in
July 2012, extending work from the original standard setting committee in July 2006 and July
Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

48

2011. The Grades 3-8 and 10 Mathematics and Science/Biology cut scores were established in
July 2008. The Judgmental Policy Capturing procedure (Jaeger, 1995) was used to set standards
for the Composition assessments in July 2012. District of Columbia educators who participated in
standard setting workshops recommended cut scores for each test and grade level.
The July 2006 standard setting workshops for Mathematics and Science/Biology lasted four-and-ahalf days, with the morning of the first day devoted to orientation and bookmark training, two-anda-half days to standard setting, and one-and-a-half days to description writing. Participants
recommended three cut scores at the Basic, Proficient, and Advanced levels, which would separate
students into four performance levels: Below Basic, Basic, Proficient, and Advanced. Participants
engaged in training, discussion, and three rounds of bookmark placements. The table leaders
reviewed the participant-recommended cut scores and associated impact data and suggested
changes to promote cross-grade articulation. Impact data are the percentages of students who are
classified in each performance level based on the recommended cut scores.
The Judgmental Policy Capturing method in 2012 was implemented to set standards for the
Composition test in Grades 4, 7, and 10. Judgmental Policy Capturing is a rubric-centered,
content-based method that has been used in recent years to establish performance standards on
unscaled assessments, (see Jaeger, 1995; Perie, 2007; Roeber, 2002). During the one-and-a-halfday procedure, DC educators were trained to examine the DC CAS scoring rubrics and to consider
the knowledge and skills associated with the attainment of each successive score level. Two
separate rubrics were used to score the Composition tests: students received 0-6 points for Topic
Development, and 0-4 points for Standard English Conventions. A third rubric for Composition,
Literary Analysis (0-4 points), was also used to score students' responses; however, scores from
this rubric did not contribute to students' total scores in 2012.) Participants studied these scoring
rubrics, the DC CAS content standards, and performance level descriptions and discussed their
expectations of the knowledge and skills students must have in order to associate a score level with
a performance level for each writing prompt.
The cut score recommendations from the committees for all content areas and grades were
reviewed by the DC CAS Technical Advisory Committee and DCPS in 2006 and 2012 and the
OSSE in 2008 and 2012. Certain cut scores were adjusted each time to achieve articulated
standards and impact data. The DC Board of Education approved these cut scores in 2006 and
2008, and the OSSE approved the cut scores for Composition, Grade 2 Reading and Mathematics,
and Grades 3-10 Reading in 2012.

Grades 3-10 Reading Cut Score Review
In recognition that there may have been subtle shifts in the DC CAS Reading tests and
expectations of student performance since the original standard setting in 2006-except for Grade
9, where cut scores were set in 2011-OSSE decided to conduct a review of the cut scores for
Grades 3-10 Reading in 2012.
The Bookmark Standard Setting Procedure (BSSP; Lewis, Mitzel, & Green, 1996; Lewis, Miztel,
Mercado, & Schulz, 2012) was implemented to review the cut scores for Grades 3-10 Reading.
The workshop lasted one-and-a-half days. Participants reviewed cut scores at the Basic, Proficient,
and Advanced levels, which separate students into four performance levels: Below Basic, Basic,
Proficient, and Advanced.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

49

Participants engaged in training, discussion, and two rounds of bookmark placements. Participants
reviewed the existing cut scores and recommended adjustments for content-based reasons.
Participants also adjusted the performance level descriptions (PLDs) to improve their clarity and
alignment with the tested content.
NOTE: The Reading cut scores reviewed and approved during the 2012 standard setting for
Grades 3-10 were NOT used for scoring, reporting, or accountability purposes in 2012. These will
be applied in 2013.

Grade 2 Reading and Mathematics Standard Setting
The foundation of the standard setting for the Grade 2 Reading and Mathematics assessments were
based on the guidance and procedures used for DC CAS in Reading and Mathematics for Grades
3-8 and 10. Prior to setting performance standards for the Grade 2 assessments, CTB Test
Development staff drafted performance level descriptors, which summarize the knowledge, skills,
and abilities expected of students in each performance level on the tests.
The Bookmark Standard Setting Procedure was implemented to set standards for the Grade 2
Reading and Mathematics assessments. The standard setting workshop lasted one-and-a-half days.
Participants recommended cut scores at the Basic, Proficient, and Advanced levels, which would
separate students into four performance levels: Below Basic, Basic, Proficient, and Advanced.
Participants engaged in training, discussion, and two rounds of bookmark placements. To help
participants recommend cut scores that were well articulated with Grades 3-10 Reading and
Grades 3-8 and 10 Mathematics, participants were shown target cut scores that were calculated
statistically from the Grades 3-10 cut scores. Participants were free to recommend any set of cut
scores that were consistent with the tested content and the expectations of students in each
performance level.

Grades 4, 7, and 10 Composition Standard Setting
The pool of writing prompts for the Composition assessment was refreshed in 2012 with new
prompts. Simultaneously, a test scale (based on DC CAS Reading) was implemented on the
Composition test for the first time. In recognition of these changes in the Composition assessment,
the OSSE decided to hold a standard setting for Grades 4, 7, and 10 Composition.
The Judgmental Policy Capturing procedure (Jaeger, 1995) was implemented to set standards for
the Composition assessments. The standard setting workshop lasted one-and-a-half days.
Participants recommended cut scores at the Basic, Proficient, and Advanced levels, which separate
students into four performance levels: Below Basic, Basic, Proficient, and Advanced.
Participants engaged in training and in three rounds of discussion and decisions. For each writing
prompt, participants recommended cut scores in terms of raw score. These raw scores were later
transformed onto the test scale. Participants considered students' performance on the two scored
rubrics, Topic Development and Standard English Conventions, and on one unscored rubric,
Literary Analysis. Only scores from the first two rubrics contribute to students' scores in 2012.

Final, Approved DC CAS Cut Scores
The cut score recommendations from the 2012 committees were reviewed by staff from the OSSE.
In addition, the OSSE reviewed the impact data associated with the recommended cut scores:
impact data are the percentages of students classified in each performance level, based on the cut
Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

50

scores. The cut scores, as recommended by District of Columbia educators, were reviewed by the
Technical Advisory Committee and approved by the OSSE in August 2012. The standard setting
technical report summarizes procedures and results of the 2012 standard settings and cut score
review.
The report includes round-by-round synopses, agendas, training materials, and the recommended
cut scores. See District of Columbia Comprehensive Assessment System (DC CAS) Standard
Setting Technical Report 2012 (CTB/McGraw-Hill, 2012).
Table 15 shows the final, approved cut scores. Note that the 2012 Reading cut scores for all grades
except Grade 2 were applied to final scores reported in 2012. The new Reading cut scores will be
applied to all grades in 2013.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

51

Table 15. Final Cut Score Ranges
Grade
2
3
4
5
6
7
8
9
10
Grade
2
3
4
5
6
7
8
9
10
Grade
2
3
4
5
6
7
8
10
Grade
5
8
High School
Grade
4
7
10

Reading: Applied in 2012
Below Basic
Basic
Proficient
200 - 231
232 - 245
246 - 263
300 - 338
339 - 353
354 - 372
400 - 438
439 - 454
455 - 471
500 - 539
540 - 555
556 - 572
600 - 639
640 - 654
655 - 671
700 - 738
739 - 755
756 - 767
800 - 839
840 - 855
856 - 869
900 - 930
931 - 949
950 - 959
900 - 939
940 - 955
956 - 969
Reading: To Be Applied in 2013
Below Basic
Basic
Proficient
200 - 231
232 - 245
246 - 263
300 - 339
340 - 351
352 - 366
400 - 443
444 - 455
456 - 469
500 - 544
545 - 554
555 - 568
600 - 639
640 - 651
652 - 665
700 - 743
744 - 755
756 - 766
800 - 841
842 - 855
856 - 867
900 - 938
939 - 950
951 - 964
900 - 942
943 - 954
955 - 966
Mathematics
Below Basic
Basic
Proficient
200 - 243
244 - 254
255 - 267
300 - 339
340 - 359
360 - 375
400 - 442
443 - 457
458 - 473
500 - 542
543 - 559
560 - 574
600 - 635
636 - 653
654 - 667
700 - 735
736 - 751
752 - 769
800 - 835
836 - 849
850 - 867
900 - 932
933 - 950
951 - 970
Science/Biology
Below Basic
Basic
Proficient
500 - 540
541 - 552
553 - 563
800 - 848
849 - 855
856 - 867
900 - 945
946 - 951
952 - 965
Composition
Below Basic
Basic
Proficient
400 - 443
444 - 455
456 - 469
700 - 743
744 - 755
756 - 766
900 - 942
943 - 954
955 - 966

Advanced
264 - 299
373 - 399
472 - 499
573 - 599
672 - 699
768 - 799
870 - 899
960 - 999
970 - 999
Advanced
264 - 299
367 - 399
470 - 499
569 - 599
666 - 699
767 - 799
868 - 899
965 - 999
967 - 999
Advanced
268 - 299
376 - 399
474 - 499
575 - 599
668 - 699
770 - 799
868 - 899
971 - 999
Advanced
564 - 599
868 - 899
966 - 999
Advanced
470 - 499
767 - 799
967 - 999

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

52

Section 8. Evidence for Reliability and Validity
This section contains information relevant to Standards and Assessments Peer Review Guidance,
Critical Elements 4.1, 4.2, and 4.3:
4.1
For each assessment, including all alternate assessments, has the State documented the issue of
validity (in addition to the alignment of the assessment with the content standards), as described in
the Standards for Educational and Psychological Testing (AERA/APA/NCME, 1999), with
respect to all of the following categories:
(a) Has the State specified the purposes of the assessments, delineating the types of uses and
decisions most appropriate to each?
(c) Has the State ascertained that the scoring and reporting structures are consistent with the subdomain structures of its academic content standards (i.e., are item interrelationships consistent with
the framework from which the test arises)?
(e) Has the State ascertained that test and item scores are related to outside variables as intended
(e.g., scores are correlated strongly with relevant measures of academic achievement and are
weakly correlated, if at all, with irrelevant characteristics, such as demographics)?
4.2
For each assessment, including all alternate assessments, has the State considered the issue of
reliability, as described in the Standards for Educational and Psychological Testing
(AERA/APA/NCME, 1999), with respect to all of the following categories:
(a) Has the State determined the reliability of the scores it reports, based on data for its own
student population and each reported subpopulation? and
(b) Has the State quantified and reported within the technical documentation for its assessments
the conditional standard error of measurement and student classification that are consistent at each
cut score specified in its academic achievement standards? and
(c) Has the State reported evidence of generalizability for all relevant sources, such as variability
of groups, internal consistency of item responses, variability among schools, consistency from
form to form of the test, and inter-rater consistency in scoring?
4.3
Has the State ensured that its assessment system is fair and accessible to all students, including
students with disabilities and students with limited English proficiency, with respect to each of the
following issues:
(c) Has the State taken steps to ensure fairness in the development of the assessments?

Reliability
Reliability refers to the degree to which students' scores are free from such effects and provides a
measure of consistency. In other words, reliability helps to describe how consistent students'
performances would be if given the assessment over multiple occasions. The degree of score
reliability that is required for an interpretation of an individual student's test score must be

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

53

carefully considered. Individual score reliability is estimated using internal consistency
coefficients that are computed on all student responses in each grade and content area of the
DC CAS. They are computed using the operational items administered to all students in a grade
and content area.

Validity
The collection of reliability evidences is a necessary precursor to establishing evidence of validity.
How the scores are ultimately used is a key component to validity evidence, such that the
trustworthiness of the scores is well established. As noted in the introduction, test validation is an
ongoing process of gathering evidence from many sources to evaluate the trustworthiness of the
desired score interpretation or use. This evidence is provided throughout this technical report
specific to procedures and processes that support the integrity of the content of the test, test
development, blueprints, alignment, scoring and rater reliability, psychometric analyses (item
analyses, scaling, equating, and comparative analyses across administrations), and student-level
performance results.

Item Level Evidence
Classical Item Statistics
DC CAS operational and field test items are all reviewed for statistical accuracy and quality.
Table 16 summarizes item level classical statistics for operational and field test items. For multiple
choice items, percent correct (p values) is reported. For constructed response items, the p value is
calculated as the mean score across all students divided by the maximum number of score points
possible. On average, the collection of operational items on a test ranged from moderately difficult
(mean p value of 0.41 for Science Grade 8 and Biology) to moderately easy (mean p value of 0.74
for Grade 2 Mathematics). Tables in Appendix C display the item difficulty for each item at each
grade. With respect to field test items, a test ranged from moderately difficult (mean p value of
0.35 for Science Grade 8 and Biology) to moderately easy (mean p value of 0.69 for Grade 2
Mathematics).
The point biserial, or item-test correlation, a type of internal consistency measure, is one measure
of the correlation between each item and the overall test. The item-test correlations for each
content area and grade for operational and field test items are shown in Table 16. The operational
test form correlations range from 0.38 to 0.45 (Reading); from 0.38 to 0.45 (Mathematics); from
0.29 to 0.35 (Science/Biology); and from 0.60 to 0.68 (Composition). Field test form correlations
range from 0.27 to 0.37 (Reading); from 0.33 to 0.45 (Mathematics); and from 0.25 to 0.34
(Science/Biology).
Table 16 also displays the mean item omit rates calculated across students for each grade and
content area. CTB flags items when more than 5% of students omit an item. Flagged items are
reviewed to ensure that they are appropriate for examinees in the tested grade. In addition, omitted
items near the end of the test are reviewed as not reached items to ensure the administration
conditions, such as testing time and accurate printing and scanning. Overall, the omit rates are low.
The largest mean percentage omit rate is 5.80% in Composition Grade 10. All of the not reached
rates are less than 1% except for in Reading Grade 9 (1.92%), Reading Grade 10 (1.15%),
Composition Grade 4 (1.18%), Composition Grade 7 ( 1.75%), and Composition Grade 10
(5.80%), indicating that the students were provided with ample time to complete the DC CAS
tests.
Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

54

Inter-Rater Reliability
The inter-rater reliabilities of constructed response items rely heavily on the solid and consistent
training of the hand-scorers, as was described in Section 4. The DC CAS constructed response
questions require a response composed by the examinee, usually in the form of one or more
sentences, where the ideas expressed are scored as correct, partially correct, or incorrect. Since the
ideas rather than the specific written expressions are scored, the response cannot be scored by
applying a clerical key. Raters use judgment to determine whether the ideas expressed match those
described in a scoring guide. In other words, raters interpret what the student has written. In order
to minimize the difference in interpretations that raters make, raters are required to have certain
hiring qualifications and on-site training using examples of responses that match and do not match
the desired answers. Even so, the match between a student's response and the scoring guide
description of a correct response is a matter of degree.
As a result, perfect agreement between different raters of the same student response is not expected
in order for the test to be valid. High perfect agreement between raters (70%-80% agreement and
above) can be obtained when the ideas being expressed and scored are rather narrowly defined
instances of principles or algorithms within a content area composed of discrete knowledge. This
rate of perfect agreement drops rapidly, however, for a content area such as Reading, where the
ideas being expressed are not highly constrained by content; instead, the form and coherence of the
expression of the ideas is the target of the testing and scoring.
Nevertheless, relatively high adjacent agreement (scores differing by only one point) can be
obtained. This adjacent agreement still varies with known characteristics of the question and
scoring guides. Adjacent agreement of 95% or more is desirable when analytic rubrics are used.
When holistic rubrics are used and scoring is deliberately impressionistic, adjacent agreement may
drop below 90%.
Statistical agreement data are presented in terms of the percentage of perfect, adjacent, and
discrepant agreement. Adjacent agreement occurs when two raters differ by one point, and
discrepant agreement is when two raters differ by more than one point. Tables 17-20 provide the
inter-rater agreement statistics for operational constructed response items. In general, the values
are within acceptable limits. For operational items, in Reading, the average perfect agreement was
72%, with a high of 86% and a low of 56%. For perfect and adjacent agreement, the average was
96%, with a high of 99% and a low of 91%. In Mathematics, the average perfect agreement was
90%, with a high of 97% and a low of 79%. For perfect and adjacent agreement, the average was
99%, with a high of 100% and a low of 98%. In Science/Biology, the average perfect agreement
rate was 87%, with a high of 94% and a low of 80%. For perfect and adjacent agreement, the
average was 99%, with a high of 100% and a low of 97%. In Composition, the average perfect
agreement was 59%, with a high of 87% and a low of 42%. For perfect and adjacent agreement,
the average was 94%, with a high of 100% and a low of 83%.
Field test items with perfect plus adjacent inter-rater agreement rates below 90% or lower checkset
agreement rates will be avoided as much as possible during the process of selecting items for
operational use. These items and their rubric can be investigated to determine whether the rubric
may be difficult to apply in live scoring, and such items can be revised and re-field tested. Tables
21-23 provide the agreement rates for field test constructed response items. For field test items, in
Reading, the average perfect agreement was 68%, with a high of 84% and a low of 57%. For
perfect and adjacent agreement, the average was 96%, with a high of 99% and a low of 90%. In
Mathematics, the average perfect agreement was 89%, with a high of 97% and a low of 77%. For
Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

55

perfect and adjacent agreement, the average was 99%, with a high of 100% and a low of 95%. In
Science/Biology, the average perfect agreement rate was 87%, with a high of 97% and a low of
73%. For perfect and adjacent agreement, the average was 99%, with a high of 100% and a low of
97%.

Differential Item Function
Differential item function (DIF) analyses were conducted for all grades and content areas for
gender and race/ethnicity. DIF analyses were conducted with at least 400 cases for reference
groups and 200 cases for focal groups to provide data adequate for Mantel-Haenszel DIF analysis
procedures, which require subdividing each comparison group based on total test raw scores.
Tables 24-27 summarize the 2012 DIF analysis results for operational items, and Tables 28-30 for
pilot items. Positive flags indicate DIF that favors the focal group. Statistics with fewer than 200
focal group examinees and 400 reference group examinees are not calculated for these analyses to
provide appropriate subgroup comparisons. Recall that A corresponds to no DIF, B to moderate
DIF, and C to considerable DIF. Modest numbers of multiple choice and constructed response
items were flagged for DIF at levels B and C. The majority of items flagged for DIF were in
race/ethnicity comparisons; many of those were positive values that indicated DIF that favored the
focal group (e.g., Hispanic and White students).
Overall, the number of operational items flagged for DIF was moderate. For example, the total of
126 Reading item flags for DIF represents 13.2% of the 957 flagging opportunities in Reading; the
total of 108 item flags in Mathematics for DIF represents 10% of the 1,080 flagging opportunities
in Mathematics; and the total of 18 item flags in Science/Biology for DIF represent 5.1% of the
356 flagging opportunities.
The number of field test items flagged for DIF was moderate. For example, the total of 126
Reading item flags for DIF represents 13.2% of the 957 flagging opportunities in Reading; the
total of 108 item flags in Mathematics for DIF represents 10% of the 1,080 flagging opportunities
in Mathematics; and the total of 18 item flags in Science/Biology for DIF represents 5.1% of the
356 flagging opportunities.

Test and Strand Level Evidence
Operational Test Scores
Operational test level raw score and scale score means and standard deviations for the District are
provided in Table 31, along with the test level reliability coefficients, including Cronbach alpha,
stratified coefficient alpha, and Feldt-Raju. The scale score and raw score means and standard
deviations are consistent across grades within content area. The reliabilities all show high levels of
internal consistency, with reliabilities all greater than 0.85. Subgroup performance and total test
reliabilities are provided in Appendix D.
Similarly, the content strand means, standard deviations, averagep values, and reliabilities are
provided for each grade and content area in Tables 32-35. Teachers and educational decision
makers frequently want diagnostic information that can be used to inform instructional strategies
within a content area and to help identify student strengths and weaknesses. This information can
be derived from student scores on subsets of test questions called content strands (e.g.,
Informational Text, Number Sense).
Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

56

Strand Level Scores
The raw score means and standard deviations highlight strands in which students show better or
lesser mean performance, and the variability of that performance given the spread represented by
the standard deviations. The average p values are a better indicator of the strand level difficulty,
however, given they are not swayed by the number of items in a given strand, as the mean raw
score is. Therefore, a review of the average p values in each strand highlights the strands that tend
to be the more or less difficult for students. Specifically, the strands that tend to be the most
difficult in each content area are Reading Informational Text in Reading, Measurement in
Mathematics, and in Science-Science and Technology (Grade 5), Energy and Waves (Grade 8),
and Cell Biology and Biochemistry (HS). In Composition, we look to the mean raw scores, noting
that each strand represents a single rubric of 4 to 6 points. The mean raw scores are very similar
across strands, where the Writing Language Conventions rubric or strand was slightly more
difficult in Grades 4 and 7, while the Writing Topic Development was slightly more difficult in
Grade 10.
In strands where there are very few items, reliabilities are lower, as would be expected. The degree
of reliability that is required to interpret these strand scores, as for any test score, must therefore be
carefully considered. These coefficients are computed on all valid student responses in each grade
and content area for each content strand. The internal reliability estimates for these strand scores,
which include as few as 4 items and as many as 23, range between 0.40 and 0.88.
As an additional measure of internal consistency, correlations have been produced between strands
within each grade and content area. These are provided in Tables 36-39. A review of the
correlations shows fairly strong relationships amongst strands within content area. Specifically, in
Table 36, the DC CAS 2012 Reading strand and total test correlations for all grades are presented.
The Reading strand correlations are moderate to high for all grades.
Table 37 displays the correlations for the DC CAS 2012 Mathematics strand and total test
correlations by grade. The correlations are mostly moderate to high. The correlations between
Geometry and the other Mathematics strands tend to be lower than for the other strands. Geometry
and Measurement also tend to have the lowest correlations with the Mathematics total raw score at
each grade. This is due in part to the smaller number of items used to measure Geometry and
Measurement in relation to the rest of the content strands.
In Table 38, the DC CAS 2012 Science/Biology strand and total test correlations for all grades are
presented. The correlations are moderate to high, although somewhat lower in general than the
correlations in Reading and Mathematics.
The DC CAS 2012 rubric score and total Composition test correlations for all grades are presented
in Table 39. The correlations between the Topic Development and Language Conventions scores
are moderate, suggesting that each rubric assesses somewhat different composing skills, as
intended. The correlations between the rubric scores and total Composition scores are high, as
expected.

Standard Errors of Measurement
Standard errors of measurement (SEMs) indicate the degree of unreliability in the test scores, and
conditional SEMs specific to each scale score provide further evidence. Tables 40-43 list the
number correct to scale score values, along with their associated IRT SEM values. It is most
Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

57

important to review these at the cut scores that differentiated students by proficient level. The cut
score SEMs range from 3 to 8 (Reading), 3 to 9 (Mathematics), 2 to 6 (Science/Biology) and 5 to
13 (Composition). The lowest SEMs are typically at the -Proficient? cut in all grades and content
areas.

Proficiency Level Evidence
Student performance relative to their score is classified into one of four proficiency levels: Below
Basic, Basic, Proficient, and Advanced. The categorizations are important for accountability
purposes, as well as for teacher, students, and parents to understand the content meaning of the
associated scale scores. The percentage of students in each category, referred to as -impact data,?
is provided in Table 44. The -overall pass rate? represents the combined impact data of the two
upper levels, Proficient and Advanced, and is often the sum percentage referenced in
accountability measures.
Tables 45-48 display the classification consistency and accuracy results for each cut score and
across all cut scores for the 2012 DC CAS in Reading, Mathematics, Science/Biology, and
Composition. (The same information is provided for each subgroup in tables in Appendix D.)
These statistics provide indication of the reliability of the proficiency cut scores, which designate
the categories within which student performance would be classified over multiple administrations
of the same assessment. The classification consistency statistics can be interpreted like the
correlations, where the closer to 1.00 the statistics, the stronger the reliability. As with other
measures of reliability, the statistics are impacted by the number of data points or, in this case,
items and score points. Step 2 of the classification consistency calculations rests on the total raw
scores. For that reason, the reliabilities for Composition are likely to be lower than the assessments
in other content areas with higher possible total raw scores/points. What can be seen from the
results described, however, is that Composition remains comparable to the other content areas,
even with fewer points.
The classification consistency in all grades in Reading, Mathematics, and Science/Biology range
from 0.65 to 0.82, and are comparable to those in 2011, which ranged between 0.66 and 0.78. The
classification consistency ranged from 0.52 to 0.86 in Composition. The kappa values, which
indicate classification consistency beyond chance consistency, represent moderate to substantial
consistency levels (Landis & Koch, 1997). The kappa coefficients in Reading, Mathematics, and
Science/Biology coefficients range between 0.48 and 0.77, which is comparable with the 2011
results (0.48 to 0.68). Kappa coefficients in Composition this year range from 0.34 to 0.62.
The classification accuracy results range from 0.73 to 0.85 in Reading, Mathematics, and
Science/Biology. The results are comparable with those in 2011, which also ranged between 0.73
and 0.84. In Composition, classification accuracies range from 0.62 to 0.91. These results suggest
that the 2012 DC CAS assessments in all content areas classify examinees into DC CAS
proficiency levels based on observed test scores with reasonably strong accuracy.
The false positive rates are estimates of the percentages of examinees that are classified into a
proficiency level higher than their true proficiency level. The false negative rates are estimates of
the percentages of examinees that are classified into a proficiency level lower than their true
proficiency level. These are reasonably low false positive and negative rates in absolute terms. It is
a policy question as to how much higher or lower false positive rates should be relative to false
negative rates. A review of the tables, though shows these rates quite low, ranging from 0.03 to
0.30 in Composition and from 0.00 to 0.17 in Reading, Mathematics, and Science/Biology.
Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

58

The magnitude of classification consistency and accuracy measures is influenced by key features
of the test design, including the number of items and number of cut scores, score reliability and
associated standard errors of measurement, and the locations of the cut scores in relation to the
examinee proficiency frequency distributions. The classification consistency and accuracy results
observed for 2012 suggest that consistent and accurate performance level classifications are being
made for students based on the DC CAS assessments.

Correlational Evidence across Content Areas
Using all scored data, the correlations across the Reading, Mathematics, Science/Biology, and
Composition raw scores were calculated as a way of examining evidence of the validity of
inferences about student achievement based on relationships between content area tests. This
evidence is referred to as evidence of convergent and discriminant validity. The correlations
between Reading, Mathematics, Science/Biology, and Composition total raw scores appear in
Table 49.
Correlations are somewhat higher in the elementary grades than in the middle and high school
grades. Correlations between Reading and Mathematics are 0.72 and higher; correlations of
Reading and Mathematics scores with Science/Biology scores are 0.56 and higher; correlations
with the Composition total scores are in the range of 0.46 to 0.64. Composition correlations are
relatively lower because Composition scores range from 2 to 10, which restricts variability and
covariance. These results are consistent with typical content area correlations for educational
achievement tests in these content areas.
These correlations are moderately high. They indicate that approximately 25%-50% of the
variability in performance on these separate content area tests can be accounted for by skills and
proficiency shared across the content areas (i.e., disregarding measurement error). This
observation suggests that approximately one half to three quarters of the performance on each
content area assessment can be explained by knowledge, skills, and proficiency that are unique to
each content area (i.e., disregard measurement error).

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

59

Technical Report for Spring 2012 Test Administration of DC CAS

Table 16. DC CAS 2012 Classical Item Level Statistics
Operational
Grade

Number
of Items

Mean
p value

Mean ItemTotal
Correlation

Field Test
Mean
Omit
Rate

2
3
4
5
6
7
8
9
10

35
48
48
48
48
48
48
48
48

0.64
0.65
0.62
0.65
0.64
0.63
0.59
0.58
0.61

0.40
0.45
0.42
0.42
0.41
0.38
0.39
0.42
0.41

2.33
1.07
0.55
0.66
0.57
0.64
0.70
3.18
1.97

2
3
4
5
6
7
8
10

32
53
54
53
54
52
54
54

0.74
0.66
0.63
0.67
0.58
0.58
0.51
0.48

0.43
0.45
0.43
0.43
0.43
0.41
0.39
0.38

0.62
0.92
0.51
0.42
0.46
0.70
0.81
2.10

5
8
High School

50
50
50

0.47
0.41
0.41

0.35
0.33
0.29

0.74
1.27
1.76

4
7
10

8
8
8

0.48
0.56
0.53

0.60*
0.68*
0.61*

1.18
1.75
5.80

Mean Not
Reached Number
Rate
of Items
Reading
0.72
42
0.26
38
0.14
38
0.35
38
0.23
38
0.23
38
0.33
40
1.92
40
1.15
40
Mathematics
0.14
36
0.09
32
0.12
32
0.13
32
0.10
32
0.24
32
0.34
32
0.93
32
Science/Biology
0.38
28
0.43
28
0.92
28
Composition
1.18
N/A
1.75
N/A
5.80
N/A

Mean ItemTotal
Correlation

Mean
Omit
Rate

Mean Not
Reached
Rate

0.45
0.50
0.45
0.48
0.52
0.43
0.48
0.48
0.50

0.36
0.37
0.34
0.37
0.35
0.27
0.34
0.34
0.36

1.59
1.05
0.59
0.66
0.82
0.82
1.41
3.60
2.66

0.17
0.13
0.08
0.13
0.28
0.33
0.52
2.55
1.50

0.69
0.63
0.53
0.44
0.45
0.41
0.38
0.37

0.43
0.45
0.43
0.37
0.37
0.33
0.33
0.33

0.74
1.17
1.27
0.86
0.73
1.26
1.35
3.50

0.17
0.20
0.23
0.18
0.14
0.29
0.37
1.06

0.47
0.35
0.35

0.34
0.26
0.25

0.90
2.05
2.70

0.27
0.29
0.62

N/A
N/A
N/A

N/A
N/A
N/A

N/A
N/A
N/A

N/A
N/A
N/A

Mean
p value

*Item-total correlations for Composition include the Reading items along with which the Composition prompts were scaled.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

60

Technical Report for Spring 2012 Test Administration of DC CAS

Table 17. DC CAS 2012 Operational Inter-Rater Agreement for Constructed Response
Items: Reading

Grade

Form

2

1-2

3

1-2

4

1-2

5

1-2

6

1-2

7

1-2

8

1-2

9

1-2

10

1-2

Item
No.
8
33
12
18
38
5
18
67
19
23
67
14
28
66
13
17
44
17
28
68
9
18
54
6
16
38

Score
Points
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-2
0-3
0-3
0-3
0-3
0-3

% of Agreement
Perfect
+
Perfect Adjacent Adjacent
68
27
94
86
5
91
70
25
95
69
28
97
73
23
95
78
20
98
77
21
98
68
27
95
73
25
98
56
37
93
69
22
91
63
34
97
64
32
96
75
23
98
68
28
97
77
20
97
73
20
93
74
20
94
64
32
96
80
19
98
75
24
99
72
24
96
77
22
98
69
26
95
74
23
97
78
21
99

Checkset
Average
Agreement
Percentages
93
98
91
93
95
91
84
93
80
66
64
66
57
77
80
85
83
84
87
81
89
79
77
75
85
92

Note: Perfect + Adjacent agreement percentages may not equal the sum of Perfect and Adjacent percentages
due to rounding. Checkset average agreement percentages are calculated across all checksets and raters.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

61

Table 18. DC CAS 2012 Operational Inter-Rater Agreement for Constructed Response
Items: Mathematics

Grade

Form

2

1-2

3

1-2

4

1-2

5

1-2

6

1-2

7

1-2

8

1-2

10

1-2

Item
No.
6
26
6
25
60
6
25
60
6
25
60
6
25
60
6
25
60
6
25
60
6
25
60

Score
Points
0-2
0-2
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3

% of Agreement
Perfect
+
Perfect Adjacent Adjacent
88
12
100
97
3
100
81
18
99
91
7
98
89
11
100
94
6
100
94
6
100
97
3
100
89
9
98
97
3
100
89
10
99
90
9
100
95
5
100
95
5
99
87
12
99
79
20
99
94
4
99
90
10
99
94
5
99
83
15
98
79
19
98
95
4
99
87
12
99

Checkset
Average
Agreement
Percentages
96
97
93
98
96
96
92
98
97
100
98
99
98
95
90
94
95
100
96
88
84
96
93

Note: Perfect + Adjacent agreement percentages may not equal the sum of Perfect and Adjacent percentages
due to rounding. Checkset average agreement percentages are calculated across all checksets and raters.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

62

Table 19. DC CAS 2012 Operational Inter-Rater Agreement for Constructed Response
Items: Science/Biology

Grade

Form

5

1-2

8

1-2

High
School

1-2

Item
No.
13
27
51
13
27
51
13
27
51

Score
Points
0-2
0-2
0-2
0-2
0-2
0-2
0-2
0-2
0-2

% of Agreement
Perfect
+
Perfect Adjacent Adjacent
92
8
100
80
19
99
94
4
99
91
6
97
89
10
99
81
17
98
85
14
98
86
13
99
81
18
100

Checkset
Average
Agreement
Percentages
95
89
94
85
94
83
86
93
86

Note: Perfect + Adjacent agreement percentages may not equal the sum of Perfect and Adjacent percentages
due to rounding. Checkset average agreement percentages are calculated across all checksets and raters.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

63

Table 20. DC CAS 2012 Operational Inter-Rater Agreement for Constructed Response
Items: Composition

Grade

Form

4

1

4

2

4

3

4

4

7

1

7

2

7

3

7

4

10

1

10

2

10

3

10

4

Item
No.
1A
1B
1C
1A
1B
1C
1A
1B
1C
1A
1B
1C
1A
1B
1C
1A
1B
1C
1A
1B
1C
1A
1B
1C
1A
1B
1C
1A
1B
1C
1A
1B
1C
1A
1B
1C

Score
Points
1-6
1-4
1-4
1-6
1-4
1-4
1-6
1-4
1-4
1-6
1-4
1-4
1-6
1-4
1-4
1-6
1-4
1-4
1-6
1-4
1-4
1-6
1-4
1-4
1-6
1-4
1-4
1-6
1-4
1-4
1-6
1-4
1-4
1-6
1-4
1-4

% of Agreement
Perfect
+
Perfect Adjacent Adjacent
50
47
97
58
41
99
57
37
94
48
38
86
58
38
96
58
33
92
57
38
95
67
31
98
68
30
97
55
40
95
61
39
100
58
40
98
57
39
96
62
38
100
68
32
100
51
39
90
64
33
97
61
34
95
55
42
97
59
41
99
57
37
94
62
36
98
61
38
99
64
33
97
49
36
85
42
53
95
54
29
83
51
38
90
65
31
95
62
26
88
52
35
87
74
25
98
87
8
95
49
40
89
61
34
95
52
38
90

Checkset
Average
Agreement
Percentages
60
77
72
73
76
76
65
70
71
78
82
76
74
79
75
67
76
78
69
77
74
79
79
83
76
79
79
75
79
79
78
86
93
82
84
86

Note: Perfect + Adjacent agreement percentages may not equal the sum of Perfect and Adjacent percentages
due to rounding. Checkset average agreement percentages are calculated across all checksets and raters.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

Table 21. DC CAS 2012 Field Test Inter-Rater Agreement for Constructed Response
Items: Reading
Checkset
Average
Perfect + Agreement
Adjacent Adjacent Percentages
25
95
85
14
97
85
31
97
95
22
95
82
25
98
77
23
98
86
25
99
86
31
96
70
24
96
69
26
97
89
29
97
64
36
94
72
20
98
71
24
93
64
31
96
75
21
95
77
30
97
78
15
97
83
34
94
76
25
96
72
33
90
72
34
95
72
39
97
82
38
95
83
33
96
77
14
98
89
28
99
89
19
97
80
36
95
82
20
92
87
36
96
86
38
95
82

% of Agreement
Grade
2

Form
1
2
1

3
2
1
4
2
1
5
2
1
6
2
1
7
2
1
8
2
9

1
2
1

10
2

Item
No.
49
49
33
59
33
59
39
62
39
62
36
60
36
60
36
61
36
61
33
60
33
60
38
62
38
62
37
64
32
60
32
60

Score
Points
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3

Perfect
70
82
66
74
73
76
74
65
72
71
68
57
77
69
65
74
67
81
60
71
57
60
58
57
63
84
71
78
59
72
60
57

Note: Perfect + Adjacent agreement percentages may not equal the sum of Perfect and Adjacent percentages
due to rounding. Checkset average agreement percentages are calculated across all checksets and raters.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

64

Technical Report for Spring 2012 Test Administration of DC CAS

Table 22. DC CAS 2012 Field Test Inter-Rater Agreement for Constructed Response
Items: Mathematics
Checkset
Average
Perfect + Agreement
Adjacent Adjacent Percentages
10
99
91
12
99
96
8
99
96
15
99
93
6
98
98
7
98
98
7
99
92
7
99
96
9
100
96
4
100
97
11
99
89
2
99
96
17
99
95
8
100
98
9
100
91
19
96
88
8
98
93
11
100
84
2
98
95
15
98
87
5
99
93
18
98
94
12
100
97
12
98
91
8
98
92
16
99
92
6
95
93
4
97
90
6
98
85
3
98
98

% of Agreement
Grade
2

Form
1
2
1

3
2
1
4
2
1
5
2
1
6
2
1
7
2
1
8
2
1
10
2

Item
No.
53
53
32
49
32
49
32
49
32
49
32
49
32
49
32
49
32
49
32
49
32
49
32
49
32
49
32
49
32
49

Score
Points
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3
0-3

Perfect
89
87
91
84
92
91
92
93
91
96
88
97
82
93
91
77
90
89
96
83
95
79
88
86
91
83
89
93
92
95

Note: Perfect + Adjacent agreement percentages may not equal the sum of Perfect and Adjacent percentages
due to rounding. Checkset average agreement percentages are calculated across all checksets and raters.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

65

Technical Report for Spring 2012 Test Administration of DC CAS

66

Table 23. DC CAS 2012 Field Test Inter-Rater Agreement for Constructed Response Items:
Science/Biology
Checkset
Average
Perfect + Agreement
Adjacent Adjacent Percentages
6
99
98
5
99
92
24
97
82
16
100
87
6
98
85
12
99
92
4
99
99
2
99
99
6
99
87
22
98
78
18
99
84
17
99
88

% of Agreement
Grade

Form
1

5
2
1
8
2
High
School

1
2

Item
No.
17
41
17
41
17
41
17
41
17
41
17
41

Score
Points
0-2
0-2
0-2
0-2
0-2
0-2
0-2
0-2
0-2
0-2
0-2
0-2

Perfect
93
94
73
83
92
87
95
97
93
76
81
83

Note: Perfect + Adjacent agreement percentages may not equal the sum of Perfect and Adjacent percentages
due to rounding. Checkset average agreement percentages are calculated across all checksets and raters.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

67

Table 24. Numbers of Operational Items Flagged for DIF Using the Mantel-Haenszel
Procedure: Reading
Reference Group

Focal Group

Grade 2 (total 35 items)
Male
Female
Asian
African American
Hispanic
White
Grade 3 (total 48 items)
Male
Female
Asian
African American
Hispanic
White
Grade 4 (total 48 items)
Male
Female
Asian
African American
Hispanic
White
Grade 5 (total 48 items)
Male
Female
Asian
African American
Hispanic
White
Grade 6 (total 48 items)
Male
Female
Asian
African American
Hispanic
White
Grade 7 (total 48 items)
Male
Female
Asian
African American
Hispanic
White
Grade 8 (total 48 items)
Male
Female
Asian
African American
Hispanic
White1

A

B

B-

C

C-

35
N/A
35
23

0
N/A
0
8

0
N/A
0
1

0
N/A
0
3

0
N/A
0
0

48
N/A
47
35

0
N/A
0
7

0
N/A
1
1

0
N/A
0
5

0
N/A
0
0

47
N/A
46
32

0
N/A
2
9

1
N/A
0
1

0
N/A
0
6

0
N/A
0
0

46
N/A
47
37

2
N/A
1
5

0
N/A
0
0

0
N/A
0
6

0
N/A
0
0

44
N/A
47
38

2
N/A
1
1

2
N/A
0
1

0
N/A
0
8

0
N/A
0
0

44
N/A
45
29

2
N/A
0
9

1
N/A
2
2

0
N/A
1
8

1
N/A
0
0

46
N/A
45
31

2
N/A
1
6

0
N/A
2
0

0
N/A
0
10

0
N/A
0
0

N/A= not applicable because case count requirements for the reference (400) and focal (200) groups were not met. See
Table 5 for the numbers of examinees in each grade and subgroup.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

68

Table 24. Numbers of Operational Items Flagged for DIF Using the Mantel-Haenszel
Procedure: Reading (continued)
Grade 9 (total 48 items)
Male
Female
Asian
African American
Hispanic
White
Grade 10 (total 48 items)
Male
Female
Asian
African American
Hispanic
White

45
N/A
45
N/A

1
N/A
0
N/A

0
N/A
3
N/A

1
N/A
0
N/A

1
N/A
0
N/A

47
N/A
45
N/A

0
N/A
2
N/A

1
N/A
1
N/A

0
N/A
0
N/A

0
N/A
0
N/A

N/A= not applicable because case count requirements for the reference (400) and focal (200) groups were not met.
See Table 5 for the numbers of examinees in each grade and subgroup.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

69

Table 25. Numbers of Operational Items Flagged for DIF Using the Mantel-Haenszel
Procedure: Mathematics
Reference Group

Focal Group

Grade 2 (total 32 items)
Male
Female
Asian
African American
Hispanic
White
Grade 3 (total 53 items)
Male
Female
Asian
African American
Hispanic
White
Grade 4 (total 54 items)
Male
Female
Asian
African American
Hispanic
White
Grade 5 (total 53 items)
Male
Female
Asian
African American
Hispanic
White
Grade 6 (total 54 items)
Male
Female
Asian
African American
Hispanic
White
Grade 7 (total 52 items)
Male
Female
Asian
African American
Hispanic
White
Grade 8 (total 54 items)
Male
Female
Asian
African American
Hispanic
White
Grade 10 (total 54 items)
Male
Female
Asian
African American
Hispanic
White

A

B

B-

C

C-

31
N/A
30
23

0
N/A
1
4

1
N/A
1
0

0
N/A
0
5

0
N/A
0
0

50
N/A
53
37

1
N/A
0
6

2
N/A
0
5

0
N/A
0
5

0
N/A
0
0

51
N/A
53
39

2
N/A
0
3

1
N/A
0
2

0
N/A
1
10

0
N/A
0
0

49
N/A
52
41

4
N/A
1
2

0
N/A
0
3

0
N/A
0
6

0
N/A
0
1

53
N/A
51
37

0
N/A
2
3

0
N/A
0
3

0
N/A
1
8

1
N/A
0
3

50
N/A
51
41

1
N/A
0
5

1
N/A
1
0

0
N/A
0
4

0
N/A
0
2

53
N/A
50
35

0
N/A
2
7

1
N/A
2
4

0
N/A
0
5

0
N/A
0
3

53
N/A
52
N/A

1
N/A
2
N/A

0
N/A
0
N/A

0
N/A
0
N/A

0
N/A
0
N/A

N/A= not applicable because case count requirements for the reference (400) and focal (200) groups were not met.
See Table 5 for the numbers of examinees in each grade and subgroup.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

70

Table 26. Numbers of Operational Items Flagged for DIF Using the Mantel-Haenszel
Procedure: Science/Biology
Reference Group

Focal Group

Grade 5 (total 50 items)
Male
Female
Asian
African American
Hispanic
White
Grade 8 (total 50 items)
Male
Female
Asian
African American
Hispanic
White
High School (total 50 items)
Male
Female
Asian
African American
Hispanic
White

A

B

B-

C

C-

50
N/A
50
40

0
N/A
0
8

0
N/A
0
0

0
N/A
0
2

0
N/A
0
0

49
N/A
49
39

1
N/A
1
6

0
N/A
0
2

0
N/A
0
3

0
N/A
0
0

49
N/A
48
N/A

1
N/A
0
N/A

0
N/A
2
N/A

0
N/A
0
N/A

0
N/A
0
N/A

N/A= not applicable because case count requirements for the reference (400) and focal (200) groups were not met.
See Table 5 for the numbers of examinees in each grade and subgroup.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

71

Table 27. Numbers of Operational/Field Test Items Flagged for DIF Using the MantelHaenszel Procedure: Composition
Reference Group

Focal Group

Grade 4 (total 8 items)
Male
Female
Asian
African American
Hispanic
White
Grade 7 (total 8 items)
Male
Female
Asian
African American
Hispanic
White
Grade 10 (total 8 items)
Male
Female
Asian
African American
Hispanic
White

A

B

B-

C

C-

4
N/A
6
4

3
N/A
0
0

0
N/A
0
1

1
N/A
2
0

0
N/A
0
0

2
N/A
3
N/A

5
N/A
2
N/A

0
N/A
0
N/A

1
N/A
3
N/A

0
N/A
0
N/A

1
N/A
N/A
N/A

0
N/A
N/A
N/A

0
N/A
N/A
N/A

1
N/A
N/A
N/A

0
N/A
N/A
N/A

N/A= not applicable because case count requirements for the reference (400) and focal (200) groups were not met.
See Table 5 for the numbers of examinees in each grade and subgroup.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

72

Table 28. Numbers of Field Test Items Flagged for DIF Using the Mantel-Haenszel
Procedure: Reading
Reference Group

Focal Group

Grade 2 (total 42 items)
Male
Female
Asian
African American
Hispanic
White
Grade 3 (total 38 items)
Male
Female
Asian
African American
Hispanic
White
Grade 4 (total 38 items)
Male
Female
Asian
African American
Hispanic
White
Grade 5 (total 38 items)
Male
Female
Asian
African American
Hispanic
White
Grade 6 (total 38 items)
Male
Female
Asian
African American
Hispanic
White1
Grade 7 (total 38 items)
Male
Female
Asian
African American
Hispanic
White
Grade 8 (total 40 items)
Male
Female
Asian
African American
Hispanic
White

A

B

B-

C

C-

42
N/A
40
17

0
N/A
1
9

0
N/A
0
0

0
N/A
0
16

0
N/A
1
0

37
N/A
35
31

1
N/A
1
4

0
N/A
2
0

0
N/A
0
2

0
N/A
0
1

37
N/A
37
27

0
N/A
0
4

0
N/A
1
0

1
N/A
0
7

0
N/A
0
0

36
N/A
38
29

2
N/A
0
6

0
N/A
0
0

0
N/A
0
3

0
N/A
0
0

38
N/A
37
26

0
N/A
0
4

0
N/A
1
0

0
N/A
0
7

0
N/A
0
0

38
N/A
38
26

0
N/A
0
6

0
N/A
0
0

0
N/A
0
6

0
N/A
0
0

36
N/A
38
16

3
N/A
0
1

1
N/A
1
1

0
N/A
1
1

0
N/A
0
1

N/A= not applicable because case count requirements for the reference (400) and focal (200) groups were not met.
See Table 5 for the numbers of examinees in each grade and subgroup.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

73

Table 28. Numbers of Field Test Items Flagged for DIF Using the Mantel-Haenszel
Procedure: Reading (continued)
Grade 9 (total 40 items)
Male
Female
Asian
African American
Hispanic
White
Grade 10 (total 40 items)
Male
Female
Asian
African American
Hispanic
White

38
N/A
35
N/A

2
N/A
4
N/A

0
N/A
0
N/A

0
N/A
0
N/A

0
N/A
1
N/A

35
N/A
36
N/A

4
N/A
2
N/A

1
N/A
1
N/A

0
N/A
0
N/A

0
N/A
1
N/A

N/A= not applicable because case count requirements for the reference (400) and focal (200) groups were not met.
See Table 5 for the numbers of examinees in each grade and subgroup.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

74

Table 29. Numbers of Field Test Items Flagged for DIF Using the Mantel-Haenszel
Procedure: Mathematics
Reference Group

Focal Group

Grade 2 (total 36 items)
Male
Female
Asian
African American
Hispanic
White
Grade 3 (total 32 items)
Male
Female
Asian
African American
Hispanic
White
Grade 4 (total 32 items)
Male
Female
Asian
African American
Hispanic
White
Grade 5 (total 32 items)
Male
Female
Asian
African American
Hispanic
White
Grade 6 (total 32 items)
Male
Female
Asian
African American
Hispanic
White
Grade 7 (total 32 items)
Male
Female
Asian
African American
Hispanic
White
Grade 8 (total 32 items)
Male
Female
Asian
African American
Hispanic
White
Grade 10 (total 32 items)
Male
Female
Asian
African American
Hispanic
White

A

B

B-

C

C-

35
N/A
35
25

0
N/A
1
3

1
N/A
0
2

0
N/A
0
5

0
N/A
0
1

31
N/A
32
25

0
N/A
0
1

0
N/A
0
0

1
N/A
0
3

0
N/A
0
2

30
N/A
32
19

0
N/A
0
3

2
N/A
0
2

0
N/A
0
7

0
N/A
0
1

31
N/A
28
15

0
N/A
3
2

1
N/A
1
0

0
N/A
0
15

0
N/A
0
0

29
N/A
28
26

1
N/A
2
0

1
N/A
2
2

0
N/A
0
3

1
N/A
0
1

32
N/A
27
23

0
N/A
5
3

0
N/A
0
0

0
N/A
0
6

0
N/A
0
0

32
N/A
32
21

0
N/A
0
5

0
N/A
0
1

0
N/A
0
4

0
N/A
0
1

30
N/A
30
N/A

1
N/A
1
N/A

1
N/A
1
N/A

0
N/A
0
N/A

0
N/A
0
N/A

N/A= not applicable because case count requirements for the reference (400) and focal (200) groups were not met.
See Table 5 for the numbers of examinees in each grade and subgroup.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

75

Table 30. Numbers of Field Test Items Flagged for DIF Using the Mantel-Haenszel
Procedure: Science/Biology
Reference Group

Focal Group

Grade 5 (total 28 items)
Male
Female
Asian
African American
Hispanic
White
Grade 8 (total 28 items)
Male
Female
Asian
African American
Hispanic
White
High School (total 28 items)
Male
Female
Asian
African American
Hispanic
White

A

B

B-

C

C-

28
N/A
27
20

0
N/A
1
2

0
N/A
0
0

0
N/A
0
6

0
N/A
0
0

26
N/A
28
22

0
N/A
0
3

2
N/A
0
0

0
N/A
0
3

0
N/A
0
0

28
N/A
27
N/A

0
N/A
1
N/A

0
N/A
0
N/A

0
N/A
0
N/A

0
N/A
0
N/A

N/A= not applicable because case count requirements for the reference (400) and focal (200) groups were not met.
See Table 5 for the numbers of examinees in each grade and subgroup.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

76

Table 31. Total Test Scale and Raw Score Means and Reliability Statistics
Students
with
Test
Scores

Number
of Items

2
3
4
5
6
7
8
9
10

4,469
4,737
4,559
4,734
4,539
4,283
4,337
3,534
4,230

35
48
48
48
48
48
48
48
48

0.88
0.93
0.92
0.92
0.91
0.90
0.90
0.92
0.92

2
3
4
5
6
7
8
10

4,499
4,771
4,590
4,747
4,551
4,297
4,341
3,466

32
53
54
53
54
52
54
54

0.89
0.93
0.93
0.93
0.93
0.92
0.92
0.91

5
8
HS

4,697
4,253
3,693

50
50
50

0.89
0.88
0.85

4
7
10

4,508
4,176
3,429

8
8
8

0.92
0.91
0.92

Grade

Alpha

Stratified
Alpha

FeldtRaju

Scale Score

Raw Score

Mean
Reading
0.88
0.94
0.92
0.92
0.92
0.91
0.91
0.93
0.92
Mathematics
0.89
0.94
0.93
0.93
0.94
0.92
0.92
0.92
Science/Biology
0.89
0.88
0.86
Composition*
0.92
0.90
0.92

SD

Mean

SD

0.88
0.94
0.92
0.92
0.92
0.90
0.91
0.93
0.92

241.97
348.65
452.42
553.75
650.16
754.13
853.86
947.17
951.32

15.78
15.37
15.09
15.09
14.20
14.25
14.32
16.94
15.45

23.25
33.48
31.82
32.95
33.13
33.03
30.26
28.27
30.94

7.82
11.60
11.15
10.73
10.96
10.33
10.42
11.34
11.23

0.90
0.94
0.93
0.93
0.94
0.93
0.92
0.92

253.94
352.25
456.65
557.66
651.21
753.33
850.23
946.80

15.27
17.70
15.75
16.67
17.11
17.49
16.59
18.80

37.03
37.04
36.73
39.20
33.02
31.49
29.39
27.28

10.68
12.53
12.29
12.30
13.01
11.84
11.75
11.83

0.89
0.88
0.86

548.40
848.66
947.91

13.28
17.76
14.76

24.55
21.00
21.34

9.69
9.49
8.62

0.93
0.92
0.93

451.73
754.33
952.18

18.87
15.76
20.11

4.51
5.27
4.76

1.91
1.98
2.17

*8 items = 4 prompts scored twice with two Writing rubrics

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

77

Table 31. Coefficient Alpha Reliability for Reading Strand Scores
Grade

2

3

4

5

6

7

8

9

10

Content Strand
1 Vocabulary Acquisition & Use
3 Reading Informational Text
4 Reading Literary Text
Total Number of Items on DC CAS
1 Vocabulary Acquisition & Use
3 Reading Informational Text
4 Reading Literary Text
Total Number of Items on DC CAS
1 Vocabulary Acquisition & Use
3 Reading Informational Text
4 Reading Literary Text
Total Number of Items on DC CAS
1 Vocabulary Acquisition & Use
3 Reading Informational Text
4 Reading Literary Text
Total Number of Items on DC CAS
1 Vocabulary Acquisition & Use
3 Reading Informational Text
4 Reading Literary Text
Total Number of Items on DC CAS
1 Vocabulary Acquisition & Use
3 Reading Informational Text
4 Reading Literary Text
Total Number of Items on DC CAS
1 Vocabulary Acquisition & Use
3 Reading Informational Text
4 Reading Literary Text
Total Number of Items on DC CAS
1 Vocabulary Acquisition & Use
3 Reading Informational Text
4 Reading Literary Text
Total Number of Items on DC CAS
1 Vocabulary Acquisition & Use
3 Reading Informational Text
4 Reading Literary Text
Total Number of Items on DC CAS

Number of
Items
7
14
14
35
8
19
21
48
8
18
22
48
8
18
22
48
9
17
22
48
8
20
20
48
7
22
19
48
8
23
17
48
9
20
19
48

Mean
p value
0.70
0.54
0.71

Standard
Deviation
0.15
0.15
0.18

0.72
0.56
0.70

0.13
0.14
0.14

0.71
0.83
0.87

0.68
0.58
0.62

0.13
0.13
0.12

0.64
0.82
0.83

0.67
0.59
0.70

0.12
0.17
0.12

0.63
0.78
0.86

0.63
0.60
0.68

0.15
0.15
0.14

0.69
0.80
0.82

0.62
0.62
0.64

0.12
0.14
0.10

0.67
0.78
0.79

0.65
0.58
0.58

0.09
0.11
0.22

0.58
0.83
0.76

0.58
0.58
0.58

0.20
0.14
0.14

0.58
0.87
0.82

0.69
0.61
0.58

0.09
0.11
0.15

0.69
0.84
0.80

Reliability

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

0.58
0.76
0.74

Technical Report for Spring 2012 Test Administration of DC CAS

78

Table 33. Coefficient Alpha Reliability for Mathematics Strand Scores
Grade

2

3

4

5

6

7

8

10

Content Strand
1 Operations & Algebraic Thinking
2 Numbers & Operations Base Ten
3 Geometry
4 Measurement and Data
Total Number of Items on DC CAS
1 Number Sense & Operations
2 Patterns, Relations & Algebra
3 Geometry
4 Measurement
5 Data Analysis, Statistics & Probability
Total Number of Items on DC CAS
1 Number Sense & Operations
2 Patterns, Relations & Algebra
3 Geometry
4 Measurement
5 Data Analysis, Statistics & Probability
Total Number of Items on DC CAS
1 Number Sense & Operations
2 Patterns, Relations & Algebra
3 Geometry
4 Measurement
5 Data Analysis, Statistics & Probability
Total Number of Items on DC CAS
1 Number Sense & Operations
2 Patterns, Relations & Algebra
3 Geometry
4 Measurement
5 Data Analysis, Statistics & Probability
Total Number of Items on DC CAS
1 Number Sense & Operations
2 Patterns, Relations & Algebra
3 Geometry
4 Measurement
5 Data Analysis, Statistics & Probability
Total Number of Items on DC CAS
1 Number Sense & Operations
2 Patterns, Relations & Algebra
3 Geometry
4 Measurement
5 Data Analysis, Statistics & Probability
Total Number of Items on DC CAS
1 Number Sense & Operations
2 Patterns, Relations & Algebra
3 Geometry
4 Measurement
5 Data Analysis, Statistics & Probability
Total Number of Items on DC CAS

Number
of Items
8
9
4
11
32
17
9
5
11
11
53
23
8
5
7
11
54
20
11
6
9
7
53
16
14
8
6
10
54
17
13
7
7
8
52
16
19
6
4
9
54
11
19
7
7
10
54

Mean
p value
0.67
0.80
0.83
0.70

Standard
Reliability
Deviation
0.16
0.74
0.10
0.70
0.03
0.41
0.14
0.76

0.68
0.72
0.65
0.51
0.71

0.17
0.16
0.25
0.13
0.12

0.82
0.75
0.46
0.80
0.76

0.68
0.65
0.64
0.45
0.65

0.14
0.16
0.27
0.12
0.17

0.88
0.66
0.49
0.54
0.71

0.70
0.72
0.64
0.57
0.68

0.14
0.12
0.20
0.11
0.11

0.82
0.75
0.53
0.69
0.67

0.65
0.49
0.63
0.53
0.57

0.14
0.13
0.11
0.15
0.12

0.80
0.79
0.53
0.66
0.74

0.57
0.59
0.55
0.53
0.67

0.14
0.12
0.20
0.11
0.14

0.82
0.74
0.50
0.70
0.62

0.51
0.53
0.48
0.36
0.55

0.16
0.08
0.09
0.15
0.16

0.76
0.81
0.48
0.41
0.67

0.55
0.45
0.53
0.40
0.50

0.18
0.11
0.10
0.11
0.17

0.69
0.84
0.58
0.40
0.62

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

79

Table 34. Coefficient Alpha Reliability for Science/Biology Strand Scores
Grade

Content Strand

1 Science and Technology
2 Earth and Space Science
5
3 Physical Science
4 Life Science
Total Number of Items on DC CAS
1 Scientific Thinking and Inquiry
2 Matter and Reactions
8
3 Forces
4 Energy and Waves
Total Number of Items on DC CAS
1 Cell Biology & Biochemistry
2 Genetics and Evolution
High
3 Multicellular Organisms
School
4 Ecosystems
Total Number of Items on DC CAS

Number of
Items
15
13
10
12
50
7
22
9
12
50
14
15
11
10
50

Mean
p value
0.43
0.51
0.49
0.47

Standard
Deviation
0.15
0.19
0.10
0.13

Reliability
0.71
0.66
0.66
0.65

0.42
0.40
0.45
0.39

0.11
0.11
0.09
0.10

0.60
0.75
0.63
0.56

0.38
0.40
0.47
0.40

0.13
0.10
0.13
0.12

0.58
0.65
0.61
0.54

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

80

Table 35. Coefficient Alpha Reliability for Composition Strand Scores
Grade

4
7
10

Content Strand

Writing Topic Development
Writing Language Conventions
Writing Topic Development
Writing Language Conventions
Writing Topic Development
Writing Language Conventions

Number of
Items Across
Four Forms
4
4
4
4
4
4

Students
with Test
Scores
Across the
Four Forms
4,508
4,508
4,176
4,176
3,429
3,429

Mean Raw
Score
2.33
2.18
2.75
2.52
2.34
2.42

Correlation
STD of Raw
Between the
Score
Two Strands
1.16
0.85
1.18
0.89
1.29
1.02

0.80
-0.84
-0.77
--

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

81

Table 36. DC CAS 2012 Reading Strand Correlations by Grade
Grade

2

3

4

5

6

7

8

9

10

Content Strand
Acquisition & Use
Informational Text
Literary Text
Total Raw Score
Acquisition & Use
Informational Text
Literary Text
Total Raw Score
Acquisition & Use
Informational Text
Literary Text
Total Raw Score
Acquisition & Use
Informational Text
Literary Text
Total Raw Score
Acquisition & Use
Informational Text
Literary Text
Total Raw Score
Acquisition & Use
Informational Text
Literary Text
Total Raw Score
Acquisition & Use
Informational Text
Literary Text
Total Raw Score
Acquisition & Use
Informational Text
Literary Text
Total Raw Score
Acquisition & Use
Informational Text
Literary Text
Total Raw Score

Acquisition &
Use
-0.61
0.59
0.77
-0.75
0.78
0.86
-0.71
0.73
0.82
-0.69
0.71
0.82
-0.73
0.72
0.84
-0.68
0.72
0.83
-0.68
0.68
0.79
-0.69
0.69
0.80
-0.76
0.73
0.86

Informational
Text
0.61
-0.73
0.93
0.75
-0.81
0.94
0.71
-0.82
0.94
0.69
-0.79
0.93
0.73
-0.80
0.94
0.68
-0.77
0.93
0.68
-0.77
0.95
0.69
-0.82
0.96
0.76
-0.79
0.95

Literary
Text
0.59
0.73
-0.90
0.78
0.81
-0.96
0.73
0.82
-0.96
0.71
0.79
-0.95
0.72
0.80
-0.94
0.72
0.77
-0.93
0.68
0.77
-0.92
0.69
0.82
-0.93
0.73
0.79
-0.93

Total Reading

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

0.77
0.93
0.90
-0.86
0.94
0.96
-0.82
0.94
0.96
-0.82
0.93
0.95
-0.84
0.94
0.94
-0.83
0.93
0.93
-0.79
0.95
0.92
-0.80
0.96
0.93
-0.86
0.95
0.93
--

Technical Report for Spring 2012 Test Administration of DC CAS

82

Table 37. DC CAS 2012 Mathematics Strand Correlations by Grade
Grade

2

Grade

3

4

5

Content Strand
Operations & Algebraic Thinking
Numbers & Operations Base Ten
Geometry
Measurement & Data
Total Raw Score
Content Strand
Number Sense & Operations
Patterns, Relations & Algebra
Geometry
Measurement
Data Analysis, Statistics & Probability
Total Raw Score
Number Sense & Operations
Patterns, Relations & Algebra
Geometry
Measurement
Data Analysis, Statistics & Probability
Total Raw Score
Number Sense & Operations
Patterns, Relations & Algebra
Geometry
Measurement
Data Analysis, Statistics & Probability
Total Raw Score

Operations & Numbers &
Algebraic
Operations
Thinking
Base Ten
-0.70
N/A
0.72
0.90
Number
Sense &
Operations
-0.79
0.62
0.78
0.75
0.94
-0.79
0.65
0.61
0.75
0.95
-0.79
0.67
0.74
0.74
0.93

0.70
-N/A
0.68
0.86
Patterns,
Relations
& Algebra
0.79
-0.54
0.69
0.71
0.86
0.79
-0.60
0.56
0.69
0.87
0.79
-0.66
0.70
0.73
0.90

Geometry

Measurement
& Data

N/A
N/A
N/A
N/A
N/A

0.72
0.68
N/A
-0.90

Geometry
0.62
0.54
-0.58
0.58
0.72
0.65
0.60
-0.50
0.61
0.75
0.67
0.66
-0.64
0.63
0.80

Total
Mathematics

0.9
0.86
N/A
0.90
-Data Analysis,
Total
Measurement
Statistics &
Mathematics
Probability
0.78
0.75
0.94
0.69
0.71
0.86
0.58
0.58
0.72
-0.68
0.88
0.68
-0.87
0.88
0.87
-0.61
0.75
0.95
0.56
0.69
0.87
0.50
0.61
0.75
-0.53
0.72
0.53
-0.86
0.72
0.86
-0.74
0.74
0.93
0.70
0.73
0.90
0.64
0.63
0.80
-0.67
0.85
0.67
-0.86
0.85
0.86
--

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

83

Table 37. DC CAS 2012 Mathematics Strand Correlations by Grade (continued)
Grade

6

7

8

10

Content Strand
Number Sense & Operations
Patterns, Relations & Algebra
Geometry
Measurement
Data Analysis, Statistics & Probability
Total Raw Score
Number Sense & Operations
Patterns, Relations & Algebra
Geometry
Measurement
Data Analysis, Statistics & Probability
Total Raw Score
Number Sense & Operations
Patterns, Relations & Algebra
Geometry
Measurement
Data Analysis, Statistics & Probability
Total Raw Score
Number Sense & Operations
Patterns, Relations & Algebra
Geometry
Measurement
Data Analysis, Statistics & Probability
Total Raw Score

Number
Sense &
Operations

Patterns,
Relations
& Algebra

Geometry

Measurement

-0.79
0.64
0.74
0.77
0.93
-0.80
0.62
0.74
0.70
0.94
-0.77
0.58
0.56
0.69
0.90
-0.73
0.66
0.52
0.65
0.86

0.79
-0.60
0.73
0.74
0.91
0.8
-0.60
0.72
0.67
0.92
0.77
-0.62
0.57
0.72
0.94
0.73
-0.71
0.57
0.69
0.93

0.64
0.60
-0.57
0.58
0.74
0.62
0.60
-0.54
0.52
0.73
0.58
0.62
-0.46
0.57
0.75
0.66
0.71
-0.51
0.61
0.83

Data Analysis,
Total
Statistics &
Mathematics
Probability

0.74
0.73
0.57
-0.69
0.85
0.74
0.72
0.54
-0.62
0.84
0.56
0.57
0.46
-0.51
0.68
0.52
0.57
0.51
-0.51
0.69

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

0.77
0.74
0.58
0.69
-0.87
0.7
0.67
0.52
0.62
-0.80
0.69
0.72
0.57
0.51
-0.84
0.65
0.69
0.61
0.51
-0.82

0.93
0.91
0.74
0.85
0.87
-0.94
0.92
0.73
0.84
0.80
-0.90
0.94
0.75
0.68
0.84
-0.86
0.93
0.83
0.69
0.82
--

Technical Report for Spring 2012 Test Administration of DC CAS

84

Table 38. DC CAS 2012 Science/Biology Strand Correlations by Grade
Grade

5

Science and
Technology

Content Strand
Science and Technology
Earth and Space Science
Physical Science
Life Science
Total Raw Score

Grade

Content Strand

8

Scientific Thinking and Inquiry
Matter and Reactions
Forces
Energy and Waves
Total Raw Score

Grade

-0.66
0.68
0.68
0.89
Scientific
Thinking
and Inquiry
-0.64
0.63
0.54
0.80
Cell Biology and
Biochemistry

Content Strand

Cell Biology and Biochemistry
Genetics and Evolution
High
Multicellular Organisms
School
Ecosystems
Total Raw Score

-0.62
0.57
0.55
0.83

Earth
and
Space Science
0.66
-0.65
0.65
0.86
Matter
and Reactions
0.64
-0.67
0.61
0.92
Genetics
and
Evolution
0.62
-0.59
0.59
0.86

Physical Science

Life Science

Total Science

0.68
0.65
-0.65
0.85

0.68
0.65
0.65
-0.86
Energy
and
Waves
0.54
0.61
0.59
-0.79

0.89
0.86
0.85
0.86
-Total Science

Multicellular
Organisms

Ecosystems

Total
Biology

0.57
0.59
-0.58
0.82

0.55
0.59
0.58
-0.80

0.83
0.86
0.82
0.80
--

Forces
0.63
0.67
-0.59
0.84

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

0.80
0.92
0.84
0.79
--

Technical Report for Spring 2012 Test Administration of DC CAS

85

Table 39. DC CAS 2012 Composition Rubric Score Correlations by Grade
Grade
4

7

10

Content Strand
Topic Development
Language Conventions
Total Raw Score
Topic Development
Language Conventions
Total Raw Score
Topic Development
Language Conventions
Total Raw Score

Topic Development
-0.78
0.96
-0.81
0.97
-0.69
0.95

Language
Conventions
0.78
-0.92
0.81
-0.94
0.69
-0.89

Total Composition
0.96
0.92
-0.97
0.94
-0.95
0.89
--

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

86

Table 40. DC CAS 2012 Number Correct to Scale Score Conversions with Associated Standard Errors of Measurement (SEM):
Reading
Grade 2
Raw
Scale
Score
SEM
Score
0
200
25
1
200
25
2
200
25
3
200
25
4
200
25
5
200
25
6
200
25
7
200
25
8
206
19
9
213
12
10
217
9
11
220
7
12
222
6
13
225
6
14
227
5
15
228
5
16
230
5
17 232*
5
18
233
5
19
235
4
20
236
4
21
238
4
22
239
4
23
241
4
24
243
4
25
244
4
26
246* 5
27
248
5

Grade 3
Scale
SEM
Score
300
31
300
31
300
31
300
31
300
31
300
31
300
31
300
31
300
31
300
31
302
29
313
18
319
12
322
9
325
8
327
6
329
6
331
5
332
5
334
4
335
4
336
4
337
4
339* 4
340
3
341
3
342
3
343
3

Grade 4
Scale
SEM
Score
400
37
400
37
400
37
400
37
400
37
400
37
400
37
400
37
400
37
400
37
412
25
421
16
425
12
429
9
431
7
433
6
435
6
437
5
438
5
439* 4
441
4
442
4
443
4
444
4
445
4
446
3
447
3
448
3

Grade 5
Scale
SEM
Score
500
38
500
38
500
38
500
38
500
38
500
38
500
38
500
38
500
38
500
38
500
38
518
20
524
14
528
10
531
8
533
6
535
6
537
5
538
5
539
4
541* 4
542
4
543
4
544
4
545
3
546
3
547
3
548
3

Grade 6
Grade 7
Grade 8
Scale
Scale
Scale
SEM
SEM
SEM
Score
Score
Score
600
35
700
38
800
39
600
35
700
38
800
39
600
35
700
38
800
39
600
35
700
38
800
39
600
35
700
38
800
39
600
35
700
38
800
39
600
35
700
38
800
39
600
35
700
38
800
39
600
35
700
38
800
39
600
35
700
38
800
39
600
35
700
38
816
23
616
19
714
24
824
15
622
13
721
17
829
10
626
9
726
12
832
8
628
7
729
10
834
7
630
6
732
8
836
6
632
5
734
7
838
6
634
5
736
6
839
5
635
4
738
5
841* 5
636
4
739* 5
842
5
638
4
741
5
844
4
639
4
742
4
845
4
640* 4
743
4
846
4
641
3
744
4
847
4
642
3
745
4
848
4
643
3
747
4
849
4
644
3
748
4
851
4
645
3
749
4
852
3

Grade 9
Scale
SEM
Score
900
38
900
38
900
38
900
38
900
38
900
38
900
38
900
38
900
38
900
38
900
38
910
28
920
18
925
13
929
10
932* 8
934
7
936
6
938
5
939
5
941
4
942
4
943
4
944
4
945
4
946
3
947
3
948
3

*Proficiency Level Scale Score cuts (Basic, Proficient, Advanced)

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Grade 10
Scale
SEM
Score
900
37
900
37
900
37
900
37
900
37
900
37
900
37
900
37
900
37
900
37
909
28
919
18
924
13
928
10
931
8
933
7
935
6
937
5
938
5
940* 5
941
4
942
4
943
4
944
4
945
4
946
4
947
3
948
3

Technical Report for Spring 2012 Test Administration of DC CAS

87

Table 40. DC CAS 2012 Number Correct to Scale Score Conversions with Associated Standard Errors of Measurement (SEM):
Reading (continued)
Grade 2
Raw
Scale
Score
SEM
Score
28
249
5
29
251
5
30
253
5
31
255
5
32
258
6
33
261
6
34 264*
7
35
268
8
36
273
9
37
280
10
38
290
16
39
299
23
40
.
.
41
.
.
42
.
.
43
.
.
44
.
.
45
.
.
46
.
.
47
.
.
48
.
.
49
.
.
50
.
.
51
.
.
52
.
.
53
.
.
54
.
.

Grade 3
Scale
SEM
Score
343
3
344
3
345
3
346
3
347
3
348
3
349
3
350
3
351
3
352
3
353
3
354* 3
355
3
356
3
357
3
358
3
360
4
361
4
363
4
365
4
367
4
369
4
371
5
375* 6
379
8
388
12
399
19

Grade 4
Scale
SEM
Score
449
3
450
3
451
3
452
3
453
3
454
3
455* 3
456
3
457
3
458
3
459
3
460
3
461
3
462
3
463
4
464
4
466
4
467
4
469
5
471
5
474* 6
477
7
481
8
486
10
494
13
499
15
499
15

Grade 5
Scale
SEM
Score
549
3
550
3
551
3
552
3
552
3
553
3
554
3
555
3
556* 3
558
3
559
3
560
4
561
4
562
4
564
4
565
4
567
4
569
5
571
5
573* 5
576
6
579
6
582
7
587
8
594
11
599
13
599
13

Grade 9
Grade 6
Grade 7
Grade 8
Grade 10
Scale
Scale
Scale
Scale
Scale
SEM
SEM
SEM
SEM
SEM
Score
Score
Score
Score
Score
646
3
750
3
853
3
949
3
949
3
646
3
751
3
854
3
950* 3
950
3
647
3
752
3
854
3
951
3
951
3
648
3
753
3
855
3
952
3
952
3
649
3
753
3
856* 3
953
3
953
3
650
3
754
3
857
3
954
3
954
3
651
3
755
3
858
3
955
3
955
3
652
3
756* 3
859
3
956
3
956* 3
653
3
757
3
860
3
957
3
957
3
654
3
758
3
862
3
958
3
958
3
655* 3
759
3
863
3
959
3
959
3
656
3
761
3
864
4
960* 3
960
3
657
3
762
3
865
4
961
3
961
3
658
3
763
4
866
4
962
3
963
4
659
3
764
4
867
4
964
3
964
4
660
3
765
4
869
4
965
3
965
4
662
4
767
4
870* 4
966
4
967
4
663
4
768* 4
872
4
968
4
968
4
665
4
770
4
873
4
969
4
970* 5
666
4
772
5
875
5
971
4
972
5
668
4
774
5
877
5
974
5
974
5
670
4
776
6
880
5
976
5
977
6
672* 5
779
6
882
6
980
6
981
7
676
6
783
8
886
7
985
8
985
8
680
8
789
10
892
10
993
12 992
11
689
13
799
16
899
14
999
16 999
13
699
21
799
16
899
14
.
.
999
13

*Proficiency Level Scale Score cuts (Basic, Proficient, Advanced)

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

88

Table 41. DC CAS 2012 Number Correct to Scale Score Conversions with Associated Standard Errors of Measurement (SEM):
Mathematics
Grade 2
Raw
Scale
Score
SEM
Score
0
200
35
1
200
35
200
35
2
3
200
35
200
35
4
5
200
35
6
200
35
209
26
7
8
220
15
225
10
9
10
228
8
231
6
11
12
233
6
13
236
5
237
5
14
15
239
4
241
4
16
17
242
4
18
244* 4
245
4
19
20
247
3
248
3
21
22
249
3
251
3
23
24
252
3
25
253
3
255* 3
26
27
257
4

Grade 3
Grade 4
Grade 5
Scale
Scale
Scale
SEM
SEM
SEM
Score
Score
Score
300
26
400
38
500
32
300
26
400
38
500
32
300
26
400
38
500
32
300
26
400
38
500
32
300
26
400
38
500
32
300
26
400
38
500
32
300
26
400
38
500
32
300
26
400
38
500
32
300
26
400
38
500
32
300
26
400
38
500
32
300
26
400
38
500
32
302
24
400
38
500
32
309
17
409
28
507
25
314
13
419
19
514
18
317
11
424
14
519
13
320
9
428
10
523
11
323
8
431
9
526
9
325
7
433
8
529
8
327
7
435
7
531
7
329
6
437
6
533
7
331
6
439
6
535
6
333
6
440
5
537
6
334
5
442
5
538
5
336
5
443* 5
540
5
337
5
444
4
541
5
339
5
445
4
542
5
340* 4
446
4
544* 4
341
4
448
4
545
4

Grade 6
Scale
SEM
Score
600
37
600
37
600
37
600
37
600
37
600
37
600
37
600
37
600
37
600
37
600
37
600
37
614
22
621
16
625
12
628
9
631
8
633
7
635
6
637* 6
639
5
640
5
641
5
643
4
644
4
645
4
646
4
647
4

Grade 7
Grade 8
Grade 10
Scale
Scale
Scale
SEM
SEM
SEM
Score
Score
Score
700
39
800
43
900
35
700
39
800
43
900
35
700
39
800
43
900
35
700
39
800
43
900
35
700
39
800
43
900
35
700
39
800
43
900
35
700
39
800
43
900
35
700
39
800
43
900
35
700
39
800
43
900
35
700
39
800
43
900
35
700
39
800
43
900
35
706
33
800
43
907
28
717
22
806
37
915
20
723
15
820
23
921
15
728
11
827
16
925
12
731
9
831
12
929
10
734
8
834
9
931
9
736* 7
837* 8
934* 8
738
6
839
7
936
7
740
6
841
6
938
7
741
5
842
5
940
6
743
5
844
5
942
6
744
5
845
5
943
5
745
4
846
4
945
5
747
4
848
4
946
5
748
4
849
4
948
5
749
4
850* 4
949
4
750
4
851
4
950
4

*Proficiency Level Scale Score cuts (Basic, Proficient, Advanced)

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

89

Table 41. DC CAS 2012 Number Correct to Scale Score Conversions with Associated Standard Errors of Measurement (SEM):
Mathematics (continued)
Grade 2
Raw
Scale
Score
SEM
Score
28
258
4
29
261
4
30
263
5
31
267
6
32
273* 9
33
299
35
34
.
.
35
.
.
36
.
.
37
.
.
38
.
.
39
.
.
40
.
.
41
.
.
42
.
.
43
.
.
44
.
.
45
.
.
46
.
.
47
.
.
48
.
.
49
.
.
50
.
.
51
.
.
52
.
.
53
.
.
54
.
.
55
.
.

Grade 3
Grade 4
Grade 5
Scale
Scale
Scale
SEM
SEM
SEM
Score
Score
Score
342
4
449
4
546
4
344
4
450
4
547
4
345
4
451
3
548
4
346
4
452
3
549
4
347
4
453
3
550
4
348
4
454
3
551
4
349
4
454
3
552
4
350
3
455
3
553
3
351
3
456
3
554
3
352
3
457
3
555
3
353
3
458* 3
556
3
354
3
459
3
557
3
355
3
460
3
558
3
356
3
461
3
559
3
357
3
462
3
560* 3
359
3
463
3
561
3
360* 3
464
3
562
3
361
4
465
3
563
3
362
4
466
3
564
3
363
4
467
3
565
3
365
4
468
3
567
4
366
4
470
4
568
4
368
4
471
4
569
4
369
4
472
4
571
4
371
5
474* 4
572
4
373
5
476
4
574
5
376* 5
477
4
576* 5
378
6
480
5
579
6

Grade 6
Scale
SEM
Score
648
4
649
3
650
3
651
3
652
3
653
3
654* 3
655
3
656
3
657
3
657
3
658
3
659
3
660
3
661
3
662
3
663
3
664
3
665
3
666
3
667
3
668* 3
669
3
671
4
672
4
673
4
675
4
677
5

Grade 7
Grade 8
Grade 10
Scale
Scale
Scale
SEM
SEM
SEM
Score
Score
Score
751
4
852
4
951* 4
752* 4
853
3
952
4
753
4
854
3
954
4
754
4
855
3
955
4
755
3
855
3
956
4
756
3
856
3
957
4
757
3
857
3
958
4
758
3
858
3
959
4
759
3
859
3
960
3
760
3
860
3
961
3
761
3
861
3
962
3
762
3
861
3
963
3
763
3
862
3
964
3
764
3
863
3
965
3
765
3
864
3
966
3
767
4
865
3
967
4
768
4
866
3
968
4
769
4
867
3
969
4
771* 4
868* 3
970
4
772
4
869
3
972* 4
774
4
870
3
973
4
776
5
871
3
974
4
778
5
873
4
976
4
781
6
874
4
978
5
785
7
876
4
980
5
790
9
878
4
982
5
798
12
880
5
985
6
799
13
883
5
988
7

*Proficiency Level Scale Score cuts (Basic, Proficient, Advanced)

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

90

Table 41. DC CAS 2012 Number Correct to Scale Score Conversions with Associated Standard Errors of Measurement (SEM):
Mathematics (continued)
Grade 2
Raw
Scale
Score
SEM
Score
56
.
.
57
.
.
58
.
.
59
.
.
60
.
.

Grade 3
Grade 4
Grade 5
Scale
Scale
Scale
SEM
SEM
SEM
Score
Score
Score
382
7
482
5
583
7
387
9
485
6
587
9
397
14
490
8
596
14
399
15
498
11
599
15
.
.
499
12
.
.

Grade 6
Grade 7
Grade 8
Scale
Scale
Scale
SEM
SEM
SEM
Score
Score
Score
679
5
799
13
886
6
682
6
.
.
890
7
687
8
.
.
894
8
695
13
.
.
899
10
699
16
.
.
899
10

Grade 10
Scale
SEM
Score
991
8
996
9
999
10
999
10
999
10

*Proficiency Level Scale Score cuts (Basic, Proficient, Advanced)

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

Table 42: DC CAS 2012 Number Correct to Scale Score Conversions with Associated
Standard Errors of Measurement (SEM): Science/Biology
Grade 5
Grade 8
High School
Raw
Scale
Scale
Scale
Score
SEM
SEM
SEM
Score
Score
Score
0
500
45
800
53
900
50
1
500
45
800
53
900
50
2
500
45
800
53
900
50
3
500
45
800
53
900
50
4
500
45
800
53
900
50
5
500
45
800
53
900
50
6
500
45
800
53
900
50
7
500
45
800
53
900
50
8
500
45
800
53
900
50
9
500
45
800
53
900
50
10
508
36
800
53
900
50
11
526
18
830
23
925
25
12
532
13
839
14
934
16
13
536
9
843
10
938
12
14
538
7
846
7
942
9
15
540
6
848
6
944
7
16
542* 5
849*
5
946*
6
17
543
4
851
4
948
5
18
545
4
852
4
949
4
19
546
4
853
4
950
4
20
547
4
854
3
951
4
21
548
3
855
3
952*
3
22
549
3
856*
3
953
3
23
550
3
857
3
954
3
24
551
3
857
3
955
3
25
552
3
858
2
956
3
26
552
3
859
2
956
2
27
553* 3
859
2
957
2
28
554
3
860
2
958
2
29
555
2
861
2
959
2
30
556
2
861
2
959
2
31
556
2
862
2
960
2
32
557
2
863
2
960
2
33
558
2
863
2
961
2
34
559
2
864
2
962
2
35
559
2
864
2
962
2
36
560
2
865
2
963
2
37
561
2
866
2
964
2
38
561
2
866
2
964
2
39
562
2
867
2
965
2
40
563
2
868*
2
965
2
41
564* 2
869
2
966*
2
42
565
2
869
2
967
2
43
566
2
870
3
968
2
44
567
3
871
3
969
2
*Proficiency Level Scale Score cuts (Basic, Proficient, Advanced)
Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

91

Technical Report for Spring 2012 Test Administration of DC CAS

Table 42. DC CAS 2012 Number Correct to Scale Score Conversions with Associated
Standard Errors of Measurement (SEM): Science/Biology (continued)
Grade 5
Grade 8
High School
Raw
Scale
Scale
Scale
Score
SEM
SEM
SEM
Score
Score
Score
45
568
3
872
3
970
2
46
569
3
874
3
971
3
47
570
3
875
3
972
3
48
572
3
877
4
973
3
49
574
4
879
4
975
4
50
576
4
881
5
977
4
51
579
6
885
6
981
6
52
585
8
891
9
987
9
53
599
21
899
15
999
20

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

92

Technical Report for Spring 2012 Test Administration of DC CAS

Table 43. DC CAS 2012 Number Correct to Scale Score Conversions with Associated
Standard Errors of Measurement (SEM): Composition
Prompt 1
Grade 4
Raw
Score Scale
SEM
Score
0
400
17
1
414
12
2
428
11
3
442
10
4
454* 10
5
464* 9
6
472* 7
7
477
7
8
482
7
9
489
9
10
499
15
Prompt 1
Grade 7
Raw
Score Scale
SEM
Score
0
700
30
1
722
8
2
730
7
3
736
7
4
744* 7
5
752
7
6
760* 7
7
768* 7
8
775
7
9
784
9
10
799
16
Prompt 1
Raw Grade 10
Score Scale
SEM
Score
0
900
29
1
919
12
2
928
10
3
934
9
4
941
9
5
948* 10
6
956* 10
7
964
11
8
974* 12
9
989
16
10
999
20

Prompt 2
Grade 4
Scale
SEM
Score
400
15
407
13
423
12
437
11
449* 11
459* 10
467
9
474* 9
482
9
491
11
499
15
Prompt 2
Grade 7
Scale
SEM
Score
700
29
723
6
730
5
737
5
743
6
750* 5
757* 6
763
5
769* 5
777
7
799
28
Prompt 2
Grade 10
Scale
SEM
Score
900
29
922
11
931
9
939
9
949* 9
957* 9
964
8
971* 9
982
11
995
13
999
14

Prompt 3
Grade 4
Scale
SEM
Score
400
18
402
18
419
15
433
13
444* 13
455
12
464* 12
474* 12
484
13
497
16
499
17
Prompt 3
Grade 7
Scale
SEM
Score
700
32
724
8
732
7
738
7
746* 7
753
7
760* 7
767* 7
775
7
783
9
799
18
Prompt 3
Grade 10
Scale
SEM
Score
900
26
920
11
930
9
941
11
953* 10
962* 9
968* 8
975
9
984
10
994
11
999
13

Prompt 4
Grade 4
Scale
SEM
Score
400
16
410
12
425
11
438
10
447* 9
456* 9
464
9
472* 9
479
8
487
10
499
18
Prompt 4
Grade 7
Scale
SEM
Score
700
19
716
11
729
10
740
9
749* 8
757* 8
764
8
771* 8
778
8
786
9
799
18
Prompt 4
Grade 10
Scale
SEM
Score
900
27
920
12
930
10
938
10
946* 9
954
9
962* 10
971* 11
983
13
999
15
999
15

*Proficiency Level Scale Score cuts (Basic, Proficient, Advanced)

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

93

Technical Report for Spring 2012 Test Administration of DC CAS

94

Table 44. DC CAS 2012 Percentages of Students at Each Performance Level
Content

Reading

Mathematics

Science/
Biology
Composition

Grade
2
3
4
5
6
7
8
9
10
2
3
4
5
6
7
8
10
5
8
10
4
7
10

N*
-4796
4841
4797
4403
4456
4327
2891
4491
-4823
4873
4817
4433
4485
4370
4464
4765
4223
3790
4755
4301
3761

Spring 2011 Impact Data
Percent of Students at Each Performance Level
Below
Basic
Proficient Advanced
Basic
----22.29%
36.74%
37.82%
3.15%
18.65%
37.57%
35.98%
7.79%
15.13%
38.65%
39.13%
7.09%
15.47%
42.31%
37.52%
4.70%
11.11%
40.82%
35.19%
12.88%
13.73%
37.60%
36.54%
12.13%
18.13%
35.56%
22.73%
23.59%
18.77%
37.16%
33.22%
10.84%
----22.79%
41.84%
24.26%
11.11%
18.67%
35.50%
34.95%
10.88%
18.37%
37.22%
32.61%
11.79%
15.11%
39.66%
31.81%
13.42%
14.47%
29.39%
43.41%
12.73%
13.57%
28.60%
46.59%
11.24%
23.97%
35.44%
34.68%
5.91%
20.29%
42.25%
31.21%
6.25%
34.48%
29.24%
32.02%
4.26%
32.22%
23.17%
42.14%
2.48%
10.91%
54.97%
25.95%
8.16%
5.98%
60.71%
27.44%
5.88%
12.28%
56.71%
22.63%
8.38%

N*
4,491
4754
4589
4744
4545
4301
4359
4164
4272
4,514
4781
4603
4759
4567
4325
4381
4245
4707
4263
3715
4470
4146
3511

Spring 2012 Impact Data
Percent of Students at Each Performance Level
Below
Basic
Proficient Advanced
Basic
22.24%
33.44%
36.12%
8.19%
21.64%
38.16%
36.52%
3.68%
15.86%
35.93%
41.73%
6.47%
14.42%
38.26%
38.85%
8.47%
17.43%
42.22%
36.13%
4.22%
11.04%
39.64%
35.97%
13.35%
14.18%
38.15%
37.62%
10.05%
16.55%
41.50%
22.79%
19.16%
18.38%
39.79%
32.44%
9.39%
20.36%
32.19%
36.89%
10.57%
21.31%
42.25%
27.25%
9.18%
15.75%
33.67%
37.98%
12.60%
17.08%
34.33%
36.25%
12.33%
16.33%
35.84%
32.95%
14.87%
12.79%
29.36%
43.56%
14.29%
14.13%
29.35%
45.04%
11.48%
22.00%
36.35%
34.65%
7.00%
18.89%
42.91%
30.74%
7.46%
35.05%
25.17%
34.37%
5.42%
28.34%
26.59%
41.48%
3.58%
26.60%
31.99%
23.76%
17.65%
18.07%
28.87%
32.34%
20.72%
28.60%
25.43%
23.84%
22.13%

Note: Total percentages for a grade may not sum to 100 due to rounding.
1
Biology is administered to students in Grades 8-12, the grade in which they elect to take the Biology
course.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

95

Table 45. Classification Consistency and Accuracy Rates by Grade and Cut Score: Reading
Grade

2

3

4

5

6

7

8

9

10

Reading Classification Consistency and
Accuracy
Consistency
Classification
Consistency
Kappa
Accuracy
Classification
False Positive Errors
Accuracy
False Negative Errors
Consistency
Classification
Consistency
Kappa
Accuracy
Classification
False Positive Errors
Accuracy
False Negative Errors
Consistency
Classification
Consistency
Kappa
Accuracy
Classification
False Positive Errors
Accuracy
False Negative Errors
Consistency
Classification
Consistency
Kappa
Accuracy
Classification
False Positive Errors
Accuracy
False Negative Errors
Consistency
Classification
Consistency
Kappa
Accuracy
Classification
False Positive Errors
Accuracy
False Negative Errors
Consistency
Classification
Consistency
Kappa
Accuracy
Classification
False Positive Errors
Accuracy
False Negative Errors
Consistency
Classification
Consistency
Kappa
Accuracy
Classification
False Positive Errors
Accuracy
False Negative Errors
Consistency
Classification
Consistency
Kappa
Accuracy
Classification
False Positive Errors
Accuracy
False Negative Errors
Consistency
Classification
Consistency
Kappa
Accuracy
Classification
False Positive Errors
Accuracy
False Negative Errors

Basic
0.90
0.72
0.93
0.03
0.03
0.93
0.78
0.95
0.02
0.03
0.93
0.74
0.95
0.02
0.03
0.94
0.75
0.96
0.02
0.02
0.93
0.76
0.95
0.02
0.03
0.94
0.70
0.96
0.02
0.02
0.93
0.70
0.95
0.03
0.02
0.91
0.64
0.94
0.03
0.03
0.92
0.74
0.94
0.02
0.04

Proficient Advanced
0.86
0.72
0.90
0.04
0.06
0.90
0.79
0.92
0.02
0.05
0.89
0.78
0.92
0.02
0.06
0.88
0.77
0.92
0.04
0.05
0.89
0.77
0.92
0.03
0.05
0.86
0.72
0.90
0.05
0.05
0.87
0.74
0.91
0.05
0.05
0.90
0.80
0.93
0.02
0.04
0.90
0.79
0.93
0.03
0.04

All Cuts

0.92
0.56
0.94
0.01
0.05
0.96
0.58
0.97
0.01
0.02
0.94
0.60
0.96
0.01
0.03
0.93
0.61
0.95
0.01
0.04
0.96
0.61
0.97
0.01
0.02
0.91
0.65
0.94
0.02
0.04
0.93
0.67
0.95
0.02
0.04
0.92
0.77
0.94
0.02
0.04
0.93
0.66
0.95
0.01
0.04

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

0.69
0.56
0.78
0.08
0.14
0.79
0.68
0.85
0.05
0.10
0.77
0.65
0.83
0.06
0.11
0.75
0.64
0.82
0.07
0.11
0.78
0.67
0.83
0.06
0.11
0.71
0.59
0.80
0.09
0.11
0.73
0.61
0.81
0.10
0.10
0.73
0.63
0.81
0.07
0.11
0.76
0.65
0.82
0.06
0.12

Technical Report for Spring 2012 Test Administration of DC CAS

96

Table 46. Classification Consistency and Accuracy Rates by Grade and Cut Score:
Mathematics
Grade

2

3

4

5

6

7

8

10

Mathematics Classification Consistency
and Accuracy
Consistency
Classification
Consistency
Kappa
Accuracy
Classification
False Positive Errors
Accuracy
False Negative Errors
Consistency
Classification
Consistency
Kappa
Accuracy
Classification
False Positive Errors
Accuracy
False Negative Errors
Consistency
Classification
Consistency
Kappa
Accuracy
Classification
False Positive Errors
Accuracy
False Negative Errors
Consistency
Classification
Consistency
Kappa
Accuracy
Classification
False Positive Errors
Accuracy
False Negative Errors
Consistency
Classification
Consistency
Kappa
Accuracy
Classification
False Positive Errors
Accuracy
False Negative Errors
Consistency
Classification
Consistency
Kappa
Accuracy
Classification
False Positive Errors
Accuracy
False Negative Errors
Consistency
Classification
Consistency
Kappa
Accuracy
Classification
False Positive Errors
Accuracy
False Negative Errors
Consistency
Classification
Consistency
Kappa
Accuracy
Classification
False Positive Errors
Accuracy
False Negative Errors

Basic
0.91
0.73
0.94
0.03
0.03
0.93
0.79
0.95
0.03
0.03
0.93
0.73
0.95
0.02
0.03
0.94
0.78
0.95
0.02
0.03
0.91
0.69
0.94
0.03
0.03
0.92
0.76
0.90
0.10
0.01
0.89
0.58
0.92
0.04
0.04
0.88
0.65
0.91
0.04
0.05

Proficient Advanced
0.87
0.75
0.91
0.04
0.05
0.91
0.81
0.93
0.02
0.05
0.91
0.81
0.93
0.03
0.05
0.91
0.82
0.93
0.03
0.03
0.91
0.83
0.93
0.02
0.05
0.95
0.90
0.96
0.02
0.01
0.88
0.76
0.91
0.04
0.05
0.89
0.78
0.92
0.03
0.05

All Cuts

0.93
0.64
0.93
0.04
0.03
0.94
0.68
0.95
0.01
0.04
0.94
0.73
0.95
0.02
0.03
0.92
0.68
0.95
0.02
0.04
0.94
0.75
0.95
0.02
0.03
0.96
0.89
0.89
0.00
0.11
0.95
0.77
0.96
0.01
0.03
0.96
0.75
0.97
0.01
0.02

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

0.72
0.60
0.78
0.10
0.12
0.78
0.69
0.83
0.05
0.11
0.77
0.67
0.83
0.06
0.11
0.77
0.68
0.83
0.07
0.10
0.76
0.67
0.82
0.07
0.10
0.82
0.77
0.75
0.12
0.13
0.73
0.60
0.80
0.09
0.11
0.74
0.62
0.81
0.08
0.12

Technical Report for Spring 2012 Test Administration of DC CAS

97

Table 47. Classification Consistency and Accuracy Rates by Grade and Cut Score:
Science/Biology
Science/Biology Classification
Consistency and Accuracy
Consistency
Classification
Consistency
Kappa
5
Accuracy
Classification
False Positive Errors
Accuracy
False Negative Errors
Consistency
Classification
Consistency
Kappa
8
Accuracy
Classification
False Positive Errors
Accuracy
False Negative Errors
Consistency
Classification
Consistency
Kappa
High
Accuracy
School Classification
False Positive Errors
Accuracy
False Negative Errors
Grade

Basic
0.87
0.60
0.91
0.06
0.04
0.82
0.60
0.87
0.06
0.07
0.81
0.54
0.86
0.06
0.08

Proficient Advanced
0.88
0.76
0.92
0.03
0.05
0.87
0.74
0.91
0.03
0.06
0.83
0.65
0.88
0.04
0.08

All Cuts

0.96
0.70
0.96
0.01
0.03
0.97
0.73
0.98
0.01
0.01
0.98
0.68
0.98
0.00
0.02

0.71
0.58
0.79
0.09
0.12
0.68
0.54
0.76
0.09
0.15
0.65
0.48
0.73
0.10
0.17

Table 48. Classification Consistency and Accuracy Rates by Grade and Cut Score:
Composition
Composition Classification
Grade
Basic
Proficient Advanced All Cuts

4

7

10

Consistency and Accuracy
Consistency
Classification
Consistency
Kappa
Accuracy
Classification
False Positive Errors
Accuracy
False Negative Errors
Consistency
Classification
Consistency
Kappa
Accuracy
Classification
False Positive Errors
Accuracy
False Negative Errors
Consistency
Classification
Consistency
Kappa
Accuracy
Classification
False Positive Errors
Accuracy
False Negative Errors

0.81
0.55
0.87
0.09
0.04
0.86
0.57
0.91
0.06
0.03
0.78
0.46
0.84
0.09
0.07

0.78
0.53
0.84
0.07
0.09
0.81
0.62
0.87
0.08
0.05
0.78
0.55
0.85
0.07
0.08

0.85
0.51
0.89
0.05
0.06
0.84
0.56
0.89
0.04
0.08
0.84
0.57
0.88
0.04
0.07

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

0.53
0.36
0.63
0.19
0.18
0.58
0.42
0.68
0.16
0.15
0.52
0.34
0.62
0.18
0.20

Technical Report for Spring 2012 Test Administration of DC CAS

98

Table 49. Correlations Between Reading, Mathematics, Science/Biology, and Composition
Total Test Raw Scores, by Grade
Grade

Mathematics

Science/Biology*
Reading
---0.78
--0.74
0.56
0.65
Mathematics
-0.72
-0.79
0.63
Science/Biology
--

Composition

Grade 2
Grade 3
Grade 4
Grade 5
Grade 6
Grade 7
Grade 8
Grade 9
Grade 10

0.72
0.78
0.80
0.76
0.78
0.78
0.77
-0.74

--0.57
--0.64
--0.58

Grade 4
Grade 5
Grade 7
Grade 8
Grade 10

------

Grade 10

--

0.55
-0.58
-0.56
0.46

Note: ---- = not applicable.
*In Biology all grades were used in the analyses but only Grades 9 and 10 can be used for the
correlations since the other grades are not in common with other content areas.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

99

References
American Educational Research Association, American Psychological Association, & National
Council on Measurement in Education. (2009). Standards for educational and
psychological testing. Washington, DC: American Educational Research Association.
Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters:
An application of an EM algorithm. Psychometrika, 46, 443-459.
Burket, G. R. (1995). PARDUX (Version 1.7) [Computer program]. Unpublished.
Burket, G. R. (2000). ITEMWIN [Computer program]. Unpublished.
CTB/McGraw-Hill. (2011). District of Columbia Comprehensive Assessment System (DC CAS)
grade 9 reading bookmark standard setting technical report 2011. Monterey, CA:
Author.
CTB/McGraw-Hill. (2011). District of Columbia Public Schools (DCPS) grade 9 reading
technical report 2011. Monterey, CA: Author.
CTB/McGraw-Hill. (2011). District of Columbia Comprehensive Assessment System (DC CAS)
test chairperson's manual: Reading and mathematics, composition, science, and biology.
Monterey, CA: Author.
CTB/McGraw-Hill. (2011). District of Columbia Comprehensive Assessment System (DC CAS)
test directions: Reading and mathematics (grades 4--8 and 10), composition (grades 4,
7, and 10), science (grades 5 and 8), and biology. Monterey, CA: Author.
CTB/McGraw-Hill. (2012). District of Columbia Comprehensive Assessment System (DC CAS)
Standard Setting Technical Report for Grades 3--10 Reading, Grade 2 Reading and
Mathematics, and Grades 4, 7, and 10 Composition. Monterey, CA: Author.
Hambleton, R. K., & Novick, M. R. (1973). Toward an integration of theory and method for
criterion-referenced tests. Journal of Educational Measurement, 10, 159-170.
Jaeger, R. M. (1995). Setting standards for complex performances: An iterative, judgmental
policy-capturing strategy. Educational Measurement: Issues and Practice, 14(4): 16-20.
Kim, D. (2007). KKCLASS [Computer program]. Unpublished.
Kim, D., Barton, K., & Kim, X. (2008). Estimating Classification Consistency and Classification
Accuracy With Pattern Scoring. Paper presented at the annual meeting of the American
Educational Research Association, Chicago, IL.
Kim, D., Choi, S., Um, K., & Kim, J. (2006). A comparison of methods for estimating
classification consistency. Paper presented at the annual meeting of the National Council
on Measurement in Education, Montreal, Canada.
Kolen, M. J., & Kim, D. (2005). Personal correspondence.
Landis, J. R., & Koch, G. G. (1997). The measurement of observer agreement for categorical
data. Biometrics, 33, 159-174.
Lewis, D. M., Mitzel, H. C., & Green, D. R. (June 1996). Standard setting: A bookmark
approach. In D. R. Green (Chair), IRT-based standard setting procedures utilizing
behavioral anchoring. Symposium presented at the Council of Chief State School
Officers National Conference on Large-Scale Assessment. Phoenix, AZ.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

100

Linn, R. L., & Harnisch, D. L. (1981). Interactions between item content and group membership
on achievement test items. Journal of Educational Measurement, 18(2), 109-118.
Livingston, S. A., & Lewis, C. (1995). Estimating the consistency and accuracy of
classifications based on test scores. Journal of Educational Measurement, 32, 179-197.
Mantel, N., & Haenszel, W. (1959). Statistical aspects of the analysis of data from retrospective
studies of disease. Journal of the National Cancer Institute, 22, 719-748.
Muraki, E., & Bock, R. D. (1991). PARSCALE: Parameter Scaling of Rating Data [Computer
program]. Chicago, IL: Scientific Software, Inc.
No Child Left Behind Act of 2001, Pub. L. No. 107--110, 115 Stat.1425 (2002).
Perie, M. (2007, June). Setting alternate achievement standards. Dover, NH: National Center for
the Improvement of Educational Assessment. Retrieved January 11, 2008 from
http://www.nciea.org/publications/CCSSO_MAP07.pdf.
Roeber, E. (2002). Setting standards on alternate assessments (Synthesis Report 42).
Minneapolis, MN: National Center on Educational Outcomes. Retrieved January 11,
2008 from http://cehd.umn.edu/NCEO/OnlinePubs/Synthesis42.html.
Standards and Assessments Peer Review Guidance. (January 12, 2009). Retrieved December 7,
2010 from http://www.ed.gov/policy/elsec/guid/saaprguidance.pdf.
Stocking, M. L., & Lord, F. M. (1983). Developing a common metric in item response theory.
Applied Psychological Measurement, 7, 201-210.
Swaminathan, H., Hambleton, R. K., & Algina, J. (1974). Reliability of Criterion-Referenced
Tests: A Decision-Theoretic Formulation, Journal of Educational Measurement, Vol. 11,
No. 4 (Winter, 1974), pp. 263-267.
Thissen, D. (1982). Marginal maximum-likelihood estimation for the one-parameter logistic
model. Psychometrika, 47, 175-186.
U.S. Department of Education. (2009, January). Standards and assessments peer review
guidance: Information and examples for meeting requirements of the No Child Left
Behind Act of 2001. Retrieved December 7, 2010, from
http://www.ed.gov/policy/elsec/guid/saaprguidance.pdf.
Yen, W.M. (1981). Using simulation results to choose a latent trait model. Applied Psychological
Measurement, 5, 245-262.
Zwick, R., Donoghue, J.R., & Grima, A. (1993). Assessment of differential item functioning for
performance tasks. Journal of Educational Measurement, 30, 233-251.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

101

Appendix A: Checklist for DC Educator Review of DC CAS Items
A. Checklist for the Content Reviewer
For All Items:
Check to ensure that the content of each item:
? is targeted to assess only one strand or skill
? deals with material that is important in testing the targeted strand or skill
? uses grade-appropriate content and thinking skills
? is presented at a reading level suitable for the grade level being tested
? is accurate and documented against reliable, up-to-date sources
For Multiple Choice Items:
Check to ensure that the content of each item:
? has a stem that facilitates answering the question or completing the statement without
looking at the answer choices
? has a stem that does not present clues to the correct answer choice
? has answer choices that are plausible and attractive to the student who has not mastered the
Strand or skill
? is conceptually, grammatically, and syntactically consistent--between the stem and answer
choices, and among the answer choices
? has mutually exclusive distractors
? has one and only one correct answer choice
For Constructed Response Items:
Check to ensure that the content of each item:
? is written so that a student possessing the knowledge or skill being tested can construct a
response that is scorable with the specified rubric or scoring tool; that is, the range of
possible correct responses must be wide enough to allow for diversity of responses, but
narrow enough so that students who do not clearly show their grasp of the Strand or skill
being assessed cannot obtain the maximum score
? is presented without clues to the correct response
? has precise and unambiguous directions for the desired response
? is free of extraneous words or expressions
? is appropriate for the question being asked and the intended response (For example, the item
does not ask students to draw pictures of abstract ideas.)
? is conceptually, grammatically, and syntactically consistent

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

102

B. Checklist for the Sensitivity Reviewer
To have confidence in test results, it is important to ensure that students are given a reasonable
chance to do their best on the test. Test items must be accessible to a diverse student population
with respect to gender, race, ethnicity, geographic region, socioeconomic status, and other
factors.
Check to ensure that the content of each item is free of explicit references to or descriptions
of:
? events involving extreme sadness or adversity
? acts of physical or psychological violence
? alcohol or drug abuse
? vulgar language
? sex
Check to ensure that if any religious, political, social, or philosophical issues
are addressed:
? more than one point of view is expressed
? beliefs or biases do not interfere with factual accuracy
? contemporary issues that have already been proven to be controversial are absent
? stereotypic descriptions of beliefs or customs are absent
Test items must:
? be free of offensive, disturbing, or inappropriate language or content
? be free of stereotyping based on:
gender
race
ethnicity
religion
socioeconomic status
age
regional or geographic area
disability
occupation
? demonstrate sensitivity to historical representation of groups
? be free of differential familiarity for any group based on:
language
socioeconomic status
regional or geographic area
prior knowledge or experiences unrelated to the subject matter
being tested

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

Appendix B: DC CAS Composition Scoring Rubrics
Topic/Idea Development
Score
6

5

4

3

2
1

Description
o
o
o
o
o
o
o

Rich topic/idea development
Careful and/or subtle organization
Effective/rich use of language
Full topic/idea development
Logical organization
Strong details
Appropriate use of language

o
o
o
o
o
o
o
o
o
o

Moderate topic/idea development and organization
Adequate, relevant details
Some variety in language
Rudimentary topic/idea development and/or organization
Basic supporting ideas
Simplistic language
Limited or weak topic/idea development, organization, and/or details
Limited awareness of audience and/or task
Limited topic/idea development, organization, and/or details
Little or no awareness of audience and/or task

Standard English Conventions
Score

Description

4

o

3

o
o

2

o
o

1

o
o

Control of sentence structure, grammar and usage, and mechanics (length
and complexity of essay provide opportunity for student to show control
of standard English conventions)
Errors do not interfere with communication and/or
Few errors relative to length of essay or complexity of sentence
structure, grammar and usage, and mechanics
Errors interfere somewhat with communication and/or
Too many errors relative to length of the essay or complexity of sentence
structure, grammar and usage, and mechanics
Errors seriously interfere with communication AND
Little control of sentence structure, grammar and usage, and mechanics

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

103

Technical Report for Spring 2012 Test Administration of DC CAS

104

Understanding Literary or Informational Text
Score
4

3

Description
The response demonstrates an understanding of the complexities of the text.
o Fully addresses the demands of the question or prompt
o Effectively uses explicitly stated text as well as inferences drawn from
the text to support an answer or claim
The response demonstrates an understanding of the text.
o Addresses the demands of the question or prompt
o Uses some explicitly stated text and/or some inferences drawn
from the text to support an answer or claim

2

The response is incomplete or oversimplified and demonstrates a partial or
literal understanding of the text.
o Attempts to answer the question or address the prompt
o Uses explicitly stated text that demonstrates some understanding

1

The response shows evidence of a minimal understanding of the text.
o Shows evidence that some meaning has been derived from the text to answer
the question
o Has minimal textual evidence

Note: The Composition prompt will also be aligned to a Common Core Reading standard. Responses will
demonstrate degrees of mastery of that reading standard. Reading standards that the composition prompts will align
to may include:
o Grade 4: CC.4.R.I.1, CC.4.R.L.2, and CC.4.R.L.4 (see Reading tested standards)
o Grade 7: CC.7.R.I.1, CC.7.R.I.8, CC.7.R.L.1, and CC.7.R.L.2 (see Reading tested standards)
o Grade 10: CC.9-10.R.I.1, CC.9-10.R.I.2, CC.10.R.I.3, CC.9-10.R.L.2, and CC.9-10.R.L.6 (see Reading tested
standards)

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

105

Appendix C: Operational and Field Test Item Adjusted P Values
Table C1. DC CAS 2012 Operational Form Item Adjusted P Values: Reading
Reading Grade 2
Operational
Item
Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

N

Max
Points

Adjusted
P Value

4,457
4,393
4,416
4,399
4,448
4,432
4,397
4,319
4,438
4,419
4,412
4,404
4,444
4,418
4,345
4,080
4,397
4,430
4,419
4,373
4,420
4,373
4,310
4,387

1
1
1
1
1
1
1
3
1
1
1
1
1
1
1
1
1
1
1
1
1
1
3
1

0.93
0.88
0.85
0.61
0.71
0.74
0.53
0.31
0.40
0.44
0.51
0.75
0.87
0.66
0.56
0.68
0.41
0.50
0.62
0.64
0.61
0.71
0.45
0.82

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

25
26
27
28
29
30
31
32
33
34
35

4,320
4,411
4,403
3,812
4,257
4,252
4,246
4,207
4,272
4,285
4,280

1
1
1
1
1
1
1
1
1
1
1

0.50
0.90
0.86
0.71
0.78
0.29
0.53
0.34
0.72
0.78
0.79

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

106

Table C1. DC CAS 2012 Operational Form Item Adjusted P Values: Reading (continued)
Reading Grade 3
Operational
Item
Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

N

Max
Points

Adjusted
P Value

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

4,726
4,720
4,727
4,733
4,718
4,732
4,724
4,703
4,707
4,714
4,702
4,554
4,702
4,625
4,691
4,688
4,688
4,537
4,717
4,714
4,706
4,704
4,715
4,718

1
1
1
1
1
1
1
1
1
1
1
3
1
1
1
1
1
3
1
1
1
1
1
1

0.88
0.79
0.71
0.85
0.74
0.78
0.79
0.46
0.63
0.60
0.53
0.43
0.50
0.70
0.58
0.73
0.72
0.46
0.79
0.81
0.74
0.90
0.80
0.63

25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48

4,708
4,691
4,598
4,711
4,696
4,690
4,691
4,703
4,704
4,701
4,696
4,693
4,686
4,707
4,696
4,698
4,670
4,662
4,609
4,580
4,673
4,564
4,664
4,662

1
1
3
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1

0.74
0.70
0.44
0.48
0.34
0.45
0.25
0.54
0.91
0.76
0.65
0.60
0.62
0.72
0.54
0.71
0.76
0.52
0.73
0.79
0.58
0.58
0.52
0.77

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

107

Table C1. DC CAS 2012 Operational Form Item Adjusted P Values: Reading (continued)
Reading Grade 4
Operational
Item
Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

N

Max
Points

Adjusted
P Value

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

4,556
4,556
4,558
4,533
4,421
4,549
4,544
4,542
4,539
4,538
4,535
4,535
4,529
4,526
4,516
4,503
4,479
4,376
4,555
4,554
4,553
4,554
4,547
4,551

1
1
1
1
3
1
1
1
1
1
1
1
1
1
1
1
1
3
1
1
1
1
1
1

0.67
0.49
0.65
0.57
0.42
0.52
0.71
0.40
0.78
0.79
0.77
0.69
0.76
0.67
0.72
0.52
0.90
0.45
0.66
0.66
0.54
0.59
0.48
0.61

25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48

4,551
4,545
4,542
4,550
4,554
4,552
4,553
4,549
4,553
4,550
4,552
4,549
4,548
4,537
4,548
4,544
4,540
4,539
4,535
4,510
4,516
4,510
4,493
4,456

1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
3

0.73
0.66
0.54
0.62
0.47
0.73
0.66
0.51
0.51
0.77
0.72
0.58
0.59
0.65
0.75
0.52
0.33
0.55
0.51
0.60
0.79
0.58
0.77
0.39

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

108

Table C1. DC CAS 2012 Operational Form Item Adjusted P Values: Reading (continued)
Reading Grade 5
Operational
Item
Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

N

Max
Points

Adjusted
P Value

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

4,730
4,733
4,733
4,731
4,727
4,733
4,731
4,729
4,720
4,723
4,723
4,727
4,724
4,723
4,722
4,718
4,707
4,681
4,543
4,722
4,725
4,717
4,643
4,725

1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
3
1
1
1
3
1

0.82
0.73
0.83
0.75
0.73
0.87
0.53
0.77
0.50
0.68
0.38
0.52
0.78
0.72
0.48
0.70
0.74
0.70
0.32
0.71
0.90
0.65
0.36
0.66

25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48

4,724
4,723
4,720
4,722
4,718
4,720
4,721
4,718
4,720
4,721
4,718
4,720
4,718
4,718
4,718
4,717
4,711
4,635
4,636
4,632
4,637
4,633
4,622
4,547

1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
3

0.73
0.70
0.65
0.54
0.62
0.64
0.86
0.61
0.84
0.70
0.56
0.62
0.75
0.57
0.87
0.71
0.61
0.42
0.57
0.76
0.56
0.79
0.58
0.24

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

109

Table C1. DC CAS 2012 Operational Form Item Adjusted P Values: Reading (continued)
Reading Grade 6
Operational
Item
Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

N

Max
Points

Adjusted
P Value

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

4,532
4,539
4,535
4,535
4,529
4,538
4,531
4,532
4,529
4,532
4,533
4,530
4,510
4,452
4,520
4,518
4,517
4,514
4,510
4,533
4,533
4,529
4,531
4,531

1
1
1
1
1
1
1
1
1
1
1
1
1
3
1
1
1
1
1
1
1
1
1
1

0.57
0.69
0.54
0.70
0.69
0.76
0.52
0.57
0.39
0.77
0.70
0.56
0.58
0.37
0.67
0.44
0.46
0.52
0.75
0.79
0.69
0.65
0.73
0.69

25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48

4,529
4,523
4,490
4,449
4,520
4,517
4,517
4,519
4,513
4,516
4,520
4,519
4,518
4,517
4,517
4,511
4,515
4,509
4,495
4,494
4,492
4,486
4,414
4,382

1
1
1
3
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
3
1

0.76
0.52
0.48
0.46
0.93
0.90
0.60
0.57
0.39
0.61
0.87
0.81
0.41
0.86
0.76
0.62
0.66
0.67
0.63
0.80
0.52
0.81
0.48
0.85

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

110

Table C1. DC CAS 2012 Operational Form Item Adjusted P Values: Reading (continued)
Reading Grade 7
Operational
Item
Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

N

Max
Points

Adjusted
P Value

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

4,273
4,278
4,278
4,268
4,280
4,273
4,273
4,268
4,269
4,265
4,262
4,238
4,212
4,245
4,245
4,230
4,165
4,271
4,273
4,273
4,272
4,270
4,270
4,270

1
1
1
1
1
1
1
1
1
1
1
1
3
1
1
1
3
1
1
1
1
1
1
1

0.56
0.62
0.60
0.58
0.62
0.48
0.51
0.60
0.54
0.49
0.76
0.68
0.55
0.87
0.89
0.59
0.51
0.82
0.84
0.63
0.59
0.59
0.67
0.59

25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48

4,265
4,259
4,257
4,258
4,258
4,260
4,256
4,255
4,256
4,257
4,243
4,192
4,253
4,253
4,252
4,254
4,247
4,251
4,250
4,248
4,247
4,244
4,244
4,246

1
1
1
1
1
1
1
1
1
1
1
3
1
1
1
1
1
1
1
1
1
1
1
1

0.45
0.68
0.61
0.65
0.73
0.82
0.76
0.50
0.63
0.56
0.58
0.53
0.73
0.68
0.51
0.39
0.46
0.64
0.58
0.77
0.71
0.61
0.59
0.78

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

111

Table C1. DC CAS 2012 Operational Form Item Adjusted P Values: Reading (continued)
Reading Grade 8
Operational
Item
Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

N

Max
Points

Adjusted
P Value

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

4,326
4,335
4,336
4,333
4,336
4,326
4,330
4,331
4,329
4,330
4,323
4,319
4,320
4,313
4,313
4,299
4,174
4,325
4,320
4,321
4,324
4,323
4,324
4,320

1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
3
1
1
1
1
1
1
1

0.60
0.90
0.93
0.49
0.88
0.86
0.72
0.63
0.65
0.77
0.68
0.38
0.49
0.63
0.46
0.78
0.34
0.66
0.48
0.70
0.66
0.57
0.71
0.75

25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48

4,317
4,319
4,304
4,213
4,316
4,314
4,314
4,309
4,311
4,313
4,312
4,305
4,308
4,310
4,309
4,314
4,306
4,306
4,275
4,277
4,271
4,275
4,268
4,161

1
1
1
3
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
3

0.53
0.59
0.56
0.42
0.57
0.73
0.51
0.51
0.39
0.63
0.60
0.27
0.30
0.65
0.25
0.76
0.58
0.44
0.67
0.66
0.49
0.56
0.64
0.33

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

112

Table C1. DC CAS 2012 Operational Form Item Adjusted P Values: Reading (continued)
Reading Grade 9
Operational
Item
Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

N

Max
Points

Adjusted
P Value

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

3,532
3,522
3,526
3,523
3,525
3,525
3,512
3,512
3,324
3,510
3,511
3,508
3,481
3,485
3,480
3,476
3,453
3,027
3,495
3,494
3,493
3,494
3,492
3,490

1
1
1
1
1
1
1
1
2
1
1
1
1
1
1
1
1
3
1
1
1
1
1
1

0.77
0.59
0.71
0.79
0.58
0.83
0.78
0.62
0.42
0.43
0.77
0.83
0.46
0.66
0.39
0.54
0.50
0.22
0.67
0.66
0.59
0.66
0.65
0.40

25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48

3,480
3,489
3,488
3,384
3,384
3,385
3,378
3,378
3,380
3,381
3,376
3,380
3,376
3,376
3,360
3,368
3,371
3,369
3,359
2,942
3,348
3,349
3,350
3,341

1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
3
1
1
1
1

0.48
0.78
0.60
0.43
0.65
0.63
0.32
0.43
0.47
0.64
0.49
0.68
0.72
0.70
0.42
0.54
0.37
0.48
0.47
0.30
0.64
0.69
0.65
0.64

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

113

Table C1. DC CAS 2012 Operational Form Item Adjusted P Values: Reading (continued)
Reading Grade 10
Operational
Item
Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

N

Max
Points

Adjusted
P Value

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

4,225
4,224
4,217
4,210
4,208
3,835
4,218
4,220
4,218
4,217
4,217
4,207
4,212
4,206
4,206
3,947
4,202
4,198
4,196
4,200
4,201
4,195
4,143
4,149

1
1
1
1
1
3
1
1
1
1
1
1
1
1
1
3
1
1
1
1
1
1
1
1

0.55
0.69
0.68
0.64
0.67
0.31
0.71
0.82
0.78
0.58
0.72
0.66
0.70
0.80
0.75
0.54
0.72
0.67
0.46
0.60
0.74
0.49
0.57
0.75

25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48

4,142
4,147
4,139
3,897
4,139
4,142
4,140
4,136
4,143
4,132
4,133
4,130
4,135
4,132
4,128
4,130
4,100
4,102
4,101
4,100
4,101
4,093
4,101
4,101

1
1
1
3
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1

0.60
0.65
0.75
0.34
0.42
0.72
0.59
0.67
0.55
0.46
0.55
0.24
0.48
0.51
0.69
0.53
0.63
0.65
0.67
0.59
0.54
0.49
0.53
0.81

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

114

Table C2. DC CAS 2012 Operational Form Item Adjusted P Values: Mathematics
Mathematics Grade 2
Operational
Item
Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

N

Max
Points

Adjusted
P Value

suppressed*
4,485
4,490
4,483
4,473
4,476
4,470
4,476
suppressed
4,467
4,474
suppressed
4,439
4,474
4,464
4,472
suppressed
4,472
4,471
suppressed
4,470
suppressed
suppressed
4,426
4,387
4,472
4,461

1
1
1
1
1
2
1
1
1
1
1
1
1
1
1
1
1
1
1
2
1
1
1
1
1
1
1

n/a
0.88
0.83
0.85
0.92
0.38
0.76
0.74
n/a
0.74
0.50
n/a
0.62
0.65
0.66
0.64
n/a
0.72
0.89
n/a
0.74
n/a
n/a
0.67
0.84
0.72
0.77

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

28
29
30
31
32
33
34
35
36
37
38
39
40

4,470
4,463
4,455
4,467
4,436
4,473
suppressed
4,475
4,461
4,474
4,447
4,458
4,473

1
1
1
1
1
1
1
1
1
1
1
1
1

0.64
0.80
0.76
0.80
0.53
0.90
n/a
0.46
0.87
0.91
0.68
0.88
0.80

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.
*Items deemed statically unacceptable were suppressed.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

115

Table C2. DC CAS 2012 Operational Form Item Adjusted P Values: Mathematics (continued)
Operational
Item Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

N

Max
Points

4,753
4,753
4,751
4,712
4,755
4,706
4,744
4,746
4,738
4,725
4,740
4,753
4,724
4,703
4,759
4,758
suppressed*
4,754
4,748
4,745
4,574
4,726
4,739
4,748
4,648
4,755
4,726

1
1
1
1
1
3
1
1
1
1
1
1
1
1
1
1
1
1
1
1
3
1
1
1
1
1
1

Mathematics Grade 3
Operational
AdjustedP
Item Sequence
Value
Number
0.85
28
0.58
29
0.51
30
0.72
31
0.31
32
0.21
33
0.57
34
0.95
35
0.73
36
0.51
37
0.80
38
0.61
39
0.52
40
0.79
41
0.53
42
0.63
43
n/a
44
0.78
45
0.65
46
0.83
47
0.80
48
0.75
49
0.87
50
0.77
51
0.33
52
0.84
53
0.55
54

N

Max
Points

Adjusted
P Value

4,755
4,747
4,714
4,668
4,740
4,745
4,737
4,729
4,680
4,713
4,735
4,746
4,747
4,746
4,748
4,743
4,737
4,730
4,724
4,728
4,711
4,727
4,716
4,522
4,698
4,726
4,736

1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
3
1
1
1
1
1
1

0.54
0.78
0.85
0.84
0.80
0.66
0.59
0.62
0.55
0.51
0.55
0.67
0.83
0.87
0.33
0.88
0.70
0.69
0.42
0.73
0.32
0.86
0.52
0.67
0.64
0.61
0.83

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.
*Items deemed statically unacceptable were suppressed.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

116

Table C2. DC CAS 2012 Operational Form Item Adjusted P Values: Mathematics (continued)
Operational
Item Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

N

Max
Points

4,585
4,584
4,590
4,576
4,549
4,545
4,578
4,580
4,576
4,578
4,574
4,574
4,573
4,572
4,576
4,575
4,577
4,570
4,570
4,539
4,534
4,571
4,568
4,565
4,571
4,571
4,548

1
1
1
1
1
3
1
1
1
1
1
1
1
1
1
1
1
1
1
1
3
1
1
1
1
1
1

Mathematics Grade 4
Operational
Adjusted
Item Sequence
P Value
Number
0.69
28
0.71
29
0.73
30
0.65
31
0.72
32
0.43
33
0.55
34
0.69
35
0.79
36
0.58
37
0.66
38
0.74
39
0.52
40
0.70
41
0.59
42
0.81
43
0.71
44
0.72
45
0.48
46
0.52
47
0.20
48
0.51
49
0.51
50
0.55
51
0.85
52
0.86
53
0.49
54

N

Max
Points

Adjusted
P Value

4,578
4,576
4,573
4,569
4,570
4,566
4,571
4,569
4,568
4,566
4,558
4,554
4,539
4,570
4,572
4,567
4,567
4,566
4,559
4,497
4,550
4,568
4,568
4,567
4,563
4,567
4,553

1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
3
1
1
1
1
1
1

0.90
0.60
0.91
0.43
0.62
0.49
0.70
0.92
0.82
0.73
0.75
0.84
0.58
0.39
0.41
0.41
0.91
0.28
0.75
0.63
0.72
0.73
0.32
0.81
0.64
0.36
0.61

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

117

Table C2. DC CAS 2012 Operational Form Item Adjusted P Values: Mathematics (continued)
Operational
Item Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

N

Max
Points

4,743
4,746
4,738
4,739
4,703
4,696
4,745
4,743
4,739
4,742
4,734
4,740
4,738
4,741
4,737
4,735
4,737
suppressed*
4,733
4,708
4,711
4,729
4,733
4,730
4,733
4,726
4,710

1
1
1
1
1
3
1
1
1
1
1
1
1
1
1
1
1
1
1
1
3
1
1
1
1
1
1

Mathematics Grade 5
Operational
Adjusted
Item Sequence
P Value
Number
0.39
28
0.84
29
0.70
30
0.67
31
0.35
32
0.71
33
0.68
34
0.68
35
0.62
36
0.67
37
0.48
38
0.67
39
0.62
40
0.90
41
0.66
42
0.85
43
0.94
44
n/a
45
0.73
46
0.62
47
0.66
48
0.50
49
0.53
50
0.64
51
0.88
52
0.49
53
0.76
54

N

Max
Points

Adjusted
P Value

4,732
4,724
4,728
4,730
4,726
4,728
4,719
4,730
4,725
4,730
4,713
4,713
4,710
4,730
4,728
4,726
4,722
4,721
4,722
4,694
4,685
4,728
4,727
4,721
4,726
4,725
4,729

1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
3
1
1
1
1
1
1

0.54
0.63
0.36
0.82
0.57
0.82
0.58
0.64
0.64
0.70
0.57
0.76
0.77
0.92
0.73
0.61
0.67
0.62
0.89
0.65
0.48
0.77
0.90
0.55
0.75
0.67
0.84

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.
*Items deemed statically unacceptable were suppressed.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

118

Table C2. DC CAS 2012 Operational Form Item Adjusted P Values: Mathematics (continued)
Operational
Item Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

N

Max
Points

4,545
4,547
4,546
4,541
4,533
4,495
4,546
4,545
4,541
4,536
4,542
4,536
4,538
4,537
4,517
4,529
4,538
4,525
4,521
4,479
4,493
4,540
4,535
4,531
4,536
4,528
4,529

1
1
1
1
1
3
1
1
1
1
1
1
1
1
1
1
1
1
1
1
3
1
1
1
1
1
1

Mathematics Grade 6
Operational
Adjusted
Item Sequence
P Value
Number
0.56
28
0.44
29
0.80
30
0.66
31
0.73
32
0.46
33
0.71
34
0.81
35
0.86
36
0.37
37
0.62
38
0.36
39
0.36
40
0.52
41
0.53
42
0.59
43
0.71
44
0.64
45
0.60
46
0.66
47
0.33
48
0.66
49
0.55
50
0.47
51
0.70
52
0.74
53
0.47
54

N

Max
Points

Adjusted
P Value

4,535
4,534
4,541
4,535
4,532
4,530
4,530
4,533
4,536
4,531
4,533
4,536
4,526
4,535
4,534
4,537
4,527
4,529
4,527
4,499
4,456
4,533
4,534
4,534
4,534
4,530
4,528

1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
3
1
1
1
1
1
1

0.64
0.53
0.63
0.70
0.51
0.36
0.55
0.63
0.56
0.66
0.74
0.59
0.52
0.62
0.71
0.61
0.54
0.33
0.66
0.54
0.21
0.28
0.72
0.74
0.66
0.45
0.62

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

119

Table C2. DC CAS 2012 Operational Form Item Adjusted P Values: Mathematics (continued)
Operational
Item Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

N

Max
Points

4,274
4,288
4,294
4,232
4,286
4,190
4,287
4,287
4,291
4,284
4,290
4,288
suppressed*
4,289
4,277
4,273
4,281
4,280
4,270
4,231
4,171
4,271
4,266
4,269
4,258
4,260
4,243

1
1
1
1
1
3
1
1
1
1
1
1
1
1
1
1
1
1
1
1
3
1
1
1
1
1
1

Mathematics Grade 7
Operational
Adjusted
Item Sequence
P Value
Number
0.54
28
0.63
29
0.86
30
0.43
31
0.48
32
0.20
33
0.23
34
0.51
35
0.48
36
0.58
37
0.51
38
0.73
39
.
40
0.35
41
0.52
42
0.82
43
0.79
44
0.77
45
0.61
46
0.49
47
0.50
48
0.50
49
0.49
50
0.79
51
0.45
52
0.76
53
0.65
54

N

Max
Points

Adjusted
P Value

4,273
4,269
4,270
4,278
4,270
4,275
4,279
4,274
4,267
4,277
4,273
4,264
4,248
4,267
4,256
4,261
4,265
4,268
4,261
4,250
.
4,268
4,260
4,264
4,267
4,260
4,259

1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
3
1
1
1
1
1
1

0.64
0.71
0.64
0.82
0.56
0.69
0.71
0.55
0.46
0.65
0.63
0.61
0.58
0.57
0.43
0.65
0.67
0.69
0.62
0.36
.
0.41
0.60
0.62
0.70
0.61
0.52

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.
*Items deemed statically unacceptable were suppressed.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

120

Table C2. DC CAS 2012 Operational Form Item Adjusted P Values: Mathematics (continued)
Operational
Item Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

N

Max
Points

4,323
4,314
4,326
4,335
4,258
4,316
4,323
4,337
4,330
4,339
4,330
4,336
4,332
4,337
4,311
4,324
4,318
4,301
4,315
4,278
4,144
4,325
4,322
4,314
4,319
4,322
4,311

1
1
1
1
1
3
1
1
1
1
1
1
1
1
1
1
1
1
1
1
3
1
1
1
1
1
1

Mathematics Grade 8
Operational
Adjusted
Item Sequence
P Value
Number
0.30
28
0.38
29
0.28
30
0.51
31
0.40
32
0.59
33
0.46
34
0.49
35
0.37
36
0.59
37
0.53
38
0.39
39
0.46
40
0.53
41
0.51
42
0.81
43
0.57
44
0.41
45
0.73
46
0.52
47
0.14
48
0.55
49
0.64
50
0.57
51
0.73
52
0.86
53
0.69
54

N

Max
Points

Adjusted
P Value

4,304
4,315
4,315
4,303
4,317
4,315
4,310
4,311
4,318
4,318
4,317
4,314
4,310
4,297
4,293
4,297
4,298
4,291
4,294
4,254
4,207
4,280
4,298
4,299
4,297
4,298
4,296

1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
3
1
1
1
1
1
1

0.54
0.35
0.51
0.32
0.59
0.39
0.48
0.60
0.64
0.48
0.36
0.57
0.44
0.37
0.51
0.55
0.49
0.61
0.76
0.53
0.36
0.45
0.56
0.48
0.56
0.38
0.55

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

121

Table C2. DC CAS 2012 Operational Form Item Adjusted P Values: Mathematics (continued)
Mathematics Grade 10
Operational
Item
Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

N

Max
Points

Adjusted
P Value

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

3,436
3,461
3,454
3,431
3,418
3,136
3,431
3,445
3,457
3,440
3,442
3,404
3,455
3,447
3,437
3,435
3,425
3,423
3,419
3,383
3,218
3,428
3,431
3,424
3,424
3,418
3,418

1
1
1
1
1
3
1
1
1
1
1
1
1
1
1
1
1
1
1
1
3
1
1
1
1
1
1

0.39
0.62
0.64
0.48
0.38
0.52
0.40
0.40
0.83
0.49
0.51
0.24
0.72
0.77
0.72
0.63
0.47
0.49
0.42
0.44
0.20
0.53
0.45
0.63
0.45
0.38
0.62

28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54

3,370
3,385
3,407
3,405
3,400
3,402
3,403
3,365
3,401
3,400
3,395
3,397
3,396
3,363
3,378
3,395
3,374
3,387
3,364
3,350
3,040
3,369
3,391
3,393
3,380
3,381
3,382

1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
3
1
1
1
1
1
1

0.44
0.33
0.51
0.58
0.40
0.57
0.38
0.36
0.58
0.47
0.46
0.78
0.53
0.53
0.28
0.68
0.48
0.40
0.35
0.46
0.20
0.30
0.43
0.56
0.33
0.56
0.30

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

122

Table C3. DC CAS 2012 Operational Form Item Adjusted P Values: Science/Biology
Science Grade 5
Operational
Item
Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

N

Max
Points

Adjusted
P Value

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

4,694
4,696
4,697
4,696
4,689
4,694
4,692
4,686
4,667
4,565
4,684
4,674
4,660
4,666
4,661
4,657
4,654
4,684
4,680
4,666
4,594
4,678
4,679
4,677
4,674

1
1
1
1
1
1
1
1
1
2
1
1
1
1
1
1
1
1
1
1
2
1
1
1
1

0.45
0.56
0.81
0.46
0.36
0.70
0.63
0.46
0.34
0.20
0.62
0.39
0.26
0.36
0.53
0.30
0.26
0.57
0.37
0.36
0.71
0.50
0.62
0.27
0.42

26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50

4,673
4,659
4,670
4,672
4,661
4,659
4,655
4,656
4,652
4,656
4,648
4,648
4,641
4,567
4,660
4,658
4,652
4,654
4,655
4,652
4,657
4,655
4,651
4,651
4,642

1
1
1
1
1
1
1
1
1
1
1
1
1
2
1
1
1
1
1
1
1
1
1
1
1

0.44
0.68
0.43
0.66
0.34
0.47
0.42
0.60
0.32
0.45
0.44
0.46
0.50
0.23
0.53
0.77
0.42
0.29
0.54
0.68
0.37
0.65
0.37
0.62
0.40

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

123

Table C3. DC CAS 2012 Operational Form Item Adjusted P Values: Science/Biology (continued)
Science Grade 8
Operational
Item
Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

N

Max
Points

Adjusted
P Value

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

4,248
4,247
4,229
4,241
4,243
4,241
4,232
4,236
4,228
3,532
4,233
4,228
4,232
4,223
4,229
4,225
4,218
4,227
4,215
4,211
4,064
4,234
4,226
4,231
4,231

1
1
1
1
1
1
1
1
1
2
1
1
1
1
1
1
1
1
1
1
2
1
1
1
1

0.57
0.38
0.32
0.42
0.59
0.31
0.41
0.50
0.36
0.15
0.44
0.49
0.55
0.33
0.46
0.36
0.47
0.42
0.28
0.35
0.51
0.20
0.26
0.66
0.56

26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50

4,222
4,227
4,230
4,228
4,218
4,220
4,206
4,212
4,204
4,208
4,201
4,204
4,182
3,936
4,203
4,206
4,201
4,209
4,205
4,201
4,207
4,208
4,206
4,201
4,208

1
1
1
1
1
1
1
1
1
1
1
1
1
2
1
1
1
1
1
1
1
1
1
1
1

0.29
0.44
0.28
0.38
0.37
0.34
0.47
0.44
0.44
0.39
0.46
0.34
0.37
0.29
0.37
0.51
0.25
0.44
0.51
0.42
0.31
0.47
0.54
0.35
0.53

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

124

Table C3. DC CAS 2012 Operational Form Item Adjusted P Values: Science/Biology (continued)
High School Biology
Operational
Item
Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

N

Max
Points

Adjusted
P Value

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

3,670
3,684
3,689
3,689
3,680
3,689
3,686
3,682
3,663
3,480
3,676
3,672
3,672
3,657
3,658
3,659
3,653
3,657
3,662
3,666
3,334
3,666
3,668
3,667
3,662

1
1
1
1
1
1
1
1
1
2
1
1
1
1
1
1
1
1
1
1
2
1
1
1
1

0.23
0.37
0.35
0.49
0.30
0.52
0.41
0.57
0.28
0.74
0.25
0.27
0.52
0.70
0.57
0.27
0.60
0.34
0.26
0.42
0.27
0.38
0.38
0.49
0.51

26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50

3,660
3,648
3,661
3,658
3,640
3,637
3,634
3,633
3,608
3,608
3,611
3,602
3,570
3,279
3,602
3,613
3,614
3,612
3,605
3,609
3,605
3,605
3,608
3,608
3,607

1
1
1
1
1
1
1
1
1
1
1
1
1
2
1
1
1
1
1
1
1
1
1
1
1

0.33
0.39
0.62
0.34
0.44
0.39
0.43
0.56
0.48
0.47
0.36
0.28
0.41
0.28
0.27
0.29
0.45
0.38
0.39
0.54
0.33
0.39
0.40
0.48
0.33

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

125

Table C4. DC CAS 2012 Operational Form Item Adjusted P Values: Composition
Composition Grade 4
Operational
Item
Sequence
Number
1
2
3
4
Operational
Item
Sequence
Number
1
2
3
4
Operational
Item
Sequence
Number
1
2
3
4

N

Max
Points

Adjusted
P Value

1,142
1,142
1,111
1,111

6
4
6
4

N

Max
Points

1,037
1,037
1,036
1,036

6
4
6
4

N

Max
Points

Adjusted
P Value

814
814
793
793

6
4
6
4

0.51
0.70
0.38
0.63

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

1,104
1,104
1,063
1,063

6
4
6
4

0.44
0.60
0.42
0.59

N

Max
Points

Adjusted
P Value

1,030
1,030
991
991

6
4
6
4

0.48
0.64
0.42
0.61

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

5
6
7
8

795
795
822
822

6
4
6
4

0.33
0.61
0.43
0.64

0.33
5
0.50
6
0.40
7
0.55
8
Composition Grade 7
Adjusted
P Value

Operational
Item Sequence
Number

0.46
5
0.64
6
0.51
7
0.68
8
Composition Grade 10

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

126

Table C5. DC CAS 2012 Field Test Form Item Adjusted P Values: Reading
Reading Grade 2
Operational
Item
Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

N

Max
Points

Adjusted
P Value

2,342
2,341
2,340
2,324
2,343
2,332
2,313
2,332
2,327
2,328
2,347
2,347
2,349
2,337
2,320
2,349
2,341
2,331
2,349
2,339
2,236
2,077
2,054
2,062

1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
3
1
1
1

0.51
0.42
0.45
0.53
0.64
0.50
0.60
0.53
0.46
0.64
0.24
0.45
0.25
0.59
0.22
0.48
0.40
0.42
0.61
0.49
0.41
0.67
0.24
0.32

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42

2,032
2,076
2,071
2,061
2,058
2,046
2,053
2,094
2,089
2,085
2,087
2,072
2,070
2,043
2,080
2,068
2,064
1,951

1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
3

0.49
0.63
0.43
0.31
0.35
0.49
0.45
0.56
0.23
0.47
0.31
0.46
0.65
0.50
0.50
0.51
0.25
0.17

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

127

Table C5. DC CAS 2012 Field Test Form Item Adjusted P Values: Reading (continued)
Reading Grade 3
Operational
Item
Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

N

Max
Points

Adjusted
P Value

2,392
2,393
2,389
2,384
2,362
2,367
2,367
2,377
2,383
2,382
2,287
2,394
2,354
2,392
2,374
2,380
2,364
2,372
2,318
2,326
2,324
2,320
2,317
2,313

1
1
1
1
1
1
1
1
1
1
3
1
1
1
1
1
1
1
3
1
1
1
1
1

0.73
0.30
0.56
0.41
0.23
0.30
0.27
0.30
0.37
0.55
0.33
0.83
0.86
0.74
0.47
0.32
0.64
0.70
0.36
0.19
0.62
0.69
0.48
0.40

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

25
26
27
28
29
30
31
32
33
34
35
36
37
38

2,323
2,311
2,320
2,323
2,320
2,260
2,323
2,321
2,324
2,324
2,302
2,267
2,316
2,266

1
1
1
1
1
3
1
1
1
1
1
1
1
3

0.56
0.54
0.31
0.28
0.50
0.30
0.85
0.76
0.54
0.72
0.71
0.39
0.72
0.31

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

128

Table C5. DC CAS 2012 Field Test Form Item Adjusted P Values: Reading (continued)
Reading Grade 4
Operational
Item
Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

N

Max
Points

Adjusted
P Value

2,306
2,303
2,306
2,302
2,300
2,304
2,302
2,298
2,299
2,282
2,231
2,309
2,308
2,306
2,304
2,305
2,299
2,291
2,258
2,242
2,244
2,244
2,244
2,243

1
1
1
1
1
1
1
1
1
1
3
1
1
1
1
1
1
1
3
1
1
1
1
1

0.39
0.23
0.83
0.42
0.34
0.41
0.59
0.74
0.46
0.44
0.40
0.66
0.41
0.43
0.61
0.37
0.25
0.45
0.34
0.45
0.49
0.36
0.42
0.37

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

25
26
27
28
29
30
31
32
33
34
35
36
37
38

2,244
2,243
2,238
2,234
2,220
2,165
2,240
2,240
2,240
2,236
2,237
2,232
2,221
2,191

1
1
1
1
1
3
1
1
1
1
1
1
1
3

0.60
0.34
0.69
0.47
0.49
0.34
0.19
0.48
0.36
0.55
0.48
0.58
0.44
0.37

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

129

Table C5. DC CAS 2012 Field Test Form Item Adjusted P Values: Reading (continued)
Reading Grade 5
Operational
Item
Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

N

Max
Points

Adjusted
P Value

2,383
2,382
2,382
2,383
2,381
2,378
2,370
2,278
2,388
2,388
2,388
2,385
2,388
2,388
2,386
2,387
2,383
2,369
2,324
2,331
2,335
2,333
2,332
2,332

1
1
1
1
1
1
1
3
1
1
1
1
1
1
1
1
1
1
3
1
1
1
1
1

0.46
0.58
0.42
0.45
0.49
0.46
0.55
0.21
0.66
0.50
0.43
0.45
0.48
0.64
0.65
0.35
0.52
0.47
0.36
0.26
0.32
0.30
0.37
0.44

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

25
26
27
28
29
30
31
32
33
34
35
36
37
38

2,333
2,321
2,202
2,334
2,335
2,334
2,334
2,333
2,333
2,335
2,332
2,329
2,324
2,243

1
1
3
1
1
1
1
1
1
1
1
1
1
3

0.49
0.32
0.12
0.67
0.64
0.70
0.56
0.46
0.56
0.62
0.58
0.49
0.78
0.25

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

130

Table C5. DC CAS 2012 Field Test Form Item Adjusted P Values: Reading (continued)
Reading Grade 6
Operational
Item
Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

N

Max
Points

Adjusted
P Value

2,261
2,261
2,258
2,259
2,261
2,258
2,248
2,191
2,260
2,258
2,260
2,258
2,259
2,255
2,257
2,251
2,249
2,238
2,180
2,259
2,257
2,255
2,255
2,257

1
1
1
1
1
1
1
3
1
1
1
1
1
1
1
1
1
1
3
1
1
1
1
1

0.67
0.83
0.60
0.60
0.74
0.24
0.78
0.22
0.44
0.43
0.54
0.47
0.48
0.33
0.47
0.54
0.39
0.50
0.18
0.68
0.51
0.52
0.57
0.64

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

25
26
27
28
29
30
31
32
33
34
35
36
37
38

2,257
2,247
2,195
2,254
2,257
2,256
2,255
2,258
2,254
2,254
2,255
2,250
2,251
2,218

1
1
3
1
1
1
1
1
1
1
1
1
1
3

0.57
0.67
0.37
0.50
0.63
0.52
0.74
0.76
0.56
0.64
0.39
0.43
0.51
0.15

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

131

Table C5. DC CAS 2012 Field Test Form Item Adjusted P Values: Reading (continued)
Reading Grade 7
Operational
Item
Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

N

Max
Points

Adjusted
P Value

2,169
2,171
2,166
2,167
2,166
2,166
2,156
2,119
2,158
2,158
2,154
2,157
2,159
2,158
2,159
2,158
2,158
2,156
2,071
2,098
2,099
2,097
2,100
2,102

1
1
1
1
1
1
1
3
1
1
1
1
1
1
1
1
1
1
3
1
1
1
1
1

0.54
0.67
0.46
0.53
0.69
0.43
0.36
0.31
0.49
0.51
0.33
0.36
0.38
0.57
0.32
0.17
0.32
0.21
0.14
0.31
0.26
0.32
0.39
0.13

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

25
26
27
28
29
30
31
32
33
34
35
36
37
38

2,098
2,092
2,015
2,098
2,100
2,096
2,099
2,095
2,096
2,097
2,099
2,097
2,085
2,068

1
1
3
1
1
1
1
1
1
1
1
1
1
3

0.46
0.49
0.23
0.61
0.50
0.50
0.79
0.50
0.47
0.46
0.41
0.56
0.67
0.41

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

132

Table C5. DC CAS 2012 Field Test Form Item Adjusted P Values: Reading (continued)
Reading Grade 8
Operational
Item
Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

N

Max
Points

Adjusted
P Value

2,178
2,180
2,177
2,174
2,178
2,177
2,176
2,175
2,170
2,093
2,178
2,175
2,175
2,174
2,177
2,174
2,176
2,175
2,162
2,108
2,115
2,111
2,114
2,106

1
1
1
1
1
1
1
1
1
3
1
1
1
1
1
1
1
1
1
3
1
1
1
1

0.50
0.58
0.46
0.58
0.42
0.37
0.35
0.45
0.58
0.39
0.35
0.30
0.58
0.66
0.52
0.54
0.24
0.46
0.48
0.34
0.29
0.43
0.40
0.36

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40

2,106
2,096
2,108
2,105
2,099
2,050
2,117
2,117
2,112
2,115
2,116
2,116
2,109
2,115
2,107
2,075

1
1
1
1
1
3
1
1
1
1
1
1
1
1
1
3

0.40
0.49
0.31
0.50
0.65
0.37
0.80
0.62
0.47
0.67
0.61
0.73
0.53
0.68
0.51
0.17

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

133

Table C5. DC CAS 2012 Field Test Form Item Adjusted P Values: Reading (continued)
Reading Grade 9
Operational
Item
Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

N

Max
Points

Adjusted
P Value

1,748
1,762
1,761
1,763
1,761
1,761
1,760
1,760
1,756
1,586
1,714
1,712
1,711
1,713
1,712
1,706
1,713
1,712
1,712
1,710
1,703
1,716
1,716
1,720

1
1
1
1
1
1
1
1
1
3
1
1
1
1
1
1
1
1
1
1
1
1
1
1

0.67
0.54
0.53
0.41
0.46
0.28
0.35
0.43
0.60
0.23
0.73
0.48
0.41
0.65
0.67
0.36
0.63
0.28
0.58
0.50
0.50
0.18
0.42
0.59

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40

1,717
1,717
1,719
1,716
1,715
1,709
1,664
1,666
1,662
1,662
1,661
1,661
1,662
1,660
1,642
1,489

1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
3

0.46
0.46
0.58
0.49
0.55
0.49
0.34
0.81
0.58
0.69
0.60
0.31
0.28
0.62
0.18
0.20

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

134

Table C5. DC CAS 2012 Field Test Form Item Adjusted P Values: Reading (continued)
Reading Grade 10
Operational
Item
Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

N

Max
Points

Adjusted
P Value

2,114
2,109
2,112
2,109
2,108
2,106
2,106
2,105
2,099
1,927
2,072
2,069
2,072
2,068
2,073
2,063
2,063
2,062
2,057
1,857
2,082
2,081
2,080
2,079

1
1
1
1
1
1
1
1
1
3
1
1
1
1
1
1
1
1
1
3
1
1
1
1

0.31
0.54
0.60
0.49
0.57
0.57
0.31
0.57
0.63
0.31
0.63
0.41
0.47
0.21
0.65
0.46
0.58
0.55
0.42
0.33
0.60
0.70
0.44
0.58

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40

2,084
2,084
2,080
2,082
2,078
1,941
2,049
2,044
2,047
2,040
2,047
2,042
2,047
2,045
2,036
1,893

1
1
1
1
1
3
1
1
1
1
1
1
1
1
1
3

0.68
0.64
0.56
0.49
0.27
0.47
0.65
0.36
0.52
0.71
0.46
0.56
0.42
0.44
0.55
0.42

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

135

Table C6. DC CAS 2012 Field Test Form Item Adjusted P Values: Mathematics
Mathematics Grade 2
Operational
Item
Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

N

Max
Points

Adjusted
P Value

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

2,364
2,366
2,361
2,370
2,354
2,367
2,365
2,368
2,370
2,362
2,360
2,367
2,351
2,360
2,369
2,371
2,353
2,361

1
1
1
1
1
1
1
1
1
1
1
1
3
1
1
1
1
1

0.73
0.62
0.86
0.57
0.78
0.97
0.78
0.55
0.82
0.52
0.53
0.69
0.43
0.81
0.91
0.66
0.50
0.60

19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36

2,071
2,101
2,109
2,109
2,106
2,100
2,096
2,093
2,093
2,099
2,095
2,093
2,089
2,104
2,100
2,097
2,105
2,101

1
1
1
1
1
1
1
1
1
1
1
1
3
1
1
1
1
1

0.77
0.87
0.80
0.69
0.84
0.63
0.75
0.68
0.63
0.84
0.67
0.55
0.60
0.45
0.93
0.48
0.52
0.79

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

136

Table C6. DC CAS 2012 Field Test Form Item Adjusted P Values: Mathematics (continued)
Mathematics Grade 3
Operational
Item
Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

N

Max
Points

Adjusted
P Value

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

2,407
2,406
2,398
2,398
2,335
2,403
2,385
2,401
2,343
2,395
2,401
2,390
2,390
2,395
2,394
2,366

1
1
1
1
3
1
1
1
3
1
1
1
1
1
1
1

0.73
0.61
0.78
0.60
0.82
0.42
0.58
0.87
0.58
0.50
0.55
0.73
0.20
0.58
0.68
0.59

17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32

2,339
2,281
2,333
2,320
2,321
2,340
2,321
2,325
2,276
2,340
2,333
2,329
2,315
2,334
2,341
2,324

1
1
1
1
3
1
1
1
3
1
1
1
1
1
1
1

0.44
0.95
0.77
0.55
0.37
0.62
0.64
0.82
0.42
0.71
0.83
0.79
0.56
0.42
0.96
0.35

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

137

Table C6. DC CAS 2012 Field Test Form Item Adjusted P Values: Mathematics (continued)
Mathematics Grade 4
Operational
Item
Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

N

Max
Points

Adjusted
P Value

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

2,327
2,327
2,321
2,308
2,290
2,295
2,294
2,277
2,287
2,318
2,315
2,302
2,319
2,317
2,319
2,302

1
1
1
1
3
1
1
1
3
1
1
1
1
1
1
1

0.62
0.74
0.73
0.74
0.33
0.32
0.45
0.45
0.32
0.45
0.68
0.44
0.69
0.47
0.68
0.47

17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32

2,237
2,239
2,232
2,220
2,177
2,209
2,211
2,206
2,205
2,232
2,231
2,218
2,246
2,240
2,240
2,217

1
1
1
1
3
1
1
1
3
1
1
1
1
1
1
1

0.34
0.52
0.46
0.47
0.46
0.44
0.79
0.25
0.30
0.79
0.45
0.62
0.66
0.76
0.86
0.28

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

138

Table C6. DC CAS 2012 Field Test Form Item Adjusted P Values: Mathematics (continued)
Mathematics Grade 5
Operational
Item
Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

N

Max
Points

Adjusted
P Value

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

2,390
2,395
2,393
2,374
2,304
2,372
2,369
2,362
2,351
2,379
2,382
2,372
2,387
2,388
2,384
2,380

1
1
1
1
3
1
1
1
3
1
1
1
1
1
1
1

0.42
0.35
0.46
0.33
0.40
0.49
0.22
0.43
0.10
0.38
0.55
0.80
0.50
0.21
0.34
0.47

17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32

2,345
2,346
2,345
2,334
2,302
2,328
2,319
2,310
2,307
2,332
2,329
2,319
2,333
2,335
2,332
2,325

1
1
1
1
3
1
1
1
3
1
1
1
1
1
1
1

0.27
0.27
0.72
0.31
0.68
0.31
0.32
0.69
0.22
0.76
0.56
0.46
0.61
0.68
0.35
0.45

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

139

Table C6. DC CAS 2012 Field Test Form Item Adjusted P Values: Mathematics (continued)
Mathematics Grade 6
Operational
Item
Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

N

Max
Points

Adjusted
P Value

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

2,282
2,282
2,279
2,280
2,206
2,276
2,273
2,267
2,242
2,276
2,271
2,265
2,276
2,276
2,278
2,274

1
1
1
1
3
1
1
1
3
1
1
1
1
1
1
1

0.80
0.41
0.30
0.81
0.13
0.43
0.39
0.47
0.36
0.61
0.53
0.59
0.44
0.20
0.19
0.17

17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32

2,256
2,254
2,253
2,251
2,211
2,251
2,249
2,251
2,228
2,243
2,248
2,243
2,254
2,254
2,256
2,251

1
1
1
1
3
1
1
1
3
1
1
1
1
1
1
1

0.68
0.49
0.40
0.74
0.23
0.79
0.34
0.70
0.15
0.40
0.72
0.42
0.43
0.45
0.30
0.25

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

140

Table C6. DC CAS 2012 Field Test Form Item Adjusted P Values: Mathematics (continued)
Mathematics Grade 7
Operational
Item
Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

N

Max
Points

Adjusted
P Value

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

2,182
2,179
2,174
2,171
2,096
2,151
2,154
2,152
2,136
2,170
2,169
2,171
2,165
2,166
2,165
2,165

1
1
1
1
3
1
1
1
3
1
1
1
1
1
1
1

0.39
0.36
0.33
0.41
0.14
0.57
0.59
0.50
0.32
0.51
0.42
0.47
0.20
0.46
0.33
0.49

17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32

2,106
2,107
2,102
2,105
2,053
2,064
2,064
2,062
2,046
2,073
2,071
2,073
2,097
2,096
2,096
2,091

1
1
1
1
3
1
1
1
3
1
1
1
1
1
1
1

0.43
0.36
0.35
0.32
0.28
0.54
0.62
0.56
0.37
0.27
0.44
0.47
0.34
0.52
0.23
0.52

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

141

Table C6. DC CAS 2012 Field Test Form Item Adjusted P Values: Mathematics (continued)
Mathematics Grade 8
Operational
Item
Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

N

Max
Points

Adjusted
P Value

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

2,196
2,196
2,199
2,199
2,072
2,183
2,186
2,188
2,077
2,188
2,191
2,188
2,180
2,178
2,182
2,173

1
1
1
1
3
1
1
1
3
1
1
1
1
1
1
1

0.51
0.37
0.17
0.41
0.25
0.38
0.66
0.37
0.10
0.61
0.44
0.45
0.46
0.27
0.34
0.37

17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32

2,134
2,133
2,127
2,131
1,935
2,121
2,115
2,120
2,061
2,123
2,122
2,122
2,116
2,117
2,116
2,115

1
1
1
1
3
1
1
1
3
1
1
1
1
1
1
1

0.71
0.53
0.29
0.62
0.10
0.50
0.41
0.35
0.08
0.34
0.33
0.32
0.36
0.30
0.28
0.46

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

142

Table C6. DC CAS 2012 Field Test Form Item Adjusted P Values: Mathematics (continued)
Mathematics Grade 10
Operational
Item
Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

N

Max
Points

Adjusted
P Value

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

1,741
1,739
1,733
1,737
1,317
1,725
1,715
1,724
1,485
1,713
1,712
1,711
1,703
1,706
1,706
1,707

1
1
1
1
3
1
1
1
3
1
1
1
1
1
1
1

0.58
0.28
0.31
0.35
0.20
0.30
0.30
0.66
0.07
0.23
0.55
0.49
0.41
0.31
0.57
0.48

17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32

1,702
1,701
1,697
1,704
1,554
1,689
1,689
1,691
1,432
1,676
1,680
1,680
1,679
1,670
1,677
1,675

1
1
1
1
3
1
1
1
3
1
1
1
1
1
1
1

0.37
0.38
0.42
0.50
0.12
0.21
0.14
0.37
0.18
0.49
0.51
0.41
0.50
0.35
0.40
0.33

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

143

Table C7. DC CAS 2012 Field Test Form Item Adjusted P Values: Science/Biology
Science Grade 5
Operational
Item
Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14

N

Max
Points

Adjusted
P Value

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

2,395
2,394
2,393
2,297
2,380
2,379
2,383
2,382
2,376
2,376
2,357
2,288
2,365
2,371

1
1
1
2
1
1
1
1
1
1
1
2
1
1

0.71
0.42
0.48
0.39
0.24
0.41
0.23
0.31
0.45
0.61
0.54
0.18
0.69
0.67

15
16
17
18
19
20
21
22
23
24
25
26
27
28

2,301
2,300
2,299
2,207
2,292
2,291
2,295
2,295
2,292
2,287
2,280
2,224
2,280
2,284

1
1
1
2
1
1
1
1
1
1
1
2
1
1

0.59
0.43
0.54
0.45
0.43
0.48
0.51
0.62
0.40
0.36
0.70
0.34
0.48
0.40

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

144

Table C7. DC CAS 2012 Field Test Form Item Adjusted P Values: Science/Biology (continued)
Science Grade 8
Operational
Item
Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14

N

Max
Points

Adjusted
P Value

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

2,148
2,153
2,152
1,800
2,143
2,145
2,142
2,141
2,142
2,134
2,120
1,965
2,124
2,127

1
1
1
2
1
1
1
1
1
1
1
2
1
1

0.31
0.43
0.43
0.12
0.38
0.41
0.55
0.28
0.27
0.27
0.13
0.35
0.39
0.34

15
16
17
18
19
20
21
22
23
24
25
26
27
28

2,092
2,095
2,093
1,926
2,085
2,090
2,091
2,092
2,090
2,086
2,077
1,874
2,081
2,078

1
1
1
2
1
1
1
1
1
1
1
2
1
1

0.57
0.50
0.54
0.03
0.33
0.30
0.50
0.43
0.33
0.33
0.31
0.12
0.38
0.34

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

145

Table C7. DC CAS 2012 Field Test Form Item Adjusted P Values: Science/Biology (continued)
High School Biology
Operational
Item
Sequence
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14

N

Max
Points

Adjusted
P Value

Operational
Item Sequence
Number

N

Max
Points

Adjusted
P Value

1,860
1,864
1,867
1,497
1,859
1,856
1,857
1,854
1,857
1,851
1,848
1,724
1,822
1,823

1
1
1
2
1
1
1
1
1
1
1
2
1
1

0.32
0.35
0.50
0.08
0.42
0.36
0.30
0.28
0.37
0.45
0.29
0.49
0.35
0.17

15
16
17
18
19
20
21
22
23
24
25
26
27
28

1,818
1,817
1,818
1,653
1,809
1,803
1,807
1,804
1,808
1,805
1,799
1,546
1,775
1,781

1
1
1
2
1
1
1
1
1
1
1
2
1
1

0.54
0.30
0.48
0.20
0.40
0.26
0.32
0.34
0.44
0.57
0.41
0.34
0.24
0.15

Note: The adjusted p value for an item includes responses only for examinees with valid responses to that item.

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

146

Appendix D: Internal Consistency Reliability Coefficients for Examinee Subgroups
(See Section 8. Evidence for Reliability and Validity, Internal Consistency Reliability, Table 31)

Table D1. Internal Consistency Reliability Coefficients for Examinee Subgroups: Reading
Grade

2

3

4

5

Subgroup

N

Alpha

Stratified Alpha

Feldt-Raju

Mean

SD

All Examinees
Male
Female
Asian
African
American
Hispanic
White
All Examinees
Male
Female
Asian
African
American
Hispanic
White
All Examinees
Male
Female
Asian
African
American
Hispanic
White
All Examinees
Male
Female
Asian
African
American
Hispanic
White

4,469
2,260
2,186
95
3,195
625
521
4,737
,
2,390
2,329
94
3,459
664
479
4,559
2,299
2,241
102
3,330
629
461
4,734
,
2,395
2,324
78
3,686
578
365

0.88
0.88
0.87
0.85
0.86
0.86
0.80
0.93
0.93
0.93
0.90
0.92
0.92
0.90
0.92
0.92
0.91
0.92
0.91
0.90
0.87
0.92
0.92
0.91
0.88
0.91
0.91
0.81

0.88
0.89
0.87
0.86
0.87
0.87
0.81
0.94
0.94
0.93
0.90
0.93
0.92
0.90
0.92
0.93
0.91
0.93
0.91
0.90
0.87
0.92
0.92
0.91
0.89
0.91
0.91
0.82

0.88
0.89
0.87
0.86
0.87
0.87
0.82
0.94
0.94
0.93
0.91
0.93
0.92
0.90
0.92
0.93
0.91
0.93
0.91
0.90
0.87
0.92
0.92
0.91
0.89
0.91
0.91
0.82

241.97
240.05
244.00
252.17
239.24
240.84
258.07
348.65
346.78
350.58
359.27
346.05
349.03
364.76
452.42
450.83
454.13
460.84
449.79
452.58
469.20
553.75
551.45
556.22
565.37
551.66
554.59
571.01

15.78
16.11
15.15
13.76
14.70
14.47
13.44
15.37
15.61
14.87
11.91
14.62
14.04
11.98
15.09
15.85
14.03
16.16
14.12
13.59
12.02
15.09
15.65
13.99
12.41
14.44
14.31
9.96

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

147

Table D1. Internal Consistency Reliability Coefficients for Examinee Subgroups: Reading (continued)
Grade

6

7

8

9

10

Subgroup

N

Alpha

Stratified Alpha

Feldt-Raju

Mean

SD

All Examinees
Male
Female
Asian
African American
Hispanic
White
All Examinees
Male
Female
Asian
African American
Hispanic
White
All Examinees
Male
Female
Asian
African American
Hispanic
White
All Examinees
Male
Female
Asian
African American
Hispanic
White
All Examinees
Male
Female
Asian
African American
Hispanic
White

4,539
2,293
2,220
68
3,590
566
268
4,283
2,150
,
2,118
55
3,442
507
240
4,337
2,154
2,159
54
3,526
475
235
3,534
1,701
,,
1,778
28
2,940
388
96
4,230
2,014
2,170
64
3,518
444
153

0.91
0.92
0.91
0.91
0.91
0.91
0.91
0.90
0.91
0.89
0.88
0.89
0.89
0.90
0.90
0.90
0.89
0.90
0.89
0.89
0.89
0.92
0.92
0.93
0.94
0.92
0.94
0.92
0.92
0.92
0.91
0.91
0.91
0.91
0.92

0.92
0.92
0.91
0.92
0.91
0.91
0.92
0.91
0.91
0.89
0.89
0.90
0.90
0.91
0.91
0.91
0.90
0.91
0.90
0.89
0.90
0.93
0.92
0.93
0.94
0.92
0.94
0.92
0.92
0.92
0.92
0.91
0.92
0.91
0.93

0.92
0.92
0.91
0.92
0.91
0.91
0.92
0.90
0.91
0.89
0.89
0.90
0.90
0.91
0.91
0.91
0.90
0.91
0.90
0.89
0.90
0.93
0.92
0.93
0.95
0.92
0.94
0.93
0.92
0.92
0.92
0.91
0.92
0.91
0.93

650.16
648.28
652.15
659.51
648.70
650.62
666.40
754.13
751.95
756.37
762.96
752.69
755.59
768.72
853.86
851.40
856.42
862.48
852.64
854.04
869.98
947.17
944.95
949.64
958.18
946.91
944.50
969.07
951.32
949.08
953.56
961.02
950.26
952.63
969.75

14.20
14.91
13.08
13.95
13.46
14.06
13.03
14.25
14.93
13.10
11.12
13.97
12.67
12.97
14.32
14.96
13.12
12.81
13.82
13.65
13.03
16.94
16.72
16.72
16.48
16.14
19.69
11.92
15.45
15.95
14.57
12.91
15.12
14.11
13.18

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

148

Table D2. Internal Consistency Reliability Coefficients for Examinee Subgroups: Mathematics
Grade

2

3

4

5

Subgroup

N

Alpha

Stratified Alpha

Feldt-Raju

Mean

SD

All Examinees
Male
Female
Asian
African
American
Hispanic
White
All Examinees
Male
Female
Asian
African
American
Hispanic
White
All Examinees
Male
Female
Asian
African
American
Hispanic
White
All Examinees
Male
Female
Asian
African
American
Hispanic
White

4,499
2,276
2276
2,198
76
100
3,209
632
523
4,771
2,413
2,339
97
3,477
674
482
4,590
2,314
2,258
103
3,349
638
463
4,747
2,409
2,322
81
3,681
586
368

0.89
0.90
0.88
0.83
0.88
0.88
0.80
0.93
0.93
0.93
0.89
0.93
0.92
0.89
0.93
0.93
0.92
0.92
0.92
0.92
0.89
0.93
0.93
0.93
0.93
0.92
0.92
0.87

0.89
0.90
0.88
0.82
0.88
0.88
0.81
0.94
0.94
0.94
0.89
0.93
0.92
0.90
0.93
0.93
0.93
0.92
0.92
0.92
0.89
0.93
0.93
0.93
0.93
0.93
0.93
0.87

0.90
0.90
0.89
0.85
0.88
0.89
0.81
0.94
0.94
0.94
0.90
0.93
0.92
0.90
0.93
0.94
0.93
0.93
0.92
0.92
0.89
0.93
0.93
0.93
0.93
0.93
0.93
0.87

253.94
253.85
254.05
265.78
251.10
253.41
269.50
352.29
351.76
352.92
368.14
348.98
354.33
370.36
456.65
455.55
457.87
469.13
453.72
458.47
472.59
557.66
556.10
559.41
572.89
555.72
558.93
572.14

15.27
15.82
14.67
15.62
13.73
14.10
14.85
17.70
17.78
17.55
13.88
16.98
15.13
13.38
15.75
16.66
14.62
13.11
15.04
14.00
11.67
16.67
16.94
16.12
15.89
16.35
15.55
11.77

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

149

Table D2. Internal Consistency Reliability Coefficients for Examinee Subgroups: Mathematics (continued)
Grade

6

7

8

10

Subgroup

N

Alpha

Stratified Alpha

Feldt-Raju

Mean

SD

All Examinees
Male
Female
Asian
African American
Hispanic
White
All Examinees
Male
Female
Asian
African American
Hispanic
White
All Examinees
Male
Female
Asian
African American
Hispanic
White
All Examinees
Male
Female
Asian
African American
Hispanic
White

4,551
2,295
2,229
69
3,594
572
269
4,297
2,150
2,131
55
3,435
527
240
4,341
2,161
2,156
57
3,508
493
234
3,466
1,678
1,748
60
2,819
409
134

0.93
0.93
0.93
0.94
0.92
0.93
0.93
0.92
0.93
0.92
0.93
0.91
0.92
0.93
0.92
0.91
0.92
0.94
0.90
0.90
0.93
0.91
0.92
0.91
0.89
0.90
0.90
0.94

0.94
0.94
0.93
0.95
0.93
0.93
0.94
0.92
0.93
0.92
0.93
0.91
0.92
0.94
0.92
0.91
0.92
0.94
0.90
0.91
0.93
0.92
0.92
0.91
0.89
0.91
0.91
0.94

0.94
0.94
0.93
0.95
0.93
0.93
0.94
0.93
0.93
0.92
0.93
0.92
0.92
0.94
0.92
0.92
0.92
0.95
0.91
0.91
0.93
0.92
0.92
0.91
0.89
0.91
0.91
0.95

651.21
650.10
652.43
669.06
649.04
654.13
669.81
753.33
751.91
754.82
771.02
751.38
754.97
773.05
850.23
848.52
852.05
865.98
848.57
851.80
868.26
946.80
945.67
948.10
964.82
945.27
948.37
968.23

17.11
17.44
16.63
17.84
16.26
15.41
16.14
17.49
18.36
16.37
16.60
16.56
16.70
18.03
16.59
17.18
15.75
14.79
16.17
14.59
14.43
18.80
19.59
17.84
12.73
18.18
17.38
19.64

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

150

Table D3. Internal Consistency Reliability Coefficients for Examinee Subgroups: Science/Biology
Grade

5

8

High
School

Subgroup

N

Alpha

Stratified Alpha

Feldt-Raju

Mean

SD

All Examinees
Male
Female
Asian
African American
Hispanic
White
All Examinees
Male
Female
Asian
African American
Hispanic
White
All Examinees
Male
Female
Asian
African American
Hispanic
White

4,697
2,374
2,296
79
3,632
588
365
4,253
2,088
4253
2,120
97
57
3,416
493
233
3,693
1,731
,
1,882
69
2,947
396
197

0.89
0.89
0.89
0.90
0.86
0.86
0.86
0.88
0.89
0.87
0.90
0.85
0.84
0.91
0.85
0.87
0.84
0.88
0.82
0.84
0.89

0.89
0.89
0.89
0.91
0.86
0.86
0.86
0.88
0.89
0.87
0.90
0.85
0.85
0.91
0.86
0.87
0.84
0.88
0.82
0.85
0.89

0.89
0.90
0.89
0.91
0.86
0.86
0.86
0.88
0.89
0.87
0.90
0.86
0.85
0.91
0.86
0.87
0.84
0.88
0.82
0.85
0.89

548.40
547.45
549.44
557.41
546.51
550.11
562.42
848.66
847.90
849.69
861.11
847.23
850.23
865.06
947.91
947.22
948.92
955.90
946.83
948.80
962.38

13.28
14.22
12.13
9.53
13.14
11.52
6.68
17.76
18.47
16.79
8.39
17.82
15.10
11.40
14.76
15.63
13.45
9.04
14.59
13.74
7.86

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

151

Table D4. Internal Consistency Reliability Coefficients for Examinee Subgroups: Composition
Grade

4

7

10

Subgroup

N

Alpha

Stratified Alpha

Feldt-Raju

Mean

SD

All Examinees
Male
Female
Asian
African American
Hispanic
White
All Examinees
Male
Female
Asian
African American
Hispanic
White
All Examinees
Male
Female
Asian
African American
Hispanic
White

4,508
2,284
2,215
104
3,293
623
458
4,176
2,086
2,083
55
3,360
498
228
3,429
1,616
1,801
43
2,879
364
123

0.92
0.93
0.92
0.92
0.91
0.90
0.85
0.91
0.91
0.90
0.89
0.90
0.90
0.90
0.92
0.93
0.92
0.91
0.92
0.91
0.91

0.92
0.92
0.91
0.92
0.91
0.90
0.85
0.90
0.91
0.89
0.88
0.90
0.90
0.91
0.92
0.93
0.92
0.91
0.92
0.91
0.91

0.93
0.93
0.92
0.93
0.91
0.91
0.87
0.92
0.92
0.91
0.90
0.91
0.91
0.91
0.93
0.93
0.93
0.93
0.93
0.92
0.92

451.73
448.48
455.14
459.50
448.95
453.93
466.71
754.33
751.07
757.60
765.44
752.37
758.36
770.39
952.18
948.47
955.53
965.28
950.67
956.28
970.93

18.87
19.16
17.93
18.29
18.15
17.44
17.74
15.76
15.91
14.90
14.56
15.12
14.80
15.35
20.11
19.80
19.85
15.21
20.24
15.71
18.80

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

152

Appendix E: Classification Consistency and Accuracy Estimates for All
Proficiency Levels for Examinee Subgroups
Table E1. Classification Consistency and Accuracy Rates for All Cut Scores and Examinee
Subgroups: Reading
Grade/Subgroup

Classification Consistency
Consistency

Males
Females
Asian
African American
Hispanic
White

0.70
0.69
0.69
0.70
0.69
0.70

Males
Females
Asian
African American
Hispanic
White

0.79
0.78
0.73
0.79
0.78
0.77

Males
Females
Asian
African American
Hispanic
White

0.78
0.77
0.78
0.77
0.77
0.80

Males
Females
Asian
African American
Hispanic
White

0.81
0.76
0.76
0.76
0.76
0.74

Males
Females
Asian
African American
Hispanic
White

0.78
0.76
0.75
0.77
0.77
0.75

Kappa
Grade 2
0.58
0.56
0.53
0.56
0.56
0.52
Grade 3
0.70
0.68
0.54
0.69
0.67
0.56
Grade 4
0.67
0.65
0.66
0.66
0.65
0.61
Grade 5
0.74
0.64
0.63
0.64
0.63
0.54
Grade 6
0.67
0.63
0.61
0.65
0.64
0.57

Classification Accuracy
False
False
Accuracy
Positive
Negative
Errors
Errors
0.79
0.77
0.78
0.78
0.78
0.78

0.10
0.11
0.12
0.10
0.11
0.12

0.11
0.12
0.10
0.12
0.12
0.10

0.85
0.84
0.80
0.85
0.85
0.84

0.07
0.07
0.10
0.06
0.07
0.07

0.08
0.09
0.11
0.08
0.08
0.09

0.84
0.84
0.84
0.84
0.84
0.72

0.07
0.07
0.07
0.07
0.07
0.02

0.09
0.09
0.08
0.09
0.09
0.26

0.74
0.83
0.83
0.83
0.83
0.81

0.08
0.09
0.10
0.08
0.08
0.10

0.18
0.09
0.07
0.09
0.09
0.10

0.85
0.83
0.82
0.84
0.83
0.82

0.07
0.08
0.09
0.08
0.09
0.09

0.08
0.09
0.09
0.09
0.08
0.09

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

153

Table E1. Classification Consistency and Accuracy Rates for All Cut Scores and Examinee
Subgroups: Reading (continued)
Males
Females
Asian
African American
Hispanic
White

0.72
0.71
0.73
0.71
0.70
0.79

Males
Females
Asian
African American
Hispanic
White

0.72
0.73
0.73
0.72
0.72
0.77

Males
Females
Asian
African American
Hispanic
White

0.74
0.74
0.77
0.74
0.74
0.80

Males
Females
Asian
African American
Hispanic
White

0.75
0.74
0.74
0.75
0.72
0.76

Grade 7
0.59
0.58
0.59
0.58
0.56
0.62
Grade 8
0.60
0.62
0.60
0.59
0.58
0.60
Grade 9
0.62
0.64
0.65
0.63
0.64
0.69
Grade 10
0.64
0.62
0.63
0.63
0.60
0.57

0.80
0.79
0.80
0.79
0.78
0.85

0.10
0.10
0.09
0.10
0.11
0.09

0.10
0.11
0.10
0.11
0.11
0.06

0.81
0.81
0.81
0.80
0.80
0.84

0.09
0.09
0.11
0.09
0.10
0.09

0.10
0.10
0.08
0.10
0.10
0.07

0.82
0.82
0.85
0.81
0.82
0.86

0.09
0.09
0.06
0.09
0.09
0.06

0.10
0.09
0.09
0.09
0.09
0.08

0.83
0.82
0.82
0.82
0.81
0.83

0.07
0.09
0.09
0.08
0.09
0.10

0.10
0.09
0.09
0.10
0.10
0.08

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

154

Table E2. Classification Consistency and Accuracy Rates for All Cut Scores and Examinee
Subgroups: Mathematics
Grade/Subgroup

Classification Consistency
Consistency

Males
Females
Asian
African American
Hispanic
White

0.72
0.72
0.73
0.72
0.71
0.74

Males
Females
Asian
African American
Hispanic
White

0.78
0.78
0.76
0.84
0.76
0.74

Males
Females
Asian
African American
Hispanic
White

0.77
0.77
0.82
0.76
0.77
0.79

Males
Females
Asian
African American
Hispanic
White

0.77
0.76
0.78
0.76
0.76
0.76

Males
Females
Asian
African American
Hispanic
White

0.76
0.76
0.94
0.76
0.76
0.82

Males
Females
Asian
African American
Hispanic
White

0.75
0.75
0.80
0.74
0.76
0.84

Kappa
Grade 2
0.61
0.60
0.58
0.60
0.59
0.55
Grade 3
0.69
0.69
0.63
0.78
0.65
0.59
Grade 4
0.67
0.67
0.72
0.66
0.66
0.64
Grade 5
0.67
0.66
0.66
0.67
0.66
0.61
Grade 6
0.67
0.66
0.88
0.65
0.66
0.67
Grade 7
0.64
0.63
0.65
0.62
0.64
0.70

Classification Accuracy
False
False
Accuracy
Positive
Negative
Errors
Errors
0.79
0.79
0.78
0.79
0.78
0.76

0.11
0.10
0.10
0.10
0.10
0.14

0.11
0.11
0.12
0.11
0.11
0.10

0.85
0.85
0.83
0.79
0.83
0.82

0.07
0.07
0.08
0.04
0.08
0.09

0.08
0.08
0.09
0.17
0.10
0.09

0.83
0.83
0.87
0.83
0.83
0.85

0.08
0.08
0.06
0.08
0.08
0.08

0.08
0.08
0.07
0.08
0.09
0.07

0.83
0.83
0.83
0.83
0.83
0.82

0.08
0.09
0.08
0.08
0.08
0.09

0.09
0.09
0.09
0.09
0.09
0.09

0.83
0.83
0.81
0.83
0.83
0.87

0.08
0.09
0.03
0.08
0.09
0.06

0.08
0.08
0.16
0.09
0.07
0.07

0.82
0.82
0.85
0.81
0.83
0.89

0.08
0.09
0.08
0.09
0.09
0.06

0.09
0.09
0.06
0.10
0.08
0.06

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

155

Table E2. Classification Consistency and Accuracy Rates for All Cut Scores and Examinee
Subgroups: Mathematics (continued)
Males
Females
Asian
African American
Hispanic
White

0.72
0.73
0.79
0.72
0.73
0.83

Males
Females
Asian
African American
Hispanic
White

0.73
0.73
0.80
0.72
0.72
0.80

Grade 8
0.60
0.60
0.67
0.58
0.59
0.70
Grade 10
0.62
0.60
0.63
0.60
0.59
0.69

0.80
0.81
0.85
0.80
0.81
0.88

0.09
0.10
0.09
0.10
0.10
0.06

0.11
0.09
0.06
0.10
0.09
0.06

0.80
0.80
0.86
0.80
0.80
0.86

0.09
0.10
0.06
0.10
0.10
0.08

0.10
0.10
0.08
0.10
0.11
0.06

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

156

Table E3. Classification Consistency and Accuracy Rates for All Cut Scores and Examinee
Subgroups: Science/Biology
Grade/Subgroup

Classification Consistency
Consistency

Males
Females
Asian
African American
Hispanic
White

0.72
0.71
0.76
0.71
0.70
0.77

Males
Females
Asian
African American
Hispanic
White

0.68
0.68
0.74
0.67
0.66
0.79

Males
Females
Asian
African American
Hispanic
White

0.67
0.65
0.74
0.65
0.66
0.78

Kappa
Grade 5
0.60
0.58
0.63
0.57
0.55
0.61
Grade 8
0.54
0.54
0.59
0.52
0.51
0.65
High School
0.51
0.48
0.55
0.48
0.49
0.62

Classification Accuracy
False
False
Accuracy
Positive
Negative
Errors
Errors
0.80
0.80
0.82
0.80
0.79
0.83

0.10
0.10
0.08
0.10
0.10
0.08

0.10
0.11
0.10
0.11
0.11
0.09

0.76
0.76
0.82
0.76
0.75
0.85

0.10
0.11
0.09
0.10
0.12
0.08

0.13
0.13
0.09
0.14
0.14
0.07

0.75
0.74
0.81
0.74
0.75
0.84

0.11
0.13
0.09
0.13
0.12
0.08

0.14
0.13
0.10
0.14
0.13
0.08

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education

Technical Report for Spring 2012 Test Administration of DC CAS

157

Table E4. Classification Consistency and Accuracy Rates for All Cut Scores and Examinee
Subgroups: Composition
Grade/Subgroup

Classification Consistency
Consistency

Males
Females
Asian
African American
Hispanic
White

0.55
0.52
0.54
0.53
0.52
0.56

Males
Females
Asian
African American
Hispanic
White

0.59
0.57
0.60
0.57
0.59
0.70

Males
Females
Asian
African American
Hispanic
White

0.52
0.52
0.55
0.51
0.51
0.66

Kappa
Grade 4
0.37
0.34
0.35
0.35
0.33
0.34
Grade 7
0.43
0.40
0.39
0.40
0.43
0.44
Grade 10
0.34
0.34
0.34
0.34
0.32
0.42

Classification Accuracy
False
False
Accuracy
Positive
Negative
Errors
Errors
0.65
0.62
0.64
0.63
0.62
0.65

0.17
0.20
0.18
0.18
0.20
0.22

0.18
0.18
0.18
0.19
0.18
0.13

0.69
0.68
0.69
0.68
0.70
0.78

0.15
0.18
0.19
0.17
0.16
0.14

0.15
0.15
0.12
0.16
0.14
0.08

0.62
0.61
0.64
0.61
0.60
0.73

0.16
0.19
0.22
0.18
0.19
0.16

0.21
0.19
0.15
0.21
0.21
0.10

Copyright (C) 2012 by the District of Columbia Office of the State Superintendent of Education