SlideShare ist ein Scribd-Unternehmen logo
1 von 52
Presenter - John Cronin, Ph.D.
Contacting us:
NWEA Main Number: 503-624-1951
E-mail: rebecca.moore@nwea.org
This PowerPoint presentation and recommended resources are
available at our our Slideshare site
http://www.slideshare.net/JohnCronin4/colorado-assessment-
summitteachereval
Considerations when using tests for
teacher evaluation
Key Colorado requirements related to
testing
• Assessment constitutes 50% of the evaluation.
• Statewide summative assessments for subjects in which available.
Districts will be on their own for other subjects.
• Use of the Colorado Growth Model with statewide assessment.
• A measure of individually attributed or collectively attributed student
growth.
• Local measure must be credible, valid (aligned), reliable, and inferences
from the measure must be supportable by evidence and logic.
• The law requires that the measures should support consistent inferences.
• Rating of ineffective or partially effective can lead to loss of non-
probationary status.
• If a value-added model is used the model must be transparent enough to
permit external evaluation.
Unique characteristics of the
Colorado approach
• Student progress counts for 50% of the
evaluation.
• Teachers are evaluated on both a “catch up”
and “keep up” metric (at least on TCAP)
• The Colorado Growth Model will be used to
evaluate progress (at least on TCAP)
A finding of effectiveness or ineffectiveness is
more defensible when it is arrived at by:
1. Two or more assessments of different designs.
2. Two or more models of different designs.
3. As many cases as possible.
It is not good to choose tests or models for local
assessment in hopes that they will mimic the
state assessment.
If evaluators do not
differentiate their
ratings, then all
differentiation
comes from the test.
If performance
ratings aren’t
consistent with
school growth, that
will probably be
public information.
Results of Tennessee Teacher Evaluation
Pilot
0%
10%
20%
30%
40%
50%
60%
1 2 3 4 5
Value-added result
Observation Result
Results of Georgia Teacher Evaluation Pilot
Evaluator Rating
ineffective
Minimally Effective
Effective
Highly Effective
Bill and Melina Gates Foundation (2013, January). Ensuring Fair and Reliable Measures of Effective
Teaching: Culminating Findings from the MET Projects Three-Year Study
Observation by Reliability coefficient
(relative to state test
value-added gain)
Proportion of test
variance
explained
Model 1 – State test – 81%
Student surveys 17% Classroom
Observations – 2%
.51 26.0%
Model 2 – State test – 50%
Student Surveys – 25%
Classroom Observation – 25%
.66 43.5%
Model 3 – State test – 33% -
Student Surveys – 33%
Classroom Observations – 33%
.76 57.7%%
Model 4 – Classroom Observation
50%
State test – 25%
Student surveys – 25%
.75 56.2%
Reliability of evaluation weights in predicted
stability of student growth gains year to year
Bill and Melina Gates Foundation (2013, January). Ensuring Fair and Reliable Measures of Effective
Teaching: Culminating Findings from the MET Projects Three-Year Study
Observation by Reliability coefficient
(relative to state test
value-added gain)
Proportion of test
variance
explained
Principal – 1 .51 26.0%
Principal – 2 .58 33.6%
Principal and other administrator .67 44.9%
Principal and three short
observations by peer observers
.67 44.9%
Two principal observations and
two peer observations
.66 43.6%
Two principal observations and
two different peer observers
.69 47.6%
Two principal observations one
peer observation and three short
observations by peers
.72 51.8%
Reliability of a variety of teacher observation
implementations
Testing
Metric (Growth or Gain Score)
Analysis (Value Added Effect
Size and/or ranking)
Evaluation (Performance
Rating)
How tests are used to evaluate teachers and
principals
Issues in the use of growth measures
Instructional alignment
Tests used for teacher evaluation
must align to the teacher’s
instructional responsibilities.
Common problems with instructional
alignment
• Using school level math and reading
results in the evaluation of music,
art, and other specials teachers.
• Using general tests of a discipline
(reading, math, science) as a major
component of the evaluation high
school teachers delivering specialized
courses.
Florida Teachers Sue Over Evaluation System
New York Times, April 17, 2013
Seven Florida teachers have brought a federal lawsuit to protest job evaluation
policies that tether individual performance ratings to the test scores of students
who are not even in their classes. The suit, which was filed Tuesday in
conjunction with three local affiliates of the National Education Association in
Federal District Court for the Northern District of Florida in Gainesville, says
Florida’s two-year-old evaluation system violates teachers’ rights of due process
and equal protection. Under a 2011 law, schools and districts must evaluate
teachers in part based on how much their students learn, as measured by
standardized tests. But since Florida, like most states, administers only math and
reading tests and only in selected grades, many teachers do not teach tested
subjects. One of the plaintiffs, a first-grade teacher, was rated on the
basis of test scores of students in a different school in her
district, and another, who teaches vocational classes to aspiring
health care workers, was rated based on test scores of students in
grades and subjects she had never taught. “This lawsuit highlights the
absurdity of the current evaluation system,” said Andy Ford, president of the
Florida Education Association.
Expect consistent inconsistency!
Inconsistency occurs because
• Of differences in test design.
• Differences in testing conditions.
• Differences in models being applied to
evaluate growth.
Test Retest
Test 1
Time 1
Test 2
Time 1
Test 1
Time 2
Test 2
Time 2
The reliability problem –
Inconsistency in testing conditions
Test 1
Time 1
Test 2
Time 1
Test 1
Time 2
Test 2
Time 2
The reliability problem –
Inconsistency in testing conditions
Test 1
Time 1
Test 2
Time 1
Test 1
Time 2
Test 2
Time 2
Test 1
Time 1
Test 2
Time 1
Test 1
Time 2
Test 2
Time 2
The problem with spring-spring testing
3/11 4/11 5/11 6/11 7/11 8/11 9/11 10/11 11/11 12/11 1/12 2/12 3/12
Teacher 1 Summer Teacher 2
Characteristics of value-added metrics
• Value-added metrics are inherently NORMATIVE.
• If below average = partially effective then half of the
average staff will be partially effective.
• Value-added metrics can’t measure progress of the
larger group over time.
• Extreme performance is more likely to have alternate
explanations.
New York City
• Margins of error can be very large
• Increasing n doesn't always decrease the
margin of error
• The margin of error in math is typically less
than reading
Los Angeles Unified
• Teachers can easily rate in multiple categories
• The choice of model can have a large impact
• Models effect English more than Math
• Teachers do better in some subjects than
others
• More complex models don't necessarily favor
the teacher
“The findings indicate that these modeling
choices can significantly influence outcomes
for individual teachers, particularly those in
the tails of the performance distribution who
are most likely to be targeted by high-stakes
policies.”
Ballou, D., Mokher, C. and Cavalluzzo, L. (2012) Using Value-Added Assessment for Personnel
Decisions: How Omitted Variables and Model Specification Influence Teachers’ Outcomes.
Instability at the tails of the
distribution
LA Times Teacher #1
LA Times Teacher #2
“Significant evidence of bias plagued the value-added model
estimated for the Los Angeles Times in 2010, including significant
patterns of racial disparities in teacher ratings both by the race of
the student served and by the race of the teachers (see
Green, Baker and Oluwole, 2012). These model biases raise the
possibility that Title VII disparate impact claims might also be filed
by teachers dismissed on the basis of their value-added estimates.
Additional analyses of the data, including richer models using
additional variables mitigated substantial portions of the bias in the
LA Times models (Briggs & Domingue, 2010).”
Baker, B. (2012, April 28). If it’s not valid, reliability doesn’t
matter so much! More on VAM-ing & SGP-ing
Teacher Dismissal.
Possible racial bias in models
Inconsistency among the Colorado
Growth Model and other value-added
approaches.
Issues with the Colorado Growth
Model
• When applied to MAP it discards the
advantages of a cross-grade scale and robust
growth norms.
• It is a descriptive and not a causal model.
• As currently applied it does not control for
factors outside the teacher’s influence that
may affect student growth.
A brief commentary on the Colorado Growth
Model
It’s limitations
•It does not support inference.
•It does not take advantage of the
useful characteristics of a vertical
scale.
•It uses only prior scores and past
testing history to evaluate growth.
A brief commentary on the Colorado Growth
Model
Other limitations
•The model can’t be used for cross-
state comparisons.
• The model is problematic for
assessing long-term trends.
Measurement Issues
Moving from the model to
the teacher rating
Translating ranked data to ratings -
principles
• There is no “science” per se around translating a
ranking to a rating. If you call a bottom 40% teacher
ineffective that is a judgment.
• The rating process can be politicized.
• The process is easy to over-engineer.
New York Rating System
• 60 points assigned from classroom observation
• 20 points assigned from state assessment
• 20 points assigned from local assessment
• A score of 64 or less is rated ineffective.
Ineffective
(Growth
Measures)
Developing (Growth Measures) Effective (Growth Measures) Highly Effective (Growth Measures)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
Ineffective(Observational)
0 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
1 2 3 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6
2 2 4 5 6 6 6 7 7 7 7 7 8 8 8 8 8 8 8 8 8 8 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9
3 2 5 6 7 7 8 8 9 9 9 10 10 10 10 10 10 11 11 11 11 11 11 11 11 11 11 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12
4 3 5 7 8 9 9 10 10 11 11 11 12 12 12 12 13 13 13 13 13 13 14 14 14 14 14 14 14 14 14 15 15 15 15 15 15 15 15 15 15 15
5 3 6 8 9 10 11 11 12 12 13 13 14 14 14 14 15 15 15 15 16 16 16 16 16 16 16 17 17 17 17 17 17 17 17 17 18 18 18 18 18 18
6 3 6 8 10 11 12 13 13 14 14 15 15 16 16 16 17 17 17 17 18 18 18 18 18 19 19 19 19 19 19 19 20 20 20 20 20 20 20 20 20 21
7 3 7 9 11 12 13 14 15 15 16 16 17 17 18 18 18 19 19 19 20 20 20 20 20 21 21 21 21 21 22 22 22 22 22 22 22 23 23 23 23 23
8 3 7 10 11 13 14 15 16 17 17 18 18 19 19 20 20 20 21 21 21 22 22 22 23 23 23 23 23 24 24 24 24 24 24 25 25 25 25 25 25 25
9 3 8 10 12 14 15 16 17 18 18 19 20 20 21 21 22 22 23 23 23 24 24 24 24 25 25 25 25 26 26 26 26 26 27 27 27 27 27 27 28 28
10 3 8 11 13 14 16 17 18 19 20 20 21 22 22 23 23 24 24 25 25 25 26 26 26 27 27 27 27 28 28 28 28 29 29 29 29 29 29 30 30 30
11 3 8 11 13 15 17 18 19 20 21 22 22 23 24 24 25 25 26 26 27 27 27 28 28 28 29 29 29 30 30 30 30 31 31 31 31 31 32 32 32 32
12 4 8 12 14 16 17 19 20 21 22 23 24 24 25 26 26 27 27 28 28 29 29 29 30 30 30 31 31 31 32 32 32 33 33 33 33 33 34 34 34 34
13 4 9 12 14 16 18 20 21 22 23 24 25 26 26 27 28 28 29 29 30 30 31 31 31 32 32 33 33 33 34 34 34 34 35 35 35 35 36 36 36 36
14 4 9 12 15 17 19 20 22 23 24 25 26 27 27 28 29 30 30 31 31 32 32 33 33 33 34 34 35 35 35 36 36 36 37 37 37 37 38 38 38 38
15 4 9 13 15 18 19 21 23 24 25 26 27 28 29 29 30 31 31 32 33 33 34 34 35 35 35 36 36 37 37 37 38 38 38 39 39 39 40 40 40 40
Developing(Observational)
16 4 9 13 16 18 20 22 23 25 26 27 28 29 30 31 31 32 33 33 34 35 35 36 36 37 37 37 38 38 39 39 39 40 40 40 41 41 41 42 42 42
17 4 9 13 16 19 21 23 24 25 27 28 29 30 31 32 33 33 34 35 35 36 37 37 38 38 39 39 39 40 40 41 41 42 42 42 43 43 43 44 44 44
18 4 10 14 17 19 21 23 25 26 28 29 30 31 32 33 34 35 35 36 37 37 38 38 39 40 40 41 41 41 42 42 43 43 44 44 44 45 45 45 46 46
19 4 10 14 17 20 22 24 26 27 28 30 31 32 33 34 35 36 36 37 38 39 39 40 40 41 42 42 43 43 43 44 44 45 45 46 46 46 47 47 47 48
20 4 10 14 17 20 22 24 26 28 29 31 32 33 34 35 36 37 38 38 39 40 41 41 42 42 43 43 44 45 45 45 46 46 47 47 48 48 48 49 49 49
21 4 10 14 18 21 23 25 27 29 30 31 33 34 35 36 37 38 39 40 40 41 42 42 43 44 44 45 45 46 46 47 47 48 48 49 49 50 50 50 51 51
22 4 10 15 18 21 23 26 27 29 31 32 34 35 36 37 38 39 40 41 42 42 43 44 44 45 46 46 47 47 48 48 49 49 50 50 51 51 52 52 52 53
23 4 10 15 18 21 24 26 28 30 31 33 34 36 37 38 39 40 41 42 43 43 44 45 46 46 47 48 48 49 49 50 50 51 51 52 52 53 53 54 54 54
24 4 11 15 19 22 24 27 29 31 32 34 35 36 38 39 40 41 42 43 44 45 45 46 47 48 48 49 50 50 51 51 52 52 53 53 54 54 55 55 56 56
25 4 11 15 19 22 25 27 29 31 33 34 36 37 39 40 41 42 43 44 45 46 47 47 48 49 50 50 51 52 52 53 53 54 54 55 55 56 56 57 57 58
26 4 11 16 19 23 25 28 30 32 34 35 37 38 39 41 42 43 44 45 46 47 48 49 49 50 51 51 52 53 53 54 55 55 56 56 57 57 58 58 59 59
27 4 11 16 20 23 26 28 30 32 34 36 37 39 40 42 43 44 45 46 47 48 49 50 50 51 52 53 53 54 55 55 56 57 57 58 58 59 59 60 60 61
28 4 11 16 20 23 26 29 31 33 35 37 38 40 41 42 44 45 46 47 48 49 50 51 52 52 53 54 55 55 56 57 57 58 59 59 60 60 61 61 62 62
29 4 11 16 20 24 26 29 31 34 35 37 39 40 42 43 45 46 47 48 49 50 51 52 53 54 54 55 56 57 57 58 59 59 60 61 61 62 62 63 63 64
30 4 11 16 20 24 27 30 32 34 36 38 40 41 43 44 45 47 48 49 50 51 52 53 54 55 56 56 57 58 59 59 60 61 61 62 62 63 64 64 65 65
Effective(Observational)
31 4 11 17 21 24 27 30 32 35 37 39 40 42 43 45 46 47 49 50 51 52 53 54 55 56 57 57 58 59 60 61 61 62 63 63 64 64 65 66 66 67
32 4 11 17 21 25 28 30 33 35 37 39 41 43 44 46 47 48 50 51 52 53 54 55 56 57 58 59 59 60 61 62 62 63 64 64 65 66 66 67 68 68
33 4 12 17 21 25 28 31 33 36 38 40 42 43 45 46 48 49 50 52 53 54 55 56 57 58 59 60 61 61 62 63 64 64 65 66 66 67 68 68 69 69
34 4 12 17 21 25 28 31 34 36 38 40 42 44 46 47 49 50 51 53 54 55 56 57 58 59 60 61 62 63 63 64 65 66 66 67 68 68 69 70 70 71
35 4 12 17 22 25 29 32 34 37 39 41 43 45 46 48 49 51 52 53 55 56 57 58 59 60 61 62 63 64 64 65 66 67 68 68 69 70 70 71 72 72
36 4 12 17 22 26 29 32 35 37 39 41 43 45 47 49 50 52 53 54 55 57 58 59 60 61 62 63 64 65 66 66 67 68 69 69 70 71 72 72 73 74
37 4 12 17 22 26 29 32 35 38 40 42 44 46 48 49 51 52 54 55 56 58 59 60 61 62 63 64 65 66 67 68 68 69 70 71 71 72 73 74 74 75
38 4 12 18 22 26 30 33 36 38 40 43 45 46 48 50 52 53 55 56 57 58 60 61 62 63 64 65 66 67 68 69 69 70 71 72 73 73 74 75 75 76
39 4 12 18 22 26 30 33 36 39 41 43 45 47 49 51 52 54 55 57 58 59 61 62 63 64 65 66 67 68 69 70 71 71 72 73 74 75 75 76 77 77
40 4 12 18 23 27 30 33 36 39 41 44 46 48 50 51 53 55 56 57 59 60 61 63 64 65 66 67 68 69 70 71 72 73 73 74 75 76 77 77 78 79
41 4 12 18 23 27 31 34 37 39 42 44 46 48 50 52 54 55 57 58 60 61 62 63 65 66 67 68 69 70 71 72 73 74 75 75 76 77 78 78 79 80
42 5 12 18 23 27 31 34 37 40 42 45 47 49 51 53 54 56 58 59 60 62 63 64 66 67 68 69 70 71 72 73 74 75 76 76 77 78 79 80 80 81
43 5 12 18 23 27 31 34 37 40 43 45 47 49 51 53 55 57 58 60 61 63 64 65 66 68 69 70 71 72 73 74 75 76 77 78 78 79 80 81 82 82
44 5 12 18 23 28 31 35 38 41 43 46 48 50 52 54 56 57 59 60 62 63 65 66 67 69 70 71 72 73 74 75 76 77 78 79 80 80 81 82 83 84
45 5 13 19 24 28 32 35 38 41 44 46 48 51 53 54 56 58 60 61 63 64 66 67 68 69 71 72 73 74 75 76 77 78 79 80 81 82 82 83 84 85
HighlyEffective(Observational)
46 5 13 19 24 28 32 35 39 41 44 47 49 51 53 55 57 59 60 62 63 65 66 68 69 70 71 73 74 75 76 77 78 79 80 81 82 83 83 84 85 86
47 5 13 19 24 28 32 36 39 42 45 47 49 52 54 56 58 59 61 63 64 66 67 69 70 71 72 74 75 76 77 78 79 80 81 82 83 84 85 85 86 87
48 5 13 19 24 29 32 36 39 42 45 47 50 52 54 56 58 60 62 63 65 66 68 69 71 72 73 74 76 77 78 79 80 81 82 83 84 85 86 87 87 88
49 5 13 19 24 29 33 36 40 43 45 48 50 53 55 57 59 61 62 64 66 67 69 70 71 73 74 75 77 78 79 80 81 82 83 84 85 86 87 88 89 89
50 5 13 19 24 29 33 37 40 43 46 48 51 53 55 57 59 61 63 65 66 68 69 71 72 74 75 76 77 79 80 81 82 83 84 85 86 87 88 89 90 90
51 5 13 19 25 29 33 37 40 43 46 49 51 54 56 58 60 62 64 65 67 69 70 72 73 74 76 77 78 79 81 82 83 84 85 86 87 88 89 90 91 92
52 5 13 19 25 29 33 37 41 44 47 49 52 54 56 58 61 62 64 66 68 69 71 72 74 75 77 78 79 80 82 83 84 85 86 87 88 89 90 91 92 93
53 5 13 19 25 30 34 37 41 44 47 50 52 55 57 59 61 63 65 67 68 70 72 73 75 76 77 79 80 81 82 84 85 86 87 88 89 90 91 92 93 94
54 5 13 20 25 30 34 38 41 44 47 50 53 55 57 60 62 64 66 67 69 71 72 74 75 77 78 80 81 82 83 85 86 87 88 89 90 91 92 93 94 95
55 5 13 20 25 30 34 38 41 45 48 50 53 56 58 60 62 64 66 68 70 71 73 75 76 78 79 80 82 83 84 85 87 88 89 90 91 92 93 94 95 96
56 5 13 20 25 30 34 38 42 45 48 51 54 56 58 61 63 65 67 69 70 72 74 75 77 78 80 81 82 84 85 86 87 89 90 91 92 93 94 95 96 97
57 5 13 20 25 30 35 38 42 45 48 51 54 56 59 61 63 65 67 69 71 73 74 76 78 79 81 82 83 85 86 87 88 90 91 92 93 94 95 96 97 98
58 5 13 20 26 30 35 39 42 46 49 52 54 57 59 62 64 66 68 70 72 73 75 77 78 80 81 83 84 85 87 88 89 90 92 93 94 95 96 97 98 99
59 5 13 20 26 31 35 39 43 46 49 52 55 57 60 62 64 66 68 70 72 74 76 77 79 81 82 83 85 86 88 89 90 91 92 94 95 96 97 98 99 100
60 5 13 20 26 31 35 39 43 46 49 52 55 58 60 63 65 67 69 71 73 75 76 78 80 81 83 84 86 87 88 90 91 92 93 95 96 97 98 99 100 101
Cheating
Atlanta Public Schools
Crescendo Charter Schools
Philadelphia Public Schools
Washington DC Public Schools
Houston Independent School
District
Michigan Public Schools
Unintended Consequences?
• Many principals and teachers (including good ones)
will seek schools or teaching assignments that they
think will improve their results.
• Principals and teachers may game the system,
inadvertently or intentionally.
• Many teachers will seek opportunities to avoid
grades with standardized tests.
• Ranking metrics can discourage cooperation among
principals and teachers – finding ways to reward
teamwork and cooperation are important.
Case Study #1 - Mean value-added performance in mathematics by
school – fall to spring
-8.00
-6.00
-4.00
-2.00
0.00
2.00
4.00
6.00
Case Study #1 - Mean spring and fall test duration in minutes by
school
0.00
10.00
20.00
30.00
40.00
50.00
60.00
70.00
80.00
90.00
Spring term
Fall term
-10.00
-8.00
-6.00
-4.00
-2.00
0.00
2.00
4.00
6.00
8.00
Students taking 10+ minutes longer spring than fall All other students
Case Study #1 - Mean value-added growth by school and test
duration
Differences in fall-spring test durations
Case Study # 2
15%
25%
60%
Mathematics
Spring < Fall Spring = Fall Spring > Fall
0.0
1.0
2.0
3.0
4.0
5.0
6.0
Spring < Fall Spring = Fall Spring > Fall
GrowthIndex
Mathematics
Differences in growth index score
based on fall-spring test durations
Case Study # 2
42%
33%
25%
Fall < Spring Fall = Spring Fall > Spring
-5.0
-4.5
-4.0
-3.5
-3.0
-2.5
-2.0
-1.5
-1.0
-0.5
0.0
Fall < Spring Fall = Spring Fall >Spring
Differences in spring -fall test durations Differences in raw growth based by
spring-fall test duration
How much of summer loss is really summer loss?
Case Study # 2
0.0
1.0
2.0
3.0
4.0
5.0
6.0
7.0
8.0
9.0
10.0
0
20
40
60
80
100
120
140
160
180
200
GrowthIndex
Minutes
School
Growth Index Fall test duration Spring test duration
Differences in fall-spring test duration (yellow-black) and
Differences in growth index scores (green) by school
Negotiated goals – Student Learning
Objectives
• Negotiated goals (SLOs) are likely to be
necessary in some subjects.
• It is difficult to set fair and reasonable goals
for improvement absent norms or context.
• It is likely that some goals will be absurdly high
and others way too low.
Ways to evaluate the attainability of a goal
• Prior performance
• Performance of peers within the system
• Performance of a norming group
One approach to evaluating the attainment
of goals.
Students in La Brea Elementary School show
mathematics growth equivalent to only 2/3 of the
average for students in their grade.
Level 4 – (Aspirational) – Students in La Brea Elementary School will
improve their mathematics growth equivalent to 1.5 times the average
for their grade.
Level 3 – (Proficient) Students in La Brea Elementary School will
improve their mathematics growth equivalent to the average for their
grade.
Level 2 – (Marginal) Students in La Brea Elementary School will
improve their mathematics growth relative to last year.
Level 1 – (Unacceptable) Students in La Brea Elementary School
do not improve their mathematics growth relative to last year.
Is this goal attainable?
62% of students at John Glenn Elementary met or exceeded
proficiency in Reading/Literature last year. Their goal is to improve
their rate to 82% this year. Is the goal attainable?
362 351 291
173
73 14 3
0
100
200
300
400
Growth
> -30%
> -20% > -10% > 0% > 10% > 20% > 30%
Oregon schools – change in
Reading/Literature proficiency 2009-10 to
2010-11 among schools that started with
60% proficiency rates
Is this goal attainable and
rigorous?
45% of the students at La Brea elementary showed average growth or
better last year. Their goal is to improve that rate to 50% this year. Is
their goal reasonable?
0%
20%
40%
60%
80%
100%
Students with average or better annual
growth in Repus school district
The selection of metrics matters
Students at LaBrea Elementary School
will show growth equivalent to 150% of
grade level.
Students at Etsaw Middle School will
show growth equivalent to 150% of grade
level.
Scale score growth relative to NWEA’s
growth norm in mathematics
-1.0
0.0
1.0
2.0
3.0
4.0
5.0
6.0
7.0
2 3 4 5 6 7 8 9
ScaleScoreGrowth
Growth Index
Percent of a year’s growth in
mathematics
0%
20%
40%
60%
80%
100%
120%
140%
160%
180%
200%
2 3 4 5 6 7 8 9
PercentofaYear’sGrowth
Mathematics
Assessing the difficult to measure
• Encourage use of performance assessment and rubrics.
• Encourage outside scoring
– Use of peers in other buildings, professionals in the field,
contest judges
• Make use of resources
– Music educator, art educator, vocational professional
associations
– Available models – AP art portfolio.
– Use your intermediate agency
– Work across buildings
• Make use of classroom observation.
Possible legal issues
• Title VII of the Civil Rights Act of 1964 –
Disparate impact of sanctions on a protected
group.
• State statutes that provide tenure and other
related protections to teachers.
• Challenges to a finding of “incompetence”
stemming from the growth or value-added
data.
Recommendations
• Embrace the formative advantages of growth
measurement as well as the summative.
• Create comprehensive evaluation systems with
multiple measures of teacher effectiveness (Rand,
2010)
• Select measures as carefully as value-added models.
• Use multiple years of student achievement data.
• Understand the issues and the tradeoffs.
Presenter - John Cronin, Ph.D.
Contacting us:
NWEA Main Number: 503-624-1951
E-mail: rebecca.moore@nwea.org
This PowerPoint presentation and recommended
resources are available at our Slideshare site
http://www.slideshare.net/JohnCronin4/colorado-
assessment-summitteachereval
Thank you for attending this event

Weitere ähnliche Inhalte

Was ist angesagt?

MET_Gathering_Feedback_Practioner_Brief
MET_Gathering_Feedback_Practioner_BriefMET_Gathering_Feedback_Practioner_Brief
MET_Gathering_Feedback_Practioner_BriefPaul Fleischman
 
Dr. Lautrice Nickson, PhD Dissertation Defense, Dr. William Allan Kritsonis, ...
Dr. Lautrice Nickson, PhD Dissertation Defense, Dr. William Allan Kritsonis, ...Dr. Lautrice Nickson, PhD Dissertation Defense, Dr. William Allan Kritsonis, ...
Dr. Lautrice Nickson, PhD Dissertation Defense, Dr. William Allan Kritsonis, ...William Kritsonis
 
Teacher opinions about the use of Value-Added models
Teacher opinions about the use of Value-Added models Teacher opinions about the use of Value-Added models
Teacher opinions about the use of Value-Added models llee18
 
Oral Defense presentation
Oral Defense presentationOral Defense presentation
Oral Defense presentationDwayne Squires
 
Wosmek Review Paper Presentation
Wosmek Review Paper PresentationWosmek Review Paper Presentation
Wosmek Review Paper PresentationBrad Wosmek
 
University of Houston DLN presentation
University of Houston DLN presentationUniversity of Houston DLN presentation
University of Houston DLN presentationnasateach
 
CypherWorx OST Effiacy Study Results 2015
CypherWorx OST Effiacy Study Results 2015CypherWorx OST Effiacy Study Results 2015
CypherWorx OST Effiacy Study Results 2015Steve Stookey
 
Post 7. comparative and non comparative evaluation in educational technology
Post 7. comparative and non comparative evaluation in educational technologyPost 7. comparative and non comparative evaluation in educational technology
Post 7. comparative and non comparative evaluation in educational technologymazin
 
Quantitative External Project: Kentucky Professional Development Framework Im...
Quantitative External Project: Kentucky Professional Development Framework Im...Quantitative External Project: Kentucky Professional Development Framework Im...
Quantitative External Project: Kentucky Professional Development Framework Im...LMweas
 
Comparative And Non Comparative Study
Comparative And Non Comparative StudyComparative And Non Comparative Study
Comparative And Non Comparative Studyu065932
 
Keynote: 7th eSTEeM Annual Conference Critical discussion of Student Evaluati...
Keynote: 7th eSTEeM Annual Conference Critical discussion of Student Evaluati...Keynote: 7th eSTEeM Annual Conference Critical discussion of Student Evaluati...
Keynote: 7th eSTEeM Annual Conference Critical discussion of Student Evaluati...Bart Rienties
 
Comparative and non-comparative evaluation in Educational technology
Comparative and non-comparative evaluation in Educational technologyComparative and non-comparative evaluation in Educational technology
Comparative and non-comparative evaluation in Educational technologyAlaa Sadik
 
Attendance and student performance arp (1)
Attendance and student performance arp (1)Attendance and student performance arp (1)
Attendance and student performance arp (1)Cindy Paynter
 
Using Data for Continuos School Improvement
Using Data for Continuos School ImprovementUsing Data for Continuos School Improvement
Using Data for Continuos School Improvementlindamtz88
 

Was ist angesagt? (19)

MET_Gathering_Feedback_Practioner_Brief
MET_Gathering_Feedback_Practioner_BriefMET_Gathering_Feedback_Practioner_Brief
MET_Gathering_Feedback_Practioner_Brief
 
Dr. Lautrice Nickson, PhD Dissertation Defense, Dr. William Allan Kritsonis, ...
Dr. Lautrice Nickson, PhD Dissertation Defense, Dr. William Allan Kritsonis, ...Dr. Lautrice Nickson, PhD Dissertation Defense, Dr. William Allan Kritsonis, ...
Dr. Lautrice Nickson, PhD Dissertation Defense, Dr. William Allan Kritsonis, ...
 
Teacher opinions about the use of Value-Added models
Teacher opinions about the use of Value-Added models Teacher opinions about the use of Value-Added models
Teacher opinions about the use of Value-Added models
 
Oral Defense presentation
Oral Defense presentationOral Defense presentation
Oral Defense presentation
 
Wosmek Review Paper Presentation
Wosmek Review Paper PresentationWosmek Review Paper Presentation
Wosmek Review Paper Presentation
 
SERF pres
SERF presSERF pres
SERF pres
 
University of Houston DLN presentation
University of Houston DLN presentationUniversity of Houston DLN presentation
University of Houston DLN presentation
 
CypherWorx OST Effiacy Study Results 2015
CypherWorx OST Effiacy Study Results 2015CypherWorx OST Effiacy Study Results 2015
CypherWorx OST Effiacy Study Results 2015
 
Post 7. comparative and non comparative evaluation in educational technology
Post 7. comparative and non comparative evaluation in educational technologyPost 7. comparative and non comparative evaluation in educational technology
Post 7. comparative and non comparative evaluation in educational technology
 
Quantitative External Project: Kentucky Professional Development Framework Im...
Quantitative External Project: Kentucky Professional Development Framework Im...Quantitative External Project: Kentucky Professional Development Framework Im...
Quantitative External Project: Kentucky Professional Development Framework Im...
 
Comparative And Non Comparative Study
Comparative And Non Comparative StudyComparative And Non Comparative Study
Comparative And Non Comparative Study
 
Keynote: 7th eSTEeM Annual Conference Critical discussion of Student Evaluati...
Keynote: 7th eSTEeM Annual Conference Critical discussion of Student Evaluati...Keynote: 7th eSTEeM Annual Conference Critical discussion of Student Evaluati...
Keynote: 7th eSTEeM Annual Conference Critical discussion of Student Evaluati...
 
Comparative and non-comparative evaluation in Educational technology
Comparative and non-comparative evaluation in Educational technologyComparative and non-comparative evaluation in Educational technology
Comparative and non-comparative evaluation in Educational technology
 
Action research proposal ppt
Action research proposal pptAction research proposal ppt
Action research proposal ppt
 
Debunking danielson
Debunking danielsonDebunking danielson
Debunking danielson
 
Attendance and student performance arp (1)
Attendance and student performance arp (1)Attendance and student performance arp (1)
Attendance and student performance arp (1)
 
Feedback as Dialogue
Feedback as DialogueFeedback as Dialogue
Feedback as Dialogue
 
Putting the Results into Practice
Putting the Results into PracticePutting the Results into Practice
Putting the Results into Practice
 
Using Data for Continuos School Improvement
Using Data for Continuos School ImprovementUsing Data for Continuos School Improvement
Using Data for Continuos School Improvement
 

Andere mochten auch

Forests burning at alarming rates in Canada, Russia
Forests burning at alarming rates in Canada, RussiaForests burning at alarming rates in Canada, Russia
Forests burning at alarming rates in Canada, Russiachildlikegem4873
 
Should I Share??
Should I Share??Should I Share??
Should I Share??john wayne
 
Decision making process
Decision making processDecision making process
Decision making processsenfina
 
Blog estefany avila
Blog estefany avilaBlog estefany avila
Blog estefany avilaLorena Peña
 
JAWS-UG ◯◯◯の紹介
JAWS-UG ◯◯◯の紹介JAWS-UG ◯◯◯の紹介
JAWS-UG ◯◯◯の紹介uchimanajet7
 
Collaborative Knowledge Foundation: Systemic Change in Knowledge Production
Collaborative Knowledge Foundation: Systemic Change in Knowledge ProductionCollaborative Knowledge Foundation: Systemic Change in Knowledge Production
Collaborative Knowledge Foundation: Systemic Change in Knowledge ProductionCollaborative Knowledge Foundation
 
2 ptahy -_symvoly
2 ptahy -_symvoly2 ptahy -_symvoly
2 ptahy -_symvolydenyshi123
 
Jackie Coetzee Letter
Jackie Coetzee LetterJackie Coetzee Letter
Jackie Coetzee LetterTamara Tallie
 
4 ohorona ptahiv
4 ohorona ptahiv4 ohorona ptahiv
4 ohorona ptahivdenyshi123
 
Analyses superieures pour 2012 sur choisir les questions fondamentales de l'h...
Analyses superieures pour 2012 sur choisir les questions fondamentales de l'h...Analyses superieures pour 2012 sur choisir les questions fondamentales de l'h...
Analyses superieures pour 2012 sur choisir les questions fondamentales de l'h...riadmarrakech4559
 
NENA Additional Data Structures - Buildings
NENA Additional Data Structures - BuildingsNENA Additional Data Structures - Buildings
NENA Additional Data Structures - BuildingsDMacP
 
Презентация Кирилла Алексеева на Georgia Gaming Congress 2015
Презентация Кирилла Алексеева на Georgia Gaming Congress 2015Презентация Кирилла Алексеева на Georgia Gaming Congress 2015
Презентация Кирилла Алексеева на Georgia Gaming Congress 2015Betting Business Russia
 
Evaluación de altas capacidades manual
Evaluación de altas capacidades manualEvaluación de altas capacidades manual
Evaluación de altas capacidades manualMercedes Riviere
 

Andere mochten auch (17)

Forests burning at alarming rates in Canada, Russia
Forests burning at alarming rates in Canada, RussiaForests burning at alarming rates in Canada, Russia
Forests burning at alarming rates in Canada, Russia
 
PopMail
PopMailPopMail
PopMail
 
Should I Share??
Should I Share??Should I Share??
Should I Share??
 
Decision making process
Decision making processDecision making process
Decision making process
 
Blog estefany avila
Blog estefany avilaBlog estefany avila
Blog estefany avila
 
careerexpo'15
careerexpo'15careerexpo'15
careerexpo'15
 
JAWS-UG ◯◯◯の紹介
JAWS-UG ◯◯◯の紹介JAWS-UG ◯◯◯の紹介
JAWS-UG ◯◯◯の紹介
 
Collaborative Knowledge Foundation: Systemic Change in Knowledge Production
Collaborative Knowledge Foundation: Systemic Change in Knowledge ProductionCollaborative Knowledge Foundation: Systemic Change in Knowledge Production
Collaborative Knowledge Foundation: Systemic Change in Knowledge Production
 
2 ptahy -_symvoly
2 ptahy -_symvoly2 ptahy -_symvoly
2 ptahy -_symvoly
 
Control de-medicamentos naproxeno
Control de-medicamentos naproxenoControl de-medicamentos naproxeno
Control de-medicamentos naproxeno
 
Jackie Coetzee Letter
Jackie Coetzee LetterJackie Coetzee Letter
Jackie Coetzee Letter
 
4 ohorona ptahiv
4 ohorona ptahiv4 ohorona ptahiv
4 ohorona ptahiv
 
Analyses superieures pour 2012 sur choisir les questions fondamentales de l'h...
Analyses superieures pour 2012 sur choisir les questions fondamentales de l'h...Analyses superieures pour 2012 sur choisir les questions fondamentales de l'h...
Analyses superieures pour 2012 sur choisir les questions fondamentales de l'h...
 
Design Upstream
Design UpstreamDesign Upstream
Design Upstream
 
NENA Additional Data Structures - Buildings
NENA Additional Data Structures - BuildingsNENA Additional Data Structures - Buildings
NENA Additional Data Structures - Buildings
 
Презентация Кирилла Алексеева на Georgia Gaming Congress 2015
Презентация Кирилла Алексеева на Georgia Gaming Congress 2015Презентация Кирилла Алексеева на Georgia Gaming Congress 2015
Презентация Кирилла Алексеева на Georgia Gaming Congress 2015
 
Evaluación de altas capacidades manual
Evaluación de altas capacidades manualEvaluación de altas capacidades manual
Evaluación de altas capacidades manual
 

Ähnlich wie Colorado assessment summit_teacher_eval

Teacher evaluation presentation3 mass
Teacher evaluation presentation3  massTeacher evaluation presentation3  mass
Teacher evaluation presentation3 massJohn Cronin
 
Ed Reform Lecture - University of Arkansas
Ed Reform Lecture - University of ArkansasEd Reform Lecture - University of Arkansas
Ed Reform Lecture - University of ArkansasJohn Cronin
 
Connecticut mesuring and modeling growth
Connecticut   mesuring and modeling growthConnecticut   mesuring and modeling growth
Connecticut mesuring and modeling growthJohn Cronin
 
Connecticut mesuring and modeling growth
Connecticut   mesuring and modeling growthConnecticut   mesuring and modeling growth
Connecticut mesuring and modeling growthJohn Cronin
 
Connecticut mesuring and modeling growth
Connecticut   mesuring and modeling growthConnecticut   mesuring and modeling growth
Connecticut mesuring and modeling growthJohn Cronin
 
Using tests for teacher evaluation texas
Using tests for teacher evaluation texasUsing tests for teacher evaluation texas
Using tests for teacher evaluation texasNWEA
 
National Superintendent's Dialogue
National Superintendent's DialogueNational Superintendent's Dialogue
National Superintendent's DialogueNWEA
 
Teacher evaluation and goal setting connecticut
Teacher evaluation and goal setting   connecticutTeacher evaluation and goal setting   connecticut
Teacher evaluation and goal setting connecticutJohn Cronin
 
Taking control of the South Carolina Teacher Evaluation framework
Taking control of the South Carolina Teacher Evaluation frameworkTaking control of the South Carolina Teacher Evaluation framework
Taking control of the South Carolina Teacher Evaluation frameworkNWEA
 
NWEA Growth and Teacher evaluation VA 9-13
NWEA Growth and Teacher evaluation VA 9-13NWEA Growth and Teacher evaluation VA 9-13
NWEA Growth and Teacher evaluation VA 9-13NWEA
 
Maximizing student assessment systems cronin
Maximizing student assessment systems   croninMaximizing student assessment systems   cronin
Maximizing student assessment systems croninJohn Cronin
 
Teacher evaluation presentation oregon
Teacher evaluation presentation   oregonTeacher evaluation presentation   oregon
Teacher evaluation presentation oregonJohn Cronin
 
Using student test scores to measure principal performance inee spain march 2...
Using student test scores to measure principal performance inee spain march 2...Using student test scores to measure principal performance inee spain march 2...
Using student test scores to measure principal performance inee spain march 2...Instituto Nacional de Evaluación Educativa
 
Maximizing student assessment systems cronin
Maximizing student assessment systems   croninMaximizing student assessment systems   cronin
Maximizing student assessment systems croninNWEA
 
Using Assessment Data for Educator and Student Growth
Using Assessment Data for Educator and Student GrowthUsing Assessment Data for Educator and Student Growth
Using Assessment Data for Educator and Student GrowthNWEA
 
Standardized Tests.pdf
Standardized Tests.pdfStandardized Tests.pdf
Standardized Tests.pdfvuminyembe
 
IASB Student Growth Presentation
IASB Student Growth PresentationIASB Student Growth Presentation
IASB Student Growth PresentationRichard Voltz
 
NYSCOSS Conference Superintendents Training on Assessment 9 14
NYSCOSS Conference Superintendents Training on Assessment 9 14NYSCOSS Conference Superintendents Training on Assessment 9 14
NYSCOSS Conference Superintendents Training on Assessment 9 14NWEA
 

Ähnlich wie Colorado assessment summit_teacher_eval (20)

Teacher evaluation presentation3 mass
Teacher evaluation presentation3  massTeacher evaluation presentation3  mass
Teacher evaluation presentation3 mass
 
Ed Reform Lecture - University of Arkansas
Ed Reform Lecture - University of ArkansasEd Reform Lecture - University of Arkansas
Ed Reform Lecture - University of Arkansas
 
Connecticut mesuring and modeling growth
Connecticut   mesuring and modeling growthConnecticut   mesuring and modeling growth
Connecticut mesuring and modeling growth
 
Connecticut mesuring and modeling growth
Connecticut   mesuring and modeling growthConnecticut   mesuring and modeling growth
Connecticut mesuring and modeling growth
 
Connecticut mesuring and modeling growth
Connecticut   mesuring and modeling growthConnecticut   mesuring and modeling growth
Connecticut mesuring and modeling growth
 
Using tests for teacher evaluation texas
Using tests for teacher evaluation texasUsing tests for teacher evaluation texas
Using tests for teacher evaluation texas
 
National Superintendent's Dialogue
National Superintendent's DialogueNational Superintendent's Dialogue
National Superintendent's Dialogue
 
Teacher evaluation and goal setting connecticut
Teacher evaluation and goal setting   connecticutTeacher evaluation and goal setting   connecticut
Teacher evaluation and goal setting connecticut
 
Taking control of the South Carolina Teacher Evaluation framework
Taking control of the South Carolina Teacher Evaluation frameworkTaking control of the South Carolina Teacher Evaluation framework
Taking control of the South Carolina Teacher Evaluation framework
 
NWEA Growth and Teacher evaluation VA 9-13
NWEA Growth and Teacher evaluation VA 9-13NWEA Growth and Teacher evaluation VA 9-13
NWEA Growth and Teacher evaluation VA 9-13
 
Assessment 101 Parts 1 & 2
Assessment 101 Parts 1 & 2Assessment 101 Parts 1 & 2
Assessment 101 Parts 1 & 2
 
Maximizing student assessment systems cronin
Maximizing student assessment systems   croninMaximizing student assessment systems   cronin
Maximizing student assessment systems cronin
 
Teacher evaluation presentation oregon
Teacher evaluation presentation   oregonTeacher evaluation presentation   oregon
Teacher evaluation presentation oregon
 
Using student test scores to measure principal performance inee spain march 2...
Using student test scores to measure principal performance inee spain march 2...Using student test scores to measure principal performance inee spain march 2...
Using student test scores to measure principal performance inee spain march 2...
 
Maximizing student assessment systems cronin
Maximizing student assessment systems   croninMaximizing student assessment systems   cronin
Maximizing student assessment systems cronin
 
Using Assessment Data for Educator and Student Growth
Using Assessment Data for Educator and Student GrowthUsing Assessment Data for Educator and Student Growth
Using Assessment Data for Educator and Student Growth
 
Interview presentation.pptx
Interview presentation.pptxInterview presentation.pptx
Interview presentation.pptx
 
Standardized Tests.pdf
Standardized Tests.pdfStandardized Tests.pdf
Standardized Tests.pdf
 
IASB Student Growth Presentation
IASB Student Growth PresentationIASB Student Growth Presentation
IASB Student Growth Presentation
 
NYSCOSS Conference Superintendents Training on Assessment 9 14
NYSCOSS Conference Superintendents Training on Assessment 9 14NYSCOSS Conference Superintendents Training on Assessment 9 14
NYSCOSS Conference Superintendents Training on Assessment 9 14
 

Kürzlich hochgeladen

BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxSayali Powar
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfPatidar M
 
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Association for Project Management
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWQuiz Club NITW
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Projectjordimapav
 
CLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptxCLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptxAnupam32727
 
CHEST Proprioceptive neuromuscular facilitation.pptx
CHEST Proprioceptive neuromuscular facilitation.pptxCHEST Proprioceptive neuromuscular facilitation.pptx
CHEST Proprioceptive neuromuscular facilitation.pptxAneriPatwari
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfVanessa Camilleri
 
4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptx4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptxmary850239
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQuiz Club NITW
 
4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptxmary850239
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSMae Pangan
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operationalssuser3e220a
 
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...DhatriParmar
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...Nguyen Thanh Tu Collection
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxkarenfajardo43
 
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQ-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQuiz Club NITW
 

Kürzlich hochgeladen (20)

INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptxINCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
 
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdf
 
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITW
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Project
 
CLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptxCLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptx
 
CHEST Proprioceptive neuromuscular facilitation.pptx
CHEST Proprioceptive neuromuscular facilitation.pptxCHEST Proprioceptive neuromuscular facilitation.pptx
CHEST Proprioceptive neuromuscular facilitation.pptx
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdf
 
4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptx4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptx
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
 
4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHS
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operational
 
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
 
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQ-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
 

Colorado assessment summit_teacher_eval

  • 1. Presenter - John Cronin, Ph.D. Contacting us: NWEA Main Number: 503-624-1951 E-mail: rebecca.moore@nwea.org This PowerPoint presentation and recommended resources are available at our our Slideshare site http://www.slideshare.net/JohnCronin4/colorado-assessment- summitteachereval Considerations when using tests for teacher evaluation
  • 2. Key Colorado requirements related to testing • Assessment constitutes 50% of the evaluation. • Statewide summative assessments for subjects in which available. Districts will be on their own for other subjects. • Use of the Colorado Growth Model with statewide assessment. • A measure of individually attributed or collectively attributed student growth. • Local measure must be credible, valid (aligned), reliable, and inferences from the measure must be supportable by evidence and logic. • The law requires that the measures should support consistent inferences. • Rating of ineffective or partially effective can lead to loss of non- probationary status. • If a value-added model is used the model must be transparent enough to permit external evaluation.
  • 3. Unique characteristics of the Colorado approach • Student progress counts for 50% of the evaluation. • Teachers are evaluated on both a “catch up” and “keep up” metric (at least on TCAP) • The Colorado Growth Model will be used to evaluate progress (at least on TCAP)
  • 4. A finding of effectiveness or ineffectiveness is more defensible when it is arrived at by: 1. Two or more assessments of different designs. 2. Two or more models of different designs. 3. As many cases as possible. It is not good to choose tests or models for local assessment in hopes that they will mimic the state assessment.
  • 5. If evaluators do not differentiate their ratings, then all differentiation comes from the test.
  • 6. If performance ratings aren’t consistent with school growth, that will probably be public information.
  • 7. Results of Tennessee Teacher Evaluation Pilot 0% 10% 20% 30% 40% 50% 60% 1 2 3 4 5 Value-added result Observation Result
  • 8. Results of Georgia Teacher Evaluation Pilot Evaluator Rating ineffective Minimally Effective Effective Highly Effective
  • 9. Bill and Melina Gates Foundation (2013, January). Ensuring Fair and Reliable Measures of Effective Teaching: Culminating Findings from the MET Projects Three-Year Study Observation by Reliability coefficient (relative to state test value-added gain) Proportion of test variance explained Model 1 – State test – 81% Student surveys 17% Classroom Observations – 2% .51 26.0% Model 2 – State test – 50% Student Surveys – 25% Classroom Observation – 25% .66 43.5% Model 3 – State test – 33% - Student Surveys – 33% Classroom Observations – 33% .76 57.7%% Model 4 – Classroom Observation 50% State test – 25% Student surveys – 25% .75 56.2% Reliability of evaluation weights in predicted stability of student growth gains year to year
  • 10. Bill and Melina Gates Foundation (2013, January). Ensuring Fair and Reliable Measures of Effective Teaching: Culminating Findings from the MET Projects Three-Year Study Observation by Reliability coefficient (relative to state test value-added gain) Proportion of test variance explained Principal – 1 .51 26.0% Principal – 2 .58 33.6% Principal and other administrator .67 44.9% Principal and three short observations by peer observers .67 44.9% Two principal observations and two peer observations .66 43.6% Two principal observations and two different peer observers .69 47.6% Two principal observations one peer observation and three short observations by peers .72 51.8% Reliability of a variety of teacher observation implementations
  • 11. Testing Metric (Growth or Gain Score) Analysis (Value Added Effect Size and/or ranking) Evaluation (Performance Rating) How tests are used to evaluate teachers and principals
  • 12. Issues in the use of growth measures Instructional alignment Tests used for teacher evaluation must align to the teacher’s instructional responsibilities.
  • 13. Common problems with instructional alignment • Using school level math and reading results in the evaluation of music, art, and other specials teachers. • Using general tests of a discipline (reading, math, science) as a major component of the evaluation high school teachers delivering specialized courses.
  • 14. Florida Teachers Sue Over Evaluation System New York Times, April 17, 2013 Seven Florida teachers have brought a federal lawsuit to protest job evaluation policies that tether individual performance ratings to the test scores of students who are not even in their classes. The suit, which was filed Tuesday in conjunction with three local affiliates of the National Education Association in Federal District Court for the Northern District of Florida in Gainesville, says Florida’s two-year-old evaluation system violates teachers’ rights of due process and equal protection. Under a 2011 law, schools and districts must evaluate teachers in part based on how much their students learn, as measured by standardized tests. But since Florida, like most states, administers only math and reading tests and only in selected grades, many teachers do not teach tested subjects. One of the plaintiffs, a first-grade teacher, was rated on the basis of test scores of students in a different school in her district, and another, who teaches vocational classes to aspiring health care workers, was rated based on test scores of students in grades and subjects she had never taught. “This lawsuit highlights the absurdity of the current evaluation system,” said Andy Ford, president of the Florida Education Association.
  • 16. Inconsistency occurs because • Of differences in test design. • Differences in testing conditions. • Differences in models being applied to evaluate growth.
  • 17. Test Retest Test 1 Time 1 Test 2 Time 1 Test 1 Time 2 Test 2 Time 2 The reliability problem – Inconsistency in testing conditions
  • 18. Test 1 Time 1 Test 2 Time 1 Test 1 Time 2 Test 2 Time 2 The reliability problem – Inconsistency in testing conditions Test 1 Time 1 Test 2 Time 1 Test 1 Time 2 Test 2 Time 2 Test 1 Time 1 Test 2 Time 1 Test 1 Time 2 Test 2 Time 2
  • 19. The problem with spring-spring testing 3/11 4/11 5/11 6/11 7/11 8/11 9/11 10/11 11/11 12/11 1/12 2/12 3/12 Teacher 1 Summer Teacher 2
  • 20. Characteristics of value-added metrics • Value-added metrics are inherently NORMATIVE. • If below average = partially effective then half of the average staff will be partially effective. • Value-added metrics can’t measure progress of the larger group over time. • Extreme performance is more likely to have alternate explanations.
  • 21. New York City • Margins of error can be very large • Increasing n doesn't always decrease the margin of error • The margin of error in math is typically less than reading
  • 22. Los Angeles Unified • Teachers can easily rate in multiple categories • The choice of model can have a large impact • Models effect English more than Math • Teachers do better in some subjects than others • More complex models don't necessarily favor the teacher
  • 23. “The findings indicate that these modeling choices can significantly influence outcomes for individual teachers, particularly those in the tails of the performance distribution who are most likely to be targeted by high-stakes policies.” Ballou, D., Mokher, C. and Cavalluzzo, L. (2012) Using Value-Added Assessment for Personnel Decisions: How Omitted Variables and Model Specification Influence Teachers’ Outcomes. Instability at the tails of the distribution LA Times Teacher #1 LA Times Teacher #2
  • 24. “Significant evidence of bias plagued the value-added model estimated for the Los Angeles Times in 2010, including significant patterns of racial disparities in teacher ratings both by the race of the student served and by the race of the teachers (see Green, Baker and Oluwole, 2012). These model biases raise the possibility that Title VII disparate impact claims might also be filed by teachers dismissed on the basis of their value-added estimates. Additional analyses of the data, including richer models using additional variables mitigated substantial portions of the bias in the LA Times models (Briggs & Domingue, 2010).” Baker, B. (2012, April 28). If it’s not valid, reliability doesn’t matter so much! More on VAM-ing & SGP-ing Teacher Dismissal. Possible racial bias in models
  • 25. Inconsistency among the Colorado Growth Model and other value-added approaches.
  • 26. Issues with the Colorado Growth Model • When applied to MAP it discards the advantages of a cross-grade scale and robust growth norms. • It is a descriptive and not a causal model. • As currently applied it does not control for factors outside the teacher’s influence that may affect student growth.
  • 27. A brief commentary on the Colorado Growth Model It’s limitations •It does not support inference. •It does not take advantage of the useful characteristics of a vertical scale. •It uses only prior scores and past testing history to evaluate growth.
  • 28. A brief commentary on the Colorado Growth Model Other limitations •The model can’t be used for cross- state comparisons. • The model is problematic for assessing long-term trends.
  • 29. Measurement Issues Moving from the model to the teacher rating
  • 30. Translating ranked data to ratings - principles • There is no “science” per se around translating a ranking to a rating. If you call a bottom 40% teacher ineffective that is a judgment. • The rating process can be politicized. • The process is easy to over-engineer.
  • 31. New York Rating System • 60 points assigned from classroom observation • 20 points assigned from state assessment • 20 points assigned from local assessment • A score of 64 or less is rated ineffective.
  • 32. Ineffective (Growth Measures) Developing (Growth Measures) Effective (Growth Measures) Highly Effective (Growth Measures) 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 Ineffective(Observational) 0 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 2 3 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 2 2 4 5 6 6 6 7 7 7 7 7 8 8 8 8 8 8 8 8 8 8 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 3 2 5 6 7 7 8 8 9 9 9 10 10 10 10 10 10 11 11 11 11 11 11 11 11 11 11 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 4 3 5 7 8 9 9 10 10 11 11 11 12 12 12 12 13 13 13 13 13 13 14 14 14 14 14 14 14 14 14 15 15 15 15 15 15 15 15 15 15 15 5 3 6 8 9 10 11 11 12 12 13 13 14 14 14 14 15 15 15 15 16 16 16 16 16 16 16 17 17 17 17 17 17 17 17 17 18 18 18 18 18 18 6 3 6 8 10 11 12 13 13 14 14 15 15 16 16 16 17 17 17 17 18 18 18 18 18 19 19 19 19 19 19 19 20 20 20 20 20 20 20 20 20 21 7 3 7 9 11 12 13 14 15 15 16 16 17 17 18 18 18 19 19 19 20 20 20 20 20 21 21 21 21 21 22 22 22 22 22 22 22 23 23 23 23 23 8 3 7 10 11 13 14 15 16 17 17 18 18 19 19 20 20 20 21 21 21 22 22 22 23 23 23 23 23 24 24 24 24 24 24 25 25 25 25 25 25 25 9 3 8 10 12 14 15 16 17 18 18 19 20 20 21 21 22 22 23 23 23 24 24 24 24 25 25 25 25 26 26 26 26 26 27 27 27 27 27 27 28 28 10 3 8 11 13 14 16 17 18 19 20 20 21 22 22 23 23 24 24 25 25 25 26 26 26 27 27 27 27 28 28 28 28 29 29 29 29 29 29 30 30 30 11 3 8 11 13 15 17 18 19 20 21 22 22 23 24 24 25 25 26 26 27 27 27 28 28 28 29 29 29 30 30 30 30 31 31 31 31 31 32 32 32 32 12 4 8 12 14 16 17 19 20 21 22 23 24 24 25 26 26 27 27 28 28 29 29 29 30 30 30 31 31 31 32 32 32 33 33 33 33 33 34 34 34 34 13 4 9 12 14 16 18 20 21 22 23 24 25 26 26 27 28 28 29 29 30 30 31 31 31 32 32 33 33 33 34 34 34 34 35 35 35 35 36 36 36 36 14 4 9 12 15 17 19 20 22 23 24 25 26 27 27 28 29 30 30 31 31 32 32 33 33 33 34 34 35 35 35 36 36 36 37 37 37 37 38 38 38 38 15 4 9 13 15 18 19 21 23 24 25 26 27 28 29 29 30 31 31 32 33 33 34 34 35 35 35 36 36 37 37 37 38 38 38 39 39 39 40 40 40 40 Developing(Observational) 16 4 9 13 16 18 20 22 23 25 26 27 28 29 30 31 31 32 33 33 34 35 35 36 36 37 37 37 38 38 39 39 39 40 40 40 41 41 41 42 42 42 17 4 9 13 16 19 21 23 24 25 27 28 29 30 31 32 33 33 34 35 35 36 37 37 38 38 39 39 39 40 40 41 41 42 42 42 43 43 43 44 44 44 18 4 10 14 17 19 21 23 25 26 28 29 30 31 32 33 34 35 35 36 37 37 38 38 39 40 40 41 41 41 42 42 43 43 44 44 44 45 45 45 46 46 19 4 10 14 17 20 22 24 26 27 28 30 31 32 33 34 35 36 36 37 38 39 39 40 40 41 42 42 43 43 43 44 44 45 45 46 46 46 47 47 47 48 20 4 10 14 17 20 22 24 26 28 29 31 32 33 34 35 36 37 38 38 39 40 41 41 42 42 43 43 44 45 45 45 46 46 47 47 48 48 48 49 49 49 21 4 10 14 18 21 23 25 27 29 30 31 33 34 35 36 37 38 39 40 40 41 42 42 43 44 44 45 45 46 46 47 47 48 48 49 49 50 50 50 51 51 22 4 10 15 18 21 23 26 27 29 31 32 34 35 36 37 38 39 40 41 42 42 43 44 44 45 46 46 47 47 48 48 49 49 50 50 51 51 52 52 52 53 23 4 10 15 18 21 24 26 28 30 31 33 34 36 37 38 39 40 41 42 43 43 44 45 46 46 47 48 48 49 49 50 50 51 51 52 52 53 53 54 54 54 24 4 11 15 19 22 24 27 29 31 32 34 35 36 38 39 40 41 42 43 44 45 45 46 47 48 48 49 50 50 51 51 52 52 53 53 54 54 55 55 56 56 25 4 11 15 19 22 25 27 29 31 33 34 36 37 39 40 41 42 43 44 45 46 47 47 48 49 50 50 51 52 52 53 53 54 54 55 55 56 56 57 57 58 26 4 11 16 19 23 25 28 30 32 34 35 37 38 39 41 42 43 44 45 46 47 48 49 49 50 51 51 52 53 53 54 55 55 56 56 57 57 58 58 59 59 27 4 11 16 20 23 26 28 30 32 34 36 37 39 40 42 43 44 45 46 47 48 49 50 50 51 52 53 53 54 55 55 56 57 57 58 58 59 59 60 60 61 28 4 11 16 20 23 26 29 31 33 35 37 38 40 41 42 44 45 46 47 48 49 50 51 52 52 53 54 55 55 56 57 57 58 59 59 60 60 61 61 62 62 29 4 11 16 20 24 26 29 31 34 35 37 39 40 42 43 45 46 47 48 49 50 51 52 53 54 54 55 56 57 57 58 59 59 60 61 61 62 62 63 63 64 30 4 11 16 20 24 27 30 32 34 36 38 40 41 43 44 45 47 48 49 50 51 52 53 54 55 56 56 57 58 59 59 60 61 61 62 62 63 64 64 65 65 Effective(Observational) 31 4 11 17 21 24 27 30 32 35 37 39 40 42 43 45 46 47 49 50 51 52 53 54 55 56 57 57 58 59 60 61 61 62 63 63 64 64 65 66 66 67 32 4 11 17 21 25 28 30 33 35 37 39 41 43 44 46 47 48 50 51 52 53 54 55 56 57 58 59 59 60 61 62 62 63 64 64 65 66 66 67 68 68 33 4 12 17 21 25 28 31 33 36 38 40 42 43 45 46 48 49 50 52 53 54 55 56 57 58 59 60 61 61 62 63 64 64 65 66 66 67 68 68 69 69 34 4 12 17 21 25 28 31 34 36 38 40 42 44 46 47 49 50 51 53 54 55 56 57 58 59 60 61 62 63 63 64 65 66 66 67 68 68 69 70 70 71 35 4 12 17 22 25 29 32 34 37 39 41 43 45 46 48 49 51 52 53 55 56 57 58 59 60 61 62 63 64 64 65 66 67 68 68 69 70 70 71 72 72 36 4 12 17 22 26 29 32 35 37 39 41 43 45 47 49 50 52 53 54 55 57 58 59 60 61 62 63 64 65 66 66 67 68 69 69 70 71 72 72 73 74 37 4 12 17 22 26 29 32 35 38 40 42 44 46 48 49 51 52 54 55 56 58 59 60 61 62 63 64 65 66 67 68 68 69 70 71 71 72 73 74 74 75 38 4 12 18 22 26 30 33 36 38 40 43 45 46 48 50 52 53 55 56 57 58 60 61 62 63 64 65 66 67 68 69 69 70 71 72 73 73 74 75 75 76 39 4 12 18 22 26 30 33 36 39 41 43 45 47 49 51 52 54 55 57 58 59 61 62 63 64 65 66 67 68 69 70 71 71 72 73 74 75 75 76 77 77 40 4 12 18 23 27 30 33 36 39 41 44 46 48 50 51 53 55 56 57 59 60 61 63 64 65 66 67 68 69 70 71 72 73 73 74 75 76 77 77 78 79 41 4 12 18 23 27 31 34 37 39 42 44 46 48 50 52 54 55 57 58 60 61 62 63 65 66 67 68 69 70 71 72 73 74 75 75 76 77 78 78 79 80 42 5 12 18 23 27 31 34 37 40 42 45 47 49 51 53 54 56 58 59 60 62 63 64 66 67 68 69 70 71 72 73 74 75 76 76 77 78 79 80 80 81 43 5 12 18 23 27 31 34 37 40 43 45 47 49 51 53 55 57 58 60 61 63 64 65 66 68 69 70 71 72 73 74 75 76 77 78 78 79 80 81 82 82 44 5 12 18 23 28 31 35 38 41 43 46 48 50 52 54 56 57 59 60 62 63 65 66 67 69 70 71 72 73 74 75 76 77 78 79 80 80 81 82 83 84 45 5 13 19 24 28 32 35 38 41 44 46 48 51 53 54 56 58 60 61 63 64 66 67 68 69 71 72 73 74 75 76 77 78 79 80 81 82 82 83 84 85 HighlyEffective(Observational) 46 5 13 19 24 28 32 35 39 41 44 47 49 51 53 55 57 59 60 62 63 65 66 68 69 70 71 73 74 75 76 77 78 79 80 81 82 83 83 84 85 86 47 5 13 19 24 28 32 36 39 42 45 47 49 52 54 56 58 59 61 63 64 66 67 69 70 71 72 74 75 76 77 78 79 80 81 82 83 84 85 85 86 87 48 5 13 19 24 29 32 36 39 42 45 47 50 52 54 56 58 60 62 63 65 66 68 69 71 72 73 74 76 77 78 79 80 81 82 83 84 85 86 87 87 88 49 5 13 19 24 29 33 36 40 43 45 48 50 53 55 57 59 61 62 64 66 67 69 70 71 73 74 75 77 78 79 80 81 82 83 84 85 86 87 88 89 89 50 5 13 19 24 29 33 37 40 43 46 48 51 53 55 57 59 61 63 65 66 68 69 71 72 74 75 76 77 79 80 81 82 83 84 85 86 87 88 89 90 90 51 5 13 19 25 29 33 37 40 43 46 49 51 54 56 58 60 62 64 65 67 69 70 72 73 74 76 77 78 79 81 82 83 84 85 86 87 88 89 90 91 92 52 5 13 19 25 29 33 37 41 44 47 49 52 54 56 58 61 62 64 66 68 69 71 72 74 75 77 78 79 80 82 83 84 85 86 87 88 89 90 91 92 93 53 5 13 19 25 30 34 37 41 44 47 50 52 55 57 59 61 63 65 67 68 70 72 73 75 76 77 79 80 81 82 84 85 86 87 88 89 90 91 92 93 94 54 5 13 20 25 30 34 38 41 44 47 50 53 55 57 60 62 64 66 67 69 71 72 74 75 77 78 80 81 82 83 85 86 87 88 89 90 91 92 93 94 95 55 5 13 20 25 30 34 38 41 45 48 50 53 56 58 60 62 64 66 68 70 71 73 75 76 78 79 80 82 83 84 85 87 88 89 90 91 92 93 94 95 96 56 5 13 20 25 30 34 38 42 45 48 51 54 56 58 61 63 65 67 69 70 72 74 75 77 78 80 81 82 84 85 86 87 89 90 91 92 93 94 95 96 97 57 5 13 20 25 30 35 38 42 45 48 51 54 56 59 61 63 65 67 69 71 73 74 76 78 79 81 82 83 85 86 87 88 90 91 92 93 94 95 96 97 98 58 5 13 20 26 30 35 39 42 46 49 52 54 57 59 62 64 66 68 70 72 73 75 77 78 80 81 83 84 85 87 88 89 90 92 93 94 95 96 97 98 99 59 5 13 20 26 31 35 39 43 46 49 52 55 57 60 62 64 66 68 70 72 74 76 77 79 81 82 83 85 86 88 89 90 91 92 94 95 96 97 98 99 100 60 5 13 20 26 31 35 39 43 46 49 52 55 58 60 63 65 67 69 71 73 75 76 78 80 81 83 84 86 87 88 90 91 92 93 95 96 97 98 99 100 101
  • 33. Cheating Atlanta Public Schools Crescendo Charter Schools Philadelphia Public Schools Washington DC Public Schools Houston Independent School District Michigan Public Schools
  • 34. Unintended Consequences? • Many principals and teachers (including good ones) will seek schools or teaching assignments that they think will improve their results. • Principals and teachers may game the system, inadvertently or intentionally. • Many teachers will seek opportunities to avoid grades with standardized tests. • Ranking metrics can discourage cooperation among principals and teachers – finding ways to reward teamwork and cooperation are important.
  • 35. Case Study #1 - Mean value-added performance in mathematics by school – fall to spring -8.00 -6.00 -4.00 -2.00 0.00 2.00 4.00 6.00
  • 36. Case Study #1 - Mean spring and fall test duration in minutes by school 0.00 10.00 20.00 30.00 40.00 50.00 60.00 70.00 80.00 90.00 Spring term Fall term
  • 37. -10.00 -8.00 -6.00 -4.00 -2.00 0.00 2.00 4.00 6.00 8.00 Students taking 10+ minutes longer spring than fall All other students Case Study #1 - Mean value-added growth by school and test duration
  • 38. Differences in fall-spring test durations Case Study # 2 15% 25% 60% Mathematics Spring < Fall Spring = Fall Spring > Fall 0.0 1.0 2.0 3.0 4.0 5.0 6.0 Spring < Fall Spring = Fall Spring > Fall GrowthIndex Mathematics Differences in growth index score based on fall-spring test durations
  • 39. Case Study # 2 42% 33% 25% Fall < Spring Fall = Spring Fall > Spring -5.0 -4.5 -4.0 -3.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 Fall < Spring Fall = Spring Fall >Spring Differences in spring -fall test durations Differences in raw growth based by spring-fall test duration How much of summer loss is really summer loss?
  • 40. Case Study # 2 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 0 20 40 60 80 100 120 140 160 180 200 GrowthIndex Minutes School Growth Index Fall test duration Spring test duration Differences in fall-spring test duration (yellow-black) and Differences in growth index scores (green) by school
  • 41. Negotiated goals – Student Learning Objectives • Negotiated goals (SLOs) are likely to be necessary in some subjects. • It is difficult to set fair and reasonable goals for improvement absent norms or context. • It is likely that some goals will be absurdly high and others way too low.
  • 42. Ways to evaluate the attainability of a goal • Prior performance • Performance of peers within the system • Performance of a norming group
  • 43. One approach to evaluating the attainment of goals. Students in La Brea Elementary School show mathematics growth equivalent to only 2/3 of the average for students in their grade. Level 4 – (Aspirational) – Students in La Brea Elementary School will improve their mathematics growth equivalent to 1.5 times the average for their grade. Level 3 – (Proficient) Students in La Brea Elementary School will improve their mathematics growth equivalent to the average for their grade. Level 2 – (Marginal) Students in La Brea Elementary School will improve their mathematics growth relative to last year. Level 1 – (Unacceptable) Students in La Brea Elementary School do not improve their mathematics growth relative to last year.
  • 44. Is this goal attainable? 62% of students at John Glenn Elementary met or exceeded proficiency in Reading/Literature last year. Their goal is to improve their rate to 82% this year. Is the goal attainable? 362 351 291 173 73 14 3 0 100 200 300 400 Growth > -30% > -20% > -10% > 0% > 10% > 20% > 30% Oregon schools – change in Reading/Literature proficiency 2009-10 to 2010-11 among schools that started with 60% proficiency rates
  • 45. Is this goal attainable and rigorous? 45% of the students at La Brea elementary showed average growth or better last year. Their goal is to improve that rate to 50% this year. Is their goal reasonable? 0% 20% 40% 60% 80% 100% Students with average or better annual growth in Repus school district
  • 46. The selection of metrics matters Students at LaBrea Elementary School will show growth equivalent to 150% of grade level. Students at Etsaw Middle School will show growth equivalent to 150% of grade level.
  • 47. Scale score growth relative to NWEA’s growth norm in mathematics -1.0 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 2 3 4 5 6 7 8 9 ScaleScoreGrowth Growth Index
  • 48. Percent of a year’s growth in mathematics 0% 20% 40% 60% 80% 100% 120% 140% 160% 180% 200% 2 3 4 5 6 7 8 9 PercentofaYear’sGrowth Mathematics
  • 49. Assessing the difficult to measure • Encourage use of performance assessment and rubrics. • Encourage outside scoring – Use of peers in other buildings, professionals in the field, contest judges • Make use of resources – Music educator, art educator, vocational professional associations – Available models – AP art portfolio. – Use your intermediate agency – Work across buildings • Make use of classroom observation.
  • 50. Possible legal issues • Title VII of the Civil Rights Act of 1964 – Disparate impact of sanctions on a protected group. • State statutes that provide tenure and other related protections to teachers. • Challenges to a finding of “incompetence” stemming from the growth or value-added data.
  • 51. Recommendations • Embrace the formative advantages of growth measurement as well as the summative. • Create comprehensive evaluation systems with multiple measures of teacher effectiveness (Rand, 2010) • Select measures as carefully as value-added models. • Use multiple years of student achievement data. • Understand the issues and the tradeoffs.
  • 52. Presenter - John Cronin, Ph.D. Contacting us: NWEA Main Number: 503-624-1951 E-mail: rebecca.moore@nwea.org This PowerPoint presentation and recommended resources are available at our Slideshare site http://www.slideshare.net/JohnCronin4/colorado- assessment-summitteachereval Thank you for attending this event