SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Testing in the classroom: Using
tests to promote learning
Richard P. Phelps
Universidad Finis Terrae, Santiago, Chile
January 7, 2014
Q. What is a standardized test?
A. An assessment with at least one aspect – in
its content or administration – standardized.

Q. What is the key advantage of standardized testing?
A. It is standardized.
Meta-analysis
• A method for
summarizing a large
research literature,
with a single,
comparable measure.

© 2012, Richard P PHELPS

World Association of Education Research,
17th Congress, Reims, June, 2012

3
John Hattie’s meta-analyses of meta-analyses
John Hattie’s list
1.

11.

Student self-assessment/self-grading
Response to intervention
Teacher credibility
Providing formative assessments
Classroom discussion
Teacher clarity
Feedback
Reciprocal teaching
Teacher-student relationships fostered
Spaced vs. mass practice

21.

Concept mapping
Cooperative vs individualistic learning
Direct instruction
Tactile stimulation programs
Mastery learning
Worked examples
Visual-perception programs
Peer tutoring
Cooperative vs competitive learning
Phonics instruction

Acceleration
Classroom behavioral techniques
Vocabulary programs
Repeated reading programs
Creativity programs
Student prior achievement
Self-questioning by students
Study skills
Problem-solving teaching
Not labeling students

31.

Student-centered teaching
Classroom cohesion
Pre-term birth weight
Peer influences
Classroom management techniques
Outdoor-adventure programs
Home environment
Socio-economic status
The effect of testing on student achievement:
1910-2010

Richard P. PHELPS
© 2012, Richard P PHELPS
The effect of testing on student
achievement
• 12-year long study
• analyzed close to 700 separate studies,
and more than 1,600 separate effects

• 2,000 other studies were reviewed and
found incomplete or inappropriate
• lacking sufficient time and money,
hundreds of other studies will not be
reviewed
© 2012, Richard P PHELPS

World Association of Education Research,
17th Congress, Reims, June, 2012

7
Studies included in the meta-analyses

2. …when:
• a test is newly introduced, or newly removed
• quantity of testing is increased or reduced
• test stakes are introduced or increased, or removed or
reduced
© 2012, Richard P PHELPS

World Association of Education Research,
17th Congress, Reims, June, 2012

8
Number of studies of effects,
by methodology type
Number of
studies

Number of
effects

Quantitative

177

640

Surveys and public
opinion polls (US & Canada)

247

813

Qualitative

245

245

TOTAL

669

1698

Methodology type

© 2012, Richard P PHELPS

World Association of Education Research,
17th Congress, Reims, June, 2012

9
Effect size: Interpretation

• d between 0.25 & 0.50  weak effect
• d between 0.50 et 0.75  medium effect

• d more than 0.75

© 2012, Richard P PHELPS

 strong effect

World Association of Education Research,
17th Congress, Reims, June, 2012

10
Which predictors matter?

Mean Effect
Size

Treatment Group…
…is made aware of performance, and control group is not

+0.98

…receives targeted instruction (e.g., remediation)

+0.96

…is tested with higher stakes than control group

+0.87

…is tested more frequently than control group

+0.85

© 2012, Richard P PHELPS

World Association of Education Research,
17th Congress, Reims, June, 2012

11
Why tests?

●

Students tend to study more, and learn more, when:
• they know they will be tested, but not precisely what will be tested
» (e.g.) Experiment comparing gains of students with “take-home tests” to
those with “in class tests” -- the latter learned substantially more.
• when there is reinforcement of material already studied
●

Mastery learning experiments of 1960s—1980s:
» Students learn more when asked to recall what they have learned.
» Up to a point, the more students are made to actively process information,
and describe it to others, the better they learn.
Surveys and opinion polls:
Regular standardized tests, performance tests

Regular tests
(N ≈125)

Performance tests
(N ≈ 50)

d

d

Achievement is increased

1.2

1.0

…weighted by size of study population

1.9

0.5

Instruction is improved

1.0

1.4

…weighted by size of study population

0.9

0.9

Tests help align instruction

1.0

1.0

…weighted by size of study population

0.5

0.9

Respondent opinion

© 2012, Richard P PHELPS

World Association of Education Research,
17th Congress, Reims, June, 2012

13
Qualitative studies:
Effect on student achievement
244 studies conducted in the past century in over 30 countries
Number of
studies

Percent of studies

Percent without
the inferred

Positive

204

84

93

Positive inferred

24

10

Mixed

5

2

2

No change

8

3

4

Negative

3

1

1

244

100

100

Direction of effect

TOTAL
© 2012, Richard P PHELPS

World Association of Education Research,
17th Congress, Reims, June, 2012

14
“Repeated retrieval during learning is the key to long-term retention.”
10 benefits of testing and their applications to education
Roediger, Putnam and Smith
Direct effects of testing
Retrieval practice during tests enhances retention of the retrieved information
(relative to not testing or even to studying) -- the “testing effect”

Repeated retrieval produces knowledge that can be retrieved flexibly and
transferred to other situations
On open-ended assessments (e.g., essay tests) retrieval practice induced by
tests helps students organize information into a coherent knowledge base.
Repeated retrieval leads to easier retrieval of related information
SOURCE: Roediger, Putnam, & Smith, Ten benefits of testing and their applications to educational practice, Psychology of
Learning and Motivation, 55, 2011.
10 benefits of testing and their applications to education
Roediger, Putnam and Smith
Indirect effects of testing
Students tested frequently study more and with more regularity.
Tests permit students to discover gaps in their knowledge and adjust their
study efforts to focus on difficult material.
Students who study after taking a test learn more than if they had not taken a
test.
Students who self-test or are tested more frequently in class learn more.

SOURCE: Roediger, Putnam, & Smith, Ten benefits of testing and their applications to educational practice, Psychology of
Learning and Motivation, 55, 2011.
10 benefits of testing and their applications to education
Roediger, Putnam and Smith
Benefit 1: The Testing Effect: Retrieval Aids Later Retention
Benefit 2: Testing Identifies Gaps in Knowledge
Benefit 3: Testing Causes Students to Learn More from the Next Study Episode
Benefit 4: Testing Produces Better Organization of Knowledge
Benefit 5: Testing Improves Transfer of Knowledge to New Contexts
Benefit 6: Testing can Facilitate Retrieval of Material That was not Tested
Benefit 7: Testing Improves Metacognitive Monitoring
Benefit 8: Testing Prevents Interference from Prior Material when Learning
New Material
Benefit 9: Testing Provides Feedback to Instructors
Benefit 10: Frequent Testing Encourages Students to Study
SOURCE: Roediger, Putnam, & Smith, Ten benefits of testing and their applications to educational practice, Psychology of
Learning and Motivation, 55, 2011.
10 benefits of testing and their applications to education
Roediger, Putnam and Smith
Benefit 1: The Testing Effect: Retrieval Aids Later Retention
Benefit 2: Testing Identifies Gaps in Knowledge
Benefit 3: Testing Causes Students to Learn More from the Next Study Episode
Benefit 4: Testing Produces Better Organization of Knowledge
Benefit 5: Testing Improves Transfer of Knowledge to New Contexts
Benefit 6: Testing can Facilitate Retrieval of Material That was not Tested
Benefit 7: Testing Improves Metacognitive Monitoring
Benefit 8: Testing Prevents Interference from Prior Material when Learning
New Material
Benefit 9: Testing Provides Feedback to Instructors
Benefit 10: Frequent Testing Encourages Students to Study
SOURCE: Roediger, Putnam, & Smith, Ten benefits of testing and their applications to educational practice, Psychology of
Learning and Motivation, 55, 2011.
10 benefits of testing and their applications to education
Roediger, Putnam and Smith
Most teachers should be testing much more
frequently, …with smaller, shorter, less
consequential tests.
Students learn more when they test. But
learn best when the tests are “spaced”.
What is the optimal lapse of time between tests?
The best time to test again is just before students start forgetting the
information. This time lapse is shorter with discrete material, like
mathematics, than with other subjects. Some studies suggest that math
students should be tested at least once a week.
The more high-stakes decision points, the better the
student performance ?
Figure 1: Average TIMSS Score and Number of Quality Control
Measures Used, by Country

Average Percent Correct (grades 7&8)

80

70

60

50

40
30

20

10

0
0

5

10

15

20

Number of Quality Control Measures Used
Top-Performing Countries

Bottom-Performing Countries

SOURCE: Phelps, Benchmarking to the best in mathematics, Evaluation Review, 2001
Quality control has proportionally greater effect in poorer
countries

Average Percent Correct (grades 7& 8)
(per GDP/capita)

Figure 2: Average TIMSS Score and Number of Quality Control
Measures Used (each adjusted for GDP/capita), by Country

Num be r of Quality Control Me as ure s Us e d (pe r GDP/capita)

SOURCE: Phelps, Benchmarking to the best in mathematics, Evaluation Review, 2001
What testing skills do teachers need…
…for interpreting information from large-scale tests?
Basic understanding of statistics:
- distributions, mean, median, skewness, kurtosis
- sampling error, measurement error
- type 1 / type 2 error, statistical power
- sampling (size, representativeness)

Protocols to help them explain
tests to others:
- to students
- to parents
- to the media
What testing skills do teachers need…
…for developing and administering classroom tests?
Practice (with each other) in writing
items / prompts / rubrics :
- unambiguous, relevant, un-biased

Understand that useful assessment can
be very simple:
- e.g., save the last few minutes of
each class to assess by asking
students to record 2-3 concepts they
learned that day
Learn the optimal frequency, spacing of tests for your subject field and grade level.
It is easy to know what you are teaching.

But, you can only know what students are learning if you assess.

Weitere ähnliche Inhalte

Was ist angesagt?

Standardized testsand classroom test
Standardized testsand classroom testStandardized testsand classroom test
Standardized testsand classroom testmaryammuno
 
Standardized testing.pptx 2
Standardized testing.pptx 2Standardized testing.pptx 2
Standardized testing.pptx 2Jesullyna Manuel
 
test construction in mathematics
test construction in mathematicstest construction in mathematics
test construction in mathematicsAlokBhutia
 
Progressive and achievement test
Progressive and achievement testProgressive and achievement test
Progressive and achievement testMd Arman
 
Teacher Made Test vs Standardized Test
Teacher Made Test vs Standardized TestTeacher Made Test vs Standardized Test
Teacher Made Test vs Standardized TestDr. Amjad Ali Arain
 
Achievement& Diagnostic test
Achievement& Diagnostic testAchievement& Diagnostic test
Achievement& Diagnostic testrkbioraj24
 
Assessment for learning
Assessment for learningAssessment for learning
Assessment for learningAtul Thakur
 
Types of Evaluation prior to Instructional Act
Types of Evaluation prior to Instructional ActTypes of Evaluation prior to Instructional Act
Types of Evaluation prior to Instructional Actitspetacular
 
Norm referenced and Criterion Referenced Test
Norm referenced and Criterion Referenced TestNorm referenced and Criterion Referenced Test
Norm referenced and Criterion Referenced TestDrSindhuAlmas
 
standard test vs teacher-made test
standard test vs teacher-made teststandard test vs teacher-made test
standard test vs teacher-made testDhaiana San Diego
 
Criterion and norm referenced evaluation
Criterion and norm referenced evaluationCriterion and norm referenced evaluation
Criterion and norm referenced evaluationJinto Philip
 
administrating test,scoring, grading vs marks
administrating test,scoring, grading vs marksadministrating test,scoring, grading vs marks
administrating test,scoring, grading vs markskrishu29
 
TEST DEVELOPMENT AND EVALUATION (6462)
TEST DEVELOPMENT AND EVALUATION (6462)TEST DEVELOPMENT AND EVALUATION (6462)
TEST DEVELOPMENT AND EVALUATION (6462)HennaAnsari
 
standardized Achievement tests SAT
standardized Achievement tests SATstandardized Achievement tests SAT
standardized Achievement tests SATMuzna AL Hooti
 
Test Development and Evaluation
Test Development and Evaluation Test Development and Evaluation
Test Development and Evaluation HennaAnsari
 

Was ist angesagt? (20)

Standardized testsand classroom test
Standardized testsand classroom testStandardized testsand classroom test
Standardized testsand classroom test
 
Standardized testing.pptx 2
Standardized testing.pptx 2Standardized testing.pptx 2
Standardized testing.pptx 2
 
test construction in mathematics
test construction in mathematicstest construction in mathematics
test construction in mathematics
 
Progressive and achievement test
Progressive and achievement testProgressive and achievement test
Progressive and achievement test
 
Types of evaluation
Types of evaluationTypes of evaluation
Types of evaluation
 
Teacher Made Test vs Standardized Test
Teacher Made Test vs Standardized TestTeacher Made Test vs Standardized Test
Teacher Made Test vs Standardized Test
 
Achievement& Diagnostic test
Achievement& Diagnostic testAchievement& Diagnostic test
Achievement& Diagnostic test
 
Testing and evaluation
Testing and evaluationTesting and evaluation
Testing and evaluation
 
Assessment for learning
Assessment for learningAssessment for learning
Assessment for learning
 
Types of Evaluation prior to Instructional Act
Types of Evaluation prior to Instructional ActTypes of Evaluation prior to Instructional Act
Types of Evaluation prior to Instructional Act
 
Norm referenced and Criterion Referenced Test
Norm referenced and Criterion Referenced TestNorm referenced and Criterion Referenced Test
Norm referenced and Criterion Referenced Test
 
standard test vs teacher-made test
standard test vs teacher-made teststandard test vs teacher-made test
standard test vs teacher-made test
 
Criterion and norm referenced evaluation
Criterion and norm referenced evaluationCriterion and norm referenced evaluation
Criterion and norm referenced evaluation
 
administrating test,scoring, grading vs marks
administrating test,scoring, grading vs marksadministrating test,scoring, grading vs marks
administrating test,scoring, grading vs marks
 
Achievement test
Achievement testAchievement test
Achievement test
 
TEST DEVELOPMENT AND EVALUATION (6462)
TEST DEVELOPMENT AND EVALUATION (6462)TEST DEVELOPMENT AND EVALUATION (6462)
TEST DEVELOPMENT AND EVALUATION (6462)
 
Types of Evaluation 1.2
Types of Evaluation 1.2Types of Evaluation 1.2
Types of Evaluation 1.2
 
Types of Test
Types of Test Types of Test
Types of Test
 
standardized Achievement tests SAT
standardized Achievement tests SATstandardized Achievement tests SAT
standardized Achievement tests SAT
 
Test Development and Evaluation
Test Development and Evaluation Test Development and Evaluation
Test Development and Evaluation
 

Ähnlich wie Classroom testing: Using tests to promote learning

Arkansas common core presentation
Arkansas common core presentationArkansas common core presentation
Arkansas common core presentationRichard P Phelps
 
TESTA, Durham University (December 2013)
TESTA, Durham University (December 2013)TESTA, Durham University (December 2013)
TESTA, Durham University (December 2013)TESTA winch
 
MET_Gathering_Feedback_Practioner_Brief
MET_Gathering_Feedback_Practioner_BriefMET_Gathering_Feedback_Practioner_Brief
MET_Gathering_Feedback_Practioner_BriefPaul Fleischman
 
TESTA, University of Greenwich Keynote (July 2013)
TESTA, University of Greenwich Keynote (July 2013)TESTA, University of Greenwich Keynote (July 2013)
TESTA, University of Greenwich Keynote (July 2013)TESTA winch
 
Impact Of Diagnostic Test For Enhancing Student Learning At Elementary Level
Impact Of Diagnostic Test For Enhancing Student Learning At Elementary LevelImpact Of Diagnostic Test For Enhancing Student Learning At Elementary Level
Impact Of Diagnostic Test For Enhancing Student Learning At Elementary LevelPakistan
 
Ch1 introduction to educational research
Ch1 introduction to educational researchCh1 introduction to educational research
Ch1 introduction to educational researchsaima sardar
 
educational research by Gay, Mils and Alrasian
educational research by Gay, Mils and Alrasianeducational research by Gay, Mils and Alrasian
educational research by Gay, Mils and Alrasianmanzoor83
 
researchSEND 18 Nov 2017
researchSEND 18 Nov 2017researchSEND 18 Nov 2017
researchSEND 18 Nov 2017Gary Jones
 
ResearchSEND - Becoming an Evidence based SENCO - 18.11.2017
ResearchSEND - Becoming an Evidence based SENCO - 18.11.2017ResearchSEND - Becoming an Evidence based SENCO - 18.11.2017
ResearchSEND - Becoming an Evidence based SENCO - 18.11.2017MichelleHaywood5
 
TESTA, University of Leeds: 'Talking @ Teaching' (September 2013)
TESTA, University of Leeds: 'Talking @ Teaching' (September 2013)TESTA, University of Leeds: 'Talking @ Teaching' (September 2013)
TESTA, University of Leeds: 'Talking @ Teaching' (September 2013)TESTA winch
 
CAN Conference TESTA Programme Assessment
CAN Conference TESTA Programme Assessment CAN Conference TESTA Programme Assessment
CAN Conference TESTA Programme Assessment Tansy Jessop
 
Comparative and Non-Comparative
Comparative and Non-ComparativeComparative and Non-Comparative
Comparative and Non-Comparativeu068692
 
TESTA, Universtiy of Warwick SCAP Conference (July 2013)
TESTA, Universtiy of Warwick SCAP Conference (July 2013)TESTA, Universtiy of Warwick SCAP Conference (July 2013)
TESTA, Universtiy of Warwick SCAP Conference (July 2013)TESTA winch
 
The Reflective Journal as a site of Student Engagement, Learning and Transfor...
The Reflective Journal as a site of Student Engagement, Learning and Transfor...The Reflective Journal as a site of Student Engagement, Learning and Transfor...
The Reflective Journal as a site of Student Engagement, Learning and Transfor...Susie Macfarlane
 
Glyndwr Teaching and Research
Glyndwr Teaching and Research Glyndwr Teaching and Research
Glyndwr Teaching and Research Jamie Davies
 
Steve Vitto Response to Intervention (RTI)
Steve Vitto Response to Intervention (RTI)Steve Vitto Response to Intervention (RTI)
Steve Vitto Response to Intervention (RTI)Steve Vitto
 

Ähnlich wie Classroom testing: Using tests to promote learning (20)

Arkansas common core presentation
Arkansas common core presentationArkansas common core presentation
Arkansas common core presentation
 
TESTA, Durham University (December 2013)
TESTA, Durham University (December 2013)TESTA, Durham University (December 2013)
TESTA, Durham University (December 2013)
 
MET_Gathering_Feedback_Practioner_Brief
MET_Gathering_Feedback_Practioner_BriefMET_Gathering_Feedback_Practioner_Brief
MET_Gathering_Feedback_Practioner_Brief
 
TESTA, University of Greenwich Keynote (July 2013)
TESTA, University of Greenwich Keynote (July 2013)TESTA, University of Greenwich Keynote (July 2013)
TESTA, University of Greenwich Keynote (July 2013)
 
Impact Of Diagnostic Test For Enhancing Student Learning At Elementary Level
Impact Of Diagnostic Test For Enhancing Student Learning At Elementary LevelImpact Of Diagnostic Test For Enhancing Student Learning At Elementary Level
Impact Of Diagnostic Test For Enhancing Student Learning At Elementary Level
 
Ch1 introduction to educational research
Ch1 introduction to educational researchCh1 introduction to educational research
Ch1 introduction to educational research
 
CH01.PPT
CH01.PPTCH01.PPT
CH01.PPT
 
educational research by Gay, Mils and Alrasian
educational research by Gay, Mils and Alrasianeducational research by Gay, Mils and Alrasian
educational research by Gay, Mils and Alrasian
 
Educational Research.PPT
Educational Research.PPTEducational Research.PPT
Educational Research.PPT
 
researchSEND 18 Nov 2017
researchSEND 18 Nov 2017researchSEND 18 Nov 2017
researchSEND 18 Nov 2017
 
ResearchSEND - Becoming an Evidence based SENCO - 18.11.2017
ResearchSEND - Becoming an Evidence based SENCO - 18.11.2017ResearchSEND - Becoming an Evidence based SENCO - 18.11.2017
ResearchSEND - Becoming an Evidence based SENCO - 18.11.2017
 
Ch01 intro
Ch01 introCh01 intro
Ch01 intro
 
TESTA, University of Leeds: 'Talking @ Teaching' (September 2013)
TESTA, University of Leeds: 'Talking @ Teaching' (September 2013)TESTA, University of Leeds: 'Talking @ Teaching' (September 2013)
TESTA, University of Leeds: 'Talking @ Teaching' (September 2013)
 
CAN Conference TESTA Programme Assessment
CAN Conference TESTA Programme Assessment CAN Conference TESTA Programme Assessment
CAN Conference TESTA Programme Assessment
 
Comparative and Non-Comparative
Comparative and Non-ComparativeComparative and Non-Comparative
Comparative and Non-Comparative
 
TESTA, Universtiy of Warwick SCAP Conference (July 2013)
TESTA, Universtiy of Warwick SCAP Conference (July 2013)TESTA, Universtiy of Warwick SCAP Conference (July 2013)
TESTA, Universtiy of Warwick SCAP Conference (July 2013)
 
The Reflective Journal as a site of Student Engagement, Learning and Transfor...
The Reflective Journal as a site of Student Engagement, Learning and Transfor...The Reflective Journal as a site of Student Engagement, Learning and Transfor...
The Reflective Journal as a site of Student Engagement, Learning and Transfor...
 
Glyndwr Teaching and Research
Glyndwr Teaching and Research Glyndwr Teaching and Research
Glyndwr Teaching and Research
 
Research
ResearchResearch
Research
 
Steve Vitto Response to Intervention (RTI)
Steve Vitto Response to Intervention (RTI)Steve Vitto Response to Intervention (RTI)
Steve Vitto Response to Intervention (RTI)
 

Mehr von Richard P Phelps

Dismissive Reviews, Citation Cartels, and the Replication Crisis.pptx
Dismissive Reviews, Citation Cartels, and the Replication Crisis.pptxDismissive Reviews, Citation Cartels, and the Replication Crisis.pptx
Dismissive Reviews, Citation Cartels, and the Replication Crisis.pptxRichard P Phelps
 
The Successful Degradation of Evidence on Educational Testing in the United S...
The Successful Degradation of Evidence on Educational Testing in the United S...The Successful Degradation of Evidence on Educational Testing in the United S...
The Successful Degradation of Evidence on Educational Testing in the United S...Richard P Phelps
 
Comparing achievement and aptitude tests for university admission
Comparing achievement and aptitude tests for university admissionComparing achievement and aptitude tests for university admission
Comparing achievement and aptitude tests for university admissionRichard P Phelps
 
Boarding School: Benefits and Drawbacks
Boarding School: Benefits and DrawbacksBoarding School: Benefits and Drawbacks
Boarding School: Benefits and DrawbacksRichard P Phelps
 
It's a myth: High stakes cause test score inflation
It's a myth: High stakes cause test score inflation It's a myth: High stakes cause test score inflation
It's a myth: High stakes cause test score inflation Richard P Phelps
 
It's a myth: High stakes cause test score inflation
It's a myth: High stakes cause test score inflationIt's a myth: High stakes cause test score inflation
It's a myth: High stakes cause test score inflationRichard P Phelps
 
Designing an Assessment System
Designing an Assessment SystemDesigning an Assessment System
Designing an Assessment SystemRichard P Phelps
 
Innovaciones en la evaluación en el aula: El uso de pruebas para promover el ...
Innovaciones en la evaluación en el aula: El uso de pruebas para promover el ...Innovaciones en la evaluación en el aula: El uso de pruebas para promover el ...
Innovaciones en la evaluación en el aula: El uso de pruebas para promover el ...Richard P Phelps
 
Fortalezas y debilidades de las pruebas estandarizadas como mecanismos inclus...
Fortalezas y debilidades de las pruebas estandarizadas como mecanismos inclus...Fortalezas y debilidades de las pruebas estandarizadas como mecanismos inclus...
Fortalezas y debilidades de las pruebas estandarizadas como mecanismos inclus...Richard P Phelps
 
University Admission Testing in Chile: The PSU
University Admission Testing in Chile: The PSUUniversity Admission Testing in Chile: The PSU
University Admission Testing in Chile: The PSURichard P Phelps
 
Forty years of polls on standardized tests in education
Forty years of polls on standardized tests in educationForty years of polls on standardized tests in education
Forty years of polls on standardized tests in educationRichard P Phelps
 
Economic perspectives on testing
Economic perspectives on testingEconomic perspectives on testing
Economic perspectives on testingRichard P Phelps
 
L'effet de tests standardisés sur les résultats scolaires des élèves : 1910-...
L'effet de tests standardisés sur les résultats scolaires des élèves :  1910-...L'effet de tests standardisés sur les résultats scolaires des élèves :  1910-...
L'effet de tests standardisés sur les résultats scolaires des élèves : 1910-...Richard P Phelps
 
The effect of testing on student achievement: 1910-2010
The effect of testing on student achievement: 1910-2010The effect of testing on student achievement: 1910-2010
The effect of testing on student achievement: 1910-2010Richard P Phelps
 
L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...
L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...
L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...Richard P Phelps
 
Worse Than Plagiarism: Dismissive Reviews
Worse Than Plagiarism: Dismissive ReviewsWorse Than Plagiarism: Dismissive Reviews
Worse Than Plagiarism: Dismissive ReviewsRichard P Phelps
 

Mehr von Richard P Phelps (18)

Dismissive Reviews, Citation Cartels, and the Replication Crisis.pptx
Dismissive Reviews, Citation Cartels, and the Replication Crisis.pptxDismissive Reviews, Citation Cartels, and the Replication Crisis.pptx
Dismissive Reviews, Citation Cartels, and the Replication Crisis.pptx
 
The Successful Degradation of Evidence on Educational Testing in the United S...
The Successful Degradation of Evidence on Educational Testing in the United S...The Successful Degradation of Evidence on Educational Testing in the United S...
The Successful Degradation of Evidence on Educational Testing in the United S...
 
Comparing achievement and aptitude tests for university admission
Comparing achievement and aptitude tests for university admissionComparing achievement and aptitude tests for university admission
Comparing achievement and aptitude tests for university admission
 
Boarding School: Benefits and Drawbacks
Boarding School: Benefits and DrawbacksBoarding School: Benefits and Drawbacks
Boarding School: Benefits and Drawbacks
 
It's a myth: High stakes cause test score inflation
It's a myth: High stakes cause test score inflation It's a myth: High stakes cause test score inflation
It's a myth: High stakes cause test score inflation
 
It's a myth: High stakes cause test score inflation
It's a myth: High stakes cause test score inflationIt's a myth: High stakes cause test score inflation
It's a myth: High stakes cause test score inflation
 
Designing an Assessment System
Designing an Assessment SystemDesigning an Assessment System
Designing an Assessment System
 
Innovaciones en la evaluación en el aula: El uso de pruebas para promover el ...
Innovaciones en la evaluación en el aula: El uso de pruebas para promover el ...Innovaciones en la evaluación en el aula: El uso de pruebas para promover el ...
Innovaciones en la evaluación en el aula: El uso de pruebas para promover el ...
 
Fortalezas y debilidades de las pruebas estandarizadas como mecanismos inclus...
Fortalezas y debilidades de las pruebas estandarizadas como mecanismos inclus...Fortalezas y debilidades de las pruebas estandarizadas como mecanismos inclus...
Fortalezas y debilidades de las pruebas estandarizadas como mecanismos inclus...
 
University Admission Testing in Chile: The PSU
University Admission Testing in Chile: The PSUUniversity Admission Testing in Chile: The PSU
University Admission Testing in Chile: The PSU
 
Test benefits slide show
Test benefits slide showTest benefits slide show
Test benefits slide show
 
Forty years of polls on standardized tests in education
Forty years of polls on standardized tests in educationForty years of polls on standardized tests in education
Forty years of polls on standardized tests in education
 
Economic perspectives on testing
Economic perspectives on testingEconomic perspectives on testing
Economic perspectives on testing
 
L'effet de tests standardisés sur les résultats scolaires des élèves : 1910-...
L'effet de tests standardisés sur les résultats scolaires des élèves :  1910-...L'effet de tests standardisés sur les résultats scolaires des élèves :  1910-...
L'effet de tests standardisés sur les résultats scolaires des élèves : 1910-...
 
The effect of testing on student achievement: 1910-2010
The effect of testing on student achievement: 1910-2010The effect of testing on student achievement: 1910-2010
The effect of testing on student achievement: 1910-2010
 
L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...
L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...
L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...
 
Source of Lake Wobegon
Source of Lake WobegonSource of Lake Wobegon
Source of Lake Wobegon
 
Worse Than Plagiarism: Dismissive Reviews
Worse Than Plagiarism: Dismissive ReviewsWorse Than Plagiarism: Dismissive Reviews
Worse Than Plagiarism: Dismissive Reviews
 

Kürzlich hochgeladen

A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...RKavithamani
 

Kürzlich hochgeladen (20)

A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
 

Classroom testing: Using tests to promote learning

  • 1. Testing in the classroom: Using tests to promote learning Richard P. Phelps Universidad Finis Terrae, Santiago, Chile January 7, 2014
  • 2. Q. What is a standardized test? A. An assessment with at least one aspect – in its content or administration – standardized. Q. What is the key advantage of standardized testing? A. It is standardized.
  • 3. Meta-analysis • A method for summarizing a large research literature, with a single, comparable measure. © 2012, Richard P PHELPS World Association of Education Research, 17th Congress, Reims, June, 2012 3
  • 4. John Hattie’s meta-analyses of meta-analyses
  • 5. John Hattie’s list 1. 11. Student self-assessment/self-grading Response to intervention Teacher credibility Providing formative assessments Classroom discussion Teacher clarity Feedback Reciprocal teaching Teacher-student relationships fostered Spaced vs. mass practice 21. Concept mapping Cooperative vs individualistic learning Direct instruction Tactile stimulation programs Mastery learning Worked examples Visual-perception programs Peer tutoring Cooperative vs competitive learning Phonics instruction Acceleration Classroom behavioral techniques Vocabulary programs Repeated reading programs Creativity programs Student prior achievement Self-questioning by students Study skills Problem-solving teaching Not labeling students 31. Student-centered teaching Classroom cohesion Pre-term birth weight Peer influences Classroom management techniques Outdoor-adventure programs Home environment Socio-economic status
  • 6. The effect of testing on student achievement: 1910-2010 Richard P. PHELPS © 2012, Richard P PHELPS
  • 7. The effect of testing on student achievement • 12-year long study • analyzed close to 700 separate studies, and more than 1,600 separate effects • 2,000 other studies were reviewed and found incomplete or inappropriate • lacking sufficient time and money, hundreds of other studies will not be reviewed © 2012, Richard P PHELPS World Association of Education Research, 17th Congress, Reims, June, 2012 7
  • 8. Studies included in the meta-analyses 2. …when: • a test is newly introduced, or newly removed • quantity of testing is increased or reduced • test stakes are introduced or increased, or removed or reduced © 2012, Richard P PHELPS World Association of Education Research, 17th Congress, Reims, June, 2012 8
  • 9. Number of studies of effects, by methodology type Number of studies Number of effects Quantitative 177 640 Surveys and public opinion polls (US & Canada) 247 813 Qualitative 245 245 TOTAL 669 1698 Methodology type © 2012, Richard P PHELPS World Association of Education Research, 17th Congress, Reims, June, 2012 9
  • 10. Effect size: Interpretation • d between 0.25 & 0.50  weak effect • d between 0.50 et 0.75  medium effect • d more than 0.75 © 2012, Richard P PHELPS  strong effect World Association of Education Research, 17th Congress, Reims, June, 2012 10
  • 11. Which predictors matter? Mean Effect Size Treatment Group… …is made aware of performance, and control group is not +0.98 …receives targeted instruction (e.g., remediation) +0.96 …is tested with higher stakes than control group +0.87 …is tested more frequently than control group +0.85 © 2012, Richard P PHELPS World Association of Education Research, 17th Congress, Reims, June, 2012 11
  • 12. Why tests? ● Students tend to study more, and learn more, when: • they know they will be tested, but not precisely what will be tested » (e.g.) Experiment comparing gains of students with “take-home tests” to those with “in class tests” -- the latter learned substantially more. • when there is reinforcement of material already studied ● Mastery learning experiments of 1960s—1980s: » Students learn more when asked to recall what they have learned. » Up to a point, the more students are made to actively process information, and describe it to others, the better they learn.
  • 13. Surveys and opinion polls: Regular standardized tests, performance tests Regular tests (N ≈125) Performance tests (N ≈ 50) d d Achievement is increased 1.2 1.0 …weighted by size of study population 1.9 0.5 Instruction is improved 1.0 1.4 …weighted by size of study population 0.9 0.9 Tests help align instruction 1.0 1.0 …weighted by size of study population 0.5 0.9 Respondent opinion © 2012, Richard P PHELPS World Association of Education Research, 17th Congress, Reims, June, 2012 13
  • 14. Qualitative studies: Effect on student achievement 244 studies conducted in the past century in over 30 countries Number of studies Percent of studies Percent without the inferred Positive 204 84 93 Positive inferred 24 10 Mixed 5 2 2 No change 8 3 4 Negative 3 1 1 244 100 100 Direction of effect TOTAL © 2012, Richard P PHELPS World Association of Education Research, 17th Congress, Reims, June, 2012 14
  • 15. “Repeated retrieval during learning is the key to long-term retention.”
  • 16. 10 benefits of testing and their applications to education Roediger, Putnam and Smith Direct effects of testing Retrieval practice during tests enhances retention of the retrieved information (relative to not testing or even to studying) -- the “testing effect” Repeated retrieval produces knowledge that can be retrieved flexibly and transferred to other situations On open-ended assessments (e.g., essay tests) retrieval practice induced by tests helps students organize information into a coherent knowledge base. Repeated retrieval leads to easier retrieval of related information SOURCE: Roediger, Putnam, & Smith, Ten benefits of testing and their applications to educational practice, Psychology of Learning and Motivation, 55, 2011.
  • 17. 10 benefits of testing and their applications to education Roediger, Putnam and Smith Indirect effects of testing Students tested frequently study more and with more regularity. Tests permit students to discover gaps in their knowledge and adjust their study efforts to focus on difficult material. Students who study after taking a test learn more than if they had not taken a test. Students who self-test or are tested more frequently in class learn more. SOURCE: Roediger, Putnam, & Smith, Ten benefits of testing and their applications to educational practice, Psychology of Learning and Motivation, 55, 2011.
  • 18. 10 benefits of testing and their applications to education Roediger, Putnam and Smith Benefit 1: The Testing Effect: Retrieval Aids Later Retention Benefit 2: Testing Identifies Gaps in Knowledge Benefit 3: Testing Causes Students to Learn More from the Next Study Episode Benefit 4: Testing Produces Better Organization of Knowledge Benefit 5: Testing Improves Transfer of Knowledge to New Contexts Benefit 6: Testing can Facilitate Retrieval of Material That was not Tested Benefit 7: Testing Improves Metacognitive Monitoring Benefit 8: Testing Prevents Interference from Prior Material when Learning New Material Benefit 9: Testing Provides Feedback to Instructors Benefit 10: Frequent Testing Encourages Students to Study SOURCE: Roediger, Putnam, & Smith, Ten benefits of testing and their applications to educational practice, Psychology of Learning and Motivation, 55, 2011.
  • 19. 10 benefits of testing and their applications to education Roediger, Putnam and Smith Benefit 1: The Testing Effect: Retrieval Aids Later Retention Benefit 2: Testing Identifies Gaps in Knowledge Benefit 3: Testing Causes Students to Learn More from the Next Study Episode Benefit 4: Testing Produces Better Organization of Knowledge Benefit 5: Testing Improves Transfer of Knowledge to New Contexts Benefit 6: Testing can Facilitate Retrieval of Material That was not Tested Benefit 7: Testing Improves Metacognitive Monitoring Benefit 8: Testing Prevents Interference from Prior Material when Learning New Material Benefit 9: Testing Provides Feedback to Instructors Benefit 10: Frequent Testing Encourages Students to Study SOURCE: Roediger, Putnam, & Smith, Ten benefits of testing and their applications to educational practice, Psychology of Learning and Motivation, 55, 2011.
  • 20. 10 benefits of testing and their applications to education Roediger, Putnam and Smith Most teachers should be testing much more frequently, …with smaller, shorter, less consequential tests. Students learn more when they test. But learn best when the tests are “spaced”. What is the optimal lapse of time between tests? The best time to test again is just before students start forgetting the information. This time lapse is shorter with discrete material, like mathematics, than with other subjects. Some studies suggest that math students should be tested at least once a week.
  • 21. The more high-stakes decision points, the better the student performance ? Figure 1: Average TIMSS Score and Number of Quality Control Measures Used, by Country Average Percent Correct (grades 7&8) 80 70 60 50 40 30 20 10 0 0 5 10 15 20 Number of Quality Control Measures Used Top-Performing Countries Bottom-Performing Countries SOURCE: Phelps, Benchmarking to the best in mathematics, Evaluation Review, 2001
  • 22. Quality control has proportionally greater effect in poorer countries Average Percent Correct (grades 7& 8) (per GDP/capita) Figure 2: Average TIMSS Score and Number of Quality Control Measures Used (each adjusted for GDP/capita), by Country Num be r of Quality Control Me as ure s Us e d (pe r GDP/capita) SOURCE: Phelps, Benchmarking to the best in mathematics, Evaluation Review, 2001
  • 23. What testing skills do teachers need… …for interpreting information from large-scale tests? Basic understanding of statistics: - distributions, mean, median, skewness, kurtosis - sampling error, measurement error - type 1 / type 2 error, statistical power - sampling (size, representativeness) Protocols to help them explain tests to others: - to students - to parents - to the media
  • 24. What testing skills do teachers need… …for developing and administering classroom tests? Practice (with each other) in writing items / prompts / rubrics : - unambiguous, relevant, un-biased Understand that useful assessment can be very simple: - e.g., save the last few minutes of each class to assess by asking students to record 2-3 concepts they learned that day Learn the optimal frequency, spacing of tests for your subject field and grade level.
  • 25. It is easy to know what you are teaching. But, you can only know what students are learning if you assess.

Hinweis der Redaktion

  1. Non-standardized measures, such as teacher grades, are volatile, unreliable, and subjective.
  2. John Hattie has summarized thousands of studies of educational interventions.A New Zealander, now living in Australia.
  3. In yellow are educational interventions related to testing.The second “Response to Intervention” is a special education program.
  4. Scatterplot of country-average 8th grade mathematics TIMSS score and number of high-stakes decision points used.
  5. Assessment is one area where teachers can help each other. Fellow teachers can serve as the item and bias review committees for each teacher’s tests.