Grading Your Assessments: How to Evaluate the Quality of Your Exams

Grading your Assessments:
How to Evaluate the
Quality of your Exams
An ExamSoft Client Webinar

Grading Your Assessments:
How to evaluate the quality
of your exams

AINSLIE T. NIBERT, PHD, RN, FAAN
MARCH 12, 2015

3
Sound Instruction
Educator’s
Golden
Triangle
Instruction
Evaluation
Objectives
Outcomes

4
Five Guidelines to Developing
Effective Critical Thinking Exams
q  Assemble the “basics.”
q  Write critical thinking test items.
q  Pay attention to housekeeping duties.
q  Develop a test blueprint.
q  Scientifically analyze all exams.

5
Definition
Critical Thinking
The process of analyzing and
understanding how and why we
reached a certain conclusion.

6
Bloom’s Taxonomy: Benjamin Bloom, 1956
(revised)
Terminology changes "The graphic is a representation of the NEW verbage
associated with the long familiar Bloom's Taxonomy. Note the change from Nouns to
Verbs [e.g., Application to Applying] to describe the different levels of the taxonomy.
Note that the top two levels are essentially exchanged from the Old to the New
version." (Schultz, 2005) (Evaluation moved from the top to Evaluating in the second
from the top, Synthesis moved from second on top to the top as Creating.) Source:
http://www.odu.edu/educ/llschult/blooms_taxonomy.htm

Post-Exam Item Analysis:
An important aspect of item writing
Helps to
determine the
quality of a test
8

Reliability Tools
10
q Kuder-Richardson Formula 20 (KR20)
—EXAM
Ø Range from –1 to + 1
q Point Biserial Correlation Coefficient
(PBCC)—TEST ITEMS
Ø Range from – 1 to + 1

11
q Item difficulty 30% - 90%
q Item Discrimination Ratio 25% and Above
q PBCC 0.20 and Above
q KR20 0.70 and Above
Standards of Acceptance

Thinking more about mean item
difficulty on teacher-made tests…
Mean
diﬃculty
level
for
a
teacher-‐made

nursing
exam
should
be
80
–
85%.

So,
why
might
low
NCLEX-‐RN®
pass
rates
persist
when
mean

diﬃculty
levels
on
teacher-‐made
exams
remain
consistently

within
this
desired
range?

12

…. and one “absolute”
rule about item difficulty

 Since
the
mean
diﬃculty
level
for
a
teacher-‐made
nursing

exam
is
80
–
85%,
what
should
the
lowest
acceptable

value
be
for
each
test
item
on
the
exam?

TEST
ITEMS
ANSWERED
CORRECTLY
BY
30%
or
LESS
of
the

examinees
should
always
be
considered
too
diﬃcult,
and

the
instructor
must
take
acSon.

Why?

13

…but what about high
difficulty levels?
q Test
items
with
high
diﬃculty
levels
(>90%)
oIen

yield
poor
discriminaJon
values.

q Is
there
a
situaJon
where
faculty
can
legiJmately

expect
that
100%
of
the
class
will
answer
a
test

item
correctly,
and
be
pleased
when
this
happens?

q RULE
OF
THUMB
ABOUT
MASTERY
ITEMS:
Due
to

their
negaJve
impact
on
test
discriminaJon
and

reliability,
they
should
comprise
no
more
than
10%

of
the
test.

14

15
q Item difficulty 30% - 90%
q Item Discrimination Ratio 25% and Above
q PBCC 0.20 and Above
q KR20 0.70 and Above

Thinking more about item
discrimination on teacher-
made tests…
q IDR
can
be
calculated
quickly,
but
doesn’t
consider
variance
of
the

enJre
group.
Use
it
to
quickly
idenJfy
items
that
have
zero/negaJve

discriminaJon
values,
since
these
need
to
be
edited
before
using
again.

q PBCC
is
a
more
powerful
measure
discriminaJon.

q Correlates
the
correct
answer
to
a
single
test
items
with
the
total
test
score

of
the
student.

q Considers
the
variance
of
the
enJre
student
group,
not
just
the
lower
and

upper
27%
groups.

q For
a
small
‘n,’
consider
cumulaJve
value.

16

… what decisions need
to be made about items?
q When
a
test
item
has
poor
difficulty
and/or

discriminaJon
values,
acJon
is
needed.

q All
of
these
acSons
require
that
the
exam
be
rescored.

q Credit
can
be
given
for
more
than
one
choice.

q Test
item
can
be
nullified.

q Test
item
can
be
deleted.

q Each
of
these
acSons
has
a
consequence,
so

faculty
need
to
carefully
consider
these
when

choosing
an
acSon.
Faculty
judgment
is
crucial

when
determining
acSons
affecSng
test
scores.

17

Nursing

Nursing-PBCC 0.15 and Above
Nursing-KR20 0.60 - 0.65 and Above
18

Thinking more about adjusting
standard of acceptance for
nursing tests…
q Remember
that
the
key
staJsJcal
concept
inherent
in

calculaJng
coeﬃcients
is
VARIANCE.

q When
there
is
less
variance
in
test
scores,
reliability
of
the

test
will
decrease,
ie
the
KR-‐20
value
will
drop.

q What
contributes
to
lack
of
variance
in

nursing
students’
test
scores?

19

..and a word about using
Response Frequencies

 SomeJmes
LESS
is
MORE
when
it
comes
to
ediJng
a
test
item.

 A
review
of
the
response
frequency
data
can
focus
your
ediJng.

 For
items
where
100%
of
students
answer
correctly,
and
no

other
opJons
were
chosen,
make
sure
that
this
is
indeed

intenJonal
(MASTERY
ITEM),
and
not
just
reﬂecJve
of
an

item
that
is
too
easy
(>90%
DIFFICULTY.)

 Target
re-‐wriJng
the
“zero”
distracters
–
those
opJons
that

are
ignored
by
students.
Replacing
“zeros”
with
plausible

opJons
will
immediately
improve
item
DISCRIMINATION.

21

22
3-Step Method for
Item Analysis
1. Review Difficulty Level
2. Review Discrimination Data
q  Item Discrimination Ratio (IDR)
q  Point Biserial Correlation Coefficient (PBCC)
3. Review Effectiveness of Alternatives
q  Response Frequencies
q  Non-distracters
Source: Morrison, Nibert, Flick, J. (2006). Critical
thinking and test item writing (2nd ed.).Houston,
TX: Health Education Systems, Inc.

28
Does the test measure what
it claims to measure?
C o n t e n t V a l i d i t y

29
Use a Blueprint to Assess a
Test’s Validity
q  Test Blueprint
Ø  Reflects Course Objectives
Ø  Rational/Logical Tool
Ø  Testing Software Program
Ø  Storage of item analysis data (Last & Cum)
Ø  Storage of test item categories

30
Test Blueprints
q  Faculty Generated
q  Electronically Generated

An electronic blueprint
for each exam in each course
31

NCLEX-‐RN® Client Needs
Percentages of Items 2011 vs. 2014
34
Source: https://www.ncsbn.org/4701.htm

NCLEX-‐RN® Client Needs
Percentages of Items 2011 vs. 2014
Increases vs. Decreases
35

Item Writing Tools for
Success …
Knowledge
Test Blueprint
Testing Software

References

Morrison,
S.,
Nibert,
A.,
&
Flick,
J.
(2006).
Cri$cal
thinking
and
test
item

wri$ng
(2nd
ed.).
Houston,
TX:
Health
EducaJon
Systems,
Inc.

Morrison,
S.
(2004).
Improving
NCLEX-‐RN
pass
rates
through
internal
and

external
curriculum
evaluaJon.
In
M.
Oermann
&
K.
Heinrich
(Eds.),
Annual

review
of
nursing
educaJon
(Vol.
3).
New
York:
Springer

NaJonal
Council
of
State
Boards
of
Nursing.
(2013)
2013
NCLEX-‐RN
test

plan.
Chicago,
IL:
NaJonal
Council
of
State
Boards
of
Nursing.

hpps://www.ncsbn.org/3795.htm

Nibert,
A.
(2010)
Benchmarking
for
student
progression
throughout
a

nursing
program:

Implica$ons
for
students,
faculty,
and
administrators.
In

CapuJ,
L.
(Ed.),
Teaching
nursing:

The
art
and
science,
2nd
ed.
(Vol.
3).
(pp.
45-‐64).
Chicago:
College
of
DuPage
Press.

37

Have Ques]ons? Need More Info?

 Thanks
for
your
Jme
&
apenJon
today!

38
866-429-8889

Grading Your Assessments: How to Evaluate the Quality of Your Exams

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (13)

Ähnlich wie Grading Your Assessments: How to Evaluate the Quality of Your Exams

Ähnlich wie Grading Your Assessments: How to Evaluate the Quality of Your Exams (20)

Mehr von ExamSoft

Mehr von ExamSoft (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Grading Your Assessments: How to Evaluate the Quality of Your Exams