SlideShare ist ein Scribd-Unternehmen logo
1 von 18
Volume XI, Number 2, Summer 2005
Issue Topic: Evaluation Methodology
Theory & Practice
Evaluation Theory or What Are Evaluation Methods for?
Quick Links
• About The Evaluation Exchange
• Table of Contents
• PDF Version (152 kB)
• Request reprint permission
• How to cite
Mel Mark, professor of psychology at the Pennsylvania State University and president-elect of the American Evaluation
Association, discusses why theory is important to evaluation practice.
Hey, this issue of The Evaluation Exchange focuses on methods, including recent methodological developments. What’s
this piece on evaluation theory doing here? Was there some kind of mix-up?
No, it’s not a mistake. Although evaluation theory1 serves several purposes, perhaps it functions most importantly as a
guide to practice. Learning the latest methodological advance—whether it’s some new statistical adjustment for selection
bias or the most recent technique to facilitate stakeholder dialogue—without knowing the relevant theory is a bit like
learning what to do without knowing why or when.
What you risk is the equivalent of becoming really skilled at tuning your car’s engine without thinking about whether your
transportation needs involve going across town, overseas, or to the top of a skyscraper. Will Shadish, Tom Cook, and
Laura Leviton make the same point using a military metaphor: “Evaluation theories are like military strategies and tactics;
methods are like military weapons and logistics,” they say. “The good commander needs to know strategy and tactics to
deploy weapons properly or to organize logistics in different situations. The good evaluator needs theories for the same
reasons in choosing and employing methods.”2
The reasons to learn about evaluation theory go beyond the strategy/tactic or why/how distinction, however. Evaluation
theory does more than help us make good judgments about what kind of methods to use, under what circumstances, and
toward what forms of evaluation influence.
First, evaluation theories are a way of consolidating lessons learned, that is, of synthesizing prior experience. Carol
Weiss’ work can help evaluators develop a more sophisticated and nuanced understanding of the way organizations
make decisions and may be influenced by evaluation findings.3 Theories enable us to learn from the experience of others
(as the saying goes, we don’t live long enough to learn everything from our own mistakes). George Madaus, Michael
Scriven, and Daniel Stufflebeam had this function of evaluation theory in mind when they said that evaluators who are
unknowledgeable about theory are “doomed to repeat past mistakes and, equally debilitating, will fail to sustain and build
on past successes.”4
Second, comparing evaluation theories is a useful way of identifying and better understanding the key areas of debate
within the field. Comparative study of evaluation theory likewise helps crystallize what the unsettled issues are for
practice. When we read the classic exchange between Michael Patton and Carol Weiss,5 for example, we learn about
very different perspectives on what evaluation use can or should look like.
A third reason for studying evaluation theory is that theory should be an important part of our identities as evaluators, both
individually and collectively. If we think of ourselves in terms of our methodological skills, what is it that differentiates us
from many other people with equal (or even superior) methodological expertise? Evaluation theory. Evaluation theory, as
Will Shadish said in his presidential address to the American Evaluation Association, is “who we are.”6 But people come
to evaluation through quite varied pathways, many of which don’t involve explicit training in evaluation. That there are
myriad pathways into evaluation is, of course, a source of great strength for the field, bringing diversity of skills, opinions,
knowledge sets, and so on.
Despite the positive consequences of the various ways that people enter the field, this diversity also reinforces the
importance of studying evaluation theories. Methods are important, but, again, they need to be chosen in the service of
some larger end. Theory helps us figure out where an evaluation should be going and why—and, not trivially, what it is to
be an evaluator.
Of course, knowing something about evaluation theory doesn’t mean that choices about methods can be made
automatically. Indeed, lauded evaluation theorist Ernie House notes that while theorists typically have a high profile,
“practitioners lament that [theorists’] ideas are far too theoretical, too impractical. Practitioners have to do the project work
tomorrow, not jawbone fruitlessly forever.”7 Especially for newer theoretical work, the translation into practice may not be
clear—and sometimes not even feasible. Even evaluation theory that has withstood the test of time doesn’t automatically
translate into some cookbook or paint-by-number approach to evaluation practice.
More knowledge about evaluation theory can, especially at first, actually make methods choices harder. Why? Because
many evaluation theories take quite different stances about what kind of uses evaluation should focus on, and about how
evaluation should be done to achieve those uses. For example, to think about Donald Campbell8 as an evaluation theorist
is to highlight (a) the possibility of major choice points in the road, such as decisions about whether or not to implement
some new program; (b) the way decisions about such things often depend largely on the program’s potential effects (e.g.,
does universal pre-K lead to better school readiness and other desirable outcomes?); and (c) the benefits of either
randomized experiments or the best-available quasi-experimental data for assessing program effects.
In contrast, when we think about Joseph Wholey9 as an evaluation theorist, we focus on a very different way that
evaluation can contribute: through developing performance-measurement systems that program administrators can use to
improve their ongoing decision making. These performance measurement systems can help managers identify problem
areas and also provide them with good-enough feedback about the apparent consequences of decisions.
Choosing among these and the many other perspectives available in evaluation theories may seem daunting, especially
at first. But it’s better to learn to face the choices than to have them made implicitly by some accident of one’s
methodological training. In addition, theories themselves can help in the choosing. Some evaluation theories have huge
“default options.” These theories may not exactly say “one size fits all,” but they certainly suggest that one size fits darn
near all. Indeed, one of the dangers for those starting to learn about evaluation theory is becoming a true believer in one
of the first theories they encounter. When this happens, the new disciple may act like his or her preferred theory fits all
circumstances. Perhaps the most effective antidote to this problem is to be sure to learn about several evaluation theories
that take fairly different stances. Metaphorically, we probably need to be multilingual: No single evaluation theory should
be “spoken” in all the varying contexts we will encounter.
However, most, if not all, evaluation theories are contingent; that is, they prescribe (or at least are open to) quite different
approaches under different circumstances. As it turns out, there even exist theories that suggest very different bases for
contingent decision making. Put differently, there are theories that differ significantly on reasons for deciding to use one
evaluation design and not another.
These lead us to think about different “drivers” of contingent decision making. For example, Michael Patton’s well-
known Utilization-Focused Evaluation tells us to be contingent based on intended use by intended users. Almost any
method may be appropriate, if it is likely to help intended users make the intended use.10 Alternatively, in a recent book,
Huey-Tsyh Chen joins others who suggest that the choices made in evaluation should be driven by program
stage.11 Evaluation purposes and methods for a new program, according to Chen, would typically be different from those
for a mature program. Gary Henry, George Julnes, and I12 have suggested that choices among alternative evaluation
purposes and methods should be driven by a kind of analytic assessment of each one’s likely contribution to social
betterment.13
It can help to be familiar with any one of these fundamentally contingent evaluation theories. And, as is true of evaluation
theories in general, one or another may fit better, depending on the specific context. Nevertheless, the ideal would
probably be to be multilingual even in terms of these contingent evaluation theories. For instance, sometimes intended
use may be the perfect driver of contingent decision making, but in other cases decision making may be so distributed
across multiple parties that it isn’t feasible to identify specific intended users: Even the contingencies are contingent.
Evaluation theories are an aid to thoughtful judgment—not a dispensation from it. But as an aid to thoughtful choices
about methods, evaluation theories are indispensable.
1 Although meaningful distinctions could perhaps be made, here I am treating evaluation theory as equivalent
to evaluation modeland to the way the term evaluation approach is sometimes used.
2 Shadish, W. R., Cook, T. D., & Leviton, L. C. (1991). Foundations of program evaluation: Theories of practice. Newbury
Park, CA: Sage.
3 For a recent overview, see Weiss, C. H. (2004). Rooting for evaluation: A Cliff Notes version of my work. In M. C. Alkin
(Ed.),Evaluation roots: Tracing theorists’ views and influences (pp. 12–65). Thousand Oaks, CA: Sage.
4 Madaus, G. F., Scriven, M., & Stufflebeam, D. L. (1983). Evaluation models. Boston: Kluwer-Nijhoff.
5 See the papers by Weiss and Patton in Vol. 9, No. 1 (1988) of Evaluation Practice, reprinted in M. Alkin (Ed.).
(1990). Debates on evaluation. Newbury Park, CA: Sage.
6 Shadish, W. (1998). Presidential address: Evaluation theory is who we are. American Journal of Evaluation, 19(1), 1–
19.
7 House, E. R. (2003). Evaluation theory. In T. Kellaghan & D. L. Stufflebeam (Eds.), International handbook of
educational evaluation (pp. 9–14). Boston: Kluwer Academic.
8 For an overview, see the chapter on Campbell in the book cited in footnote 2.
9 See, e.g., Wholey, J. S. (2003). Improving performance and accountability: Responding to emerging management
challenges. In S. I. Donaldson & M. Scriven (Eds.), Evaluating social programs and problems: Visions for the new
millennium. Mahwah, NJ: Lawrence Erlbaum.
10 Patton, M. Q. (1997). Utilization-focused evaluation: The new century text. Thousand Oaks, CA: Sage
11 Chen, H.-T. (2005). Practical program evaluation: Assessing and improve planning, implementation, and effectiveness.
Thousand Oaks, CA: Sage.
12 Mark, M. M., Henry, G. T., & Julnes, G. (2000). Evaluation: An integrated framework for understanding, guiding, and
improving policies and programs. San Francisco, CA: Jossey-Bass.
13 Each of these three theories also addresses other factors that help drive contingent thinking about evaluation. In
addition, at least in certain places, there is more overlap among these models than this brief summary suggests.
Nevertheless, they do in general focus on different drivers of contingent decision making, as noted.
Mel Mark, Ph.D.
Professor of Psychology
The Pennsylvania State University
Department of Psychology
407 Moore Hall
University Park, PA 16802.
Tel: 814-863-1755
Email: m5m@psu.edu
kirkpatrick's learning and training
evaluation theory
Donald L Kirkpatrick's training evaluation model -
the four levels of learning evaluation
also below - HRD performance evaluation guide
Donald L Kirkpatrick, Professor Emeritus, University Of Wisconsin
(where he achieved his BBA, MBA and PhD), first published his ideas in
1959, in a series of articles in the Journal of American Society of
Training Directors. The articles were subsequently included in
Kirkpatrick's book Evaluating Training Programs (originally published in
1994; now in its 3rd edition - Berrett-Koehler Publishers).
Donald Kirkpatrick was president of the American Society for Training
and Development (ASTD) in 1975. Kirkpatrick has written several other
significant books about training and evaluation, more recently with his
similarly inclined son James, and has consulted with some of the world's
largest corporations.
Donald Kirkpatrick's 1994 book Evaluating Training Programs defined
his originally published ideas of 1959, thereby further increasing
awareness of them, so that his theory has now become arguably the
most widely used and popular model for the evaluation of training and
learning. Kirkpatrick's four-level model is now considered an industry
standard across the HR and training communities.
More recently Don Kirkpatrick formed his own company, Kirkpatrick
Partners, whose website provides information about their services and
methods, etc.
kirkpatrick's four levels of evaluation model
The four levels of Kirkpatrick's evaluation model essentially measure:
• reaction of student - what they thought and felt about
the training
• learning - the resulting increase in knowledge or
capability
• behaviour - extent of behaviour and capability
improvement and implementation/application
• results - the effects on the business or environment
resulting from the trainee's performance
All these measures are recommended for full
and meaningful evaluation of learning in organizations, although their
application broadly increases in complexity, and usually cost, through
the levels from level 1-4.
Quick Training Evaluation and Feedback Form, based on
Kirkpatrick's Learning Evaluation Model - (Excel file)
kirkpatrick's four levels of training evaluation
This grid illustrates the basic Kirkpatrick structure at a glance. The
second grid, beneath this one, is the same thing with more detail.
level evaluation
type (what
is
measured)
evaluation
description and
characteristics
examples of
evaluation tools
and methods
relevance and
practicability
1 Reaction Reaction
evaluation is how the
delegates
felt about the training
or learning
experience.
'Happy sheets',
feedback forms.
Verbal reaction, post-
training surveys or
questionnaires.
Quick and very easy to
obtain.
Not expensive to gather
or to analyse.
2 Learning Learning
evaluation is the
measurement of
the increase in
knowledge - before
and after.
Typically
assessments or tests
before and after the
training.
Interview or
observation can also
be used.
Relatively simple to set
up; clear-cut for
quantifiable skills.
Less easy for complex
learning.
3 Behaviour Behaviour
evaluation is the
extent of applied
learning back on the
job - implementation.
Observation and
interview over time
are required to
assess change,
relevance of change,
and sustainability of
change.
Measurement of
behaviour change
typically requires
cooperation and skill of
line-managers.
4 Results Results evaluation is Measures are already Individually not difficult;
the effect on the
business or
environment by the
trainee.
in place via normal
management systems
and reporting - the
challenge is to relate
to the trainee.
unlike whole organisation.
Process must attribute
clear accountabilities.
kirkpatrick's four levels of training evaluation in detail
This grid illustrates the Kirkpatrick's structure detail, and particularly the
modern-day interpretation of the Kirkpatrick learning evaluation model,
usage, implications, and examples of tools and methods. This diagram
is the same format as the one above but with more detail and
explanation:
evaluation
level and
type
evaluation description
and characteristics
examples of
evaluation tools and
methods
relevance and
practicability
1.
Reaction
Reaction
evaluation is how the
delegates felt, and
their personal reactions
to the training or
learning experience, for
example:
Did the trainees like and
enjoy the training?
Did they consider the training
relevant?
Was it a good use of their
Typically 'happy sheets'.
Feedback forms based on
subjective personal
reaction to the training
experience.
Verbal reaction which can
be noted and analysed.
Post-training surveys or
questionnaires.
Online evaluation or
grading by delegates.
Can be done
immediately the
training ends.
Very easy to obtain
reaction feedback
Feedback is not
expensive to gather or
to analyse for groups.
Important to know that
people were not upset
or disappointed.
time?
Did they like the venue, the
style, timing, domestics, etc?
Level of participation.
Ease and comfort of
experience.
Level of effort required to
make the most of the
learning.
Perceived practicability and
potential for applying the
learning.
Subsequent verbal or
written reports given by
delegates to managers
back at their jobs.
Important that people
give a positive
impression when
relating their
experience to others
who might be deciding
whether to experience
same.
2.
Learning
Learning evaluation is the
measurement of
the increase in
knowledge or intellectual
capability from before to
after the learning experience:
Did the trainees learn what
what intended to be taught?
Did the trainee experience
what was intended for them
to experience?
What is the extent of
advancement or change in
the trainees after the training,
in the direction or area that
was intended?
Typically assessments or
tests before and after the
training.
Interview or observation
can be used before and
after although this is time-
consuming and can be
inconsistent.
Methods of assessment
need to be closely related
to the aims of the learning.
Measurement and
analysis is possible and
easy on a group scale.
Reliable, clear scoring and
measurements need to be
established, so as to limit
the risk of inconsistent
Relatively simple to set
up, but more
investment and thought
required than reaction
evaluation.
Highly relevant and
clear-cut for certain
training such as
quantifiable or technical
skills.
Less easy for more
complex learning such
as attitudinal
development, which is
famously difficult to
assess.
Cost escalates if
systems are poorly
assessment.
Hard-copy, electronic,
online or interview style
assessments are all
possible.
designed, which
increases work
required to measure
and analyse.
3.
Behaviour
Behaviour evaluation is
the extent to which the
trainees applied the
learningand changed
their behaviour, and this
can be immediately and
several months after the
training, depending on the
situation:
Did the trainees put their
learning into effect when
back on the job?
Were the relevant skills and
knowledge used
Was there noticeable and
measurable change in the
activity and performance of
the trainees when back in
their roles?
Was the change in behaviour
and new level of knowledge
sustained?
Would the trainee be able to
transfer their learning to
another person?
Is the trainee aware of their
Observation and interview
over time are required to
assess change, relevance
of change, and
sustainability of change.
Arbitrary snapshot
assessments are not
reliable because people
change in different ways
at different times.
Assessments need to be
subtle and ongoing, and
then transferred to a
suitable analysis tool.
Assessments need to be
designed to reduce
subjective judgement of
the observer or
interviewer, which is a
variable factor that can
affect reliability and
consistency of
measurements.
The opinion of the trainee,
which is a relevant
indicator, is also
subjective and unreliable,
Measurement of
behaviour change is
less easy to quantify
and interpret than
reaction and learning
evaluation.
Simple quick response
systems unlikely to be
adequate.
Cooperation and skill of
observers, typically
line-managers, are
important factors, and
difficult to control.
Management and
analysis of ongoing
subtle assessments are
difficult, and virtually
impossible without a
well-designed system
from the beginning.
Evaluation of
implementation and
application is an
extremely important
assessment - there is
little point in a good
change in behaviour,
knowledge, skill level?
and so needs to be
measured in a consistent
defined way.
360-degree feedback is
useful method and need
not be used before
training, because
respondents can make a
judgement as to change
after training, and this can
be analysed for groups of
respondents and trainees.
Assessments can be
designed around relevant
performance scenarios,
and specific key
performance indicators or
criteria.
Online and electronic
assessments are more
difficult to incorporate -
assessments tend to be
more successful when
integrated within existing
management and
coaching protocols.
Self-assessment can be
useful, using carefully
designed criteria and
measurements.
reaction and good
increase in capability if
nothing changes back
in the job, therefore
evaluation in this area
is vital, albeit
challenging.
Behaviour change
evaluation is possible
given good support and
involvement from line
managers or trainees,
so it is helpful to
involve them from the
start, and to identify
benefits for them,
which links to the level
4 evaluation below.
4.
Results
Results evaluation is
the effect on the
It is possible that many of
these measures are
Individually, results
evaluation is not
business or
environment resulting from
the improved performance of
the trainee - it is the acid test.
Measures would typically be
business or organisational
key performance indicators,
such as:
Volumes, values,
percentages, timescales,
return on investment, and
other quantifiable aspects of
organisational performance,
for instance; numbers of
complaints, staff turnover,
attrition, failures, wastage,
non-compliance, quality
ratings, achievement of
standards and accreditations,
growth, retention, etc.
already in place via
normal management
systems and reporting.
The challenge is to
identify which and how
relate to to the trainee's
input and influence.
Therefore it is important to
identify and agree
accountability and
relevance with the trainee
at the start of the training,
so they understand what
is to be measured.
This process overlays
normal good management
practice - it simply needs
linking to the training
input.
Failure to link to training
input type and timing will
greatly reduce the ease by
which results can be
attributed to the training.
For senior people
particularly, annual
appraisals and ongoing
agreement of key
business objectives are
integral to measuring
business results derived
from training.
particularly difficult;
across an entire
organisation it
becomes very much
more challenging, not
least because of the
reliance on line-
management, and the
frequency and scale of
changing structures,
responsibilities and
roles, which
complicates the
process of attributing
clear accountability.
Also, external factors
greatly affect
organisational and
business performance,
which cloud the true
cause of good or poor
results.
Since Kirkpatrick established his original model, other theorists (for
example Jack Phillips), and indeed Kirkpatrick himself, have referred to
a possible fifth level, namely ROI (Return On Investment). In my view
ROI can easily be included in Kirkpatrick's original fourth level 'Results'.
The inclusion and relevance of a fifth level is therefore arguably only
relevant if the assessment of Return On Investment might otherwise be
ignored or forgotten when referring simply to the 'Results' level.
Learning evaluation is a widely researched area. This is understandable
since the subject is fundamental to the existence and performance of
education around the world, not least universities, which of course
contain most of the researchers and writers.
While Kirkpatrick's model is not the only one of its type, for most
industrial and commercial applications it suffices; indeed most
organisations would be absolutely thrilled if their training and learning
evaluation, and thereby their ongoing people-development, were
planned and managed according to Kirkpatrick's model.
For reference, should you be keen to look at more ideas, there are
many to choose from...
• Jack Phillips' Five Level ROI Model
• Daniel Stufflebeam's CIPP Model (Context, Input, Process,
Product)
• Robert Stake's Responsive Evaluation Model
• Robert Stake's Congruence-Contingency Model
• Kaufman's Five Levels of Evaluation
• CIRO (Context, Input, Reaction, Outcome)
• PERT (Program Evaluation and Review Technique)
• Alkins' UCLA Model
• Michael Scriven's Goal-Free Evaluation Approach
• Provus's Discrepancy Model
• Eisner's Connoisseurship Evaluation Models
• Illuminative Evaluation Model
• Portraiture Model
• and also the American Evaluation Association
Also look at Leslie Rae's excellent Training Evaluation and
tools available on this site, which, given Leslie's experience and
knowledge, will save you the job of researching and designing your own
tools.
evaluation of HRD function performance
If you are responsible for HR functions and services to internal and/or
external customers, you might find it useful to go beyond Kirkpatrick's
evaluation of training and learning, and to evaluate
also satisfaction among staff/customers with HR department's
overall performance. The parameters for such an evaluation
ultimately depend on what your HR function is responsible for - in other
words, evaluate according to expectations.
Like anything else, evaluating customer satisfaction must first begin with
a clear appreciation of (internal) customers' expectations. Expectations -
agreed, stated, published or otherwise - provide the basis for evaluating
all types of customer satisfaction.
If people have expectations which go beyond HR
department's stated and actual responsibilities, then the
matter must be pursued because it will almost certainly offer
an opportunity to add value to HR's activities, and to add
value and competitive advantage to your organisation as a
whole. In this fast changing world, HR is increasingly the department
which is most likely to see and respond to new opportunities for the
support and development of the your people - so respond, understand,
and do what you can to meet new demands when you see them.
If you are keen to know how well HR department is meeting people's
expectations, a questionnaire, and/or some group discussions will shed
light on the situation.
Here are some example questions. Effectively you should be asking
people to say how well HR or HRD department has done the following:
• helped me
to identify, understand, identify and prioritise my personal
development needs and wishes, in terms
of: skills,knowledge, experience and attitude (or personal well-
being, or emotional maturity, or mood, or mind-set, or any other
suitable term meaning mental approach, which people will respond
to)
• helped me to understand my own preferred learning
style and learning methods for acquiring
new skills, knowledge and attitudinalcapabilities
• helped me to identify and obtain effective
learning and development that suits my preferred style and
circumstances
• helped me to measure my development, and for the
measurement to be clear to my boss and others in the organisation
who should know about my capabilities
• provided tools and systems to encourage and facilitate
my personal development
• and particularly helped to optimise
the relationship between me and my boss relating to assisting my
own personal development and well-being
• provided a working environment that protects
me from discrimination and harassment of any sort
• provided the opportunity for me to voice my grievances if I have
any, (in private, to a suitably trained person in the company whom I
trust) and then if I so wish for proper consideration and response to
be given to them by the company
• provided the opportunity for me to receive counselling and
advice in the event that I need private and supportive help of this
type, again from a suitably trained person in the company whom I
trust
• ensured that disciplinary processes are clear and fair, and
include the right of appeal
• ensured that recruitment and promotion of staff are
managed fairly and transparently
• ensuring that systems and activities exist to keep all
staff informed of company plans, performance, etc., (as normally
included in a Team Briefing system)
• (if you dare...) ensuring that people are paid and rewarded
fairly in relation to other company employees, and separately, paid
and rewarded fairly when compared to market norms (your CEO will
not like this question, but if you have a problem in this area it's best to
know about it...)
• (and for managers) helped me to ensure the development
needs of my staff are identified and supported
This is not an exhaustive list - just some examples. Many of the
examples contain elements which should under typical large company
circumstances be broken down to create more and smaller questions
about more specific aspects of HR support and services.
If you work in HR, or run an HR department, and consider that some of
these issues and expectations fall outside your remit, then consider who
else is responsible for them.
I repeat, in this fast changing world, HR is increasingly the department
which is most likely to see and respond to new opportunities for the
support and development of the your people - so respond, understand,
and do what you can to meet new demands when you see them. In
doing so you will add value to your people and your organisation - and
your department.
Informing Practice Using Evaluation Models and Theories
Instructor: Dr. Melvin M. Mark, Professor of Psychology at the Pennsylvania State University
Description: Evaluators who are not aware of the contemporary and historical aspects of the profession.
"are doomed to repeat past mistakes and, equally debilitating, will fail to sustain and build on past
successes." Madaus, Scriven and Stufflebeam (1983).
"Evaluation theories are like military strategy and tactics; methods are like military weapons and logistics.
The good commander needs to know strategy and tactics to deploy weapons properly or to organize
logistics in different situations. The good evaluator needs theories for the same reasons in choosing and
deploying methods." Shadish, Cook and Leviton (1991).
These quotes from Madaus et al. (1983) and Shadish et al. (1991) provide the perfect rationale for why the
serious evaluator should be concerned with models and theories of evaluation. The primary purpose of this
class is to overview major streams of evaluation theories (or models), and to consider their implications for
practice. Topics include: (1) why evaluation theories matter, (2) frameworks for classifying different
theories, (3) in-depth examination of 4-6 major theories, (4) identification of key issues on which evaluation
theories and models differ, (5) benefits and risks of relying heavily on any one theory, and (6) tools and
skills that can help you in picking and choosing from different theoretical perspectives in planning an
evaluation in a specific context. The overarching theme will be on practice implications, that is, on what
difference it would make for practice to follow one theory or some other.
Theories to be discussed will be ones that have had a significant impact on the evaluation field, that offer
perspectives with major implications for practice, and that represent important and different streams of
theory and practice. Case examples from the past will be used to illustrate key aspects of each theory's
approach to practice.
Participants will be asked to use the theories to question their own and others' practices, and to consider
what characteristics of evaluations will help increase their potential for use. Each participant will receive
Marvin Alkin's text, Evaluation Roots (Sage, 2004) and other materials.
The instructor's assumption will be that most people attending the session have some general familiarity
with the work of a few evaluation theorists, but that most will not themselves be scholars of evaluation
theory. At the same time, the course should be useful, whatever one's level of familiarity with evaluation
theory.
Applied Measurement for Evaluation
Instructor: Dr. Ann Doucette, TEI Director, Research Professor, Columbian College of Arts and Sciences,
George Washington University
Description: Successful evaluation depends on our ability to generate evidence attesting to the feasibility,
relevance and/or effectiveness of the interventions, services, or products we study. While theory guides our
designs and how we organize our work, it is measurement that provides the evidence we use in making
judgments about the quality of what we evaluate. Measurement, whether it results from self-report survey,
interview/focus groups, observation, document review, or administrative data must be systematic,
replicable, interpretable, reliable, and valid. While hard sciences such as physics and engineering have
advanced precise and accurate measurement (i.e., weigh, length, mass, volume), the measurement used in
evaluation studies is often imprecise and characterized by considerable error. The quality of the inferences
made in evaluation studies is directly related to the quality of the measurement on which we base our
judgments. Judgments attesting to the ineffective interventions may be flawed - the reflection of measures
that are imprecise and not sensitive to the characteristics we chose to evaluate. Evaluation attempts to
compensate for imprecise measurement with increasingly sophisticated statistical procedures to manipulate
data. The emphasis on statistical analysis all too often obscures the important characteristics of the
measures we choose. This class content will cover these topics:
• Assessing measurement precision: examining the precision of measures in relationship to the
degree of accuracy that is needed for what is being evaluated. Issues to be addressed include:
measurement/item bias, the sensitivity of measures in terms of developmental and cultural
issues, scientific soundness (reliability, validity, error, etc.), and the ability of the measure to
detect change over time.
• Quantification: Measurement is essentially assigning numbers to what is observed (direct and
inferential). Decisions about how we quantify observations and the implications these decisions
have for using the data resulting from the measures, as well as for the objectivity and certainty
we bring to the judgment made in our evaluations will be examined. This section of the course
will focus on the quality of response options, coding categories - Do response options/coding
categories segment the respondent sample in meaningful and useful ways?
• Issues and Considerations - using existing measures versus developing your own measures:
What to look for and how to assess whether existing measures are suitable for your evaluation
project will be examined. Issues associated with the development and use of new measures will
be addressed in terms of how to establish sound psychometric properties, and what cautionary
statements should accompanying interpretation and evaluation findings using these new
measures.
• Criteria for choosing measures: assessing the adequacy of measures in terms of the
characteristics of measurement - choosing measures that fit your evaluation theory and
evaluation focus (exploration, needs assessment, level of implementation, process, impact and
outcome). Measurement feasibility, practicability and relevance will be examined. Various
measurement techniques will be examined in terms of precision and adequacy, as well as the
implications of using screening, broad-range, and peaked tests.
• Error - influences on measurement precision: The characteristics of various measurement
techniques, assessment conditions (setting, respondent interest, etc.), and evaluator
characteristics will be addressed.
Participants will be provided with a copy of the text: Measurement Theory in Action (Case Studies and
Exercises) by Shulz, K.S. and D.J. Whitney (Sage, 2004).
>> Return to top

Weitere ähnliche Inhalte

Was ist angesagt?

(My initial post)reflect on the focus area or system(s) for the
(My initial post)reflect on the focus area or system(s) for the (My initial post)reflect on the focus area or system(s) for the
(My initial post)reflect on the focus area or system(s) for the ssuserfa5723
 
Research Methodology
Research MethodologyResearch Methodology
Research MethodologyAneel Raza
 
Psyc327.Show2
Psyc327.Show2Psyc327.Show2
Psyc327.Show2TUPSYC327
 
Rch 7301, critical thinking for doctoral learners 1
  Rch 7301, critical thinking for doctoral learners 1   Rch 7301, critical thinking for doctoral learners 1
Rch 7301, critical thinking for doctoral learners 1 ssuserfa5723
 
Metpen chapter 7 UMA SEKARAN
Metpen chapter 7 UMA SEKARANMetpen chapter 7 UMA SEKARAN
Metpen chapter 7 UMA SEKARANDiyah Aprilia
 
Carma internet research module scale development
Carma internet research module   scale developmentCarma internet research module   scale development
Carma internet research module scale developmentSyracuse University
 
Business Research Methods
Business Research MethodsBusiness Research Methods
Business Research MethodsAmit Sarkar
 
Theories, models and frameworks 2017
Theories, models and frameworks 2017Theories, models and frameworks 2017
Theories, models and frameworks 2017Martha Seife
 
Contemporary research practices
Contemporary research practicesContemporary research practices
Contemporary research practicesCarlo Magno
 
Adler clark 4e ppt 08
Adler clark 4e ppt 08Adler clark 4e ppt 08
Adler clark 4e ppt 08arpsychology
 
Critical thinking measurement tools and assessment rubrics report
Critical thinking measurement tools and assessment rubrics   reportCritical thinking measurement tools and assessment rubrics   report
Critical thinking measurement tools and assessment rubrics reportLira Lei Ann Bondoc
 

Was ist angesagt? (20)

Problem ding the
          Problem          ding the           Problem          ding the
Problem ding the
 
(My initial post)reflect on the focus area or system(s) for the
(My initial post)reflect on the focus area or system(s) for the (My initial post)reflect on the focus area or system(s) for the
(My initial post)reflect on the focus area or system(s) for the
 
Reliability and validity
Reliability and  validityReliability and  validity
Reliability and validity
 
Research Methodology
Research MethodologyResearch Methodology
Research Methodology
 
Meta-Evaluation Theory
Meta-Evaluation TheoryMeta-Evaluation Theory
Meta-Evaluation Theory
 
Psyc327.Show2
Psyc327.Show2Psyc327.Show2
Psyc327.Show2
 
Aqa research methods 1
Aqa research methods 1Aqa research methods 1
Aqa research methods 1
 
Methods for Tina
Methods for TinaMethods for Tina
Methods for Tina
 
Rch 7301, critical thinking for doctoral learners 1
  Rch 7301, critical thinking for doctoral learners 1   Rch 7301, critical thinking for doctoral learners 1
Rch 7301, critical thinking for doctoral learners 1
 
Metpen chapter 7 UMA SEKARAN
Metpen chapter 7 UMA SEKARANMetpen chapter 7 UMA SEKARAN
Metpen chapter 7 UMA SEKARAN
 
Carma internet research module scale development
Carma internet research module   scale developmentCarma internet research module   scale development
Carma internet research module scale development
 
Using CATs and REAs to inform decision-making
Using CATs and REAs to inform decision-makingUsing CATs and REAs to inform decision-making
Using CATs and REAs to inform decision-making
 
Business Research Methods
Business Research MethodsBusiness Research Methods
Business Research Methods
 
Unit1 systematicreview
Unit1 systematicreviewUnit1 systematicreview
Unit1 systematicreview
 
Theories, models and frameworks 2017
Theories, models and frameworks 2017Theories, models and frameworks 2017
Theories, models and frameworks 2017
 
Critical Thinking
Critical ThinkingCritical Thinking
Critical Thinking
 
Contemporary research practices
Contemporary research practicesContemporary research practices
Contemporary research practices
 
7th semester BIM: Unit 1 critical thinking
7th semester BIM: Unit 1 critical thinking7th semester BIM: Unit 1 critical thinking
7th semester BIM: Unit 1 critical thinking
 
Adler clark 4e ppt 08
Adler clark 4e ppt 08Adler clark 4e ppt 08
Adler clark 4e ppt 08
 
Critical thinking measurement tools and assessment rubrics report
Critical thinking measurement tools and assessment rubrics   reportCritical thinking measurement tools and assessment rubrics   report
Critical thinking measurement tools and assessment rubrics report
 

Ähnlich wie New microsoft word document (2)

IIAlternative Approachesto Program EvaluationPart1.docx
IIAlternative Approachesto Program EvaluationPart1.docxIIAlternative Approachesto Program EvaluationPart1.docx
IIAlternative Approachesto Program EvaluationPart1.docxsheronlewthwaite
 
My presentation erin da802
My presentation   erin da802My presentation   erin da802
My presentation erin da802nida19
 
The Duty of Loyalty and Whistleblowing Please respond to the fol.docx
The Duty of Loyalty and Whistleblowing Please respond to the fol.docxThe Duty of Loyalty and Whistleblowing Please respond to the fol.docx
The Duty of Loyalty and Whistleblowing Please respond to the fol.docxcherry686017
 
8. brown & hudson 1998 the alternatives in language assessment
8. brown & hudson 1998 the alternatives in language assessment8. brown & hudson 1998 the alternatives in language assessment
8. brown & hudson 1998 the alternatives in language assessmentCate Atehortua
 
Running head SETTING UP RESEARCH1 Chapter 6 Methods of Measu.docx
Running head SETTING UP RESEARCH1  Chapter 6 Methods of Measu.docxRunning head SETTING UP RESEARCH1  Chapter 6 Methods of Measu.docx
Running head SETTING UP RESEARCH1 Chapter 6 Methods of Measu.docxtodd521
 
IDS 401 Milestone Four Gu
IDS  401 Milestone  Four  GuIDS  401 Milestone  Four  Gu
IDS 401 Milestone Four GuMalikPinckney86
 
Educational evaluation. ed8 chapter 6
Educational evaluation. ed8 chapter 6Educational evaluation. ed8 chapter 6
Educational evaluation. ed8 chapter 6Eddie Abug
 
Ppt methods of acquiring knowledge
Ppt methods of acquiring knowledgePpt methods of acquiring knowledge
Ppt methods of acquiring knowledgejyoti dwivedi
 
Chapter3-Methods_of_Research-Module.pdf
Chapter3-Methods_of_Research-Module.pdfChapter3-Methods_of_Research-Module.pdf
Chapter3-Methods_of_Research-Module.pdfJUNGERONA
 
Judgments and decisions in health care review: how to undertake ethical revie...
Judgments and decisions in health care review: how to undertake ethical revie...Judgments and decisions in health care review: how to undertake ethical revie...
Judgments and decisions in health care review: how to undertake ethical revie...Hugh Davies
 
Classroom-Based-Action-Research.pptx
Classroom-Based-Action-Research.pptxClassroom-Based-Action-Research.pptx
Classroom-Based-Action-Research.pptxMyleneDelaPena2
 
Organizational Diagnosis
Organizational DiagnosisOrganizational Diagnosis
Organizational Diagnosisjim
 
Do’s and don’t for Applied Technology.
Do’s and don’t for Applied Technology. Do’s and don’t for Applied Technology.
Do’s and don’t for Applied Technology. solo334
 

Ähnlich wie New microsoft word document (2) (20)

IIAlternative Approachesto Program EvaluationPart1.docx
IIAlternative Approachesto Program EvaluationPart1.docxIIAlternative Approachesto Program EvaluationPart1.docx
IIAlternative Approachesto Program EvaluationPart1.docx
 
My presentation erin da802
My presentation   erin da802My presentation   erin da802
My presentation erin da802
 
The Duty of Loyalty and Whistleblowing Please respond to the fol.docx
The Duty of Loyalty and Whistleblowing Please respond to the fol.docxThe Duty of Loyalty and Whistleblowing Please respond to the fol.docx
The Duty of Loyalty and Whistleblowing Please respond to the fol.docx
 
8. brown & hudson 1998 the alternatives in language assessment
8. brown & hudson 1998 the alternatives in language assessment8. brown & hudson 1998 the alternatives in language assessment
8. brown & hudson 1998 the alternatives in language assessment
 
Running head SETTING UP RESEARCH1 Chapter 6 Methods of Measu.docx
Running head SETTING UP RESEARCH1  Chapter 6 Methods of Measu.docxRunning head SETTING UP RESEARCH1  Chapter 6 Methods of Measu.docx
Running head SETTING UP RESEARCH1 Chapter 6 Methods of Measu.docx
 
IDS 401 Milestone Four Gu
IDS  401 Milestone  Four  GuIDS  401 Milestone  Four  Gu
IDS 401 Milestone Four Gu
 
Educational evaluation. ed8 chapter 6
Educational evaluation. ed8 chapter 6Educational evaluation. ed8 chapter 6
Educational evaluation. ed8 chapter 6
 
Ppt methods of acquiring knowledge
Ppt methods of acquiring knowledgePpt methods of acquiring knowledge
Ppt methods of acquiring knowledge
 
Chapter3-Methods_of_Research-Module.pdf
Chapter3-Methods_of_Research-Module.pdfChapter3-Methods_of_Research-Module.pdf
Chapter3-Methods_of_Research-Module.pdf
 
Judgments and decisions in health care review: how to undertake ethical revie...
Judgments and decisions in health care review: how to undertake ethical revie...Judgments and decisions in health care review: how to undertake ethical revie...
Judgments and decisions in health care review: how to undertake ethical revie...
 
RACFIntroduction to Evaluation
RACFIntroduction to EvaluationRACFIntroduction to Evaluation
RACFIntroduction to Evaluation
 
Marketing Research
Marketing ResearchMarketing Research
Marketing Research
 
Educational Evaluation
Educational EvaluationEducational Evaluation
Educational Evaluation
 
Ogc chap 7
Ogc chap 7Ogc chap 7
Ogc chap 7
 
Research Week 1
Research Week 1Research Week 1
Research Week 1
 
Research
ResearchResearch
Research
 
Classroom-Based-Action-Research.pptx
Classroom-Based-Action-Research.pptxClassroom-Based-Action-Research.pptx
Classroom-Based-Action-Research.pptx
 
26formres
26formres26formres
26formres
 
Organizational Diagnosis
Organizational DiagnosisOrganizational Diagnosis
Organizational Diagnosis
 
Do’s and don’t for Applied Technology.
Do’s and don’t for Applied Technology. Do’s and don’t for Applied Technology.
Do’s and don’t for Applied Technology.
 

Kürzlich hochgeladen

Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 

Kürzlich hochgeladen (20)

Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 

New microsoft word document (2)

  • 1. Volume XI, Number 2, Summer 2005 Issue Topic: Evaluation Methodology Theory & Practice Evaluation Theory or What Are Evaluation Methods for? Quick Links • About The Evaluation Exchange • Table of Contents • PDF Version (152 kB) • Request reprint permission • How to cite Mel Mark, professor of psychology at the Pennsylvania State University and president-elect of the American Evaluation Association, discusses why theory is important to evaluation practice. Hey, this issue of The Evaluation Exchange focuses on methods, including recent methodological developments. What’s this piece on evaluation theory doing here? Was there some kind of mix-up? No, it’s not a mistake. Although evaluation theory1 serves several purposes, perhaps it functions most importantly as a guide to practice. Learning the latest methodological advance—whether it’s some new statistical adjustment for selection bias or the most recent technique to facilitate stakeholder dialogue—without knowing the relevant theory is a bit like learning what to do without knowing why or when. What you risk is the equivalent of becoming really skilled at tuning your car’s engine without thinking about whether your transportation needs involve going across town, overseas, or to the top of a skyscraper. Will Shadish, Tom Cook, and Laura Leviton make the same point using a military metaphor: “Evaluation theories are like military strategies and tactics; methods are like military weapons and logistics,” they say. “The good commander needs to know strategy and tactics to deploy weapons properly or to organize logistics in different situations. The good evaluator needs theories for the same reasons in choosing and employing methods.”2 The reasons to learn about evaluation theory go beyond the strategy/tactic or why/how distinction, however. Evaluation theory does more than help us make good judgments about what kind of methods to use, under what circumstances, and toward what forms of evaluation influence.
  • 2. First, evaluation theories are a way of consolidating lessons learned, that is, of synthesizing prior experience. Carol Weiss’ work can help evaluators develop a more sophisticated and nuanced understanding of the way organizations make decisions and may be influenced by evaluation findings.3 Theories enable us to learn from the experience of others (as the saying goes, we don’t live long enough to learn everything from our own mistakes). George Madaus, Michael Scriven, and Daniel Stufflebeam had this function of evaluation theory in mind when they said that evaluators who are unknowledgeable about theory are “doomed to repeat past mistakes and, equally debilitating, will fail to sustain and build on past successes.”4 Second, comparing evaluation theories is a useful way of identifying and better understanding the key areas of debate within the field. Comparative study of evaluation theory likewise helps crystallize what the unsettled issues are for practice. When we read the classic exchange between Michael Patton and Carol Weiss,5 for example, we learn about very different perspectives on what evaluation use can or should look like. A third reason for studying evaluation theory is that theory should be an important part of our identities as evaluators, both individually and collectively. If we think of ourselves in terms of our methodological skills, what is it that differentiates us from many other people with equal (or even superior) methodological expertise? Evaluation theory. Evaluation theory, as Will Shadish said in his presidential address to the American Evaluation Association, is “who we are.”6 But people come to evaluation through quite varied pathways, many of which don’t involve explicit training in evaluation. That there are myriad pathways into evaluation is, of course, a source of great strength for the field, bringing diversity of skills, opinions, knowledge sets, and so on. Despite the positive consequences of the various ways that people enter the field, this diversity also reinforces the importance of studying evaluation theories. Methods are important, but, again, they need to be chosen in the service of some larger end. Theory helps us figure out where an evaluation should be going and why—and, not trivially, what it is to be an evaluator. Of course, knowing something about evaluation theory doesn’t mean that choices about methods can be made automatically. Indeed, lauded evaluation theorist Ernie House notes that while theorists typically have a high profile, “practitioners lament that [theorists’] ideas are far too theoretical, too impractical. Practitioners have to do the project work tomorrow, not jawbone fruitlessly forever.”7 Especially for newer theoretical work, the translation into practice may not be clear—and sometimes not even feasible. Even evaluation theory that has withstood the test of time doesn’t automatically translate into some cookbook or paint-by-number approach to evaluation practice. More knowledge about evaluation theory can, especially at first, actually make methods choices harder. Why? Because many evaluation theories take quite different stances about what kind of uses evaluation should focus on, and about how evaluation should be done to achieve those uses. For example, to think about Donald Campbell8 as an evaluation theorist is to highlight (a) the possibility of major choice points in the road, such as decisions about whether or not to implement some new program; (b) the way decisions about such things often depend largely on the program’s potential effects (e.g., does universal pre-K lead to better school readiness and other desirable outcomes?); and (c) the benefits of either randomized experiments or the best-available quasi-experimental data for assessing program effects. In contrast, when we think about Joseph Wholey9 as an evaluation theorist, we focus on a very different way that evaluation can contribute: through developing performance-measurement systems that program administrators can use to improve their ongoing decision making. These performance measurement systems can help managers identify problem areas and also provide them with good-enough feedback about the apparent consequences of decisions. Choosing among these and the many other perspectives available in evaluation theories may seem daunting, especially at first. But it’s better to learn to face the choices than to have them made implicitly by some accident of one’s methodological training. In addition, theories themselves can help in the choosing. Some evaluation theories have huge “default options.” These theories may not exactly say “one size fits all,” but they certainly suggest that one size fits darn near all. Indeed, one of the dangers for those starting to learn about evaluation theory is becoming a true believer in one of the first theories they encounter. When this happens, the new disciple may act like his or her preferred theory fits all circumstances. Perhaps the most effective antidote to this problem is to be sure to learn about several evaluation theories
  • 3. that take fairly different stances. Metaphorically, we probably need to be multilingual: No single evaluation theory should be “spoken” in all the varying contexts we will encounter. However, most, if not all, evaluation theories are contingent; that is, they prescribe (or at least are open to) quite different approaches under different circumstances. As it turns out, there even exist theories that suggest very different bases for contingent decision making. Put differently, there are theories that differ significantly on reasons for deciding to use one evaluation design and not another. These lead us to think about different “drivers” of contingent decision making. For example, Michael Patton’s well- known Utilization-Focused Evaluation tells us to be contingent based on intended use by intended users. Almost any method may be appropriate, if it is likely to help intended users make the intended use.10 Alternatively, in a recent book, Huey-Tsyh Chen joins others who suggest that the choices made in evaluation should be driven by program stage.11 Evaluation purposes and methods for a new program, according to Chen, would typically be different from those for a mature program. Gary Henry, George Julnes, and I12 have suggested that choices among alternative evaluation purposes and methods should be driven by a kind of analytic assessment of each one’s likely contribution to social betterment.13 It can help to be familiar with any one of these fundamentally contingent evaluation theories. And, as is true of evaluation theories in general, one or another may fit better, depending on the specific context. Nevertheless, the ideal would probably be to be multilingual even in terms of these contingent evaluation theories. For instance, sometimes intended use may be the perfect driver of contingent decision making, but in other cases decision making may be so distributed across multiple parties that it isn’t feasible to identify specific intended users: Even the contingencies are contingent. Evaluation theories are an aid to thoughtful judgment—not a dispensation from it. But as an aid to thoughtful choices about methods, evaluation theories are indispensable. 1 Although meaningful distinctions could perhaps be made, here I am treating evaluation theory as equivalent to evaluation modeland to the way the term evaluation approach is sometimes used. 2 Shadish, W. R., Cook, T. D., & Leviton, L. C. (1991). Foundations of program evaluation: Theories of practice. Newbury Park, CA: Sage. 3 For a recent overview, see Weiss, C. H. (2004). Rooting for evaluation: A Cliff Notes version of my work. In M. C. Alkin (Ed.),Evaluation roots: Tracing theorists’ views and influences (pp. 12–65). Thousand Oaks, CA: Sage. 4 Madaus, G. F., Scriven, M., & Stufflebeam, D. L. (1983). Evaluation models. Boston: Kluwer-Nijhoff. 5 See the papers by Weiss and Patton in Vol. 9, No. 1 (1988) of Evaluation Practice, reprinted in M. Alkin (Ed.). (1990). Debates on evaluation. Newbury Park, CA: Sage. 6 Shadish, W. (1998). Presidential address: Evaluation theory is who we are. American Journal of Evaluation, 19(1), 1– 19. 7 House, E. R. (2003). Evaluation theory. In T. Kellaghan & D. L. Stufflebeam (Eds.), International handbook of educational evaluation (pp. 9–14). Boston: Kluwer Academic. 8 For an overview, see the chapter on Campbell in the book cited in footnote 2. 9 See, e.g., Wholey, J. S. (2003). Improving performance and accountability: Responding to emerging management challenges. In S. I. Donaldson & M. Scriven (Eds.), Evaluating social programs and problems: Visions for the new millennium. Mahwah, NJ: Lawrence Erlbaum. 10 Patton, M. Q. (1997). Utilization-focused evaluation: The new century text. Thousand Oaks, CA: Sage 11 Chen, H.-T. (2005). Practical program evaluation: Assessing and improve planning, implementation, and effectiveness. Thousand Oaks, CA: Sage. 12 Mark, M. M., Henry, G. T., & Julnes, G. (2000). Evaluation: An integrated framework for understanding, guiding, and improving policies and programs. San Francisco, CA: Jossey-Bass. 13 Each of these three theories also addresses other factors that help drive contingent thinking about evaluation. In addition, at least in certain places, there is more overlap among these models than this brief summary suggests. Nevertheless, they do in general focus on different drivers of contingent decision making, as noted. Mel Mark, Ph.D. Professor of Psychology
  • 4. The Pennsylvania State University Department of Psychology 407 Moore Hall University Park, PA 16802. Tel: 814-863-1755 Email: m5m@psu.edu kirkpatrick's learning and training evaluation theory Donald L Kirkpatrick's training evaluation model - the four levels of learning evaluation also below - HRD performance evaluation guide Donald L Kirkpatrick, Professor Emeritus, University Of Wisconsin (where he achieved his BBA, MBA and PhD), first published his ideas in 1959, in a series of articles in the Journal of American Society of Training Directors. The articles were subsequently included in Kirkpatrick's book Evaluating Training Programs (originally published in 1994; now in its 3rd edition - Berrett-Koehler Publishers). Donald Kirkpatrick was president of the American Society for Training and Development (ASTD) in 1975. Kirkpatrick has written several other significant books about training and evaluation, more recently with his similarly inclined son James, and has consulted with some of the world's largest corporations. Donald Kirkpatrick's 1994 book Evaluating Training Programs defined his originally published ideas of 1959, thereby further increasing awareness of them, so that his theory has now become arguably the
  • 5. most widely used and popular model for the evaluation of training and learning. Kirkpatrick's four-level model is now considered an industry standard across the HR and training communities. More recently Don Kirkpatrick formed his own company, Kirkpatrick Partners, whose website provides information about their services and methods, etc. kirkpatrick's four levels of evaluation model The four levels of Kirkpatrick's evaluation model essentially measure: • reaction of student - what they thought and felt about the training • learning - the resulting increase in knowledge or capability • behaviour - extent of behaviour and capability improvement and implementation/application • results - the effects on the business or environment resulting from the trainee's performance All these measures are recommended for full and meaningful evaluation of learning in organizations, although their application broadly increases in complexity, and usually cost, through the levels from level 1-4. Quick Training Evaluation and Feedback Form, based on Kirkpatrick's Learning Evaluation Model - (Excel file)
  • 6. kirkpatrick's four levels of training evaluation This grid illustrates the basic Kirkpatrick structure at a glance. The second grid, beneath this one, is the same thing with more detail. level evaluation type (what is measured) evaluation description and characteristics examples of evaluation tools and methods relevance and practicability 1 Reaction Reaction evaluation is how the delegates felt about the training or learning experience. 'Happy sheets', feedback forms. Verbal reaction, post- training surveys or questionnaires. Quick and very easy to obtain. Not expensive to gather or to analyse. 2 Learning Learning evaluation is the measurement of the increase in knowledge - before and after. Typically assessments or tests before and after the training. Interview or observation can also be used. Relatively simple to set up; clear-cut for quantifiable skills. Less easy for complex learning. 3 Behaviour Behaviour evaluation is the extent of applied learning back on the job - implementation. Observation and interview over time are required to assess change, relevance of change, and sustainability of change. Measurement of behaviour change typically requires cooperation and skill of line-managers. 4 Results Results evaluation is Measures are already Individually not difficult;
  • 7. the effect on the business or environment by the trainee. in place via normal management systems and reporting - the challenge is to relate to the trainee. unlike whole organisation. Process must attribute clear accountabilities. kirkpatrick's four levels of training evaluation in detail This grid illustrates the Kirkpatrick's structure detail, and particularly the modern-day interpretation of the Kirkpatrick learning evaluation model, usage, implications, and examples of tools and methods. This diagram is the same format as the one above but with more detail and explanation: evaluation level and type evaluation description and characteristics examples of evaluation tools and methods relevance and practicability 1. Reaction Reaction evaluation is how the delegates felt, and their personal reactions to the training or learning experience, for example: Did the trainees like and enjoy the training? Did they consider the training relevant? Was it a good use of their Typically 'happy sheets'. Feedback forms based on subjective personal reaction to the training experience. Verbal reaction which can be noted and analysed. Post-training surveys or questionnaires. Online evaluation or grading by delegates. Can be done immediately the training ends. Very easy to obtain reaction feedback Feedback is not expensive to gather or to analyse for groups. Important to know that people were not upset or disappointed.
  • 8. time? Did they like the venue, the style, timing, domestics, etc? Level of participation. Ease and comfort of experience. Level of effort required to make the most of the learning. Perceived practicability and potential for applying the learning. Subsequent verbal or written reports given by delegates to managers back at their jobs. Important that people give a positive impression when relating their experience to others who might be deciding whether to experience same. 2. Learning Learning evaluation is the measurement of the increase in knowledge or intellectual capability from before to after the learning experience: Did the trainees learn what what intended to be taught? Did the trainee experience what was intended for them to experience? What is the extent of advancement or change in the trainees after the training, in the direction or area that was intended? Typically assessments or tests before and after the training. Interview or observation can be used before and after although this is time- consuming and can be inconsistent. Methods of assessment need to be closely related to the aims of the learning. Measurement and analysis is possible and easy on a group scale. Reliable, clear scoring and measurements need to be established, so as to limit the risk of inconsistent Relatively simple to set up, but more investment and thought required than reaction evaluation. Highly relevant and clear-cut for certain training such as quantifiable or technical skills. Less easy for more complex learning such as attitudinal development, which is famously difficult to assess. Cost escalates if systems are poorly
  • 9. assessment. Hard-copy, electronic, online or interview style assessments are all possible. designed, which increases work required to measure and analyse. 3. Behaviour Behaviour evaluation is the extent to which the trainees applied the learningand changed their behaviour, and this can be immediately and several months after the training, depending on the situation: Did the trainees put their learning into effect when back on the job? Were the relevant skills and knowledge used Was there noticeable and measurable change in the activity and performance of the trainees when back in their roles? Was the change in behaviour and new level of knowledge sustained? Would the trainee be able to transfer their learning to another person? Is the trainee aware of their Observation and interview over time are required to assess change, relevance of change, and sustainability of change. Arbitrary snapshot assessments are not reliable because people change in different ways at different times. Assessments need to be subtle and ongoing, and then transferred to a suitable analysis tool. Assessments need to be designed to reduce subjective judgement of the observer or interviewer, which is a variable factor that can affect reliability and consistency of measurements. The opinion of the trainee, which is a relevant indicator, is also subjective and unreliable, Measurement of behaviour change is less easy to quantify and interpret than reaction and learning evaluation. Simple quick response systems unlikely to be adequate. Cooperation and skill of observers, typically line-managers, are important factors, and difficult to control. Management and analysis of ongoing subtle assessments are difficult, and virtually impossible without a well-designed system from the beginning. Evaluation of implementation and application is an extremely important assessment - there is little point in a good
  • 10. change in behaviour, knowledge, skill level? and so needs to be measured in a consistent defined way. 360-degree feedback is useful method and need not be used before training, because respondents can make a judgement as to change after training, and this can be analysed for groups of respondents and trainees. Assessments can be designed around relevant performance scenarios, and specific key performance indicators or criteria. Online and electronic assessments are more difficult to incorporate - assessments tend to be more successful when integrated within existing management and coaching protocols. Self-assessment can be useful, using carefully designed criteria and measurements. reaction and good increase in capability if nothing changes back in the job, therefore evaluation in this area is vital, albeit challenging. Behaviour change evaluation is possible given good support and involvement from line managers or trainees, so it is helpful to involve them from the start, and to identify benefits for them, which links to the level 4 evaluation below. 4. Results Results evaluation is the effect on the It is possible that many of these measures are Individually, results evaluation is not
  • 11. business or environment resulting from the improved performance of the trainee - it is the acid test. Measures would typically be business or organisational key performance indicators, such as: Volumes, values, percentages, timescales, return on investment, and other quantifiable aspects of organisational performance, for instance; numbers of complaints, staff turnover, attrition, failures, wastage, non-compliance, quality ratings, achievement of standards and accreditations, growth, retention, etc. already in place via normal management systems and reporting. The challenge is to identify which and how relate to to the trainee's input and influence. Therefore it is important to identify and agree accountability and relevance with the trainee at the start of the training, so they understand what is to be measured. This process overlays normal good management practice - it simply needs linking to the training input. Failure to link to training input type and timing will greatly reduce the ease by which results can be attributed to the training. For senior people particularly, annual appraisals and ongoing agreement of key business objectives are integral to measuring business results derived from training. particularly difficult; across an entire organisation it becomes very much more challenging, not least because of the reliance on line- management, and the frequency and scale of changing structures, responsibilities and roles, which complicates the process of attributing clear accountability. Also, external factors greatly affect organisational and business performance, which cloud the true cause of good or poor results.
  • 12. Since Kirkpatrick established his original model, other theorists (for example Jack Phillips), and indeed Kirkpatrick himself, have referred to a possible fifth level, namely ROI (Return On Investment). In my view ROI can easily be included in Kirkpatrick's original fourth level 'Results'. The inclusion and relevance of a fifth level is therefore arguably only relevant if the assessment of Return On Investment might otherwise be ignored or forgotten when referring simply to the 'Results' level. Learning evaluation is a widely researched area. This is understandable since the subject is fundamental to the existence and performance of education around the world, not least universities, which of course contain most of the researchers and writers. While Kirkpatrick's model is not the only one of its type, for most industrial and commercial applications it suffices; indeed most organisations would be absolutely thrilled if their training and learning evaluation, and thereby their ongoing people-development, were planned and managed according to Kirkpatrick's model. For reference, should you be keen to look at more ideas, there are many to choose from... • Jack Phillips' Five Level ROI Model • Daniel Stufflebeam's CIPP Model (Context, Input, Process, Product) • Robert Stake's Responsive Evaluation Model • Robert Stake's Congruence-Contingency Model • Kaufman's Five Levels of Evaluation
  • 13. • CIRO (Context, Input, Reaction, Outcome) • PERT (Program Evaluation and Review Technique) • Alkins' UCLA Model • Michael Scriven's Goal-Free Evaluation Approach • Provus's Discrepancy Model • Eisner's Connoisseurship Evaluation Models • Illuminative Evaluation Model • Portraiture Model • and also the American Evaluation Association Also look at Leslie Rae's excellent Training Evaluation and tools available on this site, which, given Leslie's experience and knowledge, will save you the job of researching and designing your own tools. evaluation of HRD function performance If you are responsible for HR functions and services to internal and/or external customers, you might find it useful to go beyond Kirkpatrick's evaluation of training and learning, and to evaluate also satisfaction among staff/customers with HR department's overall performance. The parameters for such an evaluation ultimately depend on what your HR function is responsible for - in other words, evaluate according to expectations.
  • 14. Like anything else, evaluating customer satisfaction must first begin with a clear appreciation of (internal) customers' expectations. Expectations - agreed, stated, published or otherwise - provide the basis for evaluating all types of customer satisfaction. If people have expectations which go beyond HR department's stated and actual responsibilities, then the matter must be pursued because it will almost certainly offer an opportunity to add value to HR's activities, and to add value and competitive advantage to your organisation as a whole. In this fast changing world, HR is increasingly the department which is most likely to see and respond to new opportunities for the support and development of the your people - so respond, understand, and do what you can to meet new demands when you see them. If you are keen to know how well HR department is meeting people's expectations, a questionnaire, and/or some group discussions will shed light on the situation. Here are some example questions. Effectively you should be asking people to say how well HR or HRD department has done the following: • helped me to identify, understand, identify and prioritise my personal development needs and wishes, in terms of: skills,knowledge, experience and attitude (or personal well- being, or emotional maturity, or mood, or mind-set, or any other suitable term meaning mental approach, which people will respond to)
  • 15. • helped me to understand my own preferred learning style and learning methods for acquiring new skills, knowledge and attitudinalcapabilities • helped me to identify and obtain effective learning and development that suits my preferred style and circumstances • helped me to measure my development, and for the measurement to be clear to my boss and others in the organisation who should know about my capabilities • provided tools and systems to encourage and facilitate my personal development • and particularly helped to optimise the relationship between me and my boss relating to assisting my own personal development and well-being • provided a working environment that protects me from discrimination and harassment of any sort • provided the opportunity for me to voice my grievances if I have any, (in private, to a suitably trained person in the company whom I trust) and then if I so wish for proper consideration and response to be given to them by the company • provided the opportunity for me to receive counselling and advice in the event that I need private and supportive help of this type, again from a suitably trained person in the company whom I trust • ensured that disciplinary processes are clear and fair, and include the right of appeal
  • 16. • ensured that recruitment and promotion of staff are managed fairly and transparently • ensuring that systems and activities exist to keep all staff informed of company plans, performance, etc., (as normally included in a Team Briefing system) • (if you dare...) ensuring that people are paid and rewarded fairly in relation to other company employees, and separately, paid and rewarded fairly when compared to market norms (your CEO will not like this question, but if you have a problem in this area it's best to know about it...) • (and for managers) helped me to ensure the development needs of my staff are identified and supported This is not an exhaustive list - just some examples. Many of the examples contain elements which should under typical large company circumstances be broken down to create more and smaller questions about more specific aspects of HR support and services. If you work in HR, or run an HR department, and consider that some of these issues and expectations fall outside your remit, then consider who else is responsible for them. I repeat, in this fast changing world, HR is increasingly the department which is most likely to see and respond to new opportunities for the support and development of the your people - so respond, understand, and do what you can to meet new demands when you see them. In doing so you will add value to your people and your organisation - and your department.
  • 17. Informing Practice Using Evaluation Models and Theories Instructor: Dr. Melvin M. Mark, Professor of Psychology at the Pennsylvania State University Description: Evaluators who are not aware of the contemporary and historical aspects of the profession. "are doomed to repeat past mistakes and, equally debilitating, will fail to sustain and build on past successes." Madaus, Scriven and Stufflebeam (1983). "Evaluation theories are like military strategy and tactics; methods are like military weapons and logistics. The good commander needs to know strategy and tactics to deploy weapons properly or to organize logistics in different situations. The good evaluator needs theories for the same reasons in choosing and deploying methods." Shadish, Cook and Leviton (1991). These quotes from Madaus et al. (1983) and Shadish et al. (1991) provide the perfect rationale for why the serious evaluator should be concerned with models and theories of evaluation. The primary purpose of this class is to overview major streams of evaluation theories (or models), and to consider their implications for practice. Topics include: (1) why evaluation theories matter, (2) frameworks for classifying different theories, (3) in-depth examination of 4-6 major theories, (4) identification of key issues on which evaluation theories and models differ, (5) benefits and risks of relying heavily on any one theory, and (6) tools and skills that can help you in picking and choosing from different theoretical perspectives in planning an evaluation in a specific context. The overarching theme will be on practice implications, that is, on what difference it would make for practice to follow one theory or some other. Theories to be discussed will be ones that have had a significant impact on the evaluation field, that offer perspectives with major implications for practice, and that represent important and different streams of theory and practice. Case examples from the past will be used to illustrate key aspects of each theory's approach to practice. Participants will be asked to use the theories to question their own and others' practices, and to consider what characteristics of evaluations will help increase their potential for use. Each participant will receive Marvin Alkin's text, Evaluation Roots (Sage, 2004) and other materials. The instructor's assumption will be that most people attending the session have some general familiarity with the work of a few evaluation theorists, but that most will not themselves be scholars of evaluation theory. At the same time, the course should be useful, whatever one's level of familiarity with evaluation theory. Applied Measurement for Evaluation Instructor: Dr. Ann Doucette, TEI Director, Research Professor, Columbian College of Arts and Sciences, George Washington University Description: Successful evaluation depends on our ability to generate evidence attesting to the feasibility, relevance and/or effectiveness of the interventions, services, or products we study. While theory guides our designs and how we organize our work, it is measurement that provides the evidence we use in making judgments about the quality of what we evaluate. Measurement, whether it results from self-report survey, interview/focus groups, observation, document review, or administrative data must be systematic, replicable, interpretable, reliable, and valid. While hard sciences such as physics and engineering have
  • 18. advanced precise and accurate measurement (i.e., weigh, length, mass, volume), the measurement used in evaluation studies is often imprecise and characterized by considerable error. The quality of the inferences made in evaluation studies is directly related to the quality of the measurement on which we base our judgments. Judgments attesting to the ineffective interventions may be flawed - the reflection of measures that are imprecise and not sensitive to the characteristics we chose to evaluate. Evaluation attempts to compensate for imprecise measurement with increasingly sophisticated statistical procedures to manipulate data. The emphasis on statistical analysis all too often obscures the important characteristics of the measures we choose. This class content will cover these topics: • Assessing measurement precision: examining the precision of measures in relationship to the degree of accuracy that is needed for what is being evaluated. Issues to be addressed include: measurement/item bias, the sensitivity of measures in terms of developmental and cultural issues, scientific soundness (reliability, validity, error, etc.), and the ability of the measure to detect change over time. • Quantification: Measurement is essentially assigning numbers to what is observed (direct and inferential). Decisions about how we quantify observations and the implications these decisions have for using the data resulting from the measures, as well as for the objectivity and certainty we bring to the judgment made in our evaluations will be examined. This section of the course will focus on the quality of response options, coding categories - Do response options/coding categories segment the respondent sample in meaningful and useful ways? • Issues and Considerations - using existing measures versus developing your own measures: What to look for and how to assess whether existing measures are suitable for your evaluation project will be examined. Issues associated with the development and use of new measures will be addressed in terms of how to establish sound psychometric properties, and what cautionary statements should accompanying interpretation and evaluation findings using these new measures. • Criteria for choosing measures: assessing the adequacy of measures in terms of the characteristics of measurement - choosing measures that fit your evaluation theory and evaluation focus (exploration, needs assessment, level of implementation, process, impact and outcome). Measurement feasibility, practicability and relevance will be examined. Various measurement techniques will be examined in terms of precision and adequacy, as well as the implications of using screening, broad-range, and peaked tests. • Error - influences on measurement precision: The characteristics of various measurement techniques, assessment conditions (setting, respondent interest, etc.), and evaluator characteristics will be addressed. Participants will be provided with a copy of the text: Measurement Theory in Action (Case Studies and Exercises) by Shulz, K.S. and D.J. Whitney (Sage, 2004). >> Return to top