Characteristics of a Valid and Reliable Evaluation Instrument

Characteristics of Good Evaluation
Instrument
Suresh Babu. G
Assistant Professor
CTE CPAS Paippad, Kottayam

Validity
Reliability
Objectivity
Adequacy
Discrimination Power
Practicability
Comparability
Utility
CharacteristicsofGoodEvaluation
Instrument

Validity
 The validity of a test may be defined as the
accuracy with which the test measures what it
purports to measure.
 According to Garrett, “The validity of a test
depends upon the fidelity with which it measures
what it intends to measure”
 Every test is constructed for some specific
purpose, and it is valid only for that purpose.

Types of Validity
There are various types of validity
1. Face Validity – It implies that a test measures
superficially what the test-maker desires to
measure, and not what it actually measures. This
type of validity has very little significance.
2. Content Validity – It refers to the degree to which
a test sample the content area which is to be
measured.
3. Predictive Validity – It refers to the extent to
which a test can predict the future performance of
individuals.

4. Concurrent Validity – It refers to the
relationship between scores on a measuring tool
and a criterion available at the same time. The
concurrent validity differs from predictive validity
only on time dimension.
5. Construct Validity – It refers to the extent to
which a test reflects, constructs presumed to
underlie the test performance and also the
extent to which it is based on theories regarding
these constructs.
Types of Validity

Method of Determining
validity of a test
Method of determining validity of an achievement
test are:
 Correlating it with another test
 Correlating with teacher rating
 Analyzing the test to ensure that due weightage
has been given to content and objective.
 Item-analysis

Reliability
Reliability of a test is the consistency with which the
test measures whatever it does measure.
A reliability test is a thrust worthy test.
A reliability test should yield essentially the same (or
almost same) scores when administered a second
time to the same pupils, provided no learning or
forgetting has taken place between the periods of
the two testing.
The degree of reliability is usually denoted by a
coefficient of correlation / reliability coefficient

Reliability depends on the following factors:
Appropriateness and definiteness of the task.
Consistency, stability, alertness or fatigued state
of the pupil who takes the test.
Consistency and objectivity of the scorer of the
test
Reliability

Method of Determining Reliability
There are several methods to determine the
reliability of a test:
1. Test-Re-Test Method :- The same test is
administrated twice to the same group of
students with a given time interval between the
two administration of the test. With the help of
these two sets of scores, correlation is computed
to find the stability.
2. Parallel Form Method :- Used only in case
where two forms of the test have been prepared.
It is truly equivalent tests – with same number of
items, uniformity in content, etc

3. Split-Half-Method :- It is the method of splitting
the test into two half and finding the correlation
between the two.
4. Rational Equivalent Method :- The method
utilizes two forms of tests with inter-changeable
corresponding items and inter item correlations.
Method of Determining Reliability

Objectivity
A test is objective if the score
assigned by different but equally
competent score is not affected by
the judgment, personal opinion or
bias of the scorers.
Objectivity is the opposite of
subjectivity.
Objectivity is a pre-requisite of
reliability and therefore of validity

Objectivity of a test may be increased by:
 Using more objective-type test items
 Making essay type test-tests more exact and
clear
 Preparing a marking scheme or scoring key.
 Setting realistic standards
 Using the average score of two independent
examiners who evaluated the test.
Objectivity

Adequacy
 Adequacy of a test means sufficiency and
suitability of that test.
 A good test should include items measuring the
objectives and content.

Discrimination Power
A test should be able to discriminate the
respondents on the basis of the phenomena
measured.
• Here Bad items are eliminated and good items are
retained.
• A good item is one which is attempted successfully
by 50% children.
• A good item must discriminate between superior
children and backward children.
• Suppose an item is successfully answered by all
children, that item is a bad item with no
discrimination power

Method of Discrimination Power
1. Divide the group which pilot sampling is applied into two
groups, upper and lower groups.
2. Take the item, say, item number one, and find out how many
of upper group have done it correctly.
3. Find out how many of the lower group have done it correctly
4. Convert the difference in the number of correct responses
from two groups into an index, using the formula D.P = U-
L/N , U and L are the number of correct responses in the
upper and lower groups, N is the number of cases in each
group
5. If discrimination power is very low and very high it is
eliminated
6. The discrimination power ranging between 0.4 to 0.8 are
included in the test.
Now the test is ready

Practicability
If all the criteria is satisfied to conduct a test, but
from the practical point of view, it may not be
possible to conduct such a test. Economy of
time, effort, number of personnel and finance
required also has to be of utmost concern for the
planner of a test.

Comparability
A test possesses comparability when the scores
obtained by administering can be interpreted in
items of a common base that has a natural or
accepted meaning.
Two methods for establishing comparability are:
(a) Making available equivalent forms of a test
(b) Making available adequate norms

Utility
A test has utility if it provides the test conditions
that would facilitate realization of the purpose for
which it is meant. For achieving utility it is
essential that the test is constructed in the light
of well-through-out purpose and its
interpretations are used in obtaining desirable
results

Objective - Basedness
 Evaluation is making judgement about some
phenomena or performance on the basis of
some pre-determined objectives.
 Therefore a tool meant for evaluation should
measure attainment in terms of criteria
determined by instructional objectives. This is
possible only if the evaluator is definite about the
objectives, the degree of realization of which he
is going to evaluate.
 Therefore each item of the tool should represent
an objective.

Comprehensiveness
It refers to the degree to which a test contains a
fairly wide sampling of items to determine the
objectives or abilities so that the resulting scores
are representatives of the relative total
performance in the areas measured.

Characteristics of a Valid and Reliable Evaluation Instrument

Characteristics of a Valid and Reliable Evaluation Instrument

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Characteristics of a Valid and Reliable Evaluation Instrument

Similar to Characteristics of a Valid and Reliable Evaluation Instrument (20)

More from Suresh Babu

More from Suresh Babu (20)

Recently uploaded

Recently uploaded (20)

Characteristics of a Valid and Reliable Evaluation Instrument