SlideShare ist ein Scribd-Unternehmen logo
1 von 10
 Criteria to Consider when Constructing Good Tests
A. Validity – is the degree to which the test measures what is intended to
measure. It is the usefulness of the test for a given purpose. It is the most
important criterion of a good examination.
Factors Influencing the Validity of the Tests In General
1. Appropriateness of Test – it should measure the abilities, skill and
information it is supposed to measure.
2. Directions –it should indicate how the learners should answer and
record their answers.
3. Reading Vocabulary and Sentence Structure –it should be based on
the intellectual level of maturity and background experience of the
learners.
4. Difficulty of Items - it should have items that are not too difficult and not
too easy to be able to discriminate the bright from slow pupils.
5. Construction of Test Items – it should not provide clues so it will not be
a test on clues nor ambiguous so it will not be a test on interpretation.
6. Length of the Test –it should just be sufficient length so it can measure
what it is supposed to measure and not that it is too short that it cannot
adequately measure the performance we want to measure.
7. Arrangement of Items –it should have items that are arranged in
ascending level of difficulty such that it starts with the easy so that the
pupils will pursue on taking the test.
8. Patterns of Answer –it should not allow the creation of patterns in
answering the test.
Ways in Establishing Validity
1. Face Validity – is done by examining the physical appearance of the test
2. Content Validity – is done through a careful and critical examination of
the objectives of the test so that it reflects the curricular objectives.
3. Criterion-related Validity – is established statistically such that a set of
scores revealed by a test is correlated with the scores obtained in
another external predictor or measure.
a. Concurrent validity – describes the present status of the individual
by correlating the sets of scores obtained from two measures given
concurrently.
b. Predictive validity – describes the future performance of an
individual by correlating the sets of scores obtained from two
measures given at a longer time interval.
4. Construct Validity – is established statistically by comparing
psychological traits or factors that theoretically influence scores in a test.
a. Convergent Validity – is established if the instrument defines
another similar trait other than what it is intended to measure. e.g.
Critical Thinking Test may be correlated with Creative Thinking Test.
b. Divergent Validity – is established if an instrument can describe only
the intended trait and not the other traits. e. g. Critical Thinking Test
may not be correlated with Reading Comprehension Test.
B. Reliability – it refers to the consistency of scores obtained by the same person
when retested using the same instrument or one that is parallel to it.
Factors Affecting Reliability
1. Length of the Test – as a general rule, the longer the test, the higher the
reliability. A longer test provides a more adequate sample of the behavior
being measured and is less distorted by chance factors like guessing.
2. Difficulty of the Test – ideally, achievement tests should be constructed
such that the average score is 50 percent correct and the scores range from
near zero to perfect. The bigger spread of the scores, the more reliable the
measured difference is likely to be. A test is reliable if the coefficient of
correlation is not less than 0.85.
3. Objectivity – can be obtained by eliminating the bias, opinions or
judgments of the person who checks the test.
Method
Type of Reliability
Measure
Procedure
Statistical
Measure
A.
Test-Retest Measure
of stability
Give a test twice to the same
group with any time interval
between tests from several
minutes to several years.
Pearson r
B.
Equivalent
Forms
Measure
of equivalence
Give parallel forms of tests with
close time intervals between
forms.
Pearson r
C.
Test-Retest
with Equivalent
Forms
Measure
of stability
and equivalence
Give parallel forms of test with
increased time intervals
between forms.
Pearson r
D.
Split Half Measure
of Internal Consistency
Give a test once. Score
equivalent halves of the test
e.g. odd- and even- numbered
items
Pearson r &
Spearman
Brown
Formula
E.
Kuder-
Richardson
Measure
of Internal Consistency
Give the test once then
correlate the
proportion/percentage of the
students passing and not
passing a given item.
Kuder-
Richardson
Formula 20
and 21
Formulas for Measures of Correlation Used in Establishing Test Validity & Reliability
Pearson r
𝑟 =
∑ 𝑋𝑌
𝑁
−(
∑ 𝑋
𝑁
)(
∑ 𝑌
𝑁
)
√∑ 𝑋2
𝑁
−(
∑ 𝑋
𝑁
)
2
√∑ 𝑌2
𝑁
− (
∑ 𝑌
𝑁
)
2
Spearman Brown Formula
𝑟𝑒𝑙𝑖𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝑡ℎ𝑒 𝑤ℎ𝑜𝑙𝑒 𝑡𝑒𝑠𝑡 =
2𝑟 𝑜𝑒
1+ 𝑟 𝑜𝑒
Kuder-Richardson Formula 20
𝐾𝑅20 =
𝐾
𝐾−1
[1 −
∑ 𝑝𝑞
𝑆2
]
Where:
X – scores in a test
Y – scores in a retest
N –number of examinees
Where:
roe– reliability coefficient
using the split-half or odd-
even procedure
Where:
K – no. of items
p – proportion of the examinees who got the
item right
q – proportion of the examinees who got the
item wrong
S2
– variance or the square of the standard
deviation
Kuder-Richardson Formula 21
𝐾𝑅21 =
𝐾
𝐾−1
[1 −
𝑘𝑝̅ 𝑞
𝑆2
]
Interpretation of the Pearson r correlation value
𝐻𝑖𝑔ℎ 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 {
1 − 𝑃𝑒𝑟𝑓𝑒𝑐𝑡 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛
0.5 − 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛
𝐿𝑜𝑤 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 {
0.5 − 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛
0 − 𝑍𝑒𝑟𝑜 𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛
𝐿𝑜𝑤 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 {
0 − 𝑍𝑒𝑟𝑜 𝐶𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛
−0.5 − 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝐶𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛
𝐻𝑖𝑔ℎ 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 {
−0.5 − 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛
−1 − 𝑃𝑒𝑟𝑓𝑒𝑐𝑡 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛
C. Administrability – the test should be administered with ease, clarity and
uniformity so that scores obtained are comparable. Uniformity can be obtained
by setting the time limit and oral instructions.
D. Scorability – the test should be easy to score such that directions for scoring
are clear, the scoring key is simple; provisions for answer sheets are made.
E. Economy – the test should be given in the cheapest way, which means that
answer sheets must be provided so the test can be given from time to time.
F. Adequacy – the test should contain a wide sampling of items to determine the
educational outcomes or abilities so that the resulting scores are
representatives of the total performance in the areas measured.
G. Authenticity – the test should simulate real-life situations.
 Shapes of the Frequency Polygons
1. Normal – bell-shaped curve
2. Positively skewed – most scores are below the mean and there are extremely high scores, 𝑥̅ >
𝑥̂ (mean is greater than the mode)
3. Negatively skewed – most scores are above the mean and there are extremely low scores,
𝑥̅ < 𝑥̂ (mean is lower than the mode)
4. Leptokurtic – highly peaked and the tails are more elevated above the baseline
5. Mesokurtic – moderately peaked
6. Platykurtic – flattened peak
7. Bimodal Curve – curve with two peaks or mode
8. Polymodal Curve – curve with three or more modes
9. Rectangular Distribution – there is no mode
 Four Types of Measurement Scales
Measurement Scale Characteristics Example
1. Nominal  Groups and labels data Gender (1-male, 2-female)
2. Ordinal  Ranks data
 Distance between
points are indefinite
Income (1-low, 2-average, 3-
high)
3. Interval  Distance between
points are equal
 No absolute zero point
Test scores and temperature
*a score of zero in a test does
not mean no knowledge at all
4. Ratio  All of the above except
that it has an absolute
zero point
Height, weight
* a zero weight means no
weight at all
Where: 𝑝̅ =
𝑋̅
𝐾
; 𝑞 = 1 − 𝑝
Measures of Central Tendency and Variability
Assumptions When Used
Appropriate Statistical Tools
Measure of Central
Tendency
-describes the
representative value of
a set of data
Measure of Variability
-describes the degree of
spread or dispersion of a
set of data
 When the frequency distribution
is regularly/ symmetrically/
normal
 Usually used when the data are
numeric (interval or ratio)
Mean – the arithmetic
average
Standard Deviation – the
root-mean-square of the
deviations from the mean.
 When the frequency distribution
is irregular/ skewed
 Usually used when the data are
ordinal
Median – the middle
score in a group of
scores that are ranked
Quartile Deviation – the
average deviation of the 1st
and 3rd quartiles from the
median
 When the distribution of scores is
normal and quick answer is
needed
 Usually used when the data are
nominal
Mode – the score that
occurs frequently
Range – the difference
between the highest and
lowest score in a set of
observation
I. Procedure in the Computation of the Measures of Central Tendency
A. Mean
Procedure:
1. Mean of Ungrouped Data: used for few cases (N<30)
a. Get the sum of scores (ΣX)
b. Divide the sum by the number of cases (N)
Formula: 𝑋̅ = ∑ 𝑋/𝑁
2. Mean of Grouped Data: uses for large cases (N>30)
There are 2 possible methods that will be discussed in computing the mean of grouped data.
a. Using Midpoint Method
Procedures:
1) Group data in the form of a frequency distribution
2) Compute the midpoints of all class limits (M)
3) Multiply the midpoints by their frequencies (M x F)
4) Get the sum of the products of the midpoints and frequencies (Σ MF)
5) Divide the sum by the number of cases (N)
Formula: 𝑋̅ =
∑ 𝑀𝐹
𝑁
b. Using Class Deviation Method
Procedures:
1) Choose your arbitrary starting point or origin from any of the class limits
2) Get the midpoint of the class limit that you have chosen as your starting point. Call this
your assumed mean (AM)
3) Get the deviation (D) of each class limit from the class limit where the assumed mean
is. The deviation of the class limit where the assumed mean is located is 0. Add one
(+1) to each class limit higher than this point of origin and subtract one (-1) to the
class limit lower than the origin.
4) Multiply the frequencies by their corresponding deviations (FD)
5) Add the products of the frequencies and deviations (ΣFD)
6) Divide the sum by the number of cases (ΣFD/N)
7) Multiply the quotient by the number of class interval (i)
8) Add the product to the assumed mean
Formula: 𝑋̅ = 𝐴𝑀 + 𝑖 (
∑ 𝐹𝐷
𝑁
)
B. Mode
 Median of Ungrouped Data
There are several ways in the computation of median for ungrouped data. The process
depends on a case to case basis
Case 1: The total number of cases is an odd number
Procedure:
1.) Arrange the scores from the highest to lowest or vice versa
2.) Get the middlemost score. The score is the median score
Case 2: The total number of cases is an even number
Procedure:
1.) Arrange the scores from highest to lowest or vice versa.
2.) Get the two middlemost scores
3.) Compute the average of the two middlemost scores. The average is the median score.
Case 3: The middlemost score occurs twice, thrice, or more number of times
Procedure:
1.) Get the middlemost score/s, its/their identical score/s and its/their counterparts either
above or below the middlemost score/s
2.) Compute their average and the average score is the median.
2. Median for Grouped Data
Procedure:
1.) Add up or accumulate the frequencies starting from the lowest to the highest class limit. Call
this the cumulative frequency. (CF)
2.) Find one half of the number of cases in the distribution. (N/2)
3.) Find the cumulative frequency which is equal or closest but higher than the half of the
number of cases. The class containing this frequency is the median class.
4.) Find the lowest limit (LL) of the median class.
5.) Get the cumulative frequency of the class below the median class. (CFb)
6.) Subtract this from the half of the number of cases in the distribution. (N/2 – CFb)
7.) Get the frequency of the median class. (FMdn)
8.) Find the class interval (i) then follow the given formula below.
Formula:
𝑋̃ = 𝐿𝐿 + 𝑖 (
𝑁
2
−𝐶𝐹 𝑏
𝐹𝑀𝑑𝑛
)
C. Mode
Procedure
1. Mode of Ungrouped Data
 Get the most frequent score
 when there are more than three modes, they are called polymodal or multimodal
 when there is no mode, it is describe as a rectangular distribution.
2. Mode for Grouped Data
a. Crude Mode – refers to the midpoint of the class limit with the highest frequency.
Procedure:
1.) Find the class limit with the highest frequency
2.) Get the midpoint of that class limit
3.) The midpoint of the class limit with the highest frequency is the crude mode
Where:
LL = lowestlimitof the medianclass
i = class interval
N/2 = half of the numberof cases
CFb = cumulative frequencybelow the
medianclass
FMdn = frequencyof the medianclass
b. Refined Mode–refers to the mode obtained from an ordered arrangements or a class
frequency distribution
Procedure:
1.) Get the mean and the median of the grouped data.
2.) Multiply the median by three (3Mdn)
3.) Multiply the mean by two (2Mn)
4.) Subtract 2Mn from 3Mdn to get the Mode. (Md)
Formula: 𝑋̂ = 3𝑀𝑑𝑛 − 2𝑀𝑛
 How will you interpret the Measures of Central Tendency?
1.) The value that represents a set of data will be the basis in determining whether the group is
performing better or poorer than the other groups.
II. Procedure in the computation of the Measures of Variability
A. Range (R)
1. For Ungrouped Data – the difference between the highest and lowest score
2. For Grouped Data – the difference between the highest limit of the highest class limit and
the lowest limit of the lowest class limit.
B. Standard Deviation (SD)
Procedure for Ungrouped Data
1.) Find the mean. (𝑋̅)
2.) Subtract the mean from each score to get the deviation. [ 𝑑 = 𝑋̅ − 𝑋̅]
3.) Square the deviation. (d2)
4.) Get the sum of the squared deviations. (Σd2)
5.) Divide the sum by the number of cases (Σ d2 / N – 1)
6.) Get the square root of the answer. √Σd2 / N-1
Formula: 𝑆𝐷 = √ ∑ 𝑑
2
𝑁−1
Procedure for Grouped Data
A. Using Class Deviation Method
1.) Like what you did in the mean, get the deviation (d) and the product of the frequency and
deviation of each score. (fd)
2.) Multiply the product of the frequency and the deviation by the deviation. (fd2)
3.) Get the sum of the product of the frequency and squared deviation. (Σfd2)
4.) Compute the standard deviation using the formula below
Formula: 𝑺𝑫 = 𝑰√[
∑ 𝒇𝒅
𝟐
𝑵
] − [
(∑ 𝒇𝒅)
𝟐
𝑵
𝟐
]
B. Using Midpoint Method
1.) Square the midpoint (M2) and multiply it by the
frequency midpoint (FM)
2.) Write the products of M & FM in another column and label it (FM2)
3.) Use the formula below to compute the Standard Deviation.
Formula:
𝑆𝐷 = √
∑ 𝐹𝑀2
𝑁
− ( 𝑋̅)2
Where:
I = interval
N = Number of cases
Σfd = sum of the product of frequency
and deviation
Σfd2
= sum of the product of the
frequency and squared
deviation
 How will you interpret the standard deviation?
1.) The results will help you determine if the group is homogeneous or not.
2.) The results will also help you determine the number of students that fall below and above
the average performance.
Study how to do this:
 Mean – 1 SD and mean + 1 SD would give the limits of an average ability
 The point right below – 1 SD is the upper limit of the below average ability
 The point right above + 1 SD is the lower limitof the above average ability
C. Quartile Deviation (QD)
1. Procedure in the Computation of QD for Ungrouped Data
1.) Arrange the scores in descending or ascending order
2.) Compute the Q1 i.e. [¼ (N)] and the results tells the rank of the Q1 score in the ordered
arrangement from the bottom.
3.) Look for the score in this rank.
4.) Compute the Q3 score [d = ¾ (N)] and the results tells the rank of the Q3 score.
5.) Look for the Q3 score in this rank
6.) Compute the QD
𝑄𝐷 =
𝑄3−𝑄1
2
2. Procedure in the Computation of QD for Grouped Data
1.) Compute for the value of the 1st quartile
𝑄1 = 𝐿𝐿 + (
𝑁
2
−𝐶𝐹 𝑏
𝐹𝑞
) 𝑖
2.) Compute for the 3rd quartile
𝑄3 = 𝐿𝐿 + (
3𝑁
2
−𝐶𝐹 𝑏
𝐹𝑞
) 𝑖
3.) Compute for the interquartile range or quartile
𝑄𝐷 =
𝑄3−𝑄1
2
 How will you interpret the quartile deviation?
The results will also tell if the group is homogeneous or not. It will also tell
how many of the students fall below or above the region of acceptable
performance. To do this, study the instruction below.
 Median – 1 QD and Median +1 QD would give the limits of an average ability
 The Point right below the (-1) QD is the upper limit of the below average
ability
 The point right above the +1 QD is the lower limit of the above average ability
STANDARD SCORES
 Indicate the pupil’s relative position by showing how far his raw score is
above or below average
 Express the pupil’s performance in terms of standard unit from the mean
 Represented by the normal probability curve or what is commonly called the
normal curve
 Used to have a common unit to compare raw scores from different tests
1. PERCENTILE
 tells the percentage of examinees that
lies below one’s score.
Formula: P𝑎 = LL + i [
𝑎𝑁−𝐶𝐹 𝑏
𝐹𝑃 𝑎
]
Where:
Q1 – standsforthe 1st
quartile
LL – lowestlimit
N/4 – one-fourthof the total
numberof the population
CF – cumulative frequencybelow
the quartile class
Fq – frequencyof the classwhere
the firstquartile score falls
I - interval
Where:
LL – lowestlimitof the classof a% N
CFb – cumulative frequencybelowthe
classof a% N
FPa – frequencyof the classof a% N
2. Z-SCORES
 tells the number of standard deviations equivalent to a given raw score
Formula: 𝑍 =
𝑋−𝑋̅
𝑆𝐷
Note:
Z – score is negative when X <𝑋̅
Z – score is positive when X >𝑋̅
3. T-SCORES
 it refers to any set of normally distributed standard deviation score that has a mean of
50 and a standard deviation of 10.
 computed after converting raw scores to z-scores to get rid of negative values
Formula: 𝑇 − 𝑠𝑐𝑜𝑟𝑒 = 50 + 10(𝑍)
ASSIGNING GRADES/MARKS/RATINGS
A. Marking/Grading - is the process of assigning value to a performance
B. Mark/Grades/Ratings are symbols which:
Could be in –
 Percent such as: 70%, 75%, 80%, etc.
 Letters such as: A, B, C, D, or F
 Numbers such as: 1, 2, 3, 4, or 5
 Descriptive expressions such as:
Outstanding (O),
Very Satisfactory (VS),
Satisfactory (S),
Moderately Satisfactory (MS),
Needs Improvement (NI), etc.
[Note: Any symbol can be used provided that it has uniform meaning to all concerned]
Could represent –
 How a student is performing in relation to other students (Norm-Referenced
Grading)
 The extent to which a student has mastered a particular body of knowledge
(Criterion-Referenced Grading)
 How a student is performing in relation to a teacher’s judgment of his or her
potential. (Grading in Relation to Teacher’s Judgment)
Could be for –
 Certification that gives assurance that a student has mastered a specific
content or achieved a certain level of accomplishment.
 Selection that provides basis in identifying or grouping students for certain
educational paths or programs.
 Direction that provides information for diagnosis and planning
 Motivation that emphasizes specific material or skills to be learned and
helping students to understand and improve their performance.
Could be based on –
 Examination results or test data
 Observations of student work
 Group evaluation activities
 Class discussions and recitations
 Homework
 Notebooks and note taking
 Reports, themes and research papers discussions and debates
 Portfolios
 Projects
 Attitudes, etc.
Could be assigned by –
 Criterion-referenced grading or grading - based on fixed or absolute
standards where grade is assigned based on how a student has met the
criteria or the well-defined objectives of a course that were spelled out in
advance.
It is then up to the student to earn the grade he or she wants to
receive regardless of how other students in the class have performed. This
is done by transmuting test scores into marks or ratings.
 Norm-referenced grading or grading - based on relative standards where
a student’s grade reflects his or her level of achievement relative to the
performance of other students in the class.
In this system the grade is assigned based in the average of test
scores. The rating scales that are used in assigning grades are:
1.) The four point rating scale which uses the median and quartile deviation
of the test scores to group the scores into four and each group is
assigned the corresponding grade of A, B, C, and D or 1, 2, 3, or 4.
2.) The five point rating scale which uses the median and quartile deviation
of the test scores to group the scores into 5 and each group is assigned
the corresponding grade of A, B, C, D, or F or 1, 2, 3, 4, or 5
 Point or Percentage Grading System whereby the teacher identifies
points or percentages of various tests and class activities depending on
their importance. The total of these points will be the bases for the grade
assigned to the student.
 Contract Grading System where each student agrees to work for a
particular grade according to agreed-upon standards.
 Guidelines in Grading Students
1.) Explain your grading system to the students early in the course and remind them
of the grading policies regularly
2.) Base grades on a predetermined and reasonable set of standards.
3.) Base your grades on as much objective evidence as possible.
4.) Base grades on the student’s attitude as well as achievement, especially at the
elementary and high school level.
5.) Base grades on the student’s relative standing compared to classmates.
6.) Base grades on a variety of sources
7.) As a rule, do not change grades.
8.) Become familiar with the grading policy of your school and with your colleagues’
standards
9.) When failing a student, closely follow school procedures.
10.)Record grades on report cards and cumulative records.
11.)Guard against bias in grading.
12.)Keep pupils informed of their standing in the class
References
Frankael, J.R. & Wallen, N.E. (1993). How to Design and Evaluate Research in
Education, 2nd Edition, New York: McGrawHill Inc.
Nackmeas, C.F. and Nachmeas, D. (1996). Research Methods in the Social Sciences,
5th Edition, London: St. Martius Press, Inc.
Oriondo, Leonora et. al. (1996). Evaluating Educational Outcomes. Quezon City: Rex
Printing Company, Inc.
Omstein, Allan C. (1990). Strategies for Effective Teaching. Newyork: Harper Collins
Publisher: Navotas, M.M.

Weitere ähnliche Inhalte

Was ist angesagt?

Role of teacher in curriculum implementation
Role of teacher in curriculum implementationRole of teacher in curriculum implementation
Role of teacher in curriculum implementationInternational advisers
 
21st century assessment
21st century assessment21st century assessment
21st century assessmentCarlo Magno
 
Differences between measurement, evaluation and assessment
Differences between measurement, evaluation and assessmentDifferences between measurement, evaluation and assessment
Differences between measurement, evaluation and assessmentSWATHY M.A
 
Portfolio assessment
Portfolio assessmentPortfolio assessment
Portfolio assessmentMero Sarade
 
Development of classroom assessment tools
Development of classroom assessment toolsDevelopment of classroom assessment tools
Development of classroom assessment toolsEaicz12
 
Objective Test Type
Objective Test TypeObjective Test Type
Objective Test TypeEmman Badang
 
Performance assessment
Performance assessmentPerformance assessment
Performance assessmentKrisna Marcos
 
The Nature of Performance-Based Assessment (Assessment of Learning 2)
The Nature of Performance-Based Assessment (Assessment of Learning 2)The Nature of Performance-Based Assessment (Assessment of Learning 2)
The Nature of Performance-Based Assessment (Assessment of Learning 2)iamina
 
Curriculum development: Processes and models
Curriculum development: Processes and modelsCurriculum development: Processes and models
Curriculum development: Processes and modelsDianneCarmela Delacruz
 
Approaches to School Curriculum
Approaches to School CurriculumApproaches to School Curriculum
Approaches to School CurriculumJunila Tejada
 
How to improve test reliability
How to improve test reliabilityHow to improve test reliability
How to improve test reliabilityKAthy Cea
 
True false test items
True false test itemsTrue false test items
True false test itemsaelnogab
 
Placement & diagnostic assessment
Placement & diagnostic assessmentPlacement & diagnostic assessment
Placement & diagnostic assessmentHadeeqaTanveer
 
Practicality and-efficiency
Practicality and-efficiencyPracticality and-efficiency
Practicality and-efficiencyJelma Perico
 
Measurement, Assessment and Evaluation
Measurement, Assessment and EvaluationMeasurement, Assessment and Evaluation
Measurement, Assessment and EvaluationMelanio Florino
 
Assessment of student learning 1
Assessment of student learning 1Assessment of student learning 1
Assessment of student learning 1joeri Neri
 

Was ist angesagt? (20)

Role of teacher in curriculum implementation
Role of teacher in curriculum implementationRole of teacher in curriculum implementation
Role of teacher in curriculum implementation
 
21st century assessment
21st century assessment21st century assessment
21st century assessment
 
Simple true false test
Simple true false testSimple true false test
Simple true false test
 
Differences between measurement, evaluation and assessment
Differences between measurement, evaluation and assessmentDifferences between measurement, evaluation and assessment
Differences between measurement, evaluation and assessment
 
Affective Assessment
Affective AssessmentAffective Assessment
Affective Assessment
 
Portfolio assessment
Portfolio assessmentPortfolio assessment
Portfolio assessment
 
Development of classroom assessment tools
Development of classroom assessment toolsDevelopment of classroom assessment tools
Development of classroom assessment tools
 
Portfolio assessment
Portfolio assessmentPortfolio assessment
Portfolio assessment
 
Objective Test Type
Objective Test TypeObjective Test Type
Objective Test Type
 
Performance assessment
Performance assessmentPerformance assessment
Performance assessment
 
The Nature of Performance-Based Assessment (Assessment of Learning 2)
The Nature of Performance-Based Assessment (Assessment of Learning 2)The Nature of Performance-Based Assessment (Assessment of Learning 2)
The Nature of Performance-Based Assessment (Assessment of Learning 2)
 
Curriculum development: Processes and models
Curriculum development: Processes and modelsCurriculum development: Processes and models
Curriculum development: Processes and models
 
Approaches to School Curriculum
Approaches to School CurriculumApproaches to School Curriculum
Approaches to School Curriculum
 
How to improve test reliability
How to improve test reliabilityHow to improve test reliability
How to improve test reliability
 
True false test items
True false test itemsTrue false test items
True false test items
 
Placement & diagnostic assessment
Placement & diagnostic assessmentPlacement & diagnostic assessment
Placement & diagnostic assessment
 
Assessment in the Affective Domain
Assessment in the Affective DomainAssessment in the Affective Domain
Assessment in the Affective Domain
 
Practicality and-efficiency
Practicality and-efficiencyPracticality and-efficiency
Practicality and-efficiency
 
Measurement, Assessment and Evaluation
Measurement, Assessment and EvaluationMeasurement, Assessment and Evaluation
Measurement, Assessment and Evaluation
 
Assessment of student learning 1
Assessment of student learning 1Assessment of student learning 1
Assessment of student learning 1
 

Ähnlich wie Criteria to consider when constructing good tests

Adapted from Assessment in Special and incl.docx
Adapted from Assessment in Special and incl.docxAdapted from Assessment in Special and incl.docx
Adapted from Assessment in Special and incl.docxnettletondevon
 
Louzel Report - Reliability & validity
Louzel Report - Reliability & validity Louzel Report - Reliability & validity
Louzel Report - Reliability & validity Louzel Linejan
 
Educational measurement and evaluation
Educational measurement and evaluationEducational measurement and evaluation
Educational measurement and evaluationalkhaizar
 
Class demo in teaching (ugly version)
Class demo in teaching (ugly version) Class demo in teaching (ugly version)
Class demo in teaching (ugly version) CharityNice Nulo
 
Assessment of learning and Educational Technology
Assessment of learning and Educational Technology Assessment of learning and Educational Technology
Assessment of learning and Educational Technology Jofamaeluceno
 
Educational technology-assessment-of-learning-and-statistical-measures-ed-09-...
Educational technology-assessment-of-learning-and-statistical-measures-ed-09-...Educational technology-assessment-of-learning-and-statistical-measures-ed-09-...
Educational technology-assessment-of-learning-and-statistical-measures-ed-09-...YvonneErekaOlazo
 
STANDARDIZED AND NON-STANDARDIZED TEST
STANDARDIZED AND NON-STANDARDIZED TESTSTANDARDIZED AND NON-STANDARDIZED TEST
STANDARDIZED AND NON-STANDARDIZED TESTsakshi rana
 
Ag Extn.504 :- RESEARCH METHODS IN BEHAVIOURAL SCIENCE
Ag Extn.504 :-  RESEARCH METHODS IN BEHAVIOURAL SCIENCE  Ag Extn.504 :-  RESEARCH METHODS IN BEHAVIOURAL SCIENCE
Ag Extn.504 :- RESEARCH METHODS IN BEHAVIOURAL SCIENCE Pradip Limbani
 
Characteristics of Good Evaluation Instrument
Characteristics of Good Evaluation InstrumentCharacteristics of Good Evaluation Instrument
Characteristics of Good Evaluation InstrumentSuresh Babu
 
Assessment of Learning
Assessment of LearningAssessment of Learning
Assessment of LearningRoelMaramara
 
LESSON-8-ANALYSIS-INTERPRETATION-AND-USE-OF-TEST-DATA.pptx
LESSON-8-ANALYSIS-INTERPRETATION-AND-USE-OF-TEST-DATA.pptxLESSON-8-ANALYSIS-INTERPRETATION-AND-USE-OF-TEST-DATA.pptx
LESSON-8-ANALYSIS-INTERPRETATION-AND-USE-OF-TEST-DATA.pptxMarjoriAnneDelosReye
 
Characteristics of Assessment
Characteristics of Assessment Characteristics of Assessment
Characteristics of Assessment AliAlZurfi
 

Ähnlich wie Criteria to consider when constructing good tests (20)

Adapted from Assessment in Special and incl.docx
Adapted from Assessment in Special and incl.docxAdapted from Assessment in Special and incl.docx
Adapted from Assessment in Special and incl.docx
 
Louzel Report - Reliability & validity
Louzel Report - Reliability & validity Louzel Report - Reliability & validity
Louzel Report - Reliability & validity
 
Assessment of Learning
Assessment of LearningAssessment of Learning
Assessment of Learning
 
Educational measurement and evaluation
Educational measurement and evaluationEducational measurement and evaluation
Educational measurement and evaluation
 
Class demo in teaching (ugly version)
Class demo in teaching (ugly version) Class demo in teaching (ugly version)
Class demo in teaching (ugly version)
 
Assessment of learning and Educational Technology
Assessment of learning and Educational Technology Assessment of learning and Educational Technology
Assessment of learning and Educational Technology
 
Quantitative Analysis
Quantitative AnalysisQuantitative Analysis
Quantitative Analysis
 
CA Group # 4.pptx
CA Group # 4.pptxCA Group # 4.pptx
CA Group # 4.pptx
 
Educational technology-assessment-of-learning-and-statistical-measures-ed-09-...
Educational technology-assessment-of-learning-and-statistical-measures-ed-09-...Educational technology-assessment-of-learning-and-statistical-measures-ed-09-...
Educational technology-assessment-of-learning-and-statistical-measures-ed-09-...
 
Unit 2.pptx
Unit 2.pptxUnit 2.pptx
Unit 2.pptx
 
STANDARDIZED AND NON-STANDARDIZED TEST
STANDARDIZED AND NON-STANDARDIZED TESTSTANDARDIZED AND NON-STANDARDIZED TEST
STANDARDIZED AND NON-STANDARDIZED TEST
 
Ag Extn.504 :- RESEARCH METHODS IN BEHAVIOURAL SCIENCE
Ag Extn.504 :-  RESEARCH METHODS IN BEHAVIOURAL SCIENCE  Ag Extn.504 :-  RESEARCH METHODS IN BEHAVIOURAL SCIENCE
Ag Extn.504 :- RESEARCH METHODS IN BEHAVIOURAL SCIENCE
 
RM-3 SCY.pdf
RM-3 SCY.pdfRM-3 SCY.pdf
RM-3 SCY.pdf
 
Characteristics of Good Evaluation Instrument
Characteristics of Good Evaluation InstrumentCharacteristics of Good Evaluation Instrument
Characteristics of Good Evaluation Instrument
 
Assessment of Learning
Assessment of LearningAssessment of Learning
Assessment of Learning
 
LESSON-8-ANALYSIS-INTERPRETATION-AND-USE-OF-TEST-DATA.pptx
LESSON-8-ANALYSIS-INTERPRETATION-AND-USE-OF-TEST-DATA.pptxLESSON-8-ANALYSIS-INTERPRETATION-AND-USE-OF-TEST-DATA.pptx
LESSON-8-ANALYSIS-INTERPRETATION-AND-USE-OF-TEST-DATA.pptx
 
Characteristics of Assessment
Characteristics of Assessment Characteristics of Assessment
Characteristics of Assessment
 
Monika seminar
Monika seminarMonika seminar
Monika seminar
 
Monika seminar
Monika seminarMonika seminar
Monika seminar
 
tools of research
tools of researchtools of research
tools of research
 

Kürzlich hochgeladen

BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024Janet Corral
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 

Kürzlich hochgeladen (20)

BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 

Criteria to consider when constructing good tests

  • 1.  Criteria to Consider when Constructing Good Tests A. Validity – is the degree to which the test measures what is intended to measure. It is the usefulness of the test for a given purpose. It is the most important criterion of a good examination. Factors Influencing the Validity of the Tests In General 1. Appropriateness of Test – it should measure the abilities, skill and information it is supposed to measure. 2. Directions –it should indicate how the learners should answer and record their answers. 3. Reading Vocabulary and Sentence Structure –it should be based on the intellectual level of maturity and background experience of the learners. 4. Difficulty of Items - it should have items that are not too difficult and not too easy to be able to discriminate the bright from slow pupils. 5. Construction of Test Items – it should not provide clues so it will not be a test on clues nor ambiguous so it will not be a test on interpretation. 6. Length of the Test –it should just be sufficient length so it can measure what it is supposed to measure and not that it is too short that it cannot adequately measure the performance we want to measure. 7. Arrangement of Items –it should have items that are arranged in ascending level of difficulty such that it starts with the easy so that the pupils will pursue on taking the test. 8. Patterns of Answer –it should not allow the creation of patterns in answering the test. Ways in Establishing Validity 1. Face Validity – is done by examining the physical appearance of the test 2. Content Validity – is done through a careful and critical examination of the objectives of the test so that it reflects the curricular objectives. 3. Criterion-related Validity – is established statistically such that a set of scores revealed by a test is correlated with the scores obtained in another external predictor or measure. a. Concurrent validity – describes the present status of the individual by correlating the sets of scores obtained from two measures given concurrently. b. Predictive validity – describes the future performance of an individual by correlating the sets of scores obtained from two measures given at a longer time interval. 4. Construct Validity – is established statistically by comparing psychological traits or factors that theoretically influence scores in a test. a. Convergent Validity – is established if the instrument defines another similar trait other than what it is intended to measure. e.g. Critical Thinking Test may be correlated with Creative Thinking Test. b. Divergent Validity – is established if an instrument can describe only the intended trait and not the other traits. e. g. Critical Thinking Test may not be correlated with Reading Comprehension Test. B. Reliability – it refers to the consistency of scores obtained by the same person when retested using the same instrument or one that is parallel to it.
  • 2. Factors Affecting Reliability 1. Length of the Test – as a general rule, the longer the test, the higher the reliability. A longer test provides a more adequate sample of the behavior being measured and is less distorted by chance factors like guessing. 2. Difficulty of the Test – ideally, achievement tests should be constructed such that the average score is 50 percent correct and the scores range from near zero to perfect. The bigger spread of the scores, the more reliable the measured difference is likely to be. A test is reliable if the coefficient of correlation is not less than 0.85. 3. Objectivity – can be obtained by eliminating the bias, opinions or judgments of the person who checks the test. Method Type of Reliability Measure Procedure Statistical Measure A. Test-Retest Measure of stability Give a test twice to the same group with any time interval between tests from several minutes to several years. Pearson r B. Equivalent Forms Measure of equivalence Give parallel forms of tests with close time intervals between forms. Pearson r C. Test-Retest with Equivalent Forms Measure of stability and equivalence Give parallel forms of test with increased time intervals between forms. Pearson r D. Split Half Measure of Internal Consistency Give a test once. Score equivalent halves of the test e.g. odd- and even- numbered items Pearson r & Spearman Brown Formula E. Kuder- Richardson Measure of Internal Consistency Give the test once then correlate the proportion/percentage of the students passing and not passing a given item. Kuder- Richardson Formula 20 and 21 Formulas for Measures of Correlation Used in Establishing Test Validity & Reliability Pearson r 𝑟 = ∑ 𝑋𝑌 𝑁 −( ∑ 𝑋 𝑁 )( ∑ 𝑌 𝑁 ) √∑ 𝑋2 𝑁 −( ∑ 𝑋 𝑁 ) 2 √∑ 𝑌2 𝑁 − ( ∑ 𝑌 𝑁 ) 2 Spearman Brown Formula 𝑟𝑒𝑙𝑖𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝑡ℎ𝑒 𝑤ℎ𝑜𝑙𝑒 𝑡𝑒𝑠𝑡 = 2𝑟 𝑜𝑒 1+ 𝑟 𝑜𝑒 Kuder-Richardson Formula 20 𝐾𝑅20 = 𝐾 𝐾−1 [1 − ∑ 𝑝𝑞 𝑆2 ] Where: X – scores in a test Y – scores in a retest N –number of examinees Where: roe– reliability coefficient using the split-half or odd- even procedure Where: K – no. of items p – proportion of the examinees who got the item right q – proportion of the examinees who got the item wrong S2 – variance or the square of the standard deviation
  • 3. Kuder-Richardson Formula 21 𝐾𝑅21 = 𝐾 𝐾−1 [1 − 𝑘𝑝̅ 𝑞 𝑆2 ] Interpretation of the Pearson r correlation value 𝐻𝑖𝑔ℎ 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 { 1 − 𝑃𝑒𝑟𝑓𝑒𝑐𝑡 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 0.5 − 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 𝐿𝑜𝑤 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 { 0.5 − 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 0 − 𝑍𝑒𝑟𝑜 𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 𝐿𝑜𝑤 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 { 0 − 𝑍𝑒𝑟𝑜 𝐶𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 −0.5 − 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝐶𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 𝐻𝑖𝑔ℎ 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 { −0.5 − 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 −1 − 𝑃𝑒𝑟𝑓𝑒𝑐𝑡 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 C. Administrability – the test should be administered with ease, clarity and uniformity so that scores obtained are comparable. Uniformity can be obtained by setting the time limit and oral instructions. D. Scorability – the test should be easy to score such that directions for scoring are clear, the scoring key is simple; provisions for answer sheets are made. E. Economy – the test should be given in the cheapest way, which means that answer sheets must be provided so the test can be given from time to time. F. Adequacy – the test should contain a wide sampling of items to determine the educational outcomes or abilities so that the resulting scores are representatives of the total performance in the areas measured. G. Authenticity – the test should simulate real-life situations.  Shapes of the Frequency Polygons 1. Normal – bell-shaped curve 2. Positively skewed – most scores are below the mean and there are extremely high scores, 𝑥̅ > 𝑥̂ (mean is greater than the mode) 3. Negatively skewed – most scores are above the mean and there are extremely low scores, 𝑥̅ < 𝑥̂ (mean is lower than the mode) 4. Leptokurtic – highly peaked and the tails are more elevated above the baseline 5. Mesokurtic – moderately peaked 6. Platykurtic – flattened peak 7. Bimodal Curve – curve with two peaks or mode 8. Polymodal Curve – curve with three or more modes 9. Rectangular Distribution – there is no mode  Four Types of Measurement Scales Measurement Scale Characteristics Example 1. Nominal  Groups and labels data Gender (1-male, 2-female) 2. Ordinal  Ranks data  Distance between points are indefinite Income (1-low, 2-average, 3- high) 3. Interval  Distance between points are equal  No absolute zero point Test scores and temperature *a score of zero in a test does not mean no knowledge at all 4. Ratio  All of the above except that it has an absolute zero point Height, weight * a zero weight means no weight at all Where: 𝑝̅ = 𝑋̅ 𝐾 ; 𝑞 = 1 − 𝑝
  • 4. Measures of Central Tendency and Variability Assumptions When Used Appropriate Statistical Tools Measure of Central Tendency -describes the representative value of a set of data Measure of Variability -describes the degree of spread or dispersion of a set of data  When the frequency distribution is regularly/ symmetrically/ normal  Usually used when the data are numeric (interval or ratio) Mean – the arithmetic average Standard Deviation – the root-mean-square of the deviations from the mean.  When the frequency distribution is irregular/ skewed  Usually used when the data are ordinal Median – the middle score in a group of scores that are ranked Quartile Deviation – the average deviation of the 1st and 3rd quartiles from the median  When the distribution of scores is normal and quick answer is needed  Usually used when the data are nominal Mode – the score that occurs frequently Range – the difference between the highest and lowest score in a set of observation I. Procedure in the Computation of the Measures of Central Tendency A. Mean Procedure: 1. Mean of Ungrouped Data: used for few cases (N<30) a. Get the sum of scores (ΣX) b. Divide the sum by the number of cases (N) Formula: 𝑋̅ = ∑ 𝑋/𝑁 2. Mean of Grouped Data: uses for large cases (N>30) There are 2 possible methods that will be discussed in computing the mean of grouped data. a. Using Midpoint Method Procedures: 1) Group data in the form of a frequency distribution 2) Compute the midpoints of all class limits (M) 3) Multiply the midpoints by their frequencies (M x F) 4) Get the sum of the products of the midpoints and frequencies (Σ MF) 5) Divide the sum by the number of cases (N) Formula: 𝑋̅ = ∑ 𝑀𝐹 𝑁 b. Using Class Deviation Method Procedures: 1) Choose your arbitrary starting point or origin from any of the class limits 2) Get the midpoint of the class limit that you have chosen as your starting point. Call this your assumed mean (AM) 3) Get the deviation (D) of each class limit from the class limit where the assumed mean is. The deviation of the class limit where the assumed mean is located is 0. Add one (+1) to each class limit higher than this point of origin and subtract one (-1) to the class limit lower than the origin. 4) Multiply the frequencies by their corresponding deviations (FD) 5) Add the products of the frequencies and deviations (ΣFD) 6) Divide the sum by the number of cases (ΣFD/N) 7) Multiply the quotient by the number of class interval (i) 8) Add the product to the assumed mean Formula: 𝑋̅ = 𝐴𝑀 + 𝑖 ( ∑ 𝐹𝐷 𝑁 )
  • 5. B. Mode  Median of Ungrouped Data There are several ways in the computation of median for ungrouped data. The process depends on a case to case basis Case 1: The total number of cases is an odd number Procedure: 1.) Arrange the scores from the highest to lowest or vice versa 2.) Get the middlemost score. The score is the median score Case 2: The total number of cases is an even number Procedure: 1.) Arrange the scores from highest to lowest or vice versa. 2.) Get the two middlemost scores 3.) Compute the average of the two middlemost scores. The average is the median score. Case 3: The middlemost score occurs twice, thrice, or more number of times Procedure: 1.) Get the middlemost score/s, its/their identical score/s and its/their counterparts either above or below the middlemost score/s 2.) Compute their average and the average score is the median. 2. Median for Grouped Data Procedure: 1.) Add up or accumulate the frequencies starting from the lowest to the highest class limit. Call this the cumulative frequency. (CF) 2.) Find one half of the number of cases in the distribution. (N/2) 3.) Find the cumulative frequency which is equal or closest but higher than the half of the number of cases. The class containing this frequency is the median class. 4.) Find the lowest limit (LL) of the median class. 5.) Get the cumulative frequency of the class below the median class. (CFb) 6.) Subtract this from the half of the number of cases in the distribution. (N/2 – CFb) 7.) Get the frequency of the median class. (FMdn) 8.) Find the class interval (i) then follow the given formula below. Formula: 𝑋̃ = 𝐿𝐿 + 𝑖 ( 𝑁 2 −𝐶𝐹 𝑏 𝐹𝑀𝑑𝑛 ) C. Mode Procedure 1. Mode of Ungrouped Data  Get the most frequent score  when there are more than three modes, they are called polymodal or multimodal  when there is no mode, it is describe as a rectangular distribution. 2. Mode for Grouped Data a. Crude Mode – refers to the midpoint of the class limit with the highest frequency. Procedure: 1.) Find the class limit with the highest frequency 2.) Get the midpoint of that class limit 3.) The midpoint of the class limit with the highest frequency is the crude mode Where: LL = lowestlimitof the medianclass i = class interval N/2 = half of the numberof cases CFb = cumulative frequencybelow the medianclass FMdn = frequencyof the medianclass
  • 6. b. Refined Mode–refers to the mode obtained from an ordered arrangements or a class frequency distribution Procedure: 1.) Get the mean and the median of the grouped data. 2.) Multiply the median by three (3Mdn) 3.) Multiply the mean by two (2Mn) 4.) Subtract 2Mn from 3Mdn to get the Mode. (Md) Formula: 𝑋̂ = 3𝑀𝑑𝑛 − 2𝑀𝑛  How will you interpret the Measures of Central Tendency? 1.) The value that represents a set of data will be the basis in determining whether the group is performing better or poorer than the other groups. II. Procedure in the computation of the Measures of Variability A. Range (R) 1. For Ungrouped Data – the difference between the highest and lowest score 2. For Grouped Data – the difference between the highest limit of the highest class limit and the lowest limit of the lowest class limit. B. Standard Deviation (SD) Procedure for Ungrouped Data 1.) Find the mean. (𝑋̅) 2.) Subtract the mean from each score to get the deviation. [ 𝑑 = 𝑋̅ − 𝑋̅] 3.) Square the deviation. (d2) 4.) Get the sum of the squared deviations. (Σd2) 5.) Divide the sum by the number of cases (Σ d2 / N – 1) 6.) Get the square root of the answer. √Σd2 / N-1 Formula: 𝑆𝐷 = √ ∑ 𝑑 2 𝑁−1 Procedure for Grouped Data A. Using Class Deviation Method 1.) Like what you did in the mean, get the deviation (d) and the product of the frequency and deviation of each score. (fd) 2.) Multiply the product of the frequency and the deviation by the deviation. (fd2) 3.) Get the sum of the product of the frequency and squared deviation. (Σfd2) 4.) Compute the standard deviation using the formula below Formula: 𝑺𝑫 = 𝑰√[ ∑ 𝒇𝒅 𝟐 𝑵 ] − [ (∑ 𝒇𝒅) 𝟐 𝑵 𝟐 ] B. Using Midpoint Method 1.) Square the midpoint (M2) and multiply it by the frequency midpoint (FM) 2.) Write the products of M & FM in another column and label it (FM2) 3.) Use the formula below to compute the Standard Deviation. Formula: 𝑆𝐷 = √ ∑ 𝐹𝑀2 𝑁 − ( 𝑋̅)2 Where: I = interval N = Number of cases Σfd = sum of the product of frequency and deviation Σfd2 = sum of the product of the frequency and squared deviation
  • 7.  How will you interpret the standard deviation? 1.) The results will help you determine if the group is homogeneous or not. 2.) The results will also help you determine the number of students that fall below and above the average performance. Study how to do this:  Mean – 1 SD and mean + 1 SD would give the limits of an average ability  The point right below – 1 SD is the upper limit of the below average ability  The point right above + 1 SD is the lower limitof the above average ability C. Quartile Deviation (QD) 1. Procedure in the Computation of QD for Ungrouped Data 1.) Arrange the scores in descending or ascending order 2.) Compute the Q1 i.e. [¼ (N)] and the results tells the rank of the Q1 score in the ordered arrangement from the bottom. 3.) Look for the score in this rank. 4.) Compute the Q3 score [d = ¾ (N)] and the results tells the rank of the Q3 score. 5.) Look for the Q3 score in this rank 6.) Compute the QD 𝑄𝐷 = 𝑄3−𝑄1 2 2. Procedure in the Computation of QD for Grouped Data 1.) Compute for the value of the 1st quartile 𝑄1 = 𝐿𝐿 + ( 𝑁 2 −𝐶𝐹 𝑏 𝐹𝑞 ) 𝑖 2.) Compute for the 3rd quartile 𝑄3 = 𝐿𝐿 + ( 3𝑁 2 −𝐶𝐹 𝑏 𝐹𝑞 ) 𝑖 3.) Compute for the interquartile range or quartile 𝑄𝐷 = 𝑄3−𝑄1 2  How will you interpret the quartile deviation? The results will also tell if the group is homogeneous or not. It will also tell how many of the students fall below or above the region of acceptable performance. To do this, study the instruction below.  Median – 1 QD and Median +1 QD would give the limits of an average ability  The Point right below the (-1) QD is the upper limit of the below average ability  The point right above the +1 QD is the lower limit of the above average ability STANDARD SCORES  Indicate the pupil’s relative position by showing how far his raw score is above or below average  Express the pupil’s performance in terms of standard unit from the mean  Represented by the normal probability curve or what is commonly called the normal curve  Used to have a common unit to compare raw scores from different tests 1. PERCENTILE  tells the percentage of examinees that lies below one’s score. Formula: P𝑎 = LL + i [ 𝑎𝑁−𝐶𝐹 𝑏 𝐹𝑃 𝑎 ] Where: Q1 – standsforthe 1st quartile LL – lowestlimit N/4 – one-fourthof the total numberof the population CF – cumulative frequencybelow the quartile class Fq – frequencyof the classwhere the firstquartile score falls I - interval Where: LL – lowestlimitof the classof a% N CFb – cumulative frequencybelowthe classof a% N FPa – frequencyof the classof a% N
  • 8. 2. Z-SCORES  tells the number of standard deviations equivalent to a given raw score Formula: 𝑍 = 𝑋−𝑋̅ 𝑆𝐷 Note: Z – score is negative when X <𝑋̅ Z – score is positive when X >𝑋̅ 3. T-SCORES  it refers to any set of normally distributed standard deviation score that has a mean of 50 and a standard deviation of 10.  computed after converting raw scores to z-scores to get rid of negative values Formula: 𝑇 − 𝑠𝑐𝑜𝑟𝑒 = 50 + 10(𝑍) ASSIGNING GRADES/MARKS/RATINGS A. Marking/Grading - is the process of assigning value to a performance B. Mark/Grades/Ratings are symbols which: Could be in –  Percent such as: 70%, 75%, 80%, etc.  Letters such as: A, B, C, D, or F  Numbers such as: 1, 2, 3, 4, or 5  Descriptive expressions such as: Outstanding (O), Very Satisfactory (VS), Satisfactory (S), Moderately Satisfactory (MS), Needs Improvement (NI), etc. [Note: Any symbol can be used provided that it has uniform meaning to all concerned] Could represent –  How a student is performing in relation to other students (Norm-Referenced Grading)  The extent to which a student has mastered a particular body of knowledge (Criterion-Referenced Grading)  How a student is performing in relation to a teacher’s judgment of his or her potential. (Grading in Relation to Teacher’s Judgment) Could be for –  Certification that gives assurance that a student has mastered a specific content or achieved a certain level of accomplishment.  Selection that provides basis in identifying or grouping students for certain educational paths or programs.  Direction that provides information for diagnosis and planning  Motivation that emphasizes specific material or skills to be learned and helping students to understand and improve their performance. Could be based on –  Examination results or test data  Observations of student work  Group evaluation activities  Class discussions and recitations
  • 9.  Homework  Notebooks and note taking  Reports, themes and research papers discussions and debates  Portfolios  Projects  Attitudes, etc. Could be assigned by –  Criterion-referenced grading or grading - based on fixed or absolute standards where grade is assigned based on how a student has met the criteria or the well-defined objectives of a course that were spelled out in advance. It is then up to the student to earn the grade he or she wants to receive regardless of how other students in the class have performed. This is done by transmuting test scores into marks or ratings.  Norm-referenced grading or grading - based on relative standards where a student’s grade reflects his or her level of achievement relative to the performance of other students in the class. In this system the grade is assigned based in the average of test scores. The rating scales that are used in assigning grades are: 1.) The four point rating scale which uses the median and quartile deviation of the test scores to group the scores into four and each group is assigned the corresponding grade of A, B, C, and D or 1, 2, 3, or 4. 2.) The five point rating scale which uses the median and quartile deviation of the test scores to group the scores into 5 and each group is assigned the corresponding grade of A, B, C, D, or F or 1, 2, 3, 4, or 5  Point or Percentage Grading System whereby the teacher identifies points or percentages of various tests and class activities depending on their importance. The total of these points will be the bases for the grade assigned to the student.  Contract Grading System where each student agrees to work for a particular grade according to agreed-upon standards.  Guidelines in Grading Students 1.) Explain your grading system to the students early in the course and remind them of the grading policies regularly 2.) Base grades on a predetermined and reasonable set of standards. 3.) Base your grades on as much objective evidence as possible. 4.) Base grades on the student’s attitude as well as achievement, especially at the elementary and high school level. 5.) Base grades on the student’s relative standing compared to classmates. 6.) Base grades on a variety of sources 7.) As a rule, do not change grades. 8.) Become familiar with the grading policy of your school and with your colleagues’ standards 9.) When failing a student, closely follow school procedures. 10.)Record grades on report cards and cumulative records. 11.)Guard against bias in grading. 12.)Keep pupils informed of their standing in the class
  • 10. References Frankael, J.R. & Wallen, N.E. (1993). How to Design and Evaluate Research in Education, 2nd Edition, New York: McGrawHill Inc. Nackmeas, C.F. and Nachmeas, D. (1996). Research Methods in the Social Sciences, 5th Edition, London: St. Martius Press, Inc. Oriondo, Leonora et. al. (1996). Evaluating Educational Outcomes. Quezon City: Rex Printing Company, Inc. Omstein, Allan C. (1990). Strategies for Effective Teaching. Newyork: Harper Collins Publisher: Navotas, M.M.