SlideShare ist ein Scribd-Unternehmen logo
1 von 104
Chapter 1
Overview of assessment :
Context, issues and trends
DEFINITION OF TERMS – test, measurement,
evaluation, and assessment
• A test is a subset of assessment intended to measure a test-taker's language
proficiency, knowledge, performance or skills
• Brown defined a test as a process of quantifying a test-taker’s performance
according to explicit procedures or rules.
• Assessment : process of observing and measuring learning. It is an ongoing
process in educational practice, which involves a multitude of methodological
techniques.
• It can consist tests, projects, portfolios.
• Evaluation involves the interpretation of information. When a tester or marker
evaluate, s/he “values” the results in such a way that the worth of the
performance is conveyed to the test-taker.
• Measurement is the assigning of numbers to certain attributes of objects, events,
or people according to a rule-governed system.
The relationship between tests, measurement and assessment
Four stages/phases of development of
examination system in our country
• Pre-Independence
• Razak Report
• RahmanTalib Report
• Cabinet Report
• Malaysia Education Blueprint (2013-2025)
The achievements of Malaysia Examination
Syndicate (MES)
Chapter 2
Role and Purposes of
Assessment In
T & L
Framework
Reasons / Purposes
of Assessment
Assessment of Learning /
Assessment for Learning
Assessment OF Learning
• the use of a task or an activity to measure, record, and report on a
student’s level of achievement in regards to specific learning
expectations.
• This type of assessment is also known as summative assessment.
• provide the focus to improve student achievement, give everyone the
information they need to improve student achievement, and apply
the pressure needed to motivate teachers to work harder to teach
and learn.
AOL
Assessment FOR leaning
• the use of a task or an activity for the purpose of determining student
progress during a unit or block of instruction.
• Is roughly equivalent ( the same) to formative assessment -
assessment intended to promote further improvement of student
learning during the learning process.
• commonly known as formative and diagnostic assessments.
• students are provided valuable feedback on their own learning.
Importance of AFL
• reflects a view of learning in which assessment helps students learn better,
rather than just achieve a better mark
• involves assessment activities as part of learning and to inform the
planning of future learning
• includes clear goals for the learning activity
• provides effective feedback that motivates the learner and can lead to
improvement
• reflects a belief that all students can improve
• encourages self-assessment and peer assessment as part of the regular
classroom routines
• involves teachers, students and parents reflecting on evidence
• is inclusive of all learners.
Types of tests
Henning (1987) identifies six kinds of information that tests provide
about students. They are:
o Diagnosis and feedback
o Screening and selection
o Placement
o Program evaluation
o Providing research criteria
o Assessment of attitudes and socio-psychological differences
Type of tests Explanation
1. Proficiency tests - designed to assess the overall language ability of students at varying levels.
- usually developed by external bodies such as examination boards like Educational
Testing Services (ETS) or Cambridge ESOL.
- Standardized
2. Achievement tests - to see what a student has learned with regard to stated course outcomes
- usually administered at mid-and end- point of the semester or academic year.
- generally based on the specific course content or on the course objectives.
- cumulative, covering material drawn from an entire course or semester.
3. Diagnosis tests - seek to identify those language areas in which a student needs further help.
- is crucial for further course activities and providing students with remediation.
- placement tests often serve a dual function of both placement and diagnosis (Harris &
McCann, 1994; Davies et al., 1999).
4. Aptitude tests - designed to measure general ability or capacity to learn a foreign language a priori
(before taking a course) and ultimate predicted success in that undertaking.
- designed to apply to the classroom learning of any language
5. Progress tests - measure the progress that students are making towards defined course or programme
goals.
- administered at various stages throughout a language course to see what the students
have learned
Type of tests Explanation
6. Placement tests - designed to assess students’ level of language
ability for placement in an appropriate course or
class.
- indicates the level at which a student will learn
most effectively
- main aim is to create groups, which are
homogeneous in level.
Malaysian Context (KSSR)
• School-Based Assessment
Purpose:
1. to realign the education system from one that focuses on academic
excellence to a more holistic one
2. To ensure a more systematic mastery of knowledge by emphasising
on assessment of each child.
3. To achieve the aspiration of National Philosophy of Education
towards developing well rounded learners (JERIS)
4. to reduce exam-oriented learning among learners
5. to evaluate learners’ learning progress
Malaysian context ctd..
SBE features:
• Assessment for and of learning
• Standard-referenced Assessment (Performance Standard)
• *Formative tests which are assessed using Bands 1 to 6, HOTS (Higher
Order Thinking Skills)
• Holistic
• Integrated
SBE Component:
Academic:
• School Assessment (using Performance Standards)
• Centralised Assessment
Non-academic:
• Physical Activities, Sports and Co-curricular Assessment (PAJSK : eg;
SEGAK)
• Psychometric/Psychological Tests (Aptitude test, Personality test)
SBE Instrument (WHO):
• Teachers
• Rationale:
• - Can continuously monitor their pupils’ growth
• - Can provide constructive feedback to help improve pupils’ learning
abilities
• - Better understand the context and environment most conducive to
assess pupils
• - Appraise and provide feedback based on the Performance Standards
HOW:
Observation, Performance, Project, Product,Hands on, Written Essays,
Pencil and Paper, Worksheet, Open ended discussion, Quizzes,
Checklist,Homework.
Performance Standard ;
a set of statements detailing the achievement and mastery of an individual within a
certain discipline, in a specific period of study based on an identified benchmark.
Chapter 3
Basic Testing
Terminology
Framework
Types of Tests
Norm-Referenced
and Criterion-
Referenced
Formative and
Summative
Objective and
Subjective
Norm-Referenced Test Criterion-Referenced Test
(Mastery tests)
Definition
Purpose
A test that measures student’s
achievement as compared to other
students in the group
*Designed to yield a normal curve, 50%
above 50% below.
Determine performance difference among
individual and groups
An approach that provides information
on student’s mastery based on a
criterion specified by the teacher
*Anyone who meets the criterion can get
high score
Determine learning mastery based on
specified criterion and standard
Test Item
Frequency
From easy to difficult level and able to
discriminate examinee’s ability
Continuous assessment in the classroom
Guided by minimum achievement in the
related objectives
Continuous assessment
Appropriateness
Example
Summative evaluation
Public exams: UPSR, PMR, SPM, and STPM
Formative evaluation
Mastery test: monthly test, coursework,
project, exercises in the classroom
Norm-Referenced Test Criterion-Referenced Test
Purpose - To rank each pupil with respect to the
achievement of others in broad areas of knowledge.
- To discriminate between high and low achievers.
- To show how a student’s performance compares
to that of other test-takers
- To determine whether each student has
achieved specific skills or concepts.
- To find out how much student know
before instruction begins and after it has
finished.
- To classify students according to whether
they have met established standard.
Content - Measures broad skill areas sampled from a
variety of textbooks,syllabus,and the judgment
of curriculum experts
- Measures specific skills which make up a
designated curriculum
- Each skill is expressed as an instructional
objective.
Item characteristics - Each skills is usually tested by less few items
- Items vary in difficulty
- Items are selected that discriminate between
high and low achievers
- Each skills tested by at least 4 items in
order to obtain an adequate sample of
pupil performance.
- Minimize guessing
- The items which test any given skill are
parallel in difficulty.
SET B: Q1a)
Norm-Referenced test (Normal Curve)
• represents the norm or average performance of a population
and the scores that are above and below the average within
that population.
• include percentile ranks, standard scores, and other statistics
for the norm group on which the test was standardized.
• A certain percentage of the norm group falls within various
ranges along the normal curve.
• Depending on the range within which test scores fall, scores
correspond to various descriptors ranging from deficient to
superior.
• An examinee's test score is compared to that of a norm group
by converting the examinee's raw scores into derived or scale
scores.
• Testmakers make the test so that most students will score
near the middle, and only a few will low (the left side of the
curve) or high (the right side of the curve).
• Scores are usually reported as percentile ranks.
• The scores range from 1st percentile to 99th percentile, with
the average students scores set at the 50th percentile.
Positive Skew
• Positive skew is when the long
tail is on the positive side of the
peak, and some people say it is
"skewed to the right".
• The mean is on the right of the
peak value.
• the mean is greater than the
mode.
• distribution has scores clustered
to the left, with the tail
extending to the right.
Negative Skew
• Majority of the score falls toward the upper hand.
• Curves are not symmetrical and have more scores on
the higher ends of distribution which will tend to
reduce the reliability of the test.
• Also called the mastery curve.
Problem:
• Scores are scrunched up around one point and thus
making it difficult to make decisions as many pupils
will be around that same point.
• Skewed distributions will also create problems as they
indicate violations of the assumption of normality that
underlies many of the other statistics that are used to
study test validity. (James Dean Brown, 1997)
Characteristics Formative Summative
Relation to Instruction - Occurs during instruction - Occurs after instruction
Frequency - Occurs on an ongoing basis - Occurs at a particular point in time to
determine to know
Relation to grading - Not graded – information used as feedback to
students and teachers, mastery is not expected
when students are first introduced to a concept
- Graded
Students role - Active engagement – self assessment - Passive engagement in design and
monitoring
Requirements for use - Clearly defined learning targets that students
understand
- Clearly defined criteria for success that
student understand
- Use of descriptive versus evaluation feedback
- Well designed assessment blue print
that outlines the learning targets.
- Well designed test items using best
practices
Examples - a process .
- Observations, interviews, evidence from
work samples, paper and pencil tasks
- Final assessment
Purpose - Designed to provide information needed to
adjust teaching and learning
- Designed to provide information about
the amount of learning that has
occurred at a particular point.
Formative Vs Summative
Assessment For Learning AFL/AOL? Assessment Of Learning
involves both teachers and students in ongoing dialogue,
descriptive feedback, and reflection throughout instruction
Elaboration -evaluate student learning at the end of an instructional unit by
comparing it against some standard or benchmark
-Specific learning outcomes and standards are reference points
-grade levels may be the benchmarks for reporting
-Rubrics can be given to students before they begin working on a
particular project so they know what is expected of them for
each of the criteria.
-help students identify their strengths and weaknesses and
target areas that need work
recognize where students are struggling and address
problems immediately
-gain as much information as possible of what the student
has achieved, what has not been achieved, and what the
student requires to best facilitate further progress
-Students’ involvement-Opportunities for students to
express their understandings
Benefit -create clear expectations
-Includes different level of difficulty
-make a judgment of student competency
Formative Summative
Exit slips: Ask students to solve one problem or answer one question
on a small piece of paper. Students hand on the slips as “exit tickets” to
pass to their next class, go to lunch, or transition to another activity.
The slips give teachers a way to quickly check progress toward skills
mastery.
Graphic organizers: When students complete mind maps or graphic
organizers that show relationships between concepts, they’re engaging
in higher level thinking. These organizers will allow teachers to monitor
student thinking about topics and lessons in progress.
Self-assessments: One way to check for student understanding is to
simply ask students to rate their learning. They can use a numerical
scale, a thumbs up or down, or even smiley faces to show how
confident they feel about their understanding of a topic.
Think-pair-share: Ask a question, give students time to think about it,
pair students with a partner, have students share their ideas. By
listening into the conversations, teachers can check student
understanding and assess any misconceptions. Students learn from
each other when discussing their ideas on a topic.
Observation: Watching how students solve a problem can lead to
further information about misunderstanding.
Discussion: Hearing how students reply to their peers can help a
teacher better understand a student’s level of understanding.
Categorizing: Let students sort ideas into self-selected categories. Ask
them to explain why such concepts go together. This will give you some
insight into how students view topics.
Example Multiple choice, True/false, Matching
Short answer
Fill in the blank
One or two sentence response
Portfolios: Portfolios allow students to collect evidence of
their learning throughout the unit, quarter, semester, or
year, rather than being judged on a number from a test
taken one time.
Projects: Projects allow students to synthesize many
concepts into one product or process. They require students
to address real world issues and put their learning to use to
solve or demonstrate multiple related skills.
Performance Tasks: Performance tasks are like mini-projects.
They can be completed in a few hours, yet still require
students to show mastery of a broad topic.
SetAQ1b)Benefits of integrating formative
and summative
• The integration of summative assessments with formative practices
can make the assessment process more meaningful for students by
providing regular feedback that supports learning whilst also
contributing towards an overall picture of their learning.
• Integrated assessment practices can also help learners to understand
connections between learning and assessment. Developing students’
active involvement as assessors of their own learning supports them
in life-long learning beyond formal education.
• The integration of assessments facilitates the accumulation of
evidence which can be used for both formative and summative
purposes over time, reducing ‘teaching to the test’.
Objective vs Subjective
Objective Subjective
Those with a single correct response regardless
of who scores a set of responses, an identical
score will be obtained
Those items that typically do not have a single
correct response
Subjective judgment of the scorer do not
influence an individual’s score
Subjective judgments of the scorer are an
integral part of the scoring process
Also known as “selected response” and
“structured-response” items
Also known as “free-response”, “constructed-
response” and “supply-type” items
Include multiple-choice question, matching and
alternative-choice items
Include short-answer and essay items
Assess lower-level skills such as
knowledge,comprehension
Require students to produce what they know
Relatively easy to administer, score and analyse Easy to construct
5 Basic Terminology in Objective test
1. Receptive or selective response
Items that the test-takers chooses from a set of responses, commonly called a supply type of response rather than creating a
response.
2. Stem
Every multiple-choice item consists of a stem (the ‘body’ of the item that presents a stimulus). Stem is the question or
assignment in an item. It is in a complete or open, positive or negative sentence form. Stem must be short or simple,
compact and clear. However, it must not easily give away the right answer.
3. Options or alternatives
They are known as a list of possible responses to a test item. There are usually between three and five options/alternatives
to choose from.
4. Key
This is the correct response. The response can either be correct or the best one. Usually for a good item, the correct answer
is not obvious as compared to the distractors.
5. Distractors
This is known as a ‘disturber’ that is included to distract students from selecting the correct answer. An excellent distractor
is almost the same as the correct answer but it is not.
SETBQ1B Objective tests
Strength Weaknesses
Quick grading Difficult to design, has to consider good distractor
High inter-rater reliability
-requires no judgment from the scorer
Considerable effect
- Guessing is possible
Easy to administer especially for a big group Low validity
Wide coverage of topics in the outlined curriculum Difficult to construct HOTS question
Precision in testing specific skills Testing on the skill rather than content
General Guidelines for Objective Test items
MCQ Alternate-choice items
i.Design each item to measure a single objective;
ii.State both stem and options as simply and directly as possible;
iii.Make certain that the intended answer is clearly the one
correct one;
iv.(Optional) Use item indices to accept, discard or revise item.
1.Must have only one correct answer
2. Format the items vertically, not horizontally.
3. Avoid using ‘All of the above”, “None of the above”, or other
special distractors.
4. Use the author’s examples as a basis for developing your
items.
5. Avoid trick items which will mislead or deceive examinees into
answering incorrectly
An alternate-choice test item is a simple declarative sentence, one
portion of which is given with two different wordings.
E.g:
Ali seems to be (a) eager (b) hesitant in making decision to further his
studies.
The examinee's task is to choose the alternative that makes the sentence
most nearly true.
Rate of guessing is high
-Difficult to write good alternate choice that covers all aspects.
Takes a shorter time
-Examiners take shorter time to evaluate the examinee.
Trick questions are seldom appropriate
-Examiners need to test the examinee directly.
Avoid taking statements directly from the text and placing them out of
context.
-Avoid confusion. It will not test the examinee understanding but their
ability of finding answers.
Use other symbols other than T/F, Y/N
-Examiners could make the examinee to underline the correct answers.
General guidelines Subjective test items
Short answer Essay items
Short-answer questions are open-ended questions that require students
to create an answer. They are commonly used in examinations to assess
the basic knowledge and understanding (low cognitive levels) of a topic
before more in-depth assessment questions are asked on the topic.
-Design short answer items which are appropriate assessment of the
learning objective
-Make sure the content of the short answer question measures
knowledge appropriate to the desired learning goal
-Express the questions with clear wordings and language which are
appropriate to the student population
-Ensure there is only one clearly correct answer in each question
-Ensure that the item clearly specifies how the question should be
answered
-Write the instructions clearly so as to specify the desired knowledge and
specificity of response
-Set the questions explicitly and precisely.
-Direct questions are better than those which require completing the
sentences.
-Let the students know what your marking style is like, is bullet point
format acceptable, or does it have to be an essay format?
-Prepare a structured marking sheet; allocate marks or part-marks for
acceptable answer(s).
-Do not make the correct answer a “giveaway” word
that could be guessed by students who do not
really know the information.
-In addition, avoid giving grammatical cues or other cues to the
correct answer.
Avoid using statements taken directly from the
curriculum.
-Develop grading criteria that lists all
acceptable answers to the test item. Have subject matter experts
determine the acceptable answers.
-Clearly state questions
not only to make essay tests easier for students to answer,
but also to make the responses easier to evaluate
-Specify and define what mental process you want the students to perform
(e.g., analyze, synthesize, compare, contrast, etc.).
-Do not assume learner is practiced with the process
-Avoid writing essay questions that require factual knowledge,as those
beginning questions with interrogative pronouns(who, when, why, where)
-Avoid vague, ambiguous, or non-specific verbs
(consider, examine, discuss, explain) unless you include specific instructions
in developing responses
-Have each student answer all the questions
-Do not offer options for questions
-Structure the question to minimize subjective interpretations
Chapter 4
Basic Principle of
Assessment
SET A SECTION B (1)
SET B Q2 a)
Reliability (Brown)
• Consistent and dependable
- If you give to another pupil or matched pupil on 2 different occasion,
the test should yield similar result
• Consistent in its conditions across two or more administrations
• Gives clear directions for scoring / evaluation
• Has uniform rubrics for scoring / evaluation
• Lends itself to consistent application of those rubrics by the scorer
• Contains item / tasks that are unambiguous to the test-taker
Factor to UNRELIABILITY of a test
1. Student-related realibility
- Temporary illness, fatigue, a ‘bad day’, anxiety etc which make an observed score
deviate from one’s true score.
2. Rater-reliability-Human error and biasness while scoring.
-Inter-rater reliability happen when 2 or more scores award inconsistent scores of
the same test.
-Unclear scoring criteria,fatigue,biasness,carelessness.
3. Test Administration Reliability – conditions in which the test is administered
-noise,room lighting,variation in temperature,condition of table and chair
4. Test-reliability – Nature of the test can cause measurement errors.
-duration of the test (too long,timed), poorly written test items ie.
Ambigious,generic, have more than one answer.
Validity
• second characteristic of good tests is validity, which refers to whether the
test is actually measuring what it claims to measure.
• The extent to which inferences made from assessment results are
appropriate, meaningful and useful in terms of the purpose of the
assessment (Groundland,1998)
A valid test:
1. Measures exactly what it proposes to measure
2. Does not measure irrelevant or ‘contaminating’ variable
3. Relies as much as possible on empirical evidence (performance)
4. Involves performance that samples the test’s criterion (objective)
5. Offers useful, meaningful information about a test-takers ability
6. Is supported by a theoretical rationale or argument
Face Validity: Do the assessment items appear to be appropriate?
• “determined impressionistically; for example by asking students whether the
examination was appropriate to the expectations” (Henning, 1987).
• as the degree to which a test looks right, and appears to measure the knowledge or
abilities it claims to measure, based on the subjective judgement of the examinees who
take it, the administrative personnel who decide on its use, and other psychometrically
unsophisticated observers.
High validity if:
1. Well-constructed,expected format with familiar tasks
2. Clearly doable within the allotted time limit
3. Items that are clear and uncomplicated
4. Directions that crystal clear
5. Task that relate to their course work
6. A difficulty level that presents a reasonable challenge
Content Validity - Does the assessment content cover what you want to assess?
Have satisfactory samples of language and language skills been selected for testing?
• whether or not the content of the test is sufficiently representative and
comprehensive for the test to be a valid measure of what it is supposed to
measure” (Henning, 1987).
• “If a test samples the subject matter about which conclusions are to be drawn,
and if it requires the test-taker to perform the behaviour that is being measured”
(Mousavi,2002).
• Validity can be verified through the use of Table of Test Specification:
1. is to make sure all content domains are presented in the test.
2. give detailed information on each content,
3. level of skills,
4. status of difficulty,
5. number of items, item representation for rating in each content or skill or topic.
Construct Validity –
Are you measuring what you think you're measuring? Is the test based
on the best available theory of language and language use?
• The extent to which a test measures a theoretical construct or
attribute
• Proficiency, communicative competence, and fluency are examples of
linguistic constructs;
• Self-esteem and motivation are psychological constructs.
Criterion-Related Validity
is usually expressed as a correlation between the test in question and the criterion measure.
-Concurrent (parallel) validity: Can you use the current test score to estimate
scores of other criteria? Does the test correlate with other existing
measures?
• The extent to which procedure correlates with the current behaviour of
subjects
• the use of another more reputable and recognised test to validate one’s
own test.
-Predictive validity: Is it accurate for you to use your existing students’ scores
to predict future students’ scores? Does the test successfully predict future
outcomes?
• The extent to which a procedure allows accurate prediction about a
subject’s future behaviour
Consequential Validity
• Encompasses all of the consequences of a test, including considering
its accuracy in measuring intended criteria, its impact in the
preparation of test takers, its effect on the learner, and the social
consequences of a test’s interpretation and use.
Practicality
• Refers to logistical, administrative issues involved in making,giving, and
scoring an assessment instrument.
• Include “cost,time to construct and administer,ease of scoring and ease of
reporting the results (Mousavi,2009)
Practical test:
1. Stays within budgetary limits
2. Stays within appropriate time constraint
3. Relatively easy to administer
4. Appropriately utilizes available human resources
5. Does not exceed available material resources
6. Has a scoring/evaluation procedure that is specific and time-efficient
Objectivity
• refers to the ability of teachers/examiners who mark the answer
scripts.
• The extent in which an examiner examines and awards scores to the
same answer script.
High objective if:
1. Examiners are able to give the same score to the similar answers
guided by the marking scheme
Objective test = highest objectivity
Subjective test = lowest objectivity
Authenticity
• “the degree of correspondence of the characteristics of a given
language test task to the features of a target language task”
High authenticity if:
1. The language in the test is as natural as possible
2. Items are contextualized
3. Topics are meaningful
4. Some thematic organization to items is provided
5. Task represent to real world tasks
Washback
• refers to the impact that tests have on teaching and learning
Positive Washback Negative Washback
Teacher -Induce teachers to cover their subject
more thoroughly.
-Improve teaching strategies
-Encourage positive teaching learning
process
-Encourage teachers to make “teaching
to the test” curriculum
-Teacher not fulfil curriculum standard
-Neglect teaching of skill
Student -Make student work harder -Bring anxiety and distort performance
-Make student to create a negative
judgment towards test
Decision makers Use the authority power of high stakes
testing to achieve the goals
To improve and the introduction of new
curriculum
-Overwhelmingly use tests to promote
their political agendas and seize
Interpretability
• Test should be written in a clear,correct and simple language
• Avoid ambiguous questions and instruction
• Clarity is essential to enable the pupils know exactly what the
examiner wants them to do.
• Difficulty: The test questions should be appropriate in difficulty not
too hard or easy
• Should be progressive to reduce stress and tension
Chapter 5
Designing Language
Classroom Test
Stages of Test
Construction
Explanation
Determining 1) What it is one wants to know
2) For what purpose
Aspect (Questions need answered)
- Examinees
- Kind of test
- Purpose (State)
- Abilities tested
- Accuracy of results
- Importance of backwash effect
- Scope of test
- Constraints set by the unavailability of expertise, facilities, time of construction, administration, and
scoring
Planning 1) Determine the content
Aspect
- Purpose (Describe)
- Characteristics of the test takers, the nature of the population of the examinees for whom the test
is being designed
- A plan for evaluating the qualities of test usefulness (reliability, validity, authenticity, practicality
inter-activeness, and impact)
Stages of Test
Construction
Explanation
Planning ctd - Nature of the ability we want measured
- Identify resources
- A plan for allocation and management of resources
- Format and timing
- Criteria
- Levels of performance
- Scoring procedures
Writing Test items writers’ characteristics:
• Experienced in test construction.
• Quite knowledgeable of the content of the test.
• Have the capacity in using language clearly and economically.
• Ready to sacrifice time and energy.
Other aspects:
• Sampling : test constructors choose widely from the whole area of the course content. (Not
including EVERYTHING under course content in 1 version of test)
• Decision regarding content validity and beneficial backwash
You’ve written it well when..
(/) It is representative sample of the course material
Stages of Test
Construction
Explanation
Preparing You have to…
(/) Understand the major principles, techniques and experience
…before preparing test items.
AVOID preparing
• Test items which can be answered through test-wiseness.
Test wiseness : examinees utilise the characteristics and formats of the test to guess the correct answer
Reviewing Principles for reviewing test items:
• The test should not be reviewed immediately after its construction, but after some considerable
time.
• Other teachers or testers should review it. In a language test, it is preferable if native speakers are
available to review the test.
Pre-testing • The tester should administer the newly-developed test to a group of examinees similar to the target
group; PURPOSE  Analyse every individual item as well as the whole test.
• Numerical data (test results) should be collected to check the efficiency of the item, it should include
item facility and discrimination.
Stages of Test
Construction
Explanation
Validating • Identify IF
• Item Facility (IF) shows to what extent the item is easy or difficult.
• IF= number of correct responses (Σc) / total number of candidates (N)
• And to measure item difficulty:
IF= (Σw) / (N)
The results of such equations range from 0 – 1. An item with a facility index of 0 is too difficult, and
with 1 is too easy. The ideal item is one with the value of (0.5) and the acceptability range for item
facility is between [0.37 → 0.63], i.e. less than 0.37 is difficult, and above 0.63 is easy.
Too easy/Too hard = Low reliability
Preparing Test Blueprint / Test Specifications
• Test specs = an outline of your test /what it will “look like” + your guiding
plan for designing an instrument that effectively fulfils your desired
principles, especially validity.
• They include the following:
a description of its content
item types (methods, such as multiple-choice, cloze, etc.)
tasks (e.g. written essay, reading a short passage, etc.)
skills to be included
how the test will be scored
how it will be reported to students
What is an item?
• A tool, an instrument, instruction or question used to get feedback
from test-takers
• Evidence of something that is being measured.
• Useful information for consideration in measuring or asserting a
construct measurement.
• Can be classified as a recall and thinking item.
• Recall item : item that requires one to recall in order to answer
• Thinking item : item that requires test-takers to use their thinking
skills to attempt.
Sequential steps in designing test specs
• A broad outline of how the test will be organised
• Which of the eight sub-skills you will test
• What the various tasks and item types will be
• How results will be scored, reported to students, and used in future class
(washback)
Remember to…
Know the purpose of the test you are creating
Know as precisely as possible what it is you want to test
Not conduct a test hastily
Examine the objectives for the unit you are testing carefully
Bloom’s Taxonomy (Revised)
• Def : A systematic way of describing how a learner’s performance
develops from simple to complex levels in their affective,
psychomotor and cognitive domain of learning.
The Cognitive Dimension Process
The Cognitive Dimension Process
Level 3C - 3
Categories & Cognitive
Processes
Definition
Factual Knowledge The basic elements students must know to the acquainted
with a discipline or solve problems in it
Conceptual Knowledge The interrelationships among the basic elements within a
larger structure that enable them to function together
Procedural Knowledge How to do something, methods of inquiry, and criteria for
using skills, algorithms, techniques, and methods
Metacognitive Knowledge Knowledge of cognition in general as well as awareness
and knowledge of one’s own cognition
The Knowledge Domain
SOLO Taxonomy
• Def : (Structure of the Observed Learning Outcome) a systematic way
of describing how a learner’s performance develops from simple to
complex levels in their learning.
• There are 5 stages, namely :
Prestructural, Unistructural, Multistructural, which are in a quantitative
phrase and Relational and Extended Abstract, which are in a qualitative
phrase (Refer Figure 1.0)
• A means of classifying learning outcomes in terms of their complexity,
enabling teachers to assess students’ work in terms of its quality.
Figure 1.0
Functions of SOLO taxonomy
• An integrated strategy, to be used
In lesson design (learning outcomes intended)
In task guidance
In formative and summative assessment
In deconstructing exam questions to understand marks awarded
As a vehicle for self-assessment and peer-assessment
Advantages of SOLO taxonomy
Aspect
Structure of the taxonomy • Encourages viewing learning as an on-going process, moving from simple recall of facts
towards a deeper understanding; that learning is a series of interconnected webs that can
be built upon and extended.
• Consisting as a series of cycles (especially between the Unistructural, Multistructural and
Relational levels), which would allow for a development of breadth of knowledge as well
as depth.
In turn..
• Creating sts that are.. “self-regulating, self-evaluating learners who were well motivated
by learning.”
SOLO based techniques • Use of constructional alignment encourages teachers to be more explicit when creating
learning objectives, focusing on what the student should be able to do and at which level.
In turn..
• Sts will be able to make progress and allows for the creation of rubrics, for use in class, to
make the process explicit to the student.
It’s HOTs properties • Scaffold in depth discussion
In turn..
• Encouraging sts to develop interpretations, use research and critical thinking effectively to
develop their own answers, and write essays that engage with the critical conversation of
the field.
• May also be helpful in providing a range of techniques for differentiated learning.
Proponents of the SOLO taxonomy say..
• A model of learning outcomes that helps schools develop a common
understanding.
• A ‘framework for developing the quality of assessment’ and that it is
‘easily communicable to students’.
• Hattie outlines three levels of understanding: surface, deep and
conceptual. He indicates that:
“The most powerful model for understanding these three levels and
integrating them into learning intentions and success criteria is the
SOLO model.”
Critics of the SOLO taxonomy say…
• There is potential to misjudge the level of functioning.
• It has ‘conceptual ambiguity’; that the ‘categorisation’ is ‘unstable’.
• The structure is referred as a hierarchy, hence rise of concerns when
complex processes, such as human thought, are categorised in this
manner.
Guidelines for constructing test items
Guideline Elaboration
Aim of test • Developed to precisely measure the objectives prescribed by the blueprint
• Meet quality standards
Range of the topics to be
tested
Measure the test-takers’ ability or proficiency in applying the knowledge and principles on the
topics that they have learnt
Range of skills to be tested • Have cognitive characteristics exemplifying understanding, problem-solving, critical
• thinking, analysis, synthesis, evaluation and interpreting rather than just declarative
knowledge.
• (Bloom’s taxonomy as tool to use in item writing)
Test format Needs to be a logical and consistent stimulus format
Why?
For test item writers : help expedite the laborious process of writing test items as well as supply
a format for asking basic questions.
For test-takers :
• So that the questioning process in itself does not give unnecessary difficulty to answering
questions
• test takers can quickly read and understand the questions, since the format is expected
Guideline Elaboration
International and Cultural
Considerations (biasness)
refrain from…
 the use of slang
 geographic references
 historical references or dates (holidays)
…that may not be understood by an international examinee.
Level of difficulty Assure that the test item…
 Has a planned number of questions at each level of difficulty
 Able to determine mastery and non-mastery performance states
 Weak students could answer easy item
 Intermediate language proficiency students could answer easy and moderate item
 High language proficiency students could answer easy, moderate and advance test items
 encompass all three levels of difficulties
Test format
• Refers to the layout of questions on a test. For example, the format of
a test could be two essay questions, 50 multiple- choice questions,
etc.
*Note : If you wish to know on the outlines of some large-scale
standardised tests, please refer to pages 64 & 65 in the PPG Module
Chapter 6
Assessing Language
Skills Content
Types of test items to assess language skills
Language Skills Elaboration
Listening Two kinds of listening tests:
• Tests that test specific aspects of listening, like sound discrimination
• Task based tests which test skills in accomplishing different types of listening tasks considered
important for the students being tested
Four types of listening performance from which assessment could be considered.
Intensive Listening for perception of the components (phonemes, words, intonation, discourse markers,etc) of a larger stretch of
language.
Responsive Listening to a relatively short stretch of language ( a greeting, question, command, comprehension check, etc.) in order
to make an equally short response
Selective Processing stretches of discourse such as short monologues for several minutes in order to “scan” for certain
information. For example, to listen for names, numbers, grammatical category, directions (in a map exercise), or certain
facts and events.
Extensive Listening to develop a top-down , global understanding of spoken language. For example listening to a conversation and
deriving a comprehensive message or purpose and listening for the gist and making inferences.
Language Skills Elaboration
Speaking Objective test : tests skills such as …
• Pronunciation
• Knowledge of what language is appropriate in different situations
• Language required in doing different things like describing, giving directions, giving instructions,
etc
Integrative task-based test : involves finding out if pupils can perform different tasks using spoken
language that is appropriate for the purpose and the context.
For example :
• Describing scenes shown in a picture
• Participating in a discussion about a given topic
• Narrating a story, etc.
CATEGORIES FOR ORAL ASSESSMENT (Refer yellow table)
Category Elaboration
Imitative • Ability to imitate a word or phrase or possibly a sentence/ pronunciation
• A number of prosodic (intonation, rhythm,etc.), lexical , and grammatical properties of language may be
included
Intensive • The production of short stretches of oral language designed to demonstrate competence in a narrow band of
grammatical, phrasal, lexical, or phonological relationships.
• Eg :directed response tasks (requests for specific production of speech), reading aloud, sentence and dialogue
completion, limited picture-cued tasks including simple sentences, and translation up to the simple sentence
level.
Responsive • Interaction and test comprehension but at somewhat limited level of very short conversation, standard
greetings, and small talk, simple requests and comments.
• The stimulus is almost always a spoken prompt (to preserve authenticity) with one or two follow-up questions or
retorts
Interactive • Increased length + complexity from responsive.
• May include multiple exchanges and/or multiple participants.
• Two types : (a) transactional language, which has the purpose of exchanging specific information, and (b)
interpersonal exchanges, which have the purpose of maintaining social relationships.
Extensive • Speeches, oral presentations, and storytelling, during which the opportunity for oral interaction from listeners is
either highly limited (perhaps to nonverbal responses) or ruled out together.
• Language style is more deliberative (planning is involved)
• May include informal monologue such as casually delivered speech (e.g., recalling a vacation in the mountains,
Language Skills Elaboration
Reading
Meaning conveyed through reading text
Type Elaboration
Skimming Inspect lengthy passage rapidly
Scanning Locate specific information within a short
period of time
Receptive/ Intensive A form of reading aimed at discovering exactly
what the author seeks to convey
Responsive Respond to some point in a reading text
through writing or by answering questions
Meaning conveyed through reading text
Grammatical meaning Meanings that are expressed through
linguistic structures such as complex and
simple sentences and the correct
interpretation of those structures.
Informational meaning The concept or messages contained in the
text. May be assessed through various means
such as summary and précis writing.
Discourse meaning The perception of rhetorical functions
conveyed by the text.
Writer’s tone The writer’s tone – whether it is cynical,
sarcastic, sad or etc
Language Skills Elaboration
Writing
Imitative • The ability to spell correctly and to perceive phoneme-grapheme correspondences in the English spelling
system
• The mechanics of writing
• Form is the primary focus while context and meaning are of secondary concern.
Intensive
(controlled
)
• Producing appropriate vocabulary within a context, collocation and idioms, and correct grammatical features
up to the length of a sentence.
Responsive • Perform at a limited discourse level, connecting sentences into a paragraph and creating a logically connected
sequence of two or three paragraphs.
• Tasks relate to pedagogical directives, lists of criteria, outlines, and other guidelines.
• Eg : brief narratives and descriptions, short reports, lab reports, summaries, brief responses to reading, and
interpretations of charts and graphs.
• Form-focused attention is mostly at the discourse level, with a strong emphasis on context and meaning.
Extensive • Implies successful management of all the processes and strategies of writing for all purposes, up to the length
of eg : an essay,
• Focus is on achieving a purpose, organizing and developing ideas logically, using details to support or
illustrate ideas, demonstrating syntactic and lexical variety and engaging in the process of multiple drafts to
achieve a final product.
• Focus on grammatical form is limited to occasional editing and proofreading of a draft
Brown’s (Assessing Skills)
Skill Type Test item
Listening Intensive Listening • Recognizing phonological and morphological elements
• Paraphrase recognition
Responsive Listening • Responding to a stimulus; conversation, requests
Selective Listening • Listening cloze
• Information transfer
• Sentence repetition
Extensive Listening • Dictation
• Communicative stimulus-response tasks
• Authentic listening tasks
Speaking Intensive Speaking • Directed response tasks
• Read-Aloud tasks
• Sentence/dialogue completion tasks and oral questionnaires
• Picture-cued tasks
Responsive Speaking • Q & A
• Giving instructions and directions
• Paraphrasing
Interactive Speaking • Interview
• Role-play
• Discussions and conversations
• Games
Extensive speaking • Oral presentations
• Picture-cued storytelling
• Retelling a story, news event
Skill Type Test item
Reading Perceptive reading • Reading aloud
• Written response
• Multiple-choice
• Picture-cued items
Selective reading • Matching tasks
• Editing tasks
• Picture-cued tasks
• Gap-filling tasks
Interactive reading • Cloze tasks
• Impromptu reading + comprehension questions
• Short answer tasks
• Editing longer texts
• Scanning
• Ordering tasks
• Information transfer; reading charts, maps, graphs, diagrams
Extensive reading • Skimming tasks
• Summarizing and responding
• Notetaking and outlining
Writing Imitative writing • Writing letters, words and punctuation
• Spelling tasks and detecting phoneme – grapheme correspondences
Intensive (Controlled) writing • Dictation and dicto-comp
• Grammatical transformation tasks
• Picture-cued tasks
• Vocabulary assessment tasks
• Ordering tasks
• Short answer and sentence completion tasks
Skill Type • Test item
Writing Responsive and extensive writing • Paraphrasing
• Guided Q & A
• Paragraph constructions tasks
• Strategic options
• Standardized tests of responsive writing
Grammar &
Vocabulary
Selected response • Multiple-choice tasks
• Discrimination tasks
• Noticing tasks or consciousness-raising tasks
Limited production • Gap-filling tasks
• Short-answer tasks
• Dialogue-completion tasks
Extended production • Information gap tasks
• Role-play or simulation tasks
Objective and Subjective Test
Objective test • Tests that are graded objectively
• Include the multiple choice test, true false items
and matching items
• Similar to select type tests where students are
expected to select or choose the answer from a list
of options
Subjective test • Involve subjectivity in grading
• Include essays and short answer questions
• Similar to supply type as the students are expected
to supply the answer through their essay
Subjective + objective • Dictation test, filling in the blank type tests, as well
as interviews and role plays
Type of test : according to how students are
expected to respond
Selected response:
Do not create any language but rather
select the answer from a given list
Constructed response:
Produce language by writing, speaking,
or doing something else
Personal response:
Produce language but also allows each
students’ response to be different from
one another and for students to
“communicate what they want to
communicate”
True false Fill-in Conferences
Matching Short answer Portfolios
Multiple choice Performance test
Self and peer
assessments
Types of test items to assess language content
Discrete Integrative
Language is seen to be
made up of smaller units
and it may be possible to
test language by testing
each unit at a time
Language is that of an
integrated whole which
cannot be broken up into
smaller units or elements
Communicative test
• Sts have to produce the language in an interactive setting involving
some degree of unpredictability which is typical of any language
interaction situation.
The three principles of communicative tests are :
• involve performance;
• are authentic; and
• are scored on real-life outcomes
Limitation in applying the communicative test
• Issues of practicality, involving especially the amount of time and
extent of organisation to allow for such communicative elements to
emerge.
Advantages in applying the communicative
test
• Have valid language that are purposeful and can stimulate positive
washback in teaching and learning.
Chapter 7
Scoring, grading and
assessment criteria
Scoring approaches
Objective • Relies on quantified methods of evaluating
students’ writing
Holistic • The reader (examiner) reacts to the students’
compositions as a whole and a single score is
awarded to the writing
• Each score on the scale will be accompanied with
general descriptors of ability
• Related : Primary trait scoring
Analytical • Raters assess students’ performance on a variety of
categories which are hypothesised to make up the
skill of writing
Comparison between approaches
Scoring Approach Advantages Disadvantages
Holistic
 Quickly graded
 Provide a public standard that is
understood by the teachers and students
alike
 Relatively higher degree of rater reliability
 Applicable to the assessment of many
different topics
 Emphasise the students’ strengths rather
than their weaknesses.
 The single score may actually mask differences
across individual compositions.
 Does not provide a lot of diagnostic feedback
Analytical
 It provides clear guidelines in grading in the
form of the various components.
 Allows the graders to consciously address
important aspects of writing.
 Writing ability is unnaturally split up into
components.
Objective:
 Emphasises the students’ strengths rather
than their weaknesses.
 Still some degree of subjectivity involved.
 Accentuates negative aspects of the learner’s
writing without giving credit for what they can
do well.
Questions you can attempt..
• Describe with examples how holistic and analytical rubrics can be
used to assess Year 6 pupils’ writing based on the following skill
- Write simple factual descriptions of things, events, scenes and what
one saw and did.
- Characteristics of each approach
Chapter 9
Reporting of Assessment
Data
Purposes of reporting
• Main purpose of tests is to obtain information concerning a particular
behaviour or characteristic.
• Evaluate the effectiveness of one’s own teaching or instructional
approach and implement the necessary changes
• Based on information obtained from tests, several different types of
decisions can be made.
Reporting methods
Norm - Referenced Assessment and Reporting Assessing and reporting a student's achievement and
progress in comparison to other students.
Criterion - Referenced Assessment and Reporting Assessing and reporting a student's achievement and
progress in comparison to predetermined criteria.
An outcomes-approach to assessment will provide
information about student achievement to enable
reporting against a standards framework.
An outcomes-approach Acknowledges that students, regardless of their class
or grade, can be working towards syllabus outcomes
anywhere along the learning continuum.
Principles of effective and informative
assessment and reporting
Has clear, direct links with outcomes
Is integral to teaching and learning
Is balanced, comprehensive and varied
Is valid
Is fair
Engages the learner
Values teacher judgement
Is time efficient and manageable
Recognises individual achievement and progress
Involves a whole school approach
Actively involves parents
Conveys meaningful and useful information
Chapter 10
Issues and Concerns related
to assessment in Malaysian
Primary Schools
Components of PBS
School assessment Refers to written tests that assess subject learning. The test questions and marking
schemes are developed,
administered, scored, and reported by school teachers based on guidance from LP.
Central assessment Refers to written tests, project work, or oral tests (for languages) that assess subject
learning. LP develops the test questions and marking schemes. The tests are, however,
administered and marked by school teachers
Psychometric assessment Refers to aptitude tests and a personality inventory to assess students’ skills, interests,
aptitude, attitude and personality. Aptitude tests are used to assess students’
innate and acquired abilities, for example in thinking and problem solving. The personality
inventory is used to identify key traits and characteristics that make up the students’
personality. LP develops these instruments and provides guidelines for use.
Physical, sports, and co-
curricular activities assessment
Refers to assessments of student performance and participation in physical and health
education, sports, uniformed bodies, clubs, and other non-school sponsored activities
Benefits of PBS
• enables students to be assessed on a broader range of output over a
longer period of time.
• Provides teachers with more regular information to take the
appropriate remedial actions for their students.
• Will hopefully reduce the overall emphasis on teaching totest, so that
teachers can focus more time on delivering meaningful learning as
stipulated in the curriculum.

Weitere ähnliche Inhalte

Was ist angesagt?

assessment in curriculum development
assessment in curriculum development assessment in curriculum development
assessment in curriculum development younes Anas
 
Characteristics of Assessment
Characteristics of Assessment Characteristics of Assessment
Characteristics of Assessment AliAlZurfi
 
test Administration
test Administrationtest Administration
test AdministrationAbuulHassan2
 
Standardized testsand classroom test
Standardized testsand classroom testStandardized testsand classroom test
Standardized testsand classroom testmaryammuno
 
Teacher made-tests
Teacher made-testsTeacher made-tests
Teacher made-testsMsc EduPsy
 
Assessment in language learning classrooms: More questions than answers
Assessment in language learning classrooms:More questions than answersAssessment in language learning classrooms:More questions than answers
Assessment in language learning classrooms: More questions than answersRobert Dickey
 
Assessment &testing in the classroom
Assessment &testing in the classroomAssessment &testing in the classroom
Assessment &testing in the classroomCidher89
 
Criterion vs norm referenced testing
Criterion vs norm referenced testingCriterion vs norm referenced testing
Criterion vs norm referenced testingSaidBaalla
 
MEASUREMENT, ASSESSMENT AND EVALUATION
 MEASUREMENT, ASSESSMENT AND EVALUATION  MEASUREMENT, ASSESSMENT AND EVALUATION
MEASUREMENT, ASSESSMENT AND EVALUATION HennaAnsari
 
TSL3143 Topic 3a Principles in Curriculum Design
TSL3143 Topic 3a Principles in Curriculum DesignTSL3143 Topic 3a Principles in Curriculum Design
TSL3143 Topic 3a Principles in Curriculum DesignYee Bee Choo
 
New grading and student evaluation.ppt
New grading and student evaluation.pptNew grading and student evaluation.ppt
New grading and student evaluation.pptLilik Yuliyanti
 
Testing, assessing, and teaching
Testing, assessing, and teachingTesting, assessing, and teaching
Testing, assessing, and teachingSutrisno Evenddy
 
Criterion referenced test
Criterion referenced test Criterion referenced test
Criterion referenced test Ulfa
 
Test Reliability and Validity
Test Reliability and ValidityTest Reliability and Validity
Test Reliability and ValidityBrian Ebie
 
Planning Classroom Tests and Assessments
Planning Classroom Tests and AssessmentsPlanning Classroom Tests and Assessments
Planning Classroom Tests and AssessmentsKevin Cedrick Castro
 

Was ist angesagt? (20)

assessment in curriculum development
assessment in curriculum development assessment in curriculum development
assessment in curriculum development
 
Test specification
Test specificationTest specification
Test specification
 
Characteristics of Assessment
Characteristics of Assessment Characteristics of Assessment
Characteristics of Assessment
 
test Administration
test Administrationtest Administration
test Administration
 
Standardized testsand classroom test
Standardized testsand classroom testStandardized testsand classroom test
Standardized testsand classroom test
 
Teacher made-tests
Teacher made-testsTeacher made-tests
Teacher made-tests
 
Interpreting test results
Interpreting test resultsInterpreting test results
Interpreting test results
 
Test and Assessment Types
Test and Assessment TypesTest and Assessment Types
Test and Assessment Types
 
Assessment in language learning classrooms: More questions than answers
Assessment in language learning classrooms:More questions than answersAssessment in language learning classrooms:More questions than answers
Assessment in language learning classrooms: More questions than answers
 
Assessment &testing in the classroom
Assessment &testing in the classroomAssessment &testing in the classroom
Assessment &testing in the classroom
 
Criterion vs norm referenced testing
Criterion vs norm referenced testingCriterion vs norm referenced testing
Criterion vs norm referenced testing
 
MEASUREMENT, ASSESSMENT AND EVALUATION
 MEASUREMENT, ASSESSMENT AND EVALUATION  MEASUREMENT, ASSESSMENT AND EVALUATION
MEASUREMENT, ASSESSMENT AND EVALUATION
 
Standards based assessment
Standards based assessmentStandards based assessment
Standards based assessment
 
TSL3143 Topic 3a Principles in Curriculum Design
TSL3143 Topic 3a Principles in Curriculum DesignTSL3143 Topic 3a Principles in Curriculum Design
TSL3143 Topic 3a Principles in Curriculum Design
 
New grading and student evaluation.ppt
New grading and student evaluation.pptNew grading and student evaluation.ppt
New grading and student evaluation.ppt
 
Testing, assessing, and teaching
Testing, assessing, and teachingTesting, assessing, and teaching
Testing, assessing, and teaching
 
Criterion referenced test
Criterion referenced test Criterion referenced test
Criterion referenced test
 
Types of Test
Types of Test Types of Test
Types of Test
 
Test Reliability and Validity
Test Reliability and ValidityTest Reliability and Validity
Test Reliability and Validity
 
Planning Classroom Tests and Assessments
Planning Classroom Tests and AssessmentsPlanning Classroom Tests and Assessments
Planning Classroom Tests and Assessments
 

Andere mochten auch

How to test grammar email
How to test grammar emailHow to test grammar email
How to test grammar emailjuniato
 
NED 203 Criterion Referenced Test & Rubrics
NED 203 Criterion Referenced Test & RubricsNED 203 Criterion Referenced Test & Rubrics
NED 203 Criterion Referenced Test & RubricsCarmina Gurrea
 
2.4 assessing spoken language 2
2.4 assessing spoken language 22.4 assessing spoken language 2
2.4 assessing spoken language 2Jordan Green
 
The role of technology in curicculum delivery
The role of technology in curicculum deliveryThe role of technology in curicculum delivery
The role of technology in curicculum deliveryAlex Legara
 
Communicative tests presentation
Communicative tests presentationCommunicative tests presentation
Communicative tests presentationMike_Guzman
 
FITZROY KENNEDY, MA - TESTING & ASSESSMENT IN ELT
FITZROY KENNEDY, MA -  TESTING & ASSESSMENT IN ELTFITZROY KENNEDY, MA -  TESTING & ASSESSMENT IN ELT
FITZROY KENNEDY, MA - TESTING & ASSESSMENT IN ELTTESOL Chile
 
Chapter 6( assessing listening)
Chapter 6( assessing listening)Chapter 6( assessing listening)
Chapter 6( assessing listening)Kheang Sokheng
 
Lesson 1. key concepts in language testing jose martin serna valdeolivar
Lesson 1.  key concepts in language testing jose martin serna valdeolivarLesson 1.  key concepts in language testing jose martin serna valdeolivar
Lesson 1. key concepts in language testing jose martin serna valdeolivarJosé Martín Serna Valdeolivar
 
Presentation from Academic Writing
Presentation from Academic WritingPresentation from Academic Writing
Presentation from Academic WritingRenee Davis
 
Factors for Technology Selection
Factors for Technology SelectionFactors for Technology Selection
Factors for Technology SelectionAlbin Caibog
 
Communicative Testing
Communicative  TestingCommunicative  Testing
Communicative TestingNingsih SM
 
Powerpoint presentation in lesson 2 the role of technology in delivering the ...
Powerpoint presentation in lesson 2 the role of technology in delivering the ...Powerpoint presentation in lesson 2 the role of technology in delivering the ...
Powerpoint presentation in lesson 2 the role of technology in delivering the ...Edi sa puso mo :">
 
Communicative testing
Communicative testingCommunicative testing
Communicative testingSamcruz5
 
Language Assessment : Kinds of tests and testing
Language Assessment : Kinds of tests and testingLanguage Assessment : Kinds of tests and testing
Language Assessment : Kinds of tests and testingMusfera Nara Vadia
 
The role of technology in delivering the curriculum
The role of technology in delivering the curriculumThe role of technology in delivering the curriculum
The role of technology in delivering the curriculumJay-ar Rios
 

Andere mochten auch (20)

Norm Reference Test
Norm Reference TestNorm Reference Test
Norm Reference Test
 
Types of tests and types of testing
Types of tests and types of testingTypes of tests and types of testing
Types of tests and types of testing
 
How to test grammar email
How to test grammar emailHow to test grammar email
How to test grammar email
 
NED 203 Criterion Referenced Test & Rubrics
NED 203 Criterion Referenced Test & RubricsNED 203 Criterion Referenced Test & Rubrics
NED 203 Criterion Referenced Test & Rubrics
 
Testing ppt
Testing pptTesting ppt
Testing ppt
 
2.4 assessing spoken language 2
2.4 assessing spoken language 22.4 assessing spoken language 2
2.4 assessing spoken language 2
 
The role of technology in curicculum delivery
The role of technology in curicculum deliveryThe role of technology in curicculum delivery
The role of technology in curicculum delivery
 
Communicative tests presentation
Communicative tests presentationCommunicative tests presentation
Communicative tests presentation
 
FITZROY KENNEDY, MA - TESTING & ASSESSMENT IN ELT
FITZROY KENNEDY, MA -  TESTING & ASSESSMENT IN ELTFITZROY KENNEDY, MA -  TESTING & ASSESSMENT IN ELT
FITZROY KENNEDY, MA - TESTING & ASSESSMENT IN ELT
 
Chapter 6( assessing listening)
Chapter 6( assessing listening)Chapter 6( assessing listening)
Chapter 6( assessing listening)
 
Lesson 1. key concepts in language testing jose martin serna valdeolivar
Lesson 1.  key concepts in language testing jose martin serna valdeolivarLesson 1.  key concepts in language testing jose martin serna valdeolivar
Lesson 1. key concepts in language testing jose martin serna valdeolivar
 
Presentation from Academic Writing
Presentation from Academic WritingPresentation from Academic Writing
Presentation from Academic Writing
 
Factors for Technology Selection
Factors for Technology SelectionFactors for Technology Selection
Factors for Technology Selection
 
Test Technique
Test TechniqueTest Technique
Test Technique
 
Communicative Testing
Communicative  TestingCommunicative  Testing
Communicative Testing
 
Powerpoint presentation in lesson 2 the role of technology in delivering the ...
Powerpoint presentation in lesson 2 the role of technology in delivering the ...Powerpoint presentation in lesson 2 the role of technology in delivering the ...
Powerpoint presentation in lesson 2 the role of technology in delivering the ...
 
Communicative testing
Communicative testingCommunicative testing
Communicative testing
 
Language Assessment : Kinds of tests and testing
Language Assessment : Kinds of tests and testingLanguage Assessment : Kinds of tests and testing
Language Assessment : Kinds of tests and testing
 
Test planning
Test planningTest planning
Test planning
 
The role of technology in delivering the curriculum
The role of technology in delivering the curriculumThe role of technology in delivering the curriculum
The role of technology in delivering the curriculum
 

Ähnlich wie La notes (1 7 & 9)

Evaluation of educational programs in nursing
Evaluation of educational programs in nursingEvaluation of educational programs in nursing
Evaluation of educational programs in nursingNavjyot Singh
 
Concept of classroom assessment by Dr. Shazia Zamir
Concept of classroom assessment by Dr. Shazia ZamirConcept of classroom assessment by Dr. Shazia Zamir
Concept of classroom assessment by Dr. Shazia Zamirshaziazamir1
 
Assessment of student learning 1
Assessment of student learning 1Assessment of student learning 1
Assessment of student learning 1joeri Neri
 
measurement assessment and evaluation
measurement assessment and evaluationmeasurement assessment and evaluation
measurement assessment and evaluationalizia54
 
Principles of language assessment.pptx
Principles of language assessment.pptxPrinciples of language assessment.pptx
Principles of language assessment.pptxNOELIAANALIPROAOTROY1
 
Purpose of measurement and evaluation
Purpose of measurement and evaluationPurpose of measurement and evaluation
Purpose of measurement and evaluationRochelle Nato
 
Roles of Assessment in Classroom Instruction
Roles of Assessment in Classroom InstructionRoles of Assessment in Classroom Instruction
Roles of Assessment in Classroom InstructionJames Robert Villacorteza
 
Didactic assessment
Didactic assessmentDidactic assessment
Didactic assessmentAsterie83
 
Didactic assessment
Didactic assessmentDidactic assessment
Didactic assessmentAsterie83
 
Learning Assessmentm PPT.pptx
Learning Assessmentm PPT.pptxLearning Assessmentm PPT.pptx
Learning Assessmentm PPT.pptxMJSanchez8
 
Assessment of learning
Assessment of learningAssessment of learning
Assessment of learningKendral Flores
 
MEASUREMENT ASSESSMENT evaluation 1.pptx
MEASUREMENT ASSESSMENT evaluation 1.pptxMEASUREMENT ASSESSMENT evaluation 1.pptx
MEASUREMENT ASSESSMENT evaluation 1.pptxSajan Ks
 

Ähnlich wie La notes (1 7 & 9) (20)

Penaksiran Akademik
Penaksiran AkademikPenaksiran Akademik
Penaksiran Akademik
 
Assessment
AssessmentAssessment
Assessment
 
Assessment
AssessmentAssessment
Assessment
 
Evaluation of educational programs in nursing
Evaluation of educational programs in nursingEvaluation of educational programs in nursing
Evaluation of educational programs in nursing
 
Concept of classroom assessment by Dr. Shazia Zamir
Concept of classroom assessment by Dr. Shazia ZamirConcept of classroom assessment by Dr. Shazia Zamir
Concept of classroom assessment by Dr. Shazia Zamir
 
Basics of assessment
Basics of assessmentBasics of assessment
Basics of assessment
 
Assessment of student learning 1
Assessment of student learning 1Assessment of student learning 1
Assessment of student learning 1
 
Testing and evaluation
Testing and evaluationTesting and evaluation
Testing and evaluation
 
measurement assessment and evaluation
measurement assessment and evaluationmeasurement assessment and evaluation
measurement assessment and evaluation
 
ASSESSMENT.pptx
ASSESSMENT.pptxASSESSMENT.pptx
ASSESSMENT.pptx
 
Principles of language assessment.pptx
Principles of language assessment.pptxPrinciples of language assessment.pptx
Principles of language assessment.pptx
 
Purpose of measurement and evaluation
Purpose of measurement and evaluationPurpose of measurement and evaluation
Purpose of measurement and evaluation
 
AT & DT by Laxman Kumar R
AT & DT by Laxman Kumar RAT & DT by Laxman Kumar R
AT & DT by Laxman Kumar R
 
Roles of Assessment in Classroom Instruction
Roles of Assessment in Classroom InstructionRoles of Assessment in Classroom Instruction
Roles of Assessment in Classroom Instruction
 
Didactic assessment
Didactic assessmentDidactic assessment
Didactic assessment
 
Didactic assessment
Didactic assessmentDidactic assessment
Didactic assessment
 
Learning Assessmentm PPT.pptx
Learning Assessmentm PPT.pptxLearning Assessmentm PPT.pptx
Learning Assessmentm PPT.pptx
 
Assessment of learning
Assessment of learningAssessment of learning
Assessment of learning
 
MEASUREMENT ASSESSMENT evaluation 1.pptx
MEASUREMENT ASSESSMENT evaluation 1.pptxMEASUREMENT ASSESSMENT evaluation 1.pptx
MEASUREMENT ASSESSMENT evaluation 1.pptx
 
Evalution
Evalution Evalution
Evalution
 

Kürzlich hochgeladen

Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701bronxfugly43
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxnegromaestrong
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docxPoojaSen20
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxVishalSingh1417
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Shubhangi Sonawane
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 

Kürzlich hochgeladen (20)

Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 

La notes (1 7 & 9)

  • 1. Chapter 1 Overview of assessment : Context, issues and trends
  • 2. DEFINITION OF TERMS – test, measurement, evaluation, and assessment • A test is a subset of assessment intended to measure a test-taker's language proficiency, knowledge, performance or skills • Brown defined a test as a process of quantifying a test-taker’s performance according to explicit procedures or rules. • Assessment : process of observing and measuring learning. It is an ongoing process in educational practice, which involves a multitude of methodological techniques. • It can consist tests, projects, portfolios. • Evaluation involves the interpretation of information. When a tester or marker evaluate, s/he “values” the results in such a way that the worth of the performance is conveyed to the test-taker. • Measurement is the assigning of numbers to certain attributes of objects, events, or people according to a rule-governed system.
  • 3. The relationship between tests, measurement and assessment
  • 4. Four stages/phases of development of examination system in our country • Pre-Independence • Razak Report • RahmanTalib Report • Cabinet Report • Malaysia Education Blueprint (2013-2025)
  • 5. The achievements of Malaysia Examination Syndicate (MES)
  • 6. Chapter 2 Role and Purposes of Assessment In T & L
  • 7. Framework Reasons / Purposes of Assessment Assessment of Learning / Assessment for Learning
  • 8. Assessment OF Learning • the use of a task or an activity to measure, record, and report on a student’s level of achievement in regards to specific learning expectations. • This type of assessment is also known as summative assessment. • provide the focus to improve student achievement, give everyone the information they need to improve student achievement, and apply the pressure needed to motivate teachers to work harder to teach and learn.
  • 9. AOL
  • 10. Assessment FOR leaning • the use of a task or an activity for the purpose of determining student progress during a unit or block of instruction. • Is roughly equivalent ( the same) to formative assessment - assessment intended to promote further improvement of student learning during the learning process. • commonly known as formative and diagnostic assessments. • students are provided valuable feedback on their own learning.
  • 11. Importance of AFL • reflects a view of learning in which assessment helps students learn better, rather than just achieve a better mark • involves assessment activities as part of learning and to inform the planning of future learning • includes clear goals for the learning activity • provides effective feedback that motivates the learner and can lead to improvement • reflects a belief that all students can improve • encourages self-assessment and peer assessment as part of the regular classroom routines • involves teachers, students and parents reflecting on evidence • is inclusive of all learners.
  • 12. Types of tests Henning (1987) identifies six kinds of information that tests provide about students. They are: o Diagnosis and feedback o Screening and selection o Placement o Program evaluation o Providing research criteria o Assessment of attitudes and socio-psychological differences
  • 13. Type of tests Explanation 1. Proficiency tests - designed to assess the overall language ability of students at varying levels. - usually developed by external bodies such as examination boards like Educational Testing Services (ETS) or Cambridge ESOL. - Standardized 2. Achievement tests - to see what a student has learned with regard to stated course outcomes - usually administered at mid-and end- point of the semester or academic year. - generally based on the specific course content or on the course objectives. - cumulative, covering material drawn from an entire course or semester. 3. Diagnosis tests - seek to identify those language areas in which a student needs further help. - is crucial for further course activities and providing students with remediation. - placement tests often serve a dual function of both placement and diagnosis (Harris & McCann, 1994; Davies et al., 1999). 4. Aptitude tests - designed to measure general ability or capacity to learn a foreign language a priori (before taking a course) and ultimate predicted success in that undertaking. - designed to apply to the classroom learning of any language 5. Progress tests - measure the progress that students are making towards defined course or programme goals. - administered at various stages throughout a language course to see what the students have learned
  • 14. Type of tests Explanation 6. Placement tests - designed to assess students’ level of language ability for placement in an appropriate course or class. - indicates the level at which a student will learn most effectively - main aim is to create groups, which are homogeneous in level.
  • 15. Malaysian Context (KSSR) • School-Based Assessment Purpose: 1. to realign the education system from one that focuses on academic excellence to a more holistic one 2. To ensure a more systematic mastery of knowledge by emphasising on assessment of each child. 3. To achieve the aspiration of National Philosophy of Education towards developing well rounded learners (JERIS) 4. to reduce exam-oriented learning among learners 5. to evaluate learners’ learning progress
  • 16. Malaysian context ctd.. SBE features: • Assessment for and of learning • Standard-referenced Assessment (Performance Standard) • *Formative tests which are assessed using Bands 1 to 6, HOTS (Higher Order Thinking Skills) • Holistic • Integrated
  • 17. SBE Component: Academic: • School Assessment (using Performance Standards) • Centralised Assessment Non-academic: • Physical Activities, Sports and Co-curricular Assessment (PAJSK : eg; SEGAK) • Psychometric/Psychological Tests (Aptitude test, Personality test)
  • 18. SBE Instrument (WHO): • Teachers • Rationale: • - Can continuously monitor their pupils’ growth • - Can provide constructive feedback to help improve pupils’ learning abilities • - Better understand the context and environment most conducive to assess pupils • - Appraise and provide feedback based on the Performance Standards HOW: Observation, Performance, Project, Product,Hands on, Written Essays, Pencil and Paper, Worksheet, Open ended discussion, Quizzes, Checklist,Homework.
  • 19. Performance Standard ; a set of statements detailing the achievement and mastery of an individual within a certain discipline, in a specific period of study based on an identified benchmark.
  • 21. Framework Types of Tests Norm-Referenced and Criterion- Referenced Formative and Summative Objective and Subjective
  • 22. Norm-Referenced Test Criterion-Referenced Test (Mastery tests) Definition Purpose A test that measures student’s achievement as compared to other students in the group *Designed to yield a normal curve, 50% above 50% below. Determine performance difference among individual and groups An approach that provides information on student’s mastery based on a criterion specified by the teacher *Anyone who meets the criterion can get high score Determine learning mastery based on specified criterion and standard Test Item Frequency From easy to difficult level and able to discriminate examinee’s ability Continuous assessment in the classroom Guided by minimum achievement in the related objectives Continuous assessment Appropriateness Example Summative evaluation Public exams: UPSR, PMR, SPM, and STPM Formative evaluation Mastery test: monthly test, coursework, project, exercises in the classroom
  • 23. Norm-Referenced Test Criterion-Referenced Test Purpose - To rank each pupil with respect to the achievement of others in broad areas of knowledge. - To discriminate between high and low achievers. - To show how a student’s performance compares to that of other test-takers - To determine whether each student has achieved specific skills or concepts. - To find out how much student know before instruction begins and after it has finished. - To classify students according to whether they have met established standard. Content - Measures broad skill areas sampled from a variety of textbooks,syllabus,and the judgment of curriculum experts - Measures specific skills which make up a designated curriculum - Each skill is expressed as an instructional objective. Item characteristics - Each skills is usually tested by less few items - Items vary in difficulty - Items are selected that discriminate between high and low achievers - Each skills tested by at least 4 items in order to obtain an adequate sample of pupil performance. - Minimize guessing - The items which test any given skill are parallel in difficulty. SET B: Q1a)
  • 24. Norm-Referenced test (Normal Curve) • represents the norm or average performance of a population and the scores that are above and below the average within that population. • include percentile ranks, standard scores, and other statistics for the norm group on which the test was standardized. • A certain percentage of the norm group falls within various ranges along the normal curve. • Depending on the range within which test scores fall, scores correspond to various descriptors ranging from deficient to superior. • An examinee's test score is compared to that of a norm group by converting the examinee's raw scores into derived or scale scores. • Testmakers make the test so that most students will score near the middle, and only a few will low (the left side of the curve) or high (the right side of the curve). • Scores are usually reported as percentile ranks. • The scores range from 1st percentile to 99th percentile, with the average students scores set at the 50th percentile.
  • 25. Positive Skew • Positive skew is when the long tail is on the positive side of the peak, and some people say it is "skewed to the right". • The mean is on the right of the peak value. • the mean is greater than the mode. • distribution has scores clustered to the left, with the tail extending to the right.
  • 26. Negative Skew • Majority of the score falls toward the upper hand. • Curves are not symmetrical and have more scores on the higher ends of distribution which will tend to reduce the reliability of the test. • Also called the mastery curve. Problem: • Scores are scrunched up around one point and thus making it difficult to make decisions as many pupils will be around that same point. • Skewed distributions will also create problems as they indicate violations of the assumption of normality that underlies many of the other statistics that are used to study test validity. (James Dean Brown, 1997)
  • 27. Characteristics Formative Summative Relation to Instruction - Occurs during instruction - Occurs after instruction Frequency - Occurs on an ongoing basis - Occurs at a particular point in time to determine to know Relation to grading - Not graded – information used as feedback to students and teachers, mastery is not expected when students are first introduced to a concept - Graded Students role - Active engagement – self assessment - Passive engagement in design and monitoring Requirements for use - Clearly defined learning targets that students understand - Clearly defined criteria for success that student understand - Use of descriptive versus evaluation feedback - Well designed assessment blue print that outlines the learning targets. - Well designed test items using best practices Examples - a process . - Observations, interviews, evidence from work samples, paper and pencil tasks - Final assessment Purpose - Designed to provide information needed to adjust teaching and learning - Designed to provide information about the amount of learning that has occurred at a particular point.
  • 28. Formative Vs Summative Assessment For Learning AFL/AOL? Assessment Of Learning involves both teachers and students in ongoing dialogue, descriptive feedback, and reflection throughout instruction Elaboration -evaluate student learning at the end of an instructional unit by comparing it against some standard or benchmark -Specific learning outcomes and standards are reference points -grade levels may be the benchmarks for reporting -Rubrics can be given to students before they begin working on a particular project so they know what is expected of them for each of the criteria. -help students identify their strengths and weaknesses and target areas that need work recognize where students are struggling and address problems immediately -gain as much information as possible of what the student has achieved, what has not been achieved, and what the student requires to best facilitate further progress -Students’ involvement-Opportunities for students to express their understandings Benefit -create clear expectations -Includes different level of difficulty -make a judgment of student competency
  • 29. Formative Summative Exit slips: Ask students to solve one problem or answer one question on a small piece of paper. Students hand on the slips as “exit tickets” to pass to their next class, go to lunch, or transition to another activity. The slips give teachers a way to quickly check progress toward skills mastery. Graphic organizers: When students complete mind maps or graphic organizers that show relationships between concepts, they’re engaging in higher level thinking. These organizers will allow teachers to monitor student thinking about topics and lessons in progress. Self-assessments: One way to check for student understanding is to simply ask students to rate their learning. They can use a numerical scale, a thumbs up or down, or even smiley faces to show how confident they feel about their understanding of a topic. Think-pair-share: Ask a question, give students time to think about it, pair students with a partner, have students share their ideas. By listening into the conversations, teachers can check student understanding and assess any misconceptions. Students learn from each other when discussing their ideas on a topic. Observation: Watching how students solve a problem can lead to further information about misunderstanding. Discussion: Hearing how students reply to their peers can help a teacher better understand a student’s level of understanding. Categorizing: Let students sort ideas into self-selected categories. Ask them to explain why such concepts go together. This will give you some insight into how students view topics. Example Multiple choice, True/false, Matching Short answer Fill in the blank One or two sentence response Portfolios: Portfolios allow students to collect evidence of their learning throughout the unit, quarter, semester, or year, rather than being judged on a number from a test taken one time. Projects: Projects allow students to synthesize many concepts into one product or process. They require students to address real world issues and put their learning to use to solve or demonstrate multiple related skills. Performance Tasks: Performance tasks are like mini-projects. They can be completed in a few hours, yet still require students to show mastery of a broad topic.
  • 30. SetAQ1b)Benefits of integrating formative and summative • The integration of summative assessments with formative practices can make the assessment process more meaningful for students by providing regular feedback that supports learning whilst also contributing towards an overall picture of their learning. • Integrated assessment practices can also help learners to understand connections between learning and assessment. Developing students’ active involvement as assessors of their own learning supports them in life-long learning beyond formal education. • The integration of assessments facilitates the accumulation of evidence which can be used for both formative and summative purposes over time, reducing ‘teaching to the test’.
  • 31. Objective vs Subjective Objective Subjective Those with a single correct response regardless of who scores a set of responses, an identical score will be obtained Those items that typically do not have a single correct response Subjective judgment of the scorer do not influence an individual’s score Subjective judgments of the scorer are an integral part of the scoring process Also known as “selected response” and “structured-response” items Also known as “free-response”, “constructed- response” and “supply-type” items Include multiple-choice question, matching and alternative-choice items Include short-answer and essay items Assess lower-level skills such as knowledge,comprehension Require students to produce what they know Relatively easy to administer, score and analyse Easy to construct
  • 32. 5 Basic Terminology in Objective test 1. Receptive or selective response Items that the test-takers chooses from a set of responses, commonly called a supply type of response rather than creating a response. 2. Stem Every multiple-choice item consists of a stem (the ‘body’ of the item that presents a stimulus). Stem is the question or assignment in an item. It is in a complete or open, positive or negative sentence form. Stem must be short or simple, compact and clear. However, it must not easily give away the right answer. 3. Options or alternatives They are known as a list of possible responses to a test item. There are usually between three and five options/alternatives to choose from. 4. Key This is the correct response. The response can either be correct or the best one. Usually for a good item, the correct answer is not obvious as compared to the distractors. 5. Distractors This is known as a ‘disturber’ that is included to distract students from selecting the correct answer. An excellent distractor is almost the same as the correct answer but it is not.
  • 33. SETBQ1B Objective tests Strength Weaknesses Quick grading Difficult to design, has to consider good distractor High inter-rater reliability -requires no judgment from the scorer Considerable effect - Guessing is possible Easy to administer especially for a big group Low validity Wide coverage of topics in the outlined curriculum Difficult to construct HOTS question Precision in testing specific skills Testing on the skill rather than content
  • 34. General Guidelines for Objective Test items MCQ Alternate-choice items i.Design each item to measure a single objective; ii.State both stem and options as simply and directly as possible; iii.Make certain that the intended answer is clearly the one correct one; iv.(Optional) Use item indices to accept, discard or revise item. 1.Must have only one correct answer 2. Format the items vertically, not horizontally. 3. Avoid using ‘All of the above”, “None of the above”, or other special distractors. 4. Use the author’s examples as a basis for developing your items. 5. Avoid trick items which will mislead or deceive examinees into answering incorrectly An alternate-choice test item is a simple declarative sentence, one portion of which is given with two different wordings. E.g: Ali seems to be (a) eager (b) hesitant in making decision to further his studies. The examinee's task is to choose the alternative that makes the sentence most nearly true. Rate of guessing is high -Difficult to write good alternate choice that covers all aspects. Takes a shorter time -Examiners take shorter time to evaluate the examinee. Trick questions are seldom appropriate -Examiners need to test the examinee directly. Avoid taking statements directly from the text and placing them out of context. -Avoid confusion. It will not test the examinee understanding but their ability of finding answers. Use other symbols other than T/F, Y/N -Examiners could make the examinee to underline the correct answers.
  • 35. General guidelines Subjective test items Short answer Essay items Short-answer questions are open-ended questions that require students to create an answer. They are commonly used in examinations to assess the basic knowledge and understanding (low cognitive levels) of a topic before more in-depth assessment questions are asked on the topic. -Design short answer items which are appropriate assessment of the learning objective -Make sure the content of the short answer question measures knowledge appropriate to the desired learning goal -Express the questions with clear wordings and language which are appropriate to the student population -Ensure there is only one clearly correct answer in each question -Ensure that the item clearly specifies how the question should be answered -Write the instructions clearly so as to specify the desired knowledge and specificity of response -Set the questions explicitly and precisely. -Direct questions are better than those which require completing the sentences. -Let the students know what your marking style is like, is bullet point format acceptable, or does it have to be an essay format? -Prepare a structured marking sheet; allocate marks or part-marks for acceptable answer(s). -Do not make the correct answer a “giveaway” word that could be guessed by students who do not really know the information. -In addition, avoid giving grammatical cues or other cues to the correct answer. Avoid using statements taken directly from the curriculum. -Develop grading criteria that lists all acceptable answers to the test item. Have subject matter experts determine the acceptable answers. -Clearly state questions not only to make essay tests easier for students to answer, but also to make the responses easier to evaluate -Specify and define what mental process you want the students to perform (e.g., analyze, synthesize, compare, contrast, etc.). -Do not assume learner is practiced with the process -Avoid writing essay questions that require factual knowledge,as those beginning questions with interrogative pronouns(who, when, why, where) -Avoid vague, ambiguous, or non-specific verbs (consider, examine, discuss, explain) unless you include specific instructions in developing responses -Have each student answer all the questions -Do not offer options for questions -Structure the question to minimize subjective interpretations
  • 36. Chapter 4 Basic Principle of Assessment SET A SECTION B (1) SET B Q2 a)
  • 37. Reliability (Brown) • Consistent and dependable - If you give to another pupil or matched pupil on 2 different occasion, the test should yield similar result • Consistent in its conditions across two or more administrations • Gives clear directions for scoring / evaluation • Has uniform rubrics for scoring / evaluation • Lends itself to consistent application of those rubrics by the scorer • Contains item / tasks that are unambiguous to the test-taker
  • 38. Factor to UNRELIABILITY of a test 1. Student-related realibility - Temporary illness, fatigue, a ‘bad day’, anxiety etc which make an observed score deviate from one’s true score. 2. Rater-reliability-Human error and biasness while scoring. -Inter-rater reliability happen when 2 or more scores award inconsistent scores of the same test. -Unclear scoring criteria,fatigue,biasness,carelessness. 3. Test Administration Reliability – conditions in which the test is administered -noise,room lighting,variation in temperature,condition of table and chair 4. Test-reliability – Nature of the test can cause measurement errors. -duration of the test (too long,timed), poorly written test items ie. Ambigious,generic, have more than one answer.
  • 39. Validity • second characteristic of good tests is validity, which refers to whether the test is actually measuring what it claims to measure. • The extent to which inferences made from assessment results are appropriate, meaningful and useful in terms of the purpose of the assessment (Groundland,1998) A valid test: 1. Measures exactly what it proposes to measure 2. Does not measure irrelevant or ‘contaminating’ variable 3. Relies as much as possible on empirical evidence (performance) 4. Involves performance that samples the test’s criterion (objective) 5. Offers useful, meaningful information about a test-takers ability 6. Is supported by a theoretical rationale or argument
  • 40. Face Validity: Do the assessment items appear to be appropriate? • “determined impressionistically; for example by asking students whether the examination was appropriate to the expectations” (Henning, 1987). • as the degree to which a test looks right, and appears to measure the knowledge or abilities it claims to measure, based on the subjective judgement of the examinees who take it, the administrative personnel who decide on its use, and other psychometrically unsophisticated observers. High validity if: 1. Well-constructed,expected format with familiar tasks 2. Clearly doable within the allotted time limit 3. Items that are clear and uncomplicated 4. Directions that crystal clear 5. Task that relate to their course work 6. A difficulty level that presents a reasonable challenge
  • 41. Content Validity - Does the assessment content cover what you want to assess? Have satisfactory samples of language and language skills been selected for testing? • whether or not the content of the test is sufficiently representative and comprehensive for the test to be a valid measure of what it is supposed to measure” (Henning, 1987). • “If a test samples the subject matter about which conclusions are to be drawn, and if it requires the test-taker to perform the behaviour that is being measured” (Mousavi,2002). • Validity can be verified through the use of Table of Test Specification: 1. is to make sure all content domains are presented in the test. 2. give detailed information on each content, 3. level of skills, 4. status of difficulty, 5. number of items, item representation for rating in each content or skill or topic.
  • 42. Construct Validity – Are you measuring what you think you're measuring? Is the test based on the best available theory of language and language use? • The extent to which a test measures a theoretical construct or attribute • Proficiency, communicative competence, and fluency are examples of linguistic constructs; • Self-esteem and motivation are psychological constructs.
  • 43. Criterion-Related Validity is usually expressed as a correlation between the test in question and the criterion measure. -Concurrent (parallel) validity: Can you use the current test score to estimate scores of other criteria? Does the test correlate with other existing measures? • The extent to which procedure correlates with the current behaviour of subjects • the use of another more reputable and recognised test to validate one’s own test. -Predictive validity: Is it accurate for you to use your existing students’ scores to predict future students’ scores? Does the test successfully predict future outcomes? • The extent to which a procedure allows accurate prediction about a subject’s future behaviour
  • 44. Consequential Validity • Encompasses all of the consequences of a test, including considering its accuracy in measuring intended criteria, its impact in the preparation of test takers, its effect on the learner, and the social consequences of a test’s interpretation and use.
  • 45. Practicality • Refers to logistical, administrative issues involved in making,giving, and scoring an assessment instrument. • Include “cost,time to construct and administer,ease of scoring and ease of reporting the results (Mousavi,2009) Practical test: 1. Stays within budgetary limits 2. Stays within appropriate time constraint 3. Relatively easy to administer 4. Appropriately utilizes available human resources 5. Does not exceed available material resources 6. Has a scoring/evaluation procedure that is specific and time-efficient
  • 46. Objectivity • refers to the ability of teachers/examiners who mark the answer scripts. • The extent in which an examiner examines and awards scores to the same answer script. High objective if: 1. Examiners are able to give the same score to the similar answers guided by the marking scheme Objective test = highest objectivity Subjective test = lowest objectivity
  • 47. Authenticity • “the degree of correspondence of the characteristics of a given language test task to the features of a target language task” High authenticity if: 1. The language in the test is as natural as possible 2. Items are contextualized 3. Topics are meaningful 4. Some thematic organization to items is provided 5. Task represent to real world tasks
  • 48. Washback • refers to the impact that tests have on teaching and learning Positive Washback Negative Washback Teacher -Induce teachers to cover their subject more thoroughly. -Improve teaching strategies -Encourage positive teaching learning process -Encourage teachers to make “teaching to the test” curriculum -Teacher not fulfil curriculum standard -Neglect teaching of skill Student -Make student work harder -Bring anxiety and distort performance -Make student to create a negative judgment towards test Decision makers Use the authority power of high stakes testing to achieve the goals To improve and the introduction of new curriculum -Overwhelmingly use tests to promote their political agendas and seize
  • 49. Interpretability • Test should be written in a clear,correct and simple language • Avoid ambiguous questions and instruction • Clarity is essential to enable the pupils know exactly what the examiner wants them to do. • Difficulty: The test questions should be appropriate in difficulty not too hard or easy • Should be progressive to reduce stress and tension
  • 51. Stages of Test Construction Explanation Determining 1) What it is one wants to know 2) For what purpose Aspect (Questions need answered) - Examinees - Kind of test - Purpose (State) - Abilities tested - Accuracy of results - Importance of backwash effect - Scope of test - Constraints set by the unavailability of expertise, facilities, time of construction, administration, and scoring Planning 1) Determine the content Aspect - Purpose (Describe) - Characteristics of the test takers, the nature of the population of the examinees for whom the test is being designed - A plan for evaluating the qualities of test usefulness (reliability, validity, authenticity, practicality inter-activeness, and impact)
  • 52. Stages of Test Construction Explanation Planning ctd - Nature of the ability we want measured - Identify resources - A plan for allocation and management of resources - Format and timing - Criteria - Levels of performance - Scoring procedures Writing Test items writers’ characteristics: • Experienced in test construction. • Quite knowledgeable of the content of the test. • Have the capacity in using language clearly and economically. • Ready to sacrifice time and energy. Other aspects: • Sampling : test constructors choose widely from the whole area of the course content. (Not including EVERYTHING under course content in 1 version of test) • Decision regarding content validity and beneficial backwash You’ve written it well when.. (/) It is representative sample of the course material
  • 53. Stages of Test Construction Explanation Preparing You have to… (/) Understand the major principles, techniques and experience …before preparing test items. AVOID preparing • Test items which can be answered through test-wiseness. Test wiseness : examinees utilise the characteristics and formats of the test to guess the correct answer Reviewing Principles for reviewing test items: • The test should not be reviewed immediately after its construction, but after some considerable time. • Other teachers or testers should review it. In a language test, it is preferable if native speakers are available to review the test. Pre-testing • The tester should administer the newly-developed test to a group of examinees similar to the target group; PURPOSE  Analyse every individual item as well as the whole test. • Numerical data (test results) should be collected to check the efficiency of the item, it should include item facility and discrimination.
  • 54. Stages of Test Construction Explanation Validating • Identify IF • Item Facility (IF) shows to what extent the item is easy or difficult. • IF= number of correct responses (Σc) / total number of candidates (N) • And to measure item difficulty: IF= (Σw) / (N) The results of such equations range from 0 – 1. An item with a facility index of 0 is too difficult, and with 1 is too easy. The ideal item is one with the value of (0.5) and the acceptability range for item facility is between [0.37 → 0.63], i.e. less than 0.37 is difficult, and above 0.63 is easy. Too easy/Too hard = Low reliability
  • 55. Preparing Test Blueprint / Test Specifications • Test specs = an outline of your test /what it will “look like” + your guiding plan for designing an instrument that effectively fulfils your desired principles, especially validity. • They include the following: a description of its content item types (methods, such as multiple-choice, cloze, etc.) tasks (e.g. written essay, reading a short passage, etc.) skills to be included how the test will be scored how it will be reported to students
  • 56. What is an item? • A tool, an instrument, instruction or question used to get feedback from test-takers • Evidence of something that is being measured. • Useful information for consideration in measuring or asserting a construct measurement. • Can be classified as a recall and thinking item. • Recall item : item that requires one to recall in order to answer • Thinking item : item that requires test-takers to use their thinking skills to attempt.
  • 57. Sequential steps in designing test specs • A broad outline of how the test will be organised • Which of the eight sub-skills you will test • What the various tasks and item types will be • How results will be scored, reported to students, and used in future class (washback) Remember to… Know the purpose of the test you are creating Know as precisely as possible what it is you want to test Not conduct a test hastily Examine the objectives for the unit you are testing carefully
  • 58. Bloom’s Taxonomy (Revised) • Def : A systematic way of describing how a learner’s performance develops from simple to complex levels in their affective, psychomotor and cognitive domain of learning.
  • 59.
  • 63.
  • 64.
  • 65. Categories & Cognitive Processes Definition Factual Knowledge The basic elements students must know to the acquainted with a discipline or solve problems in it Conceptual Knowledge The interrelationships among the basic elements within a larger structure that enable them to function together Procedural Knowledge How to do something, methods of inquiry, and criteria for using skills, algorithms, techniques, and methods Metacognitive Knowledge Knowledge of cognition in general as well as awareness and knowledge of one’s own cognition The Knowledge Domain
  • 66. SOLO Taxonomy • Def : (Structure of the Observed Learning Outcome) a systematic way of describing how a learner’s performance develops from simple to complex levels in their learning. • There are 5 stages, namely : Prestructural, Unistructural, Multistructural, which are in a quantitative phrase and Relational and Extended Abstract, which are in a qualitative phrase (Refer Figure 1.0) • A means of classifying learning outcomes in terms of their complexity, enabling teachers to assess students’ work in terms of its quality.
  • 68.
  • 69. Functions of SOLO taxonomy • An integrated strategy, to be used In lesson design (learning outcomes intended) In task guidance In formative and summative assessment In deconstructing exam questions to understand marks awarded As a vehicle for self-assessment and peer-assessment
  • 70. Advantages of SOLO taxonomy Aspect Structure of the taxonomy • Encourages viewing learning as an on-going process, moving from simple recall of facts towards a deeper understanding; that learning is a series of interconnected webs that can be built upon and extended. • Consisting as a series of cycles (especially between the Unistructural, Multistructural and Relational levels), which would allow for a development of breadth of knowledge as well as depth. In turn.. • Creating sts that are.. “self-regulating, self-evaluating learners who were well motivated by learning.” SOLO based techniques • Use of constructional alignment encourages teachers to be more explicit when creating learning objectives, focusing on what the student should be able to do and at which level. In turn.. • Sts will be able to make progress and allows for the creation of rubrics, for use in class, to make the process explicit to the student. It’s HOTs properties • Scaffold in depth discussion In turn.. • Encouraging sts to develop interpretations, use research and critical thinking effectively to develop their own answers, and write essays that engage with the critical conversation of the field. • May also be helpful in providing a range of techniques for differentiated learning.
  • 71. Proponents of the SOLO taxonomy say.. • A model of learning outcomes that helps schools develop a common understanding. • A ‘framework for developing the quality of assessment’ and that it is ‘easily communicable to students’. • Hattie outlines three levels of understanding: surface, deep and conceptual. He indicates that: “The most powerful model for understanding these three levels and integrating them into learning intentions and success criteria is the SOLO model.”
  • 72. Critics of the SOLO taxonomy say… • There is potential to misjudge the level of functioning. • It has ‘conceptual ambiguity’; that the ‘categorisation’ is ‘unstable’. • The structure is referred as a hierarchy, hence rise of concerns when complex processes, such as human thought, are categorised in this manner.
  • 73. Guidelines for constructing test items Guideline Elaboration Aim of test • Developed to precisely measure the objectives prescribed by the blueprint • Meet quality standards Range of the topics to be tested Measure the test-takers’ ability or proficiency in applying the knowledge and principles on the topics that they have learnt Range of skills to be tested • Have cognitive characteristics exemplifying understanding, problem-solving, critical • thinking, analysis, synthesis, evaluation and interpreting rather than just declarative knowledge. • (Bloom’s taxonomy as tool to use in item writing) Test format Needs to be a logical and consistent stimulus format Why? For test item writers : help expedite the laborious process of writing test items as well as supply a format for asking basic questions. For test-takers : • So that the questioning process in itself does not give unnecessary difficulty to answering questions • test takers can quickly read and understand the questions, since the format is expected
  • 74. Guideline Elaboration International and Cultural Considerations (biasness) refrain from…  the use of slang  geographic references  historical references or dates (holidays) …that may not be understood by an international examinee. Level of difficulty Assure that the test item…  Has a planned number of questions at each level of difficulty  Able to determine mastery and non-mastery performance states  Weak students could answer easy item  Intermediate language proficiency students could answer easy and moderate item  High language proficiency students could answer easy, moderate and advance test items  encompass all three levels of difficulties
  • 75. Test format • Refers to the layout of questions on a test. For example, the format of a test could be two essay questions, 50 multiple- choice questions, etc. *Note : If you wish to know on the outlines of some large-scale standardised tests, please refer to pages 64 & 65 in the PPG Module
  • 77. Types of test items to assess language skills Language Skills Elaboration Listening Two kinds of listening tests: • Tests that test specific aspects of listening, like sound discrimination • Task based tests which test skills in accomplishing different types of listening tasks considered important for the students being tested Four types of listening performance from which assessment could be considered. Intensive Listening for perception of the components (phonemes, words, intonation, discourse markers,etc) of a larger stretch of language. Responsive Listening to a relatively short stretch of language ( a greeting, question, command, comprehension check, etc.) in order to make an equally short response Selective Processing stretches of discourse such as short monologues for several minutes in order to “scan” for certain information. For example, to listen for names, numbers, grammatical category, directions (in a map exercise), or certain facts and events. Extensive Listening to develop a top-down , global understanding of spoken language. For example listening to a conversation and deriving a comprehensive message or purpose and listening for the gist and making inferences.
  • 78. Language Skills Elaboration Speaking Objective test : tests skills such as … • Pronunciation • Knowledge of what language is appropriate in different situations • Language required in doing different things like describing, giving directions, giving instructions, etc Integrative task-based test : involves finding out if pupils can perform different tasks using spoken language that is appropriate for the purpose and the context. For example : • Describing scenes shown in a picture • Participating in a discussion about a given topic • Narrating a story, etc. CATEGORIES FOR ORAL ASSESSMENT (Refer yellow table)
  • 79. Category Elaboration Imitative • Ability to imitate a word or phrase or possibly a sentence/ pronunciation • A number of prosodic (intonation, rhythm,etc.), lexical , and grammatical properties of language may be included Intensive • The production of short stretches of oral language designed to demonstrate competence in a narrow band of grammatical, phrasal, lexical, or phonological relationships. • Eg :directed response tasks (requests for specific production of speech), reading aloud, sentence and dialogue completion, limited picture-cued tasks including simple sentences, and translation up to the simple sentence level. Responsive • Interaction and test comprehension but at somewhat limited level of very short conversation, standard greetings, and small talk, simple requests and comments. • The stimulus is almost always a spoken prompt (to preserve authenticity) with one or two follow-up questions or retorts Interactive • Increased length + complexity from responsive. • May include multiple exchanges and/or multiple participants. • Two types : (a) transactional language, which has the purpose of exchanging specific information, and (b) interpersonal exchanges, which have the purpose of maintaining social relationships. Extensive • Speeches, oral presentations, and storytelling, during which the opportunity for oral interaction from listeners is either highly limited (perhaps to nonverbal responses) or ruled out together. • Language style is more deliberative (planning is involved) • May include informal monologue such as casually delivered speech (e.g., recalling a vacation in the mountains,
  • 80. Language Skills Elaboration Reading Meaning conveyed through reading text Type Elaboration Skimming Inspect lengthy passage rapidly Scanning Locate specific information within a short period of time Receptive/ Intensive A form of reading aimed at discovering exactly what the author seeks to convey Responsive Respond to some point in a reading text through writing or by answering questions
  • 81. Meaning conveyed through reading text Grammatical meaning Meanings that are expressed through linguistic structures such as complex and simple sentences and the correct interpretation of those structures. Informational meaning The concept or messages contained in the text. May be assessed through various means such as summary and précis writing. Discourse meaning The perception of rhetorical functions conveyed by the text. Writer’s tone The writer’s tone – whether it is cynical, sarcastic, sad or etc
  • 82. Language Skills Elaboration Writing Imitative • The ability to spell correctly and to perceive phoneme-grapheme correspondences in the English spelling system • The mechanics of writing • Form is the primary focus while context and meaning are of secondary concern. Intensive (controlled ) • Producing appropriate vocabulary within a context, collocation and idioms, and correct grammatical features up to the length of a sentence. Responsive • Perform at a limited discourse level, connecting sentences into a paragraph and creating a logically connected sequence of two or three paragraphs. • Tasks relate to pedagogical directives, lists of criteria, outlines, and other guidelines. • Eg : brief narratives and descriptions, short reports, lab reports, summaries, brief responses to reading, and interpretations of charts and graphs. • Form-focused attention is mostly at the discourse level, with a strong emphasis on context and meaning. Extensive • Implies successful management of all the processes and strategies of writing for all purposes, up to the length of eg : an essay, • Focus is on achieving a purpose, organizing and developing ideas logically, using details to support or illustrate ideas, demonstrating syntactic and lexical variety and engaging in the process of multiple drafts to achieve a final product. • Focus on grammatical form is limited to occasional editing and proofreading of a draft
  • 83. Brown’s (Assessing Skills) Skill Type Test item Listening Intensive Listening • Recognizing phonological and morphological elements • Paraphrase recognition Responsive Listening • Responding to a stimulus; conversation, requests Selective Listening • Listening cloze • Information transfer • Sentence repetition Extensive Listening • Dictation • Communicative stimulus-response tasks • Authentic listening tasks Speaking Intensive Speaking • Directed response tasks • Read-Aloud tasks • Sentence/dialogue completion tasks and oral questionnaires • Picture-cued tasks Responsive Speaking • Q & A • Giving instructions and directions • Paraphrasing Interactive Speaking • Interview • Role-play • Discussions and conversations • Games Extensive speaking • Oral presentations • Picture-cued storytelling • Retelling a story, news event
  • 84. Skill Type Test item Reading Perceptive reading • Reading aloud • Written response • Multiple-choice • Picture-cued items Selective reading • Matching tasks • Editing tasks • Picture-cued tasks • Gap-filling tasks Interactive reading • Cloze tasks • Impromptu reading + comprehension questions • Short answer tasks • Editing longer texts • Scanning • Ordering tasks • Information transfer; reading charts, maps, graphs, diagrams Extensive reading • Skimming tasks • Summarizing and responding • Notetaking and outlining Writing Imitative writing • Writing letters, words and punctuation • Spelling tasks and detecting phoneme – grapheme correspondences Intensive (Controlled) writing • Dictation and dicto-comp • Grammatical transformation tasks • Picture-cued tasks • Vocabulary assessment tasks • Ordering tasks • Short answer and sentence completion tasks
  • 85. Skill Type • Test item Writing Responsive and extensive writing • Paraphrasing • Guided Q & A • Paragraph constructions tasks • Strategic options • Standardized tests of responsive writing Grammar & Vocabulary Selected response • Multiple-choice tasks • Discrimination tasks • Noticing tasks or consciousness-raising tasks Limited production • Gap-filling tasks • Short-answer tasks • Dialogue-completion tasks Extended production • Information gap tasks • Role-play or simulation tasks
  • 86. Objective and Subjective Test Objective test • Tests that are graded objectively • Include the multiple choice test, true false items and matching items • Similar to select type tests where students are expected to select or choose the answer from a list of options Subjective test • Involve subjectivity in grading • Include essays and short answer questions • Similar to supply type as the students are expected to supply the answer through their essay Subjective + objective • Dictation test, filling in the blank type tests, as well as interviews and role plays
  • 87. Type of test : according to how students are expected to respond Selected response: Do not create any language but rather select the answer from a given list Constructed response: Produce language by writing, speaking, or doing something else Personal response: Produce language but also allows each students’ response to be different from one another and for students to “communicate what they want to communicate” True false Fill-in Conferences Matching Short answer Portfolios Multiple choice Performance test Self and peer assessments
  • 88. Types of test items to assess language content
  • 89. Discrete Integrative Language is seen to be made up of smaller units and it may be possible to test language by testing each unit at a time Language is that of an integrated whole which cannot be broken up into smaller units or elements
  • 90. Communicative test • Sts have to produce the language in an interactive setting involving some degree of unpredictability which is typical of any language interaction situation.
  • 91. The three principles of communicative tests are : • involve performance; • are authentic; and • are scored on real-life outcomes
  • 92. Limitation in applying the communicative test • Issues of practicality, involving especially the amount of time and extent of organisation to allow for such communicative elements to emerge. Advantages in applying the communicative test • Have valid language that are purposeful and can stimulate positive washback in teaching and learning.
  • 93. Chapter 7 Scoring, grading and assessment criteria
  • 94. Scoring approaches Objective • Relies on quantified methods of evaluating students’ writing Holistic • The reader (examiner) reacts to the students’ compositions as a whole and a single score is awarded to the writing • Each score on the scale will be accompanied with general descriptors of ability • Related : Primary trait scoring Analytical • Raters assess students’ performance on a variety of categories which are hypothesised to make up the skill of writing
  • 95. Comparison between approaches Scoring Approach Advantages Disadvantages Holistic  Quickly graded  Provide a public standard that is understood by the teachers and students alike  Relatively higher degree of rater reliability  Applicable to the assessment of many different topics  Emphasise the students’ strengths rather than their weaknesses.  The single score may actually mask differences across individual compositions.  Does not provide a lot of diagnostic feedback Analytical  It provides clear guidelines in grading in the form of the various components.  Allows the graders to consciously address important aspects of writing.  Writing ability is unnaturally split up into components. Objective:  Emphasises the students’ strengths rather than their weaknesses.  Still some degree of subjectivity involved.  Accentuates negative aspects of the learner’s writing without giving credit for what they can do well.
  • 96. Questions you can attempt.. • Describe with examples how holistic and analytical rubrics can be used to assess Year 6 pupils’ writing based on the following skill - Write simple factual descriptions of things, events, scenes and what one saw and did. - Characteristics of each approach
  • 97. Chapter 9 Reporting of Assessment Data
  • 98. Purposes of reporting • Main purpose of tests is to obtain information concerning a particular behaviour or characteristic. • Evaluate the effectiveness of one’s own teaching or instructional approach and implement the necessary changes • Based on information obtained from tests, several different types of decisions can be made.
  • 99.
  • 100. Reporting methods Norm - Referenced Assessment and Reporting Assessing and reporting a student's achievement and progress in comparison to other students. Criterion - Referenced Assessment and Reporting Assessing and reporting a student's achievement and progress in comparison to predetermined criteria. An outcomes-approach to assessment will provide information about student achievement to enable reporting against a standards framework. An outcomes-approach Acknowledges that students, regardless of their class or grade, can be working towards syllabus outcomes anywhere along the learning continuum.
  • 101. Principles of effective and informative assessment and reporting Has clear, direct links with outcomes Is integral to teaching and learning Is balanced, comprehensive and varied Is valid Is fair Engages the learner Values teacher judgement Is time efficient and manageable Recognises individual achievement and progress Involves a whole school approach Actively involves parents Conveys meaningful and useful information
  • 102. Chapter 10 Issues and Concerns related to assessment in Malaysian Primary Schools
  • 103. Components of PBS School assessment Refers to written tests that assess subject learning. The test questions and marking schemes are developed, administered, scored, and reported by school teachers based on guidance from LP. Central assessment Refers to written tests, project work, or oral tests (for languages) that assess subject learning. LP develops the test questions and marking schemes. The tests are, however, administered and marked by school teachers Psychometric assessment Refers to aptitude tests and a personality inventory to assess students’ skills, interests, aptitude, attitude and personality. Aptitude tests are used to assess students’ innate and acquired abilities, for example in thinking and problem solving. The personality inventory is used to identify key traits and characteristics that make up the students’ personality. LP develops these instruments and provides guidelines for use. Physical, sports, and co- curricular activities assessment Refers to assessments of student performance and participation in physical and health education, sports, uniformed bodies, clubs, and other non-school sponsored activities
  • 104. Benefits of PBS • enables students to be assessed on a broader range of output over a longer period of time. • Provides teachers with more regular information to take the appropriate remedial actions for their students. • Will hopefully reduce the overall emphasis on teaching totest, so that teachers can focus more time on delivering meaningful learning as stipulated in the curriculum.