Validity in Psychological Testing refers to the test measure what it claims to measure. The presentation discusses categories in validating procedures such as construct identification, criterion prediction and content description in psychological testing.
5. Questions to Ask to differentiate
Reliability and Validity
In RELIABILITY
Is the test giving CONSISTENT results?
In VALIDITY
Is the test fulfilling its PURPOSE?
6. VALIDITY is more than
It measure what it supposed to measure.
ALSO,
It verify the test capability to predict thinking or
behavior
It also verify the test if it will be taken seriously
by test takers.
8. Let’s define it one by one:
1. Construct
Identification
Procedure
1.1. Age Differentiation
1.2. Convergent Validity
1.3. Discriminant Validity
1.4. Factor Analysis
1.5. Contrasted Groups
Is the test
measuring what
it claims to
measure?
10. CASE 1.1:
You are a school psychologist and you developed a
psychological test that measures intelligence.
According to your reading of review of literature
before you constructed the test, one quality of
intelligence is that it increases as one advances in age.
Now that you are done constructing the test, what can
you do to check if the intelligence you constructed is
really measuring intelligence?
11. General Steps in Case 1.1
1. Administer test to different grade levels
(i.e. Grades 1, 2, 3) WHY?
2. Get results
See Outcome if test is valid
15. Important!
This is only applicable when Construct is AGE
RELATED (How would you know?)
Through REVLIT
16. Case 1.2
You have constructed a psychological test that measures
optimism. The following are the key details about
optimism you have read from the literature before you
constructed the test.
1) Optimism has nothing to do with age
2) Optimism is strongly correlated to happiness
What would you do to prove your test is indeed really
measuring optimism.
17. General Steps for Case 1.2
1. Find established test STRONGLY RELATED to your
test construct (I.e. HAPPINESS)
2. Find a population
3. Administer the two test to the same population
4. Note scores for each individual (How many?)
5. Correlate Optimism and happiness
18. Let’s review correlation
Coefficient Number Interpretation
.00-0.19 Very Weak
0.20-0.39 Weak
0.40-0.59 Moderate
0.60-0.79 Strong
0.80-1.0 Very Strong
19. Possible Outcomes Case 1.2
First, Correlation coefficient is direct and strong.
A. +0.92
B. -0.92
C. +0.30
Second, Correlation coefficient is inverse and strong
A. +0.92
B. -0.92
C. +0.30
Third, If correlation is weak
A. +0.92
B. -0.92
C. +0.30
No pattern
20. 1.2. Convergent Validity
1.2.1 Correlate test to another test with related
construct
Optimism Happiness
Neuroticism General Anxiety
HOW would you know?
Newly Constructed Test Established Test
RELATED CONSTRUCT
21. 1.2. Convergent Validity
1.2.2. Another approach (more recommended)
ESTABLISHED exactly the SAME construct.
Newly Constructed Happiness Test will be correlated to
Same Established Test Happiness Test
Your Newly Constructed Test
Oxford Happiness Scale
22. Why make another test if there are
test available already in the market?
A. Improvement
If the test is for adult, you can make a test for child.
Or if the factor is found unrelated in the previous
studies you may revise the existing test and make
one.
B. Additional
If there is a new factor that emerges as a result of
research
23. Why make another test if there are
test available already in the market?
C. Adaptation
If you want to adapt a foreign test and make it
indigenous to match your population.
Eg. Standford-Binet Intelligence Test.
(Adaptation by Lewis Termann includes Language
Translation and Adding item appropriate for your
population)
For Projective Personality the HUTT-Adaptation of
Bender Gestalt Test
24. Why make another test if there are test
available already in the market?
D. To provide a test that is more cheaper than in the market.
BRANDED MEDICINE GENERIC DRUGS
26. 3. Discriminant Validity
Correlating a test to unrelated construct
Established Test Newly Constructed Test
Eg. Wechsler Intelligence Test and your Constructed
Optimism Scale
27. General Steps 1.3
1. Find another ESTABLISHED unrelated test.
OPTIMISM? Intelligence
2. Administer both test to same population
3. Correlate!
28. Possible outcomes of
Discriminant Validity
First, if correlation is positive and strong
Constructed TEST is INVALID
(Why?) They were directly and strongly correlated. You are
using discriminant validity
Second, if correlation is negative and strong
Constructed TEST is INVALID (Why?) They are strongly
correlated with inverse relationship
Third, Correlation is weak.
Constructed TEST is VALID
29. Case 1.4
Case: You constructed a test that claims to
measure level of depression.
The country you are in has lost all its copies of the
established depression test that you could
possibly use to establish the validity of your test.
No other psychological test (related or unrelated to
your test) can be used to help you validate your
test. How can you prove that your test really
measures depression?
31. General Steps:
1. Find highly depressed people, measure
depression and compare the scores to those
people who are not depressed and compare
Since there is no established test were you may use,
you cannot get a correlation score. In this
contrasted group, you only need to get the mean.
32. Case: You constructed a test that aims to
measure math ability.
Your REVLIT tells you that math ability contains
4 other dimensions that people possess D1,
D2, D3, and D4.
You wonder if the test you constructed also
contains these 4 reported in REVLIT.
What would you do to confirm the existence of
these 4 dimensions?
34. V. FACTOR ANALYSIS
i. it counts the NUMBER of Dimension
(“Factor”) the test has.
ii. Which item under each factor?
Same with Psychological test= IQ Test ( How
many factors?)
35.
36.
37.
38. General Steps: (Factor Analysis)
1.Administer test
2. Factor Analysis
Possible Outcomes
How Factor Analysis Decides?
Correlating items to each other
High correlation-correlated factors
42. For Example:
Howard’s Risk of Suicide Test (Feature-High
Predictive Validity)
High Scorers are more likely to commit suicide.
Low Scorers are less prone to commit suicide
44. POSSIBLE USES OF A PREDICTING TEST
1. Prevention of Psychological
Disorders (Guidance Counselors)
-> who to counsel or undergo
program before it happens.
45. POSSIBLE USES OF A PREDICTING TEST
2. Knowing who to hire? (HR Practitioners)
Save money from hiring
wrong employees
46. HOW CAN I PREDICT?
Key Term: “CRITERION”
Numerical Expression of Observable
Indicators of a test construct.
47. Examples:
Test Construct: Intelligence
Ask: Numerical Expression?
Criterion: School GWA, IQ Score
Test Construct: Extraversion
Ask: Numerical Expression?
Criterion: Number of event/party attended
Test Construct: Aggression
Ask: Numerical Expression?
Criterion: Number of school offenses, fistfights.
48. CASE: Your principal bought an expensive
intelligence test to be used as a screening tool
for incoming freshman students. This test is
expensive because of its special ability: It can
predict students who will get high grades
three years after they are admitted in the
university.
Your principal thinks that it is a good test to be
used to help the school decide who will get
scholarships. As a school chief
psychometrician, what would you do to find
out whether this test can really predict
students who will get high college grades?
49. General Steps:
(Testing Predictive Validity of a Test)
1. Administer exam to incoming freshmen students.
2. Score
3. Select future appropriate criterion
Construct: Intelligence
Appropriate Criterion: GPA third year
50. General Steps:
(Testing Predictive Validity of a Test)
4. WAIT for the future criterion
5. CORRELATE the two scores
Entrance Exam Third Year CGPA
Subject 1 ? ?
Subject 2 ? ?
Subject 3 ? ?
51. Entrance Exam Test Score Third Year GPA
Say +.93
(interpret)
Relationship: DIRECT and STRONG
Higher entrance exam, higher CGPA (third year)
Lower entrance exam, lower CGPA (third year)
It means the TEST CAN PREDICT
52. Entrance Exam Test Score Third Year GPA
Say -.93
(interpret)
Relationship: INVERSE and STRONG
Higher entrance exam, lower CGPA (third year)
Lower entrance exam, higher CGPA (third year)
It means the TEST CAN PREDICT but this is
ridiculous.
53. Entrance Exam Test Score Third Year GPA
-0.18
(Interpret)
High/Low entrance exam scores has nothing to do
with third year CGPA
It means the TEST CAN’T PREDICT
54. Predictive Validity
Test Future Criterion
Ability of the Test to predict that a criterion will be
observed in the future
56. Case:
Your HR head asked you to validate your company’s job
efficiency test being used to hire rank and file
employees (I.e. secretaries, encoders, gift wrappers,
salesladies etc.,).
Specifically, your boss wants to find out if your company
test can predict good performance in the future of
those taking it.
Your boss would like to know the answer tomorrow
afternoon. What would you do to confirm if this test
can really predict job efficiency or not?
57. General Steps:
1. ADMINISTER, Score test (Job efficiency)
2. Select existing criterion (?)
Supervisor’s rating (Last month)
3. Correlate test results and supervisor’s rating last
month.
59. 2. CONCURRENT VALIDITY
TEST Immediately available criterion
Ability of the test to predict an already existing
CRITERION (“PAST”)
It is faster than predictive.
60. The question now is that “Why many still choose
predictive validity if Concurrent is faster?
Predictive Validity Concurrent Validity
High in Accuracy/Certainty Faster result
62. The question now is that “Why many still choose
predictive validity if Concurrent is faster?
Concurrent Validity is used up until the
establishment of Predictive Validity
65. Content Description
Procedures
KEY TERM: PRESENTABILITY of the Test
Questions to Ask:
A. “Is the test looks like a test”?
B. Or will the test taker took seriously the test?
Now as psychometrician,
How would we know that the TEST is presentable?
66. There are two methods of
Content Description Procedure
3.1. Content Validity
3.2. Face Validity
67. 3.1 Content validity
Examines APPROPRIATENESS of test items of a
psychological test (?)
Question to be answered:
Do the items belong to the test or not?
68. CONTENT VALIDITY
Are all the items in this test supposed to be in this
test? (“Fits in”)
How to know?
Example: Content Validity of the Fear Scale
69. General Steps in Content Validity
1. Review conceptual definition of fear
Remember there are two definition of concept in
thesis
a. Operational definition-how is the variable
measured in the thesis?
b. Conceptual Definition-how is the variable is
define in the REVLIT?
70. 2. Compare each item to
conceptual definition
E.g. Conceptual definition: FEAR (from REVLIT)- an
emotion caused by real or imagined threat that can
potentially result to danger, humiliation and pain.
1. I feel afraid every time I think of myself undergoing a
strenuous physical activity
2. I feel “Cold” when I find my physical safety is
threatened (i.e. not knowing whether the food is
clean or not”)
3. I get extremely uneasy when I am not going to do
things well in front of many people.
4. I get excited when I look forward to meeting a friend I
want to spend time with.
71. Who can establish content
validity?
1. Authors of the Test (Problem)
2. Subject Matter Experts (SMEs)
72. Quantifying Content Validity
Ne=number of panelist who agrees
N= number of Panelist
Scores ranges from 0 (Low) -1 (High)
The panelist will either accept, reject or revise