Louzel Report - Reliability & validity

Tools of Research: Reliability and Validity Louzel M Linejan Presenter

Sequence of Discussion: ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Reliability ● r efers to how consistently data are collected (Lee 2004) ● the degree to which a test consistently measures whatever it measures and indicates the consistency of the scores produced (Raagas,2009). ● the extent to which results are consistent over time and an accurate representation of the total population under study (Joppe 2000). ● concerns with the replicability and consistency of the methods, conditions and results (Wiersa and Jurs, 2005).

Reliability ● expressed numerically, usually as a coefficient ranging from 0.0 to 1.0; meaning the score of the respondent perfectly reflected their true status with respect to the variable being measured. If a test is perfectly reliable, the reliability coefficient is 1.0 ; meaning the score of the respondent perfectly reflected their true status with respect to the variable being measured. ● no test is perfectly reliable and the scores are invariably affected by errors of measurements resulting from a variety of causes.

Methods of Estimating Reliability 1. Stability ( also called Test – Retest Reliability) - the degree to which results/scores on the same test are consistent over time. The more similar the scores on the test over time, the more stable or consistent are the scores. - indicates score variation that occurs from one testing session to another. - provides evidence that scores obtained on a test at one time (test), are the same or close to the same when the test is readministered some other time (retest).

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Methods of Estimating Reliability

[object Object],[object Object],[object Object],[object Object],[object Object],Methods of Estimating Reliability

2. Equivalence (or Equivalent Forms) - Two tests that are identical, except for the actual items included. - The two forms measure the same variable, have the same number of items, the same structure, the same difficulty level, and the same direction for administration, scoring and interpretation. - If there is equivalence, the two tests can be used interchangeably. The correlation between scores on the two forms will yield an estimate of their reliability. Methods of Estimating Reliability

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Methods of Estimating Reliability

3. Internal Consistency Reliability (Methods of Internal Analysis) - commonly used form of reliability which deals with one test a time. This is obtained through Split-Half , Kuder-Richardson and Cronbach Coefficient Alpha . Each provides information about the consistency among the items in a single test. - applicable to instruments that have more than one item as it refers to how homogenous the items of a test are; or how well the measure of a single construct Methods of Estimating Reliability

3. Internal Consistency Reliability a. Split-Half Reliability - A common approach is to split a test into two reasonable equivalent halves. These independent subjects are then used as a source of the two independent scores needed for reliability’s estimation. - simplest statistical technique; randomly splits the questionnaire items into 2 groups. A score for each participant is then calculated based on each half of the scale. Methods of Estimating Reliability

a. Split-Half Reliability Methods of Estimating Reliability This procedure requires only 1 administration of the test. Test items are divided into 2 halves, with the items of the 2 halves are then scores independently. The problem with this method is that there are several ways in which a set of data can be split into two and so the results might stem from the way in which the data were split.

3. Internal Consistency Reliability b. Kuder-Richardson Kuder and Richardson developed two of the most widely accepted methods for estimating reliability. These are the K-R20 and K-R21 . These estimate the consistency reliability by determining how all items in a test relate to all other test items and the whole test. These are useful for true-false and the multiple choice items. Methods of Estimating Reliability

3. Internal Consistency Reliability b. Kuder-Richardson K – R20 = most advisable if the “proportion of correct responses to a particular item” vary a lot; provide the mean of all possible split-half coefficients Methods of Estimating Reliability K – R21 = most advisable if the items do not vary much in difficulty, i.e., the “proportion of correct responses to a particular item” are more or less similar; may be substituted for K-R20 if it can be assumed that item difficulty levels are similar.

Methods of Estimating Reliability 3. Internal Consistency Reliability c. Cronbach Coefficient Alpha used only if the item scores are other than 0 & 1. This is advisable for essay items, problem solving and 5-scaled items. ; based on 2 or more parts of the test, requires only one administration of the test.

Validity ● degree to which a test measures what is supposed to measure and consequently, permits appropriate interpretations of test scores (Raagas,2009). ● Validity determines whether the research truly measures that which it was intended to measure or how truthful the research results are (Joppe 2000). ● refers to the ability of the survey questions to accurately measure what they claim to measure (Lee 2004) ● anwers the question: Are we measuring what we want to measure? (Muijs, 2004)

Validity ● Validity’s 3 terms of degrees: - highly valid - moderately valid - generally invalid ● The validation process begins with an understanding of the interpretation to be made from the tests or instruments.

Forms of Validity ● CONTENT VALIDITY ● CONSTRUCT VALIDITY ● CRITERION-RELATED VALIDITY

Content Validity ,[object Object],[object Object],[object Object],[object Object]

Content Validity Requires Item Validity and Sampling Validity Item Validity - concerned whether the test items are relevant to the intended content area Sampling Validity - concerned with how well the test sample represents the total content area

Criterion-related Validity ,[object Object],[object Object],[object Object]

Criterion-related Validity ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Construct Validity ,[object Object],[object Object],[object Object],[object Object],[object Object]

[object Object],[object Object],[object Object],[object Object],Factors affecting Validity

[object Object],[object Object],[object Object],Factors affecting Validity

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Factors affecting Validity

Reliability and Validity Suppose the reported reliability coefficient for a test was 0.24, this definitely is not good. Would this tell something about the validity of the test? What if a test is so hard and no respondent could answer even a single item? Scores would still be consistent, but not valid. If a test measures what it is supposed to measure, it is reliable, but a reliable test can consistently measure the wrong thing and be invalid. Yes, it would. It would show that the validity is not high because if it were, the reliability would be higher.

Reliability and Validity Reliability is necessary but not sufficient for establishing validity. A valid test is always reliable but a reliable test is not always valid. What if the reported reliability was 0.92, which is definitely high. Would this tell anything about validity? “ not really”. It would only indicate that the test validity might be also high, because the reliability is high, but not necessarily; the test could be consistently measuring the wrong thing.

- ensure that the quality of questions we ask is clear and unambiguous. Unambiguous and clear question are likely to be more reliable, and the same goes for items on a rating scale for observers. - Another way to make an instrument more reliable is by measuring it with more than one item. - ensure that dependent variable is measured as precisely as possinble What can we do to make our instruments more reliable?

a. Split-Half Reliability When we need to predict the reliability of a test twice as long as given test, as in the split halves method, the formula is shown below: Methods of Estimating Reliability The problem with this method is that there are several ways in which a set of data can be split into two and so the results might stem from the way in which the data were split.

b. Kuder-Richardson K – R20 = most advisable if the p values vary a lot Methods of Estimating Reliability K – R21 = most advisable if the items do not vary much in difficulty, i.e., the p values are more or less similar

c. Cronbach Coefficient Alpha used only if the item scores are other than 0 & 1. This is advisable for essay items, problem solving and 5-scaled items. Methods of Estimating Reliability where si = standard deviation of a single test item and S = standard deviation of the total score of each examinee.

[object Object],[object Object],Reliability

[object Object],[object Object],Validity

Louzel Report - Reliability & validity

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Louzel Report - Reliability & validity

Ähnlich wie Louzel Report - Reliability & validity (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Louzel Report - Reliability & validity

Hinweis der Redaktion