NG BB 25 Measurement System Analysis - Attribute

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

National Guard
Black Belt Training

Module 25

Measurement
System Analysis (MSA)
Attribute Data
This material is not for general distribution, and its contents should not be quoted, extracted for publication, or otherwise
UNCLASSIFIED / FOUO
copied or distributed without prior coordination with the Department of the Army, ATTN: ETF.
UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

CPI Roadmap – Measure
8-STEP PROCESS
6. See
1.Validate 2. Identify 3. Set 4. Determine 5. Develop 7. Confirm 8. Standardize
Counter-
the Performance Improvement Root Counter- Results Successful
Measures
Problem Gaps Targets Cause Measures & Process Processes
Through

Define Measure Analyze Improve Control

TOOLS
•Process Mapping
ACTIVITIES
• Map Current Process / Go & See •Process Cycle Efficiency/TOC
• Identify Key Input, Process, Output Metrics •Little’s Law
• Develop Operational Definitions •Operational Definitions
• Develop Data Collection Plan •Data Collection Plan
• Validate Measurement System •Statistical Sampling
• Collect Baseline Data •Measurement System Analysis
• Identify Performance Gaps •TPM
• Estimate Financial/Operational Benefits •Generic Pull
• Determine Process Stability/Capability •Setup Reduction
• Complete Measure Tollgate •Control Charts
•Histograms
•Constraint Identification
•Process Capability
Note: Activities and tools vary by project. Lists provided here are not necessarily all-inclusive. UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

Learning Objective
 Understand how to conduct and interpret a
measurement system analysis with Attribute Data

Measurement System Analysis - Attribute UNCLASSIFIED / FOUO 3

UNCLASSIFIED / FOUO

Attribute Measurement Systems
 Most physical measurement systems use
measurement devices that provide continuous data
 For continuous data Measurement System Analysis
we can use control charts or Gage R&R methods
 Attribute/ordinal measurement systems utilize
accept/reject criteria or ratings (such as 1 - 5) to
determine if an acceptable level of quality has been
attained
 Kappa and Kendall techniques can be used to
evaluate these Attribute and Ordinal Measurement
Systems


UNCLASSIFIED / FOUO

Are You Really Stuck With Attribute Data?
 Many inspection or checking processes have the ability to
collect continuous data, but decide to use attribute data to
simplify the task for the person taking and recording the data
 Examples:
 On-time Delivery can be recorded in 2 ways:
a) in hours late or
b) whether the delivery was on-time or late
 Many functional tests will evaluate a product on a
continuous scale (temperature, pressure drop, voltage
drop, dimensional, hardness, etc) and record the results
as pass/fail
Strive to get continuous data!


UNCLASSIFIED / FOUO

Attribute and Ordinal Measurements
 Attribute and Ordinal measurements often rely on
subjective classifications or ratings
 Examples include:
 Rating different features of a service as either good or
bad, or on a scale from 1 to 5 with 5 being best
 Rating different aspects of employee performance as
excellent, satisfactory, needs improvement
 Rating wine on a) aroma, b) taste, and c) after taste
 Should we evaluate these measurement systems before
using them to make decisions on our CPI project?
 What are the consequences of not evaluating them?


UNCLASSIFIED / FOUO

MSA – Attribute Data
 What methodologies are appropriate to assess
Attribute Measurement Systems?
 Attribute Systems – Kappa technique which treat all
misclassifications equally
 Ordinal Systems – Kendall‟s technique which
considers the rank of the misclassification
 For example, if we are judging an advertising service on a
scale from 1 to 5 and Inspector A rates the service a „1‟ while
Inspector B rates it a „5.‟ That is a greater misclassification
than Inspector A rating it a „4‟ while Inspector B rates it a „5.‟


UNCLASSIFIED / FOUO

Data Scales
 Nominal: Contains numbers that have no basis on which to arrange
in any order or to make any assumptions about the quantitative
difference between them. These numbers are just names or labels.
For example:
 In an organization: Dept. 1 (Accounting), Dept. 2 (Customer
Service), Dept. 3 ( Human Resources)
 In an insurance co.: Business Line 1, Line 2, Line 3
 Modes of transport: Mode 1 (air), Mode 2 (truck), Mode 3 (sea)

 Ordinal: Contains numbers that can be ranked in some natural
sequence. This scale, however, cannot make an inference about the
degree of difference between the numbers. Examples:
 On service performance: excellent, very good, good, fair, poor
 Salsa taste test: mild, hot, very hot, makes me suffer
 Customer survey: strongly agree, agree, disagree, strongly
disagree

UNCLASSIFIED / FOUO

Kappa Techniques
 Kappa is appropriate for non-quantitative systems
such as:
 Good or bad
 Go/No Go
 Differentiating noises (hiss, clank, thump)
 Pass/fail


UNCLASSIFIED / FOUO

Kappa Techniques
 Kappa for Attribute Data:
 Treats all misclassifications equally
 Does not assume that the ratings are equally
distributed across the possible range
 Requires that the units be independent and that the
persons doing the judging or rating make their
classifications independently
 Requires that the assessment categories be mutually
exclusive


UNCLASSIFIED / FOUO

Operational Definitions
 There are some quality characteristics that are either difficult
or very time consuming to define
 To assess classification consistency, several units must be
classified by more than one rater or judge
 If there is substantial agreement among the raters, there is
the possibility, although no guarantee, that the ratings are
accurate
 If there is poor agreement among the raters, the usefulness
of the rating is very limited

Poor attribute measurement systems can almost
always be traced to poor operational definitions


UNCLASSIFIED / FOUO

Consequences?
 What are the important concerns?
 What are the risks if agreement within and between
raters is not good?
 Are bad items escaping to the next operation in the
process or to the external customer?
 Are good items being reprocessed unnecessarily?
 What is the standard for assessment?
 How is agreement measured?
 What is the Operational Definition for assessment?


UNCLASSIFIED / FOUO

What Is Kappa? “K”
Pobserved  Pchance
K
1  Pchance
P observed
 Proportion of units on which both Judges agree = proportion both
Judges agree are good + proportion both Judges agree are bad
P chance (expected)
 Proportion of agreements expected by chance = (proportion Judge
A says good * proportion Judge B says good) + (proportion Judge
A says bad * proportion B says bad)

Note: equation applies to a two category analysis, e.g., good or
bad

UNCLASSIFIED / FOUO

Kappa
K
1  Pchance
 For perfect agreement, P observed = 1 and K=1
 As a rule of thumb, if Kappa is lower than 0.7, the
measurement system is not adequate
 If Kappa is 0.9 or above, the measurement system is
considered excellent
 The lower limit for Kappa can range from 0 to -1
 For P observed = P chance (expected), then K=0
 Therefore, a Kappa of 0 indicates that the agreement is
the same as would be expected by random chance

UNCLASSIFIED / FOUO

Attribute MSA Guidelines
 When selecting items for the study consider the
following:
 If you only have two categories, good and bad, you
should have a minimum of 20 good and 20 bad
 As a maximum, have 50 good and 50 bad
 Try to keep approximately 50% good and 50% bad
 Have a variety of degrees of good and bad

If only good items are chosen for the study, what
might happen to P-chance (expected)?


UNCLASSIFIED / FOUO

Attribute MSA Guidelines (Cont.)
 If you have more than two categories, with one of the
categories being good and the other categories being
different error modes, you should have approximately
50% of the items being good and a minimum of 10%
of the items in each of the error modes
 You might combine some of the error modes as
“other”
 The categories should be mutually exclusive or, if not,
they should also be combined


UNCLASSIFIED / FOUO

Within Rater/Repeatability Considerations
 Have each rater evaluate the same item at least twice
 Calculate a Kappa for each rater by creating separate
Kappa tables, one for each rater
 If a Kappa measurement for a particular rater is small, that
rater does not repeat well within self
 If the rater does not repeat well within self, then they will not
repeat well with the other raters and this will hide how good
or bad the others repeat between themselves
 Calculate a between-rater Kappa by creating a Kappa table
from the first judgment of each rater
 Between-rater Kappa will be made as pairwise comparisons
(A to B, B to C, A to C)

UNCLASSIFIED / FOUO

Example: Data Set = Attribute Ordinal.mtw
 An educational testing organization is training five new
appraisers for the written portion of the twelfth-grade
standardized essay test
 The appraisers‟ ability to rate essays consistent with the
standards needs to be assessed
 Each appraiser rated fifteen essays on a five-point scale
(-2, -1, 0, 1, 2)
 The organization also rated the essays and supplied the “official
score”
 Each essay was rated twice and the data captured in the file
Attribute Ordinal.mtw
 Open the file and evaluate the appraisers' performance


UNCLASSIFIED / FOUO

Minitab and Attribute Measurement Systems
Stat>Quality Tools>Attribute Agreement Analysis


UNCLASSIFIED / FOUO

Minitab Dialog Box

1. Double click on the
appropriate variable
to place it in the
required dialog box:

Attribute = Rating
Samples = Sample
Appraisers = Appraiser

2. Click on OK


UNCLASSIFIED / FOUO

Within Appraiser Percent
This output represents the percent agreement and the 95%
confidence interval around that percentage
Date of study :
Assessment Agreement
Reported by :
Name of product:
Misc:

Within A ppraisers
100 95.0% C I
P ercent

80

60
Percent

40

20

0
Duncan Hayes Holmes Montgomery Simpson
Appraiser


UNCLASSIFIED / FOUO

Within Appraiser Session Window Output
This output is the same information contained in the graph
with the addition of a Between-Appraiser assessment


UNCLASSIFIED / FOUO

Let’s Do It Again


UNCLASSIFIED / FOUO

Introducing a Known Standard
1. Double click on the
appropriate variable
to place it in the
required dialog box
(same as before)

2. If you have a known
standard (the real answer)
for the items being inspected,
let Minitab know what column
that information is in.

3. Click on OK


UNCLASSIFIED / FOUO

Appraiser vs. Standard
Date of study :
Assessment Agreement
Reported by :
Name of product:
Misc:

Within Appraisers Appraiser vs Standard
100 95.0% C I 100 95.0% C I
P ercent P ercent
90 90

80 80

70 70
Percent

Percent
60 60

50 50

40 40

30 30

an es es ry on an es es ry on
nc ay lm me ps nc ay lm me ps
Du H Ho go Si
m Du H Ho go Si
m
ont ont
M M
Appraiser Appraiser


UNCLASSIFIED / FOUO

Within Appraiser

In addition to the Within-Appraiser
graphic, Minitab will give percentages


UNCLASSIFIED / FOUO

Each Appraiser vs. Standard

Some appraisers will repeat their own ratings well but
may not match the standard well (look at Duncan)


UNCLASSIFIED / FOUO

More Session Window Output

The session window will give percentage data as to how
all the appraisers did when judged against the standard

UNCLASSIFIED / FOUO

Kappa and Minitab
Minitab will calculate a Kappa for each (within) appraiser for each category

Note: This is only a part of the total data set for illustration

UNCLASSIFIED / FOUO

Kappa vs. Standard
Minitab will also calculate a Kappa statistic for each
appraiser as compared to the standard

Note: This is only a part of the total data set for illustration

UNCLASSIFIED / FOUO

Kappa and Minitab

Minitab will not provide a
Kappa between a specific
pair of appraisers, but will
provide an overall Kappa
between all appraisers for
each possible category of
response

How might this output help us improve our measurement system?


UNCLASSIFIED / FOUO

What If My Data Is Ordinal?


UNCLASSIFIED / FOUO

Ordinal Data

If your data is
Ordinal, you
must also check
this box


UNCLASSIFIED / FOUO

What Is Kendall’s

Kendall‟s coefficient can be thought of as an R-squared value, it is the correlation
between the responses treating the data as attribute as compared to ordinal.
The lower the number gets, the more severe the misclassifications were.


UNCLASSIFIED / FOUO

Kendall’s

Kendall‟s coefficient can be thought of as an R-squared value, it is the
correlation between the responses treating the data as attribute as
compared to ordinal. The lower the number gets, the more severe the
misclassifications were.


UNCLASSIFIED / FOUO

Kendall’s (Cont.)


UNCLASSIFIED / FOUO

Exercise: Seeing Stars
 Divide into teams of two
 One person will be the rater and one the recorder
 Have each rater inspect each start and determine if it is Good
or Bad (Kappa)
 Record the results in Minitab
 Mix up the stars and repeat with same rater 2 more times
 Compare results to other raters and to the known standard
 Take 30 minutes to complete the exercise and be prepared to
review your findings with the class


UNCLASSIFIED / FOUO

Takeaways
 How to set-up/conduct an MSA
 Use attribute data only if the measurement can not be
converted to continuous data
 Operational definitions are extremely important
 Attribute measurement systems require a great deal
of maintenance
 Kappa is an easy method to test how repeatable and
reproducible a subjective measurement system is


UNCLASSIFIED / FOUO

What other comments or questions
do you have?

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

References
 Cohen, J., “A Coefficient of Agreement for Nominal
Scales,” Educational and Psychological Measurement,
Vol. 20,
pp. 37-46, 1960
 Futrell, D., “When Quality Is a Matter of Taste, Use
Reliability Indexes,” Quality Progress, May 1995


UNCLASSIFIED / FOUO

APPENDIX – A Practical Example of Kappa
Evaluating the Measurement System for
Determining Civilian Awards


UNCLASSIFIED / FOUO

Kappa Example #1
 The Chief of Staff (COS) of the 1st Infantry Division is preparing for the
redeployment of 3 brigade combat teams supporting Operation Iraqi Freedom.
 The Secretary of General Staff (SGS) informs the COS that awards for civilian
personnel (Department of the Army Civilians and military dependents) who
provided volunteer support prior to and during the deployment is always a
“significant emotional issue.” There are hundreds of submissions for awards.
 A board of senior Army personnel decides who receives an award. The
measurement system the board uses to determine who receives an award is a
major concern due to differences in board member to board member
differences as well as within board member differences.
 The COS directs the SGS (a certified Army Black Belt) to conduct a
measurement system study using historical data to “level set” the board
members. Kappa for each board member as well as Kappa between board
members must be calculated.
 The COS‟ guidance is to retrain and/or replace board members until the
measurement system is not a concern.


UNCLASSIFIED / FOUO

Consider the Following Data
• The Lean Six Sigma Pocket Toolbook, p.100-103 outlines
the procedures for calculating Kappa. Kappa is MSA for
attribute data.

• The SGS‟ study involves two categories for
recommendations, “Award” and “No Award”.

• We select 40 candidate packets from historical data and
ensure that 20 are definitely for “Award” and 20 are for “No
Award”.

• Board Member 1 and 2 evaluate each candidate‟s packet.
The results are shown in the tables on the following slides.

UNCLASSIFIED / FOUO



UNCLASSIFIED / FOUO

Contingency Table for Board Member 1

Populate Each Cell with the Evaluation Data

Contingency Table: Counts Board Member 1 - 1st
Award No Award
Member 1

Award 15 3 18
Board

- 2nd

No Award 3 19 22
18 22
Board Member 1 – 1st : shows the results of Board Member 1’s 1st recommendations. The 1st board
member recommended an “Award” or “No Award” for each of the 40 candidates on the first review
of the files.

Board Member 1 – 2nd : shows the results of Board Member 1’s 2nd recommendations. The 1st
board member recommended an “Award” or “No Award” for each of the 40 candidates on the
second review of the files.


UNCLASSIFIED / FOUO

Contingency Table: Cell 1
The first cell represents the number of
times Board Member 1 recommended a
candidate should receive an “Award” in
both the first and second evaluation.

Contingency Table: Board Member 1 - 1st
Counts Award No Award
Member 1

Award 15 3 18
Board

- 2nd

No Award 3 19 22
18 22


UNCLASSIFIED / FOUO

The second cell represents the number of
times Board Member 1 recommended a
candidate as “No Award” the first time
and “Award” the second evaluation.

Member 1

Award 15 3 18
Board

- 2nd

No Award 3 19 22
18 22


UNCLASSIFIED / FOUO


Member 1

Award 15 3 18
Board

- 2nd

No Award 3 19 22
18 22

The third cell represents the number of times Board Member 1
recommended “Award” on the first evaluation and “No Award”
on the second evaluation.


UNCLASSIFIED / FOUO


Member 1

Award 15 3 18
Board

- 2nd

No Award 3 19 22
18 22

The fourth cell represents the number of times Board
Member 1 recommended “No Award” on the first
evaluation and “No Award” on the second evaluation.


UNCLASSIFIED / FOUO

Contingency Table: Sum of Row and Columns

Member 1

Award 15 3 18
Board

- 2nd

No Award 3 19 22
18 22

The numbers on the margins are the totals of the rows
and columns of data. The sum in both instances is 40,
the total number of candidate packets reviewed.


UNCLASSIFIED / FOUO

Contingency Table – Counts & Proportions
Member 1

Award 15 3 18
Board

- 2nd

No Award 3 19 22
18 22

Proportions Award No Award
Member 1
Board

Award 0.375 0.075 0.45
- 2nd

No Award 0.075 0.475 0.55
0.45 0.55

Represents 18/40
Board Member 1 Proportions: The lower table is the data in the upper table
represented as a percentage of the total.

UNCLASSIFIED / FOUO

Contingency Table – Sum of Percentages

Proportions Award No Award
Member 1

Award 0.375 0.075 0.45
Board

- 2nd

No Award 0.075 0.475 0.55
0.45 0.55

The sum percentages from the rows and
columns. The sums must equal 1.0


UNCLASSIFIED / FOUO

Calculating Kappa
K
1  Pchance
 Pobserved
 Proportion of candidates for which both Board Members agree
= proportion both Board Members agree are “Award” +
proportion both Board Members agree are “No Award”.
 Pchance
 Proportion of agreements expected by chance = (proportion
Board Member 1 says “Award” * proportion Board Member 2
says “Award”)+ (proportion Board Member 1 says “No Award”
* proportion Member 2 says ”No Award”)

The verbiage for defining Kappa will vary slightly depending on whether
we are defining a Within-Rater Kappa or Between-Rater Kappa


UNCLASSIFIED / FOUO

Calculate Kappa for Board Member 1
Contingency Table:
Board Member 1 - 1st
Proportions
Award No Award
Member 1

Award 0.375 0.075 0.45
Board

- 2nd

No Award 0.075 0.475 0.55
0.45 0.55

Pobserved is the sum of the probabilities on the diagonal:
P observed =(0.375 + 0.475) = 0.850

Pchance is the probabilities for each classification multiplied and then summed:
Pchance =(0.450*0.450) + (0.550*0.550) = 0.505

Then KBoard Member 1=(0.850 - 0.505)/(1 - 0.505)=0.697

Kappa for Board Member 1 is sufficiently close to 0.700 that we conclude that Board Member 1
exhibits repeatability.

UNCLASSIFIED / FOUO

Calculate Kappa for Board Member 2
Member 2

Award
Board

- 2nd

No Award

Proportion Award No Award
Member 2

Award
Board

- 2nd

No Award

K Board Member 2 = ?

UNCLASSIFIED / FOUO

Kappa Between Board Members

 To calculate a Kappa for between Board Members, we
will use a similar procedure.
 We calculate Kappa for the first recommendations of
the pair of the Board Members.
 NOTE: If there is a Board Member who has poor
Within-Board Member repeatability (less than 85%),
there is no need to calculate a Between-Board
Member rating.


UNCLASSIFIED / FOUO

Kappa – Board Member 1 to Board Member 2

Contingency Table:
Counts
Award No Award
Member 2

Award 14 5 19
Board

- 1st

No Award 4 17 21
18 22

Number of times both board members
agreed the candidate should receive an “Award.”
(using their first evaluation)


UNCLASSIFIED / FOUO

Contingency Table:
Counts Board Member 1 - 1st
Award No Award
Member 2

Award 14 5 19
Board

- 1st

No Award 4 17 21
18 22

Number of times Board Member 1
recommended “No Award” and Board Member
2 recommended “Award”. (using their first
evaluation)


UNCLASSIFIED / FOUO

Board Member 1 to Board Member 2 Kappa

Contingency Table:
Counts
Award No Award
Member 2

Award 14 5 19
Board

- 1st

No Award 4 17 21
18 22

Number of times Board Member 1 recommended
“Award” and Board Member 2 recommended “No
Award” (using their first measurement)


UNCLASSIFIED / FOUO

Between Board Member Kappa

Contingency Table:
Counts
Award No Award
Member 2

Award 14 5 19
Board

- 1st

No Award 4 17 21
18 22

Number of times both Board Members
agreed the candidate was “No Award”
(using their first measurement)


UNCLASSIFIED / FOUO

Calculate Between-Board Member Kappa:

Member 2

Award 14 5 19
Board

The lower table
- 1st

represents the data
No Award 4 17 21 in the top with each
18 22 cell being
represented as a
percentage of the
Contingency Table:
Proportions
Board Member 1 - 1st total.
Award No Award
Member 2

Award 0.35 0.125 0.48
Board

- 1st

No Award 0.100 0.425 0.53
0.450 0.550


UNCLASSIFIED / FOUO

Remember How to Calculate Kappa?
K
1  Pchance
 Pobserved
 Proportion of items on which both Board Members agree =
proportion both Board Members agree “Award” + proportion
both Board Members agree are “No Award”.

 Pchance
 Proportion of agreements expected by chance = (proportion
Board Member 1 recommends “Award” * proportion Board
Member 2 says “No Award”) + (proportion Board Member 1 says
No Award” * proportion Board Member 2 says “No Award”)

The verbiage for defining Kappa will vary slightly depending on whether we are
defining a Within-Board Member Kappa or Between-Board Member Kappa


UNCLASSIFIED / FOUO

Calculate Kappa for Board Member 1 to Board Member 2
Contingency Table:
Proportions
Award No Award
Member 2

Award 0.35 0.125 0.48
Board

- 1st

No Award 0.100 0.425 0.53
0.450 0.550
Pobserved is the sum of the probabilities on the diagonal:
Pobserved =(0.350 + 0.425) = 0.775

Pchance is the probability for each classification multiplied and then summed:
Pchance =(0.480*0.450) + (0.530*0.550) = 0.503

Then Kboard Member 1 / 2=(0.775 - 0.503)/(1 - 0.503)=0.548

The Board Members evaluate candidate packets differently too often. The SGS
will retrain each Board Member before dismissing a Board Member and finding a
replacement.

UNCLASSIFIED / FOUO

Improvement Ideas
 How might we improve this measurement system?
 Additional training
 Physical standards/samples
 Rater certification (and periodic re-certification)
process
 Better operational definitions


UNCLASSIFIED / FOUO

Kappa Conclusions
 Is the current measurement system adequate?
 Where would you focus your improvement efforts?
 What rater would you want to conduct any training
that needs to be done?

Class Challenge: After exposure to Minitab in the following slides,
input the data from previous example into Minitab. As homework,
perform the analysis and compare the computer output and simplicity
with the manual calculations performed in the previous slides.
Hint: You will need to stack columns.


NG BB 25 Measurement System Analysis - Attribute

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie NG BB 25 Measurement System Analysis - Attribute

Ähnlich wie NG BB 25 Measurement System Analysis - Attribute (20)

Mehr von Leanleaders.org

Mehr von Leanleaders.org (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

NG BB 25 Measurement System Analysis - Attribute