The document discusses a Six Sigma Green Belt certification course taught over 11 weeks. Key points:
- Students took a pre-test on the first day which showed their initial knowledge and the process capability was very poor, with high failure rates.
- Midway through the course, students re-took portions of the test, showing some improvement in scores on covered material but not uncovered material.
- At the end of the course, students re-took the full test. While scores improved overall, the distribution was bimodal due to issues with some students' work experience preventing certification. Test scores and process capability both significantly improved from the start.
2. Six Sigma is a problem solving tool kit that seeks to improve the quality of process
outputs by identifying and removing the causes of defects (errors) and minimizing
variability in manufacturing and business processes.
Six Sigma Green Belts are the tactical leads on improving functions within a job
function that are able to apply the Lean Sigma Concepts to their daily work.
The methods are universally applicable to anything where a customer is being
serviced.
2
3. This is a unique pedagogical approach and from philosophically is quite “meta”. The
objective under examination is in fact the actor performing the examination.
The most brilliant of teacher can write the most profound equation on a chalkboard,
and the most diligent of students can take pristine notes. However learning only
occurs when the student is able to apply the material. Johann Wolfgang von Goethe
was correct when he said “Knowing is not enough; we must apply.”
Given the diversity of the composition of the students in terms of education, life
experience, income and industry finding a common task in which to apply the LSS
would have been impossible. The only true commonality between the group was that
they were all humans and wanted to earn their greenbelt. We were able to leverage
this fact in developing the instructional roadmap for course.
Also the utilization of Shewhart Control Charts which are used to differentiate
between common cause and special cause variation, is fairly novel in academic
settings.
3
4. The instructor for the course, Brandon Theiss, is a Senior Member of ASQ and a
Graduate student at Rutgers University. Currently there is not a course offered in the
undergraduate Industrial and Systems Engineering Program at Rutgers. This course
provided an opportunity for students to not only be exposed to the material but also
to earn a nationally recognized certification in the tools techniques and methods of
Six Sigma. It represented a first of its kind partnership between the student chapter of
the IIE and ASQ Princeton section.
Part of the proceeds for the course were used to fund the IIE trip to their national
conference in Orlando.
4
5. The cost of the course for students included the textbook and ASQ student
membership
The professional rate only included the text.
The ASQ Certified Six Sigma Green Belt Requires 3 or more years of work experience
in one of more areas of the Body of Knowledge. There was a very long and at times
heated exchange with the ASQ certification committee about what constitutes work
experience. A compromise was ultimately reached however there were still a large
number of qualified students that were denied the right to sit for the exam
5
6. The course met once per week over an 11 week period from 6:30 to 9:30PM. There
were two sessions per week and students were free to attend either the Monday or
Tuesday class based upon which ever was more convenient for their schedule
6
7. Students were notified via email prior to the first night of the course that an exam
would be administered on the first night.
This provided both a baseline for the future improvement as well as showing students
directly the level of mastery they would need to obtain to become certified.
7
8. Feedback in any system is critically important. With a course that only meets once per
week, having students wait a week would be to long. By providing students
immediate feedback they were able to best utilize their time to study as well as not
mis-learn material thinking that they had been correct on a question when in fact
they were not.
8
10. A simple histogram of the exam results from the Monday section with a normal
distribution fit. It does appear to be normal but has a very large standard deviation
11.8%
10
11. The probability plot indicates that there is insufficient data to reject the null
hypothesis that the data is normally distributed. This is indicated by the P value which
indicates the probability that the difference between the measured data and the
model occurred by pure chance. The null hypothesis of normality would have been
rejected if the value had been less than alpha (5%) representing a 95% confidence
level.
11
12. It is technically debatable if the test scores are continuous or discrete variable and if a
I chart is appropriate. However the point is to introduce students to control charts
and an Individuals chart.
Since no point lies about the Upper or Lower Control Limit, the process is in a state of
“statistical control”. However common sense shows that this is nonsensical as the
range of the limits is between 17% and 95%. This was caused by the large standard
deviation observed.
This was used as an opportunity to discuss the difference between statistical
significance and actual significance. This reinforces the concept that the math does
not know where the numbers came from and can at best direct teams to derive the
true underlying meaning.
12
13. Again there is a technical point if the test scores are discrete or continuous. The
above Process Capability study requires that the data be considered continuous.
Process capability is essentially the probability of producing a product that will meet
your customers specification. In this case the passing score (78%) sets that limit. As
you can see in the above chart for every 1,000,000 students from the Monday
population that took the pre-test exam ~970,000 students will fail.
13
14. Everyone has taken a test where the test taker believes there was a question that
either had the wrong answer or was too difficult. By using a NP (or P) control chart,
one can easily distinguish if a question was statistically significantly too difficult above
the UCL or too easy below the LCL
14
15. There were several students who handed in their exams very quickly. We wanted to
see if the amount of time a student spent on the exam effected their scores. And for
the Monday data set it appears it did.
15
18. Again the data is normal as indicated by a P value greater than 5%. It is however
notable in the above plot that there is a clear outlier.
18
19. Again we can see that there is clearly an outlier in the data set.
19
20. The Tuesday process is very similar in its inability to produce a unit meeting
customers expectations and again will generate ~970,000 failures for every million
students from the population that take the exam
20
21. In the above graph it does appear that there were questions that a statistically
significant number of students got wrong.
21
22. Interestingly, the order in which a student turned in their exam did not have an effect
on the Tuesday data set.
22
26. The above shows a box plot comparing the two classes. The median appears to be
higher in the Tuesday class. However is the difference significant?
26
27. An ANOVA analysis was performed which results in a very high p value which means
that there is not a statistically significant difference between the two population
means.
27
28. Nominal Group -> when individuals over power a group
Multi-Voting -> Reduce a large list of items to a workable number quickly
Affinity Diagram -> Group solutions
Force Field Analysis -> Overcome Resistance to Change
Tree Diagram -> Breaks complex into simple
Cause- Effect Diagram -> identify root causes
28
34. Most Common Model of group Development was proposed by Bruce Tuckman in
1965.
In order for the team to grow, to face up to challenges, to tackle problems, to find
solutions, to plan work, and to deliver results. They must go through the cycle
Forming
Team members getting to know each other
Trying to please each other
May tend to agree too much on initial discussion topics
Not much work accomplished
Members orientation on the team goals
Group is going through “honeymoon period”
Storming
Voice their idea
Understand project scope and responsibilities
Ideas and understanding cause conflict
Not much work gets accomplished
Disagreement slows down the team
Norming
34
35. Resolve own conflicts
Come to mutually agreed plan
Some work gets done
Start to trust each other
Performing
Large amount of work gets done
Synergy realized
Competent and autonomous decisions are made
Adjourning
Team is disbanded, restructured or project re-scoped.
Regression to Forming stage
34
43. There does not appear to be a large change between the Pre Test and the Mid Term
42
44. A T-Test indicates that there is significant improvement, as indicated by the one tail P
value.
43
45. ANOVA on the other hand indicates that there is not a difference between the two
means.
44
46. Displays a histogram of the changes in scores, about 40% of the students went down
and 60% increased their score.
45
47. This is a somewhat novel adaptation of a C chart that allows for negative values.
However there appear to be students that did much better and much worse than the
other students.
46
48. Looking at a Paired-T test there was absolutely a statistically significant improvement.
47
49. Why did the test scores not improve more dramatically? Well the exams cover all of
the material in the CSSGB BoK the course was only half complete. When we looked at
the material covered up to the midterm on both the pre-test and the mid term the
above pie charts show the percentage of the covered material on each exam.
48
50. Not surprisingly students performed better on the material that was covered as
compared to the material that was not covered.
49
51. However the students also scored better on that same material on the pre test.
50
53. The change in the means indicates a ~8% improvement. However is that statistically
significant?
52
54. ANOVA does indicates that there is a difference in the means. The students did in fact
learn the material that was covered.
53
55. There does not appear to be a difference in the scores in the material that was not
covered yet in the course.
54
56. There was a small increase in the means ~2% is that significant?
55
57. No. There is not a statistically significant difference between the pre-test and mid-
term scores on the material that was not covered. As a result it would indicate that
the exams were roughly the same difficulty.
56
58. The process is still incapable of generating a passing score on the test.
57
59. Minitab is the de facto industry standard for statistical process control. Unfortunately
the undergraduate program at Rutgers does not include any training in the software
suite. It is fairly intuitive however students needed additional instruction.
58
62. Unfortunately, as this courses primary purpose was to act preparation for the
Greenbelt Exam a larger focus could not placed on this material. However in an
industrial setting most projects fail in the control phase. Regression to the mean is
the natural trend. Anyone that has ever tried to lose weight or quit smoking knows
that the trouble is always in sustaining the improvement.
61
66. The distribution is in fact bimodal. Unfortunately due to ASQ’s interpretation of the
meaning of work, a large number of qualified application were unable to sit for the
actual Greenbelt exam and became disenchanted with the course and represent the
lower distribution. This assumption was supported by a post hoc online survey.
65
67. However the test scores did appear to approve (even with the lower distribution)
66
68. And the improvement was very significant as indicated P value of 4.91 x 10^-13
67
69. On average the students improved 19.4% only a few students scores decreased,
68
70. The Paired T Test Results also confirm that the students test scores improved!
69
71. A P Chart was again used to detect difficult questions.
70
72. A Pareto Chart above shows the topics that generated that special cause variation in
the prior P chart.
71
73. The initial process capability was quite poor, producing defects ~970,000 failures per
1,000,0000
72
74. The final process capability though still not best in class, is much better, producing
475,000 failures per million (the observed is used since the data was already proven
to be non normal as it is bimodal)
73
77. *Actual data has not yet been released for the national average yet
As Confucius says “I hear and I forget. I see and I remember. I do and I understand.”
76