SlideShare ist ein Scribd-Unternehmen logo
1 von 53
Downloaden Sie, um offline zu lesen
TRANSFER
LEARNING
AND SE
TIM@MENZIES.US
WVU, JULY 2013
SOUND BITES
•  Ye olde worlde SE
•  “The” model of SE (defects, effort, etc)
•  21st century SE
•  Models (plural)
•  No generality in models
•  But , perhaps generality in how we find those models
•  Transfer learning
2
3
WHAT IS TRANSFER LEARNING?
•  Source = old= Domain1 = < Eg1, P1>
•  Target = new = Domain2 = <Eg2, P2>
•  If we move from domain1 to domain2, do have have to start
afresh?
•  Or can we learn faster in “new” …
•  … Using lessons learned from “old”?
•  NSF funding (2013..2017):
•  Transfer learning in Software Engineering
•  Menzies, Layman, Shull , Diep
4
WHO CARES?
(WHAT’S AT
STAKE?)
•  “Transfer” is a core
scientific issue
•  Lack of transfer is the
scandal of SE
•  Replication is
Empirical SE is rare
•  Conclusion instability
•  It all depends.
•  The full stop
syndrome
•  The result?
•  A funding crisis
5
MANUAL TRANSFER (WAR STORIES)
•  Brazil, SEL, 2002: need domain knowledge (but now gone)?
•  NSF, SEL, 2006: need better automatic support
•  Kitchenham, Mendes et al, TSE 2007: for = against
•  Zimmermann FSE, 2009: cross works in 4/600 times
6
WAR STORIES
(EFFORT ESTIMATION)
Effort = a . locx . y
•  learned using Boehm’s
methods
•  20*66% of NASA93
•  COCOMO attributes
•  Linear regression (log
pre-processor)
•  Sort the co-efficients
found for each member
of x,y
7
WAR STORIES
(DEFECT ESTIMATION)
8
SMARTER TRANSFER
(AUTOMATIC SUPPORT)
•  Don’t use all available training data
•  Use relevancy filtering [Turhan’09 ESE journal]
•  Use variance pruning ç this talk
•  Don’t use the raw attributes
•  Most are rubbish anyway
•  Feature selection [Chen’05, IEEE Software]
•  Feature synthesis ç this talk
9
ESEM, 2011 :
How to Find Relevant
Data for Effort Estimation
TIM MENZIES,
EKREM KOCAGUNELI
USD DOD MILITARY PROJECTS
(LAST DECADE)
11
You must
segment to
find relevant
data
12"
DOMAIN SEGMENTATIONS
12
Q: What to do
about rare
zones?
A: Select the nearest ones from the rest
But how?
IN THE LITERATURE: WITHIN VS
CROSS = ??
BEFORE THIS WORK
13
Kitchenham et al. TSE
2007
•  Within-company
learning (just use local
data)
•  Cross-company
learning (just use data
from other companies)
Results mixed
•  No clear win from cross
or within
Cross vs within are no
rigid boundaries
•  They are soft borders
•  And we can move a
few examples across
the border
•  And after making those
moves
•  “Cross” same as
“local”
SOME DATA DOES NOT DIVIDE
NEATLY ON EXISTING DIMENSIONS
14
THE LOCALITY(1) ASSUMPTION
15
Data divides best on one attribute
1.  development centers of developers;
2.  project type; e.g. embedded, etc;
3.  development language
4.  application type (MIS; GNC; etc);
5.  targeted hardware platform;
6.  in-house vs outsourced projects;
7.  Etc
If Locality(1) : hard to use
data across these boundaries
•  Then harder to build effort models:
•  Need to collect local data (slow)
THE LOCALITY(N) ASSUMPTION
16
Data divides best on
combination of attributes
If Locality(N)
• Easier to use data across
these boundaries
•  Relevant data spread all
around
•  little diamonds floating in
the dust
HOW TO FIND RELEVANT TRAINING
DATA?
17
independent
attributes
w x y z class
similar 1
0 1 1 1 2
similar 2
0 1 1 1 3
different 1 7 7 6 2 5
different 2 1 9 1 8 8
different 3 5 4 2 6 10
alien 1 74 15 73 56 20
alien 2 77 45 13 6 40
alien 3 35 99 31 21 60
alien 4 49 55 37 4 80
Use similar?
Use more variant?
Use aliens ?
VARIANCE PRUNING
18
independent
attributes
w x y z class
similar 1
0 1 1 1 2
similar 2
0 1 1 1 3
different 1 7 7 6 2 5
different 2 1 9 1 8 8
different 3 5 4 2 6 10
alien 1 74 15 73 56 20
alien 2 77 45 13 6 40
alien 3 35 99 31 21 60
alien 4 49 55 37 4 80
1) Sort the clusters by “variance”
2) Prune those high variance things
3) Estimate on the rest
“Easy path”: cull the examples
that hurt the learner
PRUNE !
KEEP !
TEAK: CLUSTERING + VARIANCE
PRUNING (TSE, JAN 2011)
19
• TEAK is a variance-based
instance selector
• It is built via GAC trees
• TEAK is a two-pass system
• First pass selects low-
variance relevant
projects
• Second pass retrieves
projects to estimate from
ESSENTIAL POINT
20
TEAK finds local regions important to the
estimation of particular cases
TEAK finds those regions via locality(N)
•  Not locality(1)
WITHIN AND CROSS DATASETS
21
Note: all
Locality(1)
divisions
EXPERIMENT1: PERFORMANCE
COMPARISON OF WITHIN AND CROSS-
SOURCE DATA
22
• TEAK on within & cross data for each dataset
group (lines separate groups)
• LOOCV used for runs
• 20 runs performed for each treatment
• Results evaluated w.r.t. MAR, MMRE,
MdMRE and Pred(30),
but see http://goo.gl/6q0tw
• If within data outperforms cross, the dataset is
highlighted with gray
• See only 2 datasets highlighted
EXPERIMENT 2: RETRIEVAL TENDENCY
OF TEAK FROM WITHIN AND CROSS-
SOURCE DATA
23
EXPERIMENT2: RETRIEVAL TENDENCY OF
TEAK FROM WITHIN AND CROSS-SOURCE
DATA
24
Diagonal (WC) vs.
Off-Diagonal (CC)
selection
percentages
sorted
Percentiles of diagonals
and off-diagonals
HIGHLIGHTS
25
1.  Don’t listen to everyone
•  When listening to a crowd, first
filter the noise
2.  Once the noise clears: bits of
me are similar to bits of you
•  Probability of selecting cross or
within instances is the same
3.  Cross-vs-within is not a
useful distinction
•  Locality(1) not informative
•  Enables “cross-company”
learning
TSE, 2013 :
LOCAL VS. GLOBAL
MODELS FOR EFFORT
ESTIMATION AND
DEFECT PREDICTION
TIM MENZIES, ANDREW BUTCHER (WVU)
ANDRIAN MARCUS (WAYNE STATE)
THOMAS ZIMMERMANN (MICROSOFT)
DAVID COK (GRAMMATECH)
WAR STORIES
(DEFECT ESTIMATION)
12/1/2011
27
ROOT CAUSE OF
CONCLUSION INSTABILITY?
HYPOTHESIS #1
Any one of….
•  Noisy data?
•  Too little data?
•  Poor statistical technique?
•  Stochastic choice within
data miner (e.g. random
forests)?
•  etc
HYPOTHESIS #2
SE is an inherently varied
activity
•  So conclusion instability
can’t be fixed
•  It must be managed
•  Needs different kinds of
data miners
12/1/2011
28
12/1/2011
29
Cluster then learn
(using envy)
•  Seek the fence
where the grass
is greener on
the other side.
•  Learn from
there
•  Test on
here
•  Cluster to find
“here” and
“there”
12/1/2011
30
ENVY =
THE WISDOM OF THE COWS
12/1/2011
31
@attribute recordnumber real
@attribute projectname {de,erb,gal,X,hst,slp,spl,Y}
@attribute cat2 {Avionics, application_ground, avionicsmonitoring, … }
@attribute center {1,2,3,4,5,6}
@attribute year real
@attribute mode {embedded,organic,semidetached}
@attribute rely {vl,l,n,h,vh,xh}
@attribute data {vl,l,n,h,vh,xh}
…
@attribute equivphyskloc real
@attribute act_effort real
@data
1,de,avionicsmonitoring,g,2,1979,semidetached,h,l,h,n,n,l,l,n,n,n,n,h,h,n,l,25.9,117.6
2,de,avionicsmonitoring,g,2,1979,semidetached,h,l,h,n,n,l,l,n,n,n,n,h,h,n,l,24.6,117.6
3,de,avionicsmonitoring,g,2,1979,semidetached,h,l,h,n,n,l,l,n,n,n,n,h,h,n,l,7.7,31.2
4,de,avionicsmonitoring,g,2,1979,semidetached,h,l,h,n,n,l,l,n,n,n,n,h,h,n,l,8.2,36
5,de,avionicsmonitoring,g,2,1979,semidetached,h,l,h,n,n,l,l,n,n,n,n,h,h,n,l,9.7,25.2
6,de,avionicsmonitoring,g,2,1979,semidetached,h,l,h,n,n,l,l,n,n,n,n,h,h,n,l,2.2,8.4
….
DATA = MULTI-DIMENSIONAL VECTORS
CAUTION: DATA MAY NOT DIVIDE
NEATLY ON RAW DIMENSIONS
The best description for SE projects may be synthesize
dimensions extracted from the raw dimensions
12/1/2011
32
FASTMAP
33
Fastmap: Faloutsos [1995]
O(2N) generation of axis of large variability
•  Pick any point W;
•  Find X furthest from W,
•  Find Y furthest from Y.
c = dist(X,Y)
All points have distance a,b to (X,Y)
•  x = (a2 + c2 − b2)/2c
•  y= sqrt(a2 – x2)
Find median(x), median(y)
Recurse on four quadrants
HIERARCHICAL PARTITIONING
Prune
Find two orthogonal dimensions
Find median(x), median(y)
Recurse on four quadrants
Combine quadtree leaves
with similar densities
Score each cluster by median
score of class variable
34
Grow
35
Learning via “envy”
•  Seek the fence
where the grass
is greener on
the other side.
•  Learn from
there
•  Test on
here
•  Cluster to find
“here” and
“there”
36
ENVY =
THE WISDOM OF THE COWS
HIERARCHICAL PARTITIONING
Prune
Find two orthogonal dimensions
Find median(x), median(y)
Recurse on four quadrants
Combine quadtree leaves
with similar densities
Score each cluster by median
score of class variable
37
Grow
HIERARCHICAL PARTITIONING
Prune
Find two orthogonal dimensions
Find median(x), median(y)
Recurse on four quadrants
Combine quadtree leaves
with similar densities
Score each cluster by median
score of class variable
This cluster envies its neighbor with
better score and max
abs(score(this) - score(neighbor))
38
Grow
Where is grass greenest?
Q: HOW TO LEARN RULES FROM
NEIGHBORING CLUSTERS
A: it doesn’t really matter
•  Many competent rule learners
But to evaluate global vs local rules:
•  Use the same rule learner for local vs global rule learning
This study uses WHICH (Menzies [2010])
• Customizable scoring operator
• Faster termination
• Generates very small rules (good for explanation)
39
DATA FROM
HTTP://PROMISEDATA.ORG/DATA
Effort reduction =
{ NasaCoc, China } :
COCOMO or function points
Defect reduction =
{lucene,xalan jedit,synapse,etc } :
CK metrics(OO)
Clusters have untreated class
distribution.
Rules select a subset of the
examples:
•  generate a treated class
distribution
40
0 20 40 60 80 100
25th
50th
75th
100th
untreated global local
Distributions have percentiles:
Treated with rules
learned from all data
Treated with rules learned
from neighboring cluster
Lower median efforts/defects (50th percentile)
Greater stability (75th – 25th percentile)
Decreased worst case (100th percentile)
BY ANY MEASURE,
LOCAL BETTER THAN GLOBAL
41
RULES LEARNED IN EACH CLUSTER
What works best “here” does not work “there”
•  Misguided to try and tame conclusion instability
•  Inherent in the data
Can’t tame conclusion instability.
•  Instead, you can exploit it
•  Learn local lessons that do better than overly generalized global theories
42
RELATED WORK
Other clustering methods:
•  nbTree, Kohavi [1996],
•  consensus clustering
Outlier removal :
•  Yin [2011], Yoon [2010],
Clustering & case-based reasoning
•  Kocaguneli [2011], Turhan [2009],
Cuadrado [2007]
Design of experiments
•  Learn via envy
•  Faster than N*M cross-val
Localizations:
•  Expert-based Petersen [2009]: how to
know it correct?
•  This work: auto-learning of contexts
Structured literature reviews:
•  Kitchenham [2007] + others
•  ?over-generalizations
Anything in any SE textbook
43
44
Conclusion
45
THE WISDOM OF THE CROWDS
46
THE WISDOM OF THE CROWDS
47
THE WISDOM OF THE CROWDS
48
THE WISDOM OF THE CROWDS
49
THE WISDOM OF THE COWS
•  Seek the fence
where the grass
is greener on
the other side.
•  Learn from
there
•  Test on
here
•  Cluster to find
“here” and
“there”
50
ENVY =
THE WISDOM OF THE COWS
51
52
Franhouder july2013

Weitere ähnliche Inhalte

Ähnlich wie Franhouder july2013

Core Methods In Educational Data Mining
Core Methods In Educational Data MiningCore Methods In Educational Data Mining
Core Methods In Educational Data Miningebelani
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersAlbert Y. C. Chen
 
Using Decision Trees to Analyze Online Learning Data
Using Decision Trees to Analyze Online Learning Data Using Decision Trees to Analyze Online Learning Data
Using Decision Trees to Analyze Online Learning Data Shalin Hai-Jew
 
Machine Learning Summary for Caltech2
Machine Learning Summary for Caltech2Machine Learning Summary for Caltech2
Machine Learning Summary for Caltech2Lukas Mandrake
 
2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf
2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf
2023-08-22 CoLLAs Tutorial - Beyond CIL.pdfVincenzo Lomonaco
 
math bio for 1st year math students
math bio for 1st year math studentsmath bio for 1st year math students
math bio for 1st year math studentsBen Bolker
 
AI3391 Artificial Intelligence Session 21 CSP.pptx
AI3391 Artificial Intelligence Session 21 CSP.pptxAI3391 Artificial Intelligence Session 21 CSP.pptx
AI3391 Artificial Intelligence Session 21 CSP.pptxAsst.prof M.Gokilavani
 
Genetic algorithm raktim
Genetic algorithm raktimGenetic algorithm raktim
Genetic algorithm raktimRaktim Halder
 
Genetic algorithm_raktim_IITKGP
Genetic algorithm_raktim_IITKGP Genetic algorithm_raktim_IITKGP
Genetic algorithm_raktim_IITKGP Raktim Halder
 
Updating Ecological Niche Modeling Methodologies
Updating Ecological Niche Modeling MethodologiesUpdating Ecological Niche Modeling Methodologies
Updating Ecological Niche Modeling MethodologiesTown Peterson
 
D1T3 enm workflows updated
D1T3 enm workflows updatedD1T3 enm workflows updated
D1T3 enm workflows updatedTown Peterson
 
ODSC India 2018: Topological space creation &amp; Clustering at BigData scale
ODSC India 2018: Topological space creation &amp; Clustering at BigData scaleODSC India 2018: Topological space creation &amp; Clustering at BigData scale
ODSC India 2018: Topological space creation &amp; Clustering at BigData scaleKuldeep Jiwani
 
causality_discussion_slides_final.pdf
causality_discussion_slides_final.pdfcausality_discussion_slides_final.pdf
causality_discussion_slides_final.pdfssuser8cde591
 
AI_Session 11: searching with Non-Deterministic Actions and partial observati...
AI_Session 11: searching with Non-Deterministic Actions and partial observati...AI_Session 11: searching with Non-Deterministic Actions and partial observati...
AI_Session 11: searching with Non-Deterministic Actions and partial observati...Asst.prof M.Gokilavani
 
What knowledge bases know (and what they don't)
What knowledge bases know (and what they don't)What knowledge bases know (and what they don't)
What knowledge bases know (and what they don't)srazniewski
 
Markov Blanket Causal Discovery Using Minimum Message Length
Markov Blanket Causal  Discovery Using Minimum  Message LengthMarkov Blanket Causal  Discovery Using Minimum  Message Length
Markov Blanket Causal Discovery Using Minimum Message LengthBayesian Intelligence
 
Machine Learning: Learning with data
Machine Learning: Learning with dataMachine Learning: Learning with data
Machine Learning: Learning with dataONE Talks
 

Ähnlich wie Franhouder july2013 (20)

Core Methods In Educational Data Mining
Core Methods In Educational Data MiningCore Methods In Educational Data Mining
Core Methods In Educational Data Mining
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional Managers
 
Using Decision Trees to Analyze Online Learning Data
Using Decision Trees to Analyze Online Learning Data Using Decision Trees to Analyze Online Learning Data
Using Decision Trees to Analyze Online Learning Data
 
Machine Learning Summary for Caltech2
Machine Learning Summary for Caltech2Machine Learning Summary for Caltech2
Machine Learning Summary for Caltech2
 
03 presentation-bothiesson
03 presentation-bothiesson03 presentation-bothiesson
03 presentation-bothiesson
 
attention-focus on what matters
attention-focus on what mattersattention-focus on what matters
attention-focus on what matters
 
2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf
2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf
2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf
 
math bio for 1st year math students
math bio for 1st year math studentsmath bio for 1st year math students
math bio for 1st year math students
 
AI3391 Artificial Intelligence Session 21 CSP.pptx
AI3391 Artificial Intelligence Session 21 CSP.pptxAI3391 Artificial Intelligence Session 21 CSP.pptx
AI3391 Artificial Intelligence Session 21 CSP.pptx
 
Genetic algorithm
Genetic algorithmGenetic algorithm
Genetic algorithm
 
Genetic algorithm raktim
Genetic algorithm raktimGenetic algorithm raktim
Genetic algorithm raktim
 
Genetic algorithm_raktim_IITKGP
Genetic algorithm_raktim_IITKGP Genetic algorithm_raktim_IITKGP
Genetic algorithm_raktim_IITKGP
 
Updating Ecological Niche Modeling Methodologies
Updating Ecological Niche Modeling MethodologiesUpdating Ecological Niche Modeling Methodologies
Updating Ecological Niche Modeling Methodologies
 
D1T3 enm workflows updated
D1T3 enm workflows updatedD1T3 enm workflows updated
D1T3 enm workflows updated
 
ODSC India 2018: Topological space creation &amp; Clustering at BigData scale
ODSC India 2018: Topological space creation &amp; Clustering at BigData scaleODSC India 2018: Topological space creation &amp; Clustering at BigData scale
ODSC India 2018: Topological space creation &amp; Clustering at BigData scale
 
causality_discussion_slides_final.pdf
causality_discussion_slides_final.pdfcausality_discussion_slides_final.pdf
causality_discussion_slides_final.pdf
 
AI_Session 11: searching with Non-Deterministic Actions and partial observati...
AI_Session 11: searching with Non-Deterministic Actions and partial observati...AI_Session 11: searching with Non-Deterministic Actions and partial observati...
AI_Session 11: searching with Non-Deterministic Actions and partial observati...
 
What knowledge bases know (and what they don't)
What knowledge bases know (and what they don't)What knowledge bases know (and what they don't)
What knowledge bases know (and what they don't)
 
Markov Blanket Causal Discovery Using Minimum Message Length
Markov Blanket Causal  Discovery Using Minimum  Message LengthMarkov Blanket Causal  Discovery Using Minimum  Message Length
Markov Blanket Causal Discovery Using Minimum Message Length
 
Machine Learning: Learning with data
Machine Learning: Learning with dataMachine Learning: Learning with data
Machine Learning: Learning with data
 

Mehr von CS, NcState

Talks2015 novdec
Talks2015 novdecTalks2015 novdec
Talks2015 novdecCS, NcState
 
GALE: Geometric active learning for Search-Based Software Engineering
GALE: Geometric active learning for Search-Based Software EngineeringGALE: Geometric active learning for Search-Based Software Engineering
GALE: Geometric active learning for Search-Based Software EngineeringCS, NcState
 
Big Data: the weakest link
Big Data: the weakest linkBig Data: the weakest link
Big Data: the weakest linkCS, NcState
 
Three Laws of Trusted Data Sharing: (Building a Better Business Case for Dat...
Three Laws of Trusted Data Sharing:(Building a Better Business Case for Dat...Three Laws of Trusted Data Sharing:(Building a Better Business Case for Dat...
Three Laws of Trusted Data Sharing: (Building a Better Business Case for Dat...CS, NcState
 
Lexisnexis june9
Lexisnexis june9Lexisnexis june9
Lexisnexis june9CS, NcState
 
Welcome to ICSE NIER’15 (new ideas and emerging results).
Welcome to ICSE NIER’15 (new ideas and emerging results).Welcome to ICSE NIER’15 (new ideas and emerging results).
Welcome to ICSE NIER’15 (new ideas and emerging results).CS, NcState
 
Kits to Find the Bits that Fits
Kits to Find  the Bits that Fits Kits to Find  the Bits that Fits
Kits to Find the Bits that Fits CS, NcState
 
Ai4se lab template
Ai4se lab templateAi4se lab template
Ai4se lab templateCS, NcState
 
Automated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSUAutomated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSUCS, NcState
 
Requirements Engineering
Requirements EngineeringRequirements Engineering
Requirements EngineeringCS, NcState
 
172529main ken and_tim_software_assurance_research_at_west_virginia
172529main ken and_tim_software_assurance_research_at_west_virginia172529main ken and_tim_software_assurance_research_at_west_virginia
172529main ken and_tim_software_assurance_research_at_west_virginiaCS, NcState
 
Automated Software Engineering
Automated Software EngineeringAutomated Software Engineering
Automated Software EngineeringCS, NcState
 
Next Generation “Treatment Learning” (finding the diamonds in the dust)
Next Generation “Treatment Learning” (finding the diamonds in the dust)Next Generation “Treatment Learning” (finding the diamonds in the dust)
Next Generation “Treatment Learning” (finding the diamonds in the dust)CS, NcState
 
Tim Menzies, directions in Data Science
Tim Menzies, directions in Data ScienceTim Menzies, directions in Data Science
Tim Menzies, directions in Data ScienceCS, NcState
 
Dagstuhl14 intro-v1
Dagstuhl14 intro-v1Dagstuhl14 intro-v1
Dagstuhl14 intro-v1CS, NcState
 
The Art and Science of Analyzing Software Data
The Art and Science of Analyzing Software DataThe Art and Science of Analyzing Software Data
The Art and Science of Analyzing Software DataCS, NcState
 
In the age of Big Data, what role for Software Engineers?
In the age of Big Data, what role for Software Engineers?In the age of Big Data, what role for Software Engineers?
In the age of Big Data, what role for Software Engineers?CS, NcState
 

Mehr von CS, NcState (20)

Talks2015 novdec
Talks2015 novdecTalks2015 novdec
Talks2015 novdec
 
Future se oct15
Future se oct15Future se oct15
Future se oct15
 
GALE: Geometric active learning for Search-Based Software Engineering
GALE: Geometric active learning for Search-Based Software EngineeringGALE: Geometric active learning for Search-Based Software Engineering
GALE: Geometric active learning for Search-Based Software Engineering
 
Big Data: the weakest link
Big Data: the weakest linkBig Data: the weakest link
Big Data: the weakest link
 
Three Laws of Trusted Data Sharing: (Building a Better Business Case for Dat...
Three Laws of Trusted Data Sharing:(Building a Better Business Case for Dat...Three Laws of Trusted Data Sharing:(Building a Better Business Case for Dat...
Three Laws of Trusted Data Sharing: (Building a Better Business Case for Dat...
 
Lexisnexis june9
Lexisnexis june9Lexisnexis june9
Lexisnexis june9
 
Welcome to ICSE NIER’15 (new ideas and emerging results).
Welcome to ICSE NIER’15 (new ideas and emerging results).Welcome to ICSE NIER’15 (new ideas and emerging results).
Welcome to ICSE NIER’15 (new ideas and emerging results).
 
Kits to Find the Bits that Fits
Kits to Find  the Bits that Fits Kits to Find  the Bits that Fits
Kits to Find the Bits that Fits
 
Ai4se lab template
Ai4se lab templateAi4se lab template
Ai4se lab template
 
Automated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSUAutomated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSU
 
Requirements Engineering
Requirements EngineeringRequirements Engineering
Requirements Engineering
 
172529main ken and_tim_software_assurance_research_at_west_virginia
172529main ken and_tim_software_assurance_research_at_west_virginia172529main ken and_tim_software_assurance_research_at_west_virginia
172529main ken and_tim_software_assurance_research_at_west_virginia
 
Automated Software Engineering
Automated Software EngineeringAutomated Software Engineering
Automated Software Engineering
 
Next Generation “Treatment Learning” (finding the diamonds in the dust)
Next Generation “Treatment Learning” (finding the diamonds in the dust)Next Generation “Treatment Learning” (finding the diamonds in the dust)
Next Generation “Treatment Learning” (finding the diamonds in the dust)
 
Tim Menzies, directions in Data Science
Tim Menzies, directions in Data ScienceTim Menzies, directions in Data Science
Tim Menzies, directions in Data Science
 
Goldrush
GoldrushGoldrush
Goldrush
 
Dagstuhl14 intro-v1
Dagstuhl14 intro-v1Dagstuhl14 intro-v1
Dagstuhl14 intro-v1
 
Know thy tools
Know thy toolsKnow thy tools
Know thy tools
 
The Art and Science of Analyzing Software Data
The Art and Science of Analyzing Software DataThe Art and Science of Analyzing Software Data
The Art and Science of Analyzing Software Data
 
In the age of Big Data, what role for Software Engineers?
In the age of Big Data, what role for Software Engineers?In the age of Big Data, what role for Software Engineers?
In the age of Big Data, what role for Software Engineers?
 

Kürzlich hochgeladen

Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Nikki Chapple
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentMahmoud Rabie
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialJoão Esperancinha
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...itnewsafrica
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 

Kürzlich hochgeladen (20)

Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career Development
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorial
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 

Franhouder july2013

  • 2. SOUND BITES •  Ye olde worlde SE •  “The” model of SE (defects, effort, etc) •  21st century SE •  Models (plural) •  No generality in models •  But , perhaps generality in how we find those models •  Transfer learning 2
  • 3. 3
  • 4. WHAT IS TRANSFER LEARNING? •  Source = old= Domain1 = < Eg1, P1> •  Target = new = Domain2 = <Eg2, P2> •  If we move from domain1 to domain2, do have have to start afresh? •  Or can we learn faster in “new” … •  … Using lessons learned from “old”? •  NSF funding (2013..2017): •  Transfer learning in Software Engineering •  Menzies, Layman, Shull , Diep 4
  • 5. WHO CARES? (WHAT’S AT STAKE?) •  “Transfer” is a core scientific issue •  Lack of transfer is the scandal of SE •  Replication is Empirical SE is rare •  Conclusion instability •  It all depends. •  The full stop syndrome •  The result? •  A funding crisis 5
  • 6. MANUAL TRANSFER (WAR STORIES) •  Brazil, SEL, 2002: need domain knowledge (but now gone)? •  NSF, SEL, 2006: need better automatic support •  Kitchenham, Mendes et al, TSE 2007: for = against •  Zimmermann FSE, 2009: cross works in 4/600 times 6
  • 7. WAR STORIES (EFFORT ESTIMATION) Effort = a . locx . y •  learned using Boehm’s methods •  20*66% of NASA93 •  COCOMO attributes •  Linear regression (log pre-processor) •  Sort the co-efficients found for each member of x,y 7
  • 9. SMARTER TRANSFER (AUTOMATIC SUPPORT) •  Don’t use all available training data •  Use relevancy filtering [Turhan’09 ESE journal] •  Use variance pruning ç this talk •  Don’t use the raw attributes •  Most are rubbish anyway •  Feature selection [Chen’05, IEEE Software] •  Feature synthesis ç this talk 9
  • 10. ESEM, 2011 : How to Find Relevant Data for Effort Estimation TIM MENZIES, EKREM KOCAGUNELI
  • 11. USD DOD MILITARY PROJECTS (LAST DECADE) 11 You must segment to find relevant data
  • 12. 12" DOMAIN SEGMENTATIONS 12 Q: What to do about rare zones? A: Select the nearest ones from the rest But how?
  • 13. IN THE LITERATURE: WITHIN VS CROSS = ?? BEFORE THIS WORK 13 Kitchenham et al. TSE 2007 •  Within-company learning (just use local data) •  Cross-company learning (just use data from other companies) Results mixed •  No clear win from cross or within Cross vs within are no rigid boundaries •  They are soft borders •  And we can move a few examples across the border •  And after making those moves •  “Cross” same as “local”
  • 14. SOME DATA DOES NOT DIVIDE NEATLY ON EXISTING DIMENSIONS 14
  • 15. THE LOCALITY(1) ASSUMPTION 15 Data divides best on one attribute 1.  development centers of developers; 2.  project type; e.g. embedded, etc; 3.  development language 4.  application type (MIS; GNC; etc); 5.  targeted hardware platform; 6.  in-house vs outsourced projects; 7.  Etc If Locality(1) : hard to use data across these boundaries •  Then harder to build effort models: •  Need to collect local data (slow)
  • 16. THE LOCALITY(N) ASSUMPTION 16 Data divides best on combination of attributes If Locality(N) • Easier to use data across these boundaries •  Relevant data spread all around •  little diamonds floating in the dust
  • 17. HOW TO FIND RELEVANT TRAINING DATA? 17 independent attributes w x y z class similar 1 0 1 1 1 2 similar 2 0 1 1 1 3 different 1 7 7 6 2 5 different 2 1 9 1 8 8 different 3 5 4 2 6 10 alien 1 74 15 73 56 20 alien 2 77 45 13 6 40 alien 3 35 99 31 21 60 alien 4 49 55 37 4 80 Use similar? Use more variant? Use aliens ?
  • 18. VARIANCE PRUNING 18 independent attributes w x y z class similar 1 0 1 1 1 2 similar 2 0 1 1 1 3 different 1 7 7 6 2 5 different 2 1 9 1 8 8 different 3 5 4 2 6 10 alien 1 74 15 73 56 20 alien 2 77 45 13 6 40 alien 3 35 99 31 21 60 alien 4 49 55 37 4 80 1) Sort the clusters by “variance” 2) Prune those high variance things 3) Estimate on the rest “Easy path”: cull the examples that hurt the learner PRUNE ! KEEP !
  • 19. TEAK: CLUSTERING + VARIANCE PRUNING (TSE, JAN 2011) 19 • TEAK is a variance-based instance selector • It is built via GAC trees • TEAK is a two-pass system • First pass selects low- variance relevant projects • Second pass retrieves projects to estimate from
  • 20. ESSENTIAL POINT 20 TEAK finds local regions important to the estimation of particular cases TEAK finds those regions via locality(N) •  Not locality(1)
  • 21. WITHIN AND CROSS DATASETS 21 Note: all Locality(1) divisions
  • 22. EXPERIMENT1: PERFORMANCE COMPARISON OF WITHIN AND CROSS- SOURCE DATA 22 • TEAK on within & cross data for each dataset group (lines separate groups) • LOOCV used for runs • 20 runs performed for each treatment • Results evaluated w.r.t. MAR, MMRE, MdMRE and Pred(30), but see http://goo.gl/6q0tw • If within data outperforms cross, the dataset is highlighted with gray • See only 2 datasets highlighted
  • 23. EXPERIMENT 2: RETRIEVAL TENDENCY OF TEAK FROM WITHIN AND CROSS- SOURCE DATA 23
  • 24. EXPERIMENT2: RETRIEVAL TENDENCY OF TEAK FROM WITHIN AND CROSS-SOURCE DATA 24 Diagonal (WC) vs. Off-Diagonal (CC) selection percentages sorted Percentiles of diagonals and off-diagonals
  • 25. HIGHLIGHTS 25 1.  Don’t listen to everyone •  When listening to a crowd, first filter the noise 2.  Once the noise clears: bits of me are similar to bits of you •  Probability of selecting cross or within instances is the same 3.  Cross-vs-within is not a useful distinction •  Locality(1) not informative •  Enables “cross-company” learning
  • 26. TSE, 2013 : LOCAL VS. GLOBAL MODELS FOR EFFORT ESTIMATION AND DEFECT PREDICTION TIM MENZIES, ANDREW BUTCHER (WVU) ANDRIAN MARCUS (WAYNE STATE) THOMAS ZIMMERMANN (MICROSOFT) DAVID COK (GRAMMATECH)
  • 28. ROOT CAUSE OF CONCLUSION INSTABILITY? HYPOTHESIS #1 Any one of…. •  Noisy data? •  Too little data? •  Poor statistical technique? •  Stochastic choice within data miner (e.g. random forests)? •  etc HYPOTHESIS #2 SE is an inherently varied activity •  So conclusion instability can’t be fixed •  It must be managed •  Needs different kinds of data miners 12/1/2011 28
  • 30. •  Seek the fence where the grass is greener on the other side. •  Learn from there •  Test on here •  Cluster to find “here” and “there” 12/1/2011 30 ENVY = THE WISDOM OF THE COWS
  • 31. 12/1/2011 31 @attribute recordnumber real @attribute projectname {de,erb,gal,X,hst,slp,spl,Y} @attribute cat2 {Avionics, application_ground, avionicsmonitoring, … } @attribute center {1,2,3,4,5,6} @attribute year real @attribute mode {embedded,organic,semidetached} @attribute rely {vl,l,n,h,vh,xh} @attribute data {vl,l,n,h,vh,xh} … @attribute equivphyskloc real @attribute act_effort real @data 1,de,avionicsmonitoring,g,2,1979,semidetached,h,l,h,n,n,l,l,n,n,n,n,h,h,n,l,25.9,117.6 2,de,avionicsmonitoring,g,2,1979,semidetached,h,l,h,n,n,l,l,n,n,n,n,h,h,n,l,24.6,117.6 3,de,avionicsmonitoring,g,2,1979,semidetached,h,l,h,n,n,l,l,n,n,n,n,h,h,n,l,7.7,31.2 4,de,avionicsmonitoring,g,2,1979,semidetached,h,l,h,n,n,l,l,n,n,n,n,h,h,n,l,8.2,36 5,de,avionicsmonitoring,g,2,1979,semidetached,h,l,h,n,n,l,l,n,n,n,n,h,h,n,l,9.7,25.2 6,de,avionicsmonitoring,g,2,1979,semidetached,h,l,h,n,n,l,l,n,n,n,n,h,h,n,l,2.2,8.4 …. DATA = MULTI-DIMENSIONAL VECTORS
  • 32. CAUTION: DATA MAY NOT DIVIDE NEATLY ON RAW DIMENSIONS The best description for SE projects may be synthesize dimensions extracted from the raw dimensions 12/1/2011 32
  • 33. FASTMAP 33 Fastmap: Faloutsos [1995] O(2N) generation of axis of large variability •  Pick any point W; •  Find X furthest from W, •  Find Y furthest from Y. c = dist(X,Y) All points have distance a,b to (X,Y) •  x = (a2 + c2 − b2)/2c •  y= sqrt(a2 – x2) Find median(x), median(y) Recurse on four quadrants
  • 34. HIERARCHICAL PARTITIONING Prune Find two orthogonal dimensions Find median(x), median(y) Recurse on four quadrants Combine quadtree leaves with similar densities Score each cluster by median score of class variable 34 Grow
  • 36. •  Seek the fence where the grass is greener on the other side. •  Learn from there •  Test on here •  Cluster to find “here” and “there” 36 ENVY = THE WISDOM OF THE COWS
  • 37. HIERARCHICAL PARTITIONING Prune Find two orthogonal dimensions Find median(x), median(y) Recurse on four quadrants Combine quadtree leaves with similar densities Score each cluster by median score of class variable 37 Grow
  • 38. HIERARCHICAL PARTITIONING Prune Find two orthogonal dimensions Find median(x), median(y) Recurse on four quadrants Combine quadtree leaves with similar densities Score each cluster by median score of class variable This cluster envies its neighbor with better score and max abs(score(this) - score(neighbor)) 38 Grow Where is grass greenest?
  • 39. Q: HOW TO LEARN RULES FROM NEIGHBORING CLUSTERS A: it doesn’t really matter •  Many competent rule learners But to evaluate global vs local rules: •  Use the same rule learner for local vs global rule learning This study uses WHICH (Menzies [2010]) • Customizable scoring operator • Faster termination • Generates very small rules (good for explanation) 39
  • 40. DATA FROM HTTP://PROMISEDATA.ORG/DATA Effort reduction = { NasaCoc, China } : COCOMO or function points Defect reduction = {lucene,xalan jedit,synapse,etc } : CK metrics(OO) Clusters have untreated class distribution. Rules select a subset of the examples: •  generate a treated class distribution 40 0 20 40 60 80 100 25th 50th 75th 100th untreated global local Distributions have percentiles: Treated with rules learned from all data Treated with rules learned from neighboring cluster
  • 41. Lower median efforts/defects (50th percentile) Greater stability (75th – 25th percentile) Decreased worst case (100th percentile) BY ANY MEASURE, LOCAL BETTER THAN GLOBAL 41
  • 42. RULES LEARNED IN EACH CLUSTER What works best “here” does not work “there” •  Misguided to try and tame conclusion instability •  Inherent in the data Can’t tame conclusion instability. •  Instead, you can exploit it •  Learn local lessons that do better than overly generalized global theories 42
  • 43. RELATED WORK Other clustering methods: •  nbTree, Kohavi [1996], •  consensus clustering Outlier removal : •  Yin [2011], Yoon [2010], Clustering & case-based reasoning •  Kocaguneli [2011], Turhan [2009], Cuadrado [2007] Design of experiments •  Learn via envy •  Faster than N*M cross-val Localizations: •  Expert-based Petersen [2009]: how to know it correct? •  This work: auto-learning of contexts Structured literature reviews: •  Kitchenham [2007] + others •  ?over-generalizations Anything in any SE textbook 43
  • 45. 45 THE WISDOM OF THE CROWDS
  • 46. 46 THE WISDOM OF THE CROWDS
  • 47. 47 THE WISDOM OF THE CROWDS
  • 48. 48 THE WISDOM OF THE CROWDS
  • 49. 49 THE WISDOM OF THE COWS
  • 50. •  Seek the fence where the grass is greener on the other side. •  Learn from there •  Test on here •  Cluster to find “here” and “there” 50 ENVY = THE WISDOM OF THE COWS
  • 51. 51
  • 52. 52