SlideShare a Scribd company logo
1 of 18
bagusco@gmail.com
bagusco@ipb.ac.id
KDD Cup 2010: Overview
• The Challenge
   – How generally or narrowly do students learn? How quickly or
     slowly? Will the rate of improvement vary between students?
     What does it mean for one problem to be similar to another?

   – Is it possible to infer the knowledge requirements of problems
     directly from student performance data, without human analysis
     of the tasks?

   – This year's challenge asks you to predict student performance
     on mathematical problems from logs of student interaction with
     Intelligent Tutoring Systems.
KDD Cup 2010: Results
• Winners of KDD Cup 2010: All Teams
   – First Place: National Taiwan University
     Feature engineering and classifier ensembling for KDD CUP
     2010

   – First Runner Up: Zhang and Su
     Gradient Boosting Machines with Singular Value Decomposition

   – Second Runner Up: BigChaos @ KDD
     Collaborative Filtering Applied to Educational Data Mining
Outline
•   What is Ensemble Learning?
•   Why Ensemble?
•   How good is Ensemble?
•   What next?
Predictive Modeling
• Widely-used in many applications:
  – Business
     • Churn modeling, Scoring
  – Science
     • Chemometrics
  – Bio-Science
     • Efficacy modeling, Classification
  – Academics
     • Admission selection, student performance
Predictive Modeling
                          New
                         Data Set


Training      Model      Predictive   Prediction
  Set      Development     Rules
Classical Approach: Model Selection




   Which one is the best?
New Approach?: Ensemble




  Combine all models!!!
What is Ensemble?
• Single Expert   vs   Team of Experts
What is Ensemble?
                        Data Set



   Training Set #1   Training Set #2   ……   Training Set #k
                                       .

     Learning           Learning              Learning
                                       ……
     Model #1           Model #2              Model #k
                                       .

                        Combiner



                        Ensemble
                        Prediction
Types of Ensemble
• Hybrid Ensemble
  – Combining several different learning algorithms into
    one prediction
  – e.g: combining the result of regression, tree, neural
    nets, and support vector machine

• Non-Hybrid Ensemble
  – Combining several learning models from the same
    algorithm into one prediction
Well-Known Ensembles
• Bagging
  – Generate learning models for the bootstrap samples
  – Aggregate the predictions via averaging or majority-vote
• Boosting (AdaBoost)
  – Generate sequential learning models with higher weight to
    ‘difficult’ cases
  – Combine the predictions by concerning the weight
• Random Forest
  – Similar to bagging except the existence of random feature
    selection for each learning model generation
How Good is Ensemble?
Error Rate
 0.7
                           tree
 0.6
                           bagging
 0.5                       adaboost
 0.4
 0.3
 0.2
 0.1
  0
       1   2   3   4   5   6   7   8   9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33


  Source: Dietterich (1999)
How Good is Ensemble?
AUC
 0.9

 0.8                                                   CART
                                                       C45
 0.7
                                                       Bagging
 0.6                                                   Random Forest
                                                       Rotation Forest
 0.5
                                                       Rotation Boost
 0.4
           DIY          Bank   Telecom1   Mail-order
Source: Bock & Poel (2011)
What Next
• Ensemble Predictive Models

• Class-Imbalance Models
  – Gradient
    Boosting, EasyEnsemble, BalanceCascade, SMOTE
    Boost

• Robust Predictive Models
  – Noise Ensemble
Ensemble in SAS/EM
THANK YOU
Bagus Sartono
Educational Background        Professional Experience
• Bachelor of Science in      • Lecturer – Dept of Stats
  Stats – IPB (2000)            IPB
• Master of Science in        • Experienced Trainer in
  Stats – IPB (2004)            Analytics (Bank
• PhD in Applied                Indonesia, Bank
  Economics – University of     Mandiri, Ganesha Cipta
  Antwerp (2012)                Informatika, CIFOR, LIPI,
                                LPEM-UI, etc)

More Related Content

Viewers also liked

Digital media for marketing meeting 2011
Digital media for marketing meeting 2011Digital media for marketing meeting 2011
Digital media for marketing meeting 2011cmviasat
 
WCAN 2013 Spring ライトニングトーク『学びのコツを掴んで変化が激しい時代を楽しもう』
WCAN 2013 Spring ライトニングトーク『学びのコツを掴んで変化が激しい時代を楽しもう』WCAN 2013 Spring ライトニングトーク『学びのコツを掴んで変化が激しい時代を楽しもう』
WCAN 2013 Spring ライトニングトーク『学びのコツを掴んで変化が激しい時代を楽しもう』takuo yamada
 
Jonge Democraten - Individueel pensioen zonder sociale partners.
Jonge Democraten - Individueel pensioen zonder sociale partners.Jonge Democraten - Individueel pensioen zonder sociale partners.
Jonge Democraten - Individueel pensioen zonder sociale partners.BeFrank
 
INTERIOR-iD Portfolio
INTERIOR-iD PortfolioINTERIOR-iD Portfolio
INTERIOR-iD PortfolioRadaschitz
 
Increasing talent mobility: (Open) Badges @ Selor
Increasing talent mobility: (Open) Badges @ SelorIncreasing talent mobility: (Open) Badges @ Selor
Increasing talent mobility: (Open) Badges @ SelorVincent Van Malderen
 
The Future of Content Marketing
The Future of Content MarketingThe Future of Content Marketing
The Future of Content MarketingLucia Novara
 
Programming SharePoint 2010 with Visual Studio 2010
Programming SharePoint 2010 with Visual Studio 2010Programming SharePoint 2010 with Visual Studio 2010
Programming SharePoint 2010 with Visual Studio 2010Quang Nguyễn Bá
 
39808 sum orientation2011_sav_graduate_ppt
39808 sum orientation2011_sav_graduate_ppt39808 sum orientation2011_sav_graduate_ppt
39808 sum orientation2011_sav_graduate_pptTreyReckling
 
pension jugement
pension jugementpension jugement
pension jugementraph98
 
2 phil lit, pre colonial period
2 phil lit, pre colonial period2 phil lit, pre colonial period
2 phil lit, pre colonial periodMarien Be
 
自転車通勤のススメ
自転車通勤のススメ自転車通勤のススメ
自転車通勤のススメtakuo yamada
 
Unlearning unlimited
Unlearning unlimitedUnlearning unlimited
Unlearning unlimitedPravin Sabnis
 
Building a $100k and flexible design career
Building a $100k and flexible design careerBuilding a $100k and flexible design career
Building a $100k and flexible design careeradambcarney
 

Viewers also liked (19)

Chapter 7
Chapter 7Chapter 7
Chapter 7
 
Digital media for marketing meeting 2011
Digital media for marketing meeting 2011Digital media for marketing meeting 2011
Digital media for marketing meeting 2011
 
WCAN 2013 Spring ライトニングトーク『学びのコツを掴んで変化が激しい時代を楽しもう』
WCAN 2013 Spring ライトニングトーク『学びのコツを掴んで変化が激しい時代を楽しもう』WCAN 2013 Spring ライトニングトーク『学びのコツを掴んで変化が激しい時代を楽しもう』
WCAN 2013 Spring ライトニングトーク『学びのコツを掴んで変化が激しい時代を楽しもう』
 
Jonge Democraten - Individueel pensioen zonder sociale partners.
Jonge Democraten - Individueel pensioen zonder sociale partners.Jonge Democraten - Individueel pensioen zonder sociale partners.
Jonge Democraten - Individueel pensioen zonder sociale partners.
 
INTERIOR-iD Portfolio
INTERIOR-iD PortfolioINTERIOR-iD Portfolio
INTERIOR-iD Portfolio
 
California
California California
California
 
Increasing talent mobility: (Open) Badges @ Selor
Increasing talent mobility: (Open) Badges @ SelorIncreasing talent mobility: (Open) Badges @ Selor
Increasing talent mobility: (Open) Badges @ Selor
 
The Future of Content Marketing
The Future of Content MarketingThe Future of Content Marketing
The Future of Content Marketing
 
The compass
The compassThe compass
The compass
 
A REFORMA E A CONTRARREFORMA
A REFORMA E A CONTRARREFORMAA REFORMA E A CONTRARREFORMA
A REFORMA E A CONTRARREFORMA
 
Programming SharePoint 2010 with Visual Studio 2010
Programming SharePoint 2010 with Visual Studio 2010Programming SharePoint 2010 with Visual Studio 2010
Programming SharePoint 2010 with Visual Studio 2010
 
39808 sum orientation2011_sav_graduate_ppt
39808 sum orientation2011_sav_graduate_ppt39808 sum orientation2011_sav_graduate_ppt
39808 sum orientation2011_sav_graduate_ppt
 
pension jugement
pension jugementpension jugement
pension jugement
 
Rbp ph
Rbp phRbp ph
Rbp ph
 
2 phil lit, pre colonial period
2 phil lit, pre colonial period2 phil lit, pre colonial period
2 phil lit, pre colonial period
 
自転車通勤のススメ
自転車通勤のススメ自転車通勤のススメ
自転車通勤のススメ
 
C
CC
C
 
Unlearning unlimited
Unlearning unlimitedUnlearning unlimited
Unlearning unlimited
 
Building a $100k and flexible design career
Building a $100k and flexible design careerBuilding a $100k and flexible design career
Building a $100k and flexible design career
 

Similar to Improving the Model’s Predictive Power with Ensemble Approaches

Predict oscars (4:17)
Predict oscars (4:17)Predict oscars (4:17)
Predict oscars (4:17)Thinkful
 
Machine learning Introduction
Machine learning IntroductionMachine learning Introduction
Machine learning IntroductionDong Guo
 
How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?Tuan Yang
 
Hadoop Summit 2010 Machine Learning Using Hadoop
Hadoop Summit 2010 Machine Learning Using HadoopHadoop Summit 2010 Machine Learning Using Hadoop
Hadoop Summit 2010 Machine Learning Using HadoopYahoo Developer Network
 
Machine Learning for Everyone
Machine Learning for EveryoneMachine Learning for Everyone
Machine Learning for EveryoneAly Abdelkareem
 
Predict the Oscars with Data Science
Predict the Oscars with Data SciencePredict the Oscars with Data Science
Predict the Oscars with Data ScienceCarlos Edo
 
in5490-classification (1).pptx
in5490-classification (1).pptxin5490-classification (1).pptx
in5490-classification (1).pptxMonicaTimber
 
EssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdfEssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdfAnkita Tiwari
 
Predict oscars (5:11)
Predict oscars (5:11)Predict oscars (5:11)
Predict oscars (5:11)Thinkful
 
Machine Learning: Learning with data
Machine Learning: Learning with dataMachine Learning: Learning with data
Machine Learning: Learning with dataONE Talks
 
One talk Machine Learning
One talk Machine LearningOne talk Machine Learning
One talk Machine LearningONE Talks
 
To bag, or to boost? A question of balance
To bag, or to boost? A question of balanceTo bag, or to boost? A question of balance
To bag, or to boost? A question of balanceAlex Henderson
 
[ESWC2017 - PhD Symposium] Enhancing white-box machine learning processes by ...
[ESWC2017 - PhD Symposium] Enhancing white-box machine learning processes by ...[ESWC2017 - PhD Symposium] Enhancing white-box machine learning processes by ...
[ESWC2017 - PhD Symposium] Enhancing white-box machine learning processes by ...Gilles Vandewiele
 
20211229120253D6323_PERT 06_ Ensemble Learning.pptx
20211229120253D6323_PERT 06_ Ensemble Learning.pptx20211229120253D6323_PERT 06_ Ensemble Learning.pptx
20211229120253D6323_PERT 06_ Ensemble Learning.pptxRaflyRizky2
 
CATALST intro stats course presentation at JMM 2013 (Elizabeth Fry, Laura Zie...
CATALST intro stats course presentation at JMM 2013 (Elizabeth Fry, Laura Zie...CATALST intro stats course presentation at JMM 2013 (Elizabeth Fry, Laura Zie...
CATALST intro stats course presentation at JMM 2013 (Elizabeth Fry, Laura Zie...statisfactions
 
Predict the Oscars with Data Science
Predict the Oscars with Data SciencePredict the Oscars with Data Science
Predict the Oscars with Data ScienceThinkful
 
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...MLAI2
 
李俊良/Feature Engineering in Machine Learning
李俊良/Feature Engineering in Machine Learning李俊良/Feature Engineering in Machine Learning
李俊良/Feature Engineering in Machine Learning台灣資料科學年會
 

Similar to Improving the Model’s Predictive Power with Ensemble Approaches (20)

Predict oscars (4:17)
Predict oscars (4:17)Predict oscars (4:17)
Predict oscars (4:17)
 
Machine learning Introduction
Machine learning IntroductionMachine learning Introduction
Machine learning Introduction
 
How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?
 
Hadoop Summit 2010 Machine Learning Using Hadoop
Hadoop Summit 2010 Machine Learning Using HadoopHadoop Summit 2010 Machine Learning Using Hadoop
Hadoop Summit 2010 Machine Learning Using Hadoop
 
Machine Learning for Everyone
Machine Learning for EveryoneMachine Learning for Everyone
Machine Learning for Everyone
 
Predict the Oscars with Data Science
Predict the Oscars with Data SciencePredict the Oscars with Data Science
Predict the Oscars with Data Science
 
in5490-classification (1).pptx
in5490-classification (1).pptxin5490-classification (1).pptx
in5490-classification (1).pptx
 
EssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdfEssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdf
 
Predict oscars (5:11)
Predict oscars (5:11)Predict oscars (5:11)
Predict oscars (5:11)
 
Machine Learning: Learning with data
Machine Learning: Learning with dataMachine Learning: Learning with data
Machine Learning: Learning with data
 
One talk Machine Learning
One talk Machine LearningOne talk Machine Learning
One talk Machine Learning
 
To bag, or to boost? A question of balance
To bag, or to boost? A question of balanceTo bag, or to boost? A question of balance
To bag, or to boost? A question of balance
 
[ESWC2017 - PhD Symposium] Enhancing white-box machine learning processes by ...
[ESWC2017 - PhD Symposium] Enhancing white-box machine learning processes by ...[ESWC2017 - PhD Symposium] Enhancing white-box machine learning processes by ...
[ESWC2017 - PhD Symposium] Enhancing white-box machine learning processes by ...
 
20211229120253D6323_PERT 06_ Ensemble Learning.pptx
20211229120253D6323_PERT 06_ Ensemble Learning.pptx20211229120253D6323_PERT 06_ Ensemble Learning.pptx
20211229120253D6323_PERT 06_ Ensemble Learning.pptx
 
Machine learning
Machine learning Machine learning
Machine learning
 
CATALST intro stats course presentation at JMM 2013 (Elizabeth Fry, Laura Zie...
CATALST intro stats course presentation at JMM 2013 (Elizabeth Fry, Laura Zie...CATALST intro stats course presentation at JMM 2013 (Elizabeth Fry, Laura Zie...
CATALST intro stats course presentation at JMM 2013 (Elizabeth Fry, Laura Zie...
 
What is Machine Learning
What is Machine LearningWhat is Machine Learning
What is Machine Learning
 
Predict the Oscars with Data Science
Predict the Oscars with Data SciencePredict the Oscars with Data Science
Predict the Oscars with Data Science
 
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
 
李俊良/Feature Engineering in Machine Learning
李俊良/Feature Engineering in Machine Learning李俊良/Feature Engineering in Machine Learning
李俊良/Feature Engineering in Machine Learning
 

More from SAS Asia Pacific

Produce Analytical Talent to Meet the Industry Needs
Produce Analytical Talent to Meet the Industry NeedsProduce Analytical Talent to Meet the Industry Needs
Produce Analytical Talent to Meet the Industry NeedsSAS Asia Pacific
 
Better decisions through analytics in healthcare industry. Our journey so far
Better decisions through analytics in healthcare industry.  Our journey so farBetter decisions through analytics in healthcare industry.  Our journey so far
Better decisions through analytics in healthcare industry. Our journey so farSAS Asia Pacific
 
How can Analytics Drive Customer Values?
How can Analytics Drive Customer Values?How can Analytics Drive Customer Values?
How can Analytics Drive Customer Values?SAS Asia Pacific
 
Developing an Analytical Mindset – Becoming an Analytical Competitor
Developing an Analytical Mindset – Becoming an Analytical CompetitorDeveloping an Analytical Mindset – Becoming an Analytical Competitor
Developing an Analytical Mindset – Becoming an Analytical CompetitorSAS Asia Pacific
 
Gaining New Insights into Usage Log Data
Gaining New Insights into Usage Log Data Gaining New Insights into Usage Log Data
Gaining New Insights into Usage Log Data SAS Asia Pacific
 
Predictive Analytics: Advanced techniques in data mining
Predictive Analytics: Advanced techniques in data miningPredictive Analytics: Advanced techniques in data mining
Predictive Analytics: Advanced techniques in data miningSAS Asia Pacific
 
A Journey through the Spatial Data Mining and Geographic Knowledge Discover J...
A Journey through the Spatial Data Mining and Geographic Knowledge Discover J...A Journey through the Spatial Data Mining and Geographic Knowledge Discover J...
A Journey through the Spatial Data Mining and Geographic Knowledge Discover J...SAS Asia Pacific
 
A journey through the spatial data mining and geographic knowledge discovery ...
A journey through the spatial data mining and geographic knowledge discovery ...A journey through the spatial data mining and geographic knowledge discovery ...
A journey through the spatial data mining and geographic knowledge discovery ...SAS Asia Pacific
 

More from SAS Asia Pacific (9)

Produce Analytical Talent to Meet the Industry Needs
Produce Analytical Talent to Meet the Industry NeedsProduce Analytical Talent to Meet the Industry Needs
Produce Analytical Talent to Meet the Industry Needs
 
Better decisions through analytics in healthcare industry. Our journey so far
Better decisions through analytics in healthcare industry.  Our journey so farBetter decisions through analytics in healthcare industry.  Our journey so far
Better decisions through analytics in healthcare industry. Our journey so far
 
How can Analytics Drive Customer Values?
How can Analytics Drive Customer Values?How can Analytics Drive Customer Values?
How can Analytics Drive Customer Values?
 
Developing an Analytical Mindset – Becoming an Analytical Competitor
Developing an Analytical Mindset – Becoming an Analytical CompetitorDeveloping an Analytical Mindset – Becoming an Analytical Competitor
Developing an Analytical Mindset – Becoming an Analytical Competitor
 
Gaining New Insights into Usage Log Data
Gaining New Insights into Usage Log Data Gaining New Insights into Usage Log Data
Gaining New Insights into Usage Log Data
 
Predictive Analytics: Advanced techniques in data mining
Predictive Analytics: Advanced techniques in data miningPredictive Analytics: Advanced techniques in data mining
Predictive Analytics: Advanced techniques in data mining
 
Technology Update
Technology UpdateTechnology Update
Technology Update
 
A Journey through the Spatial Data Mining and Geographic Knowledge Discover J...
A Journey through the Spatial Data Mining and Geographic Knowledge Discover J...A Journey through the Spatial Data Mining and Geographic Knowledge Discover J...
A Journey through the Spatial Data Mining and Geographic Knowledge Discover J...
 
A journey through the spatial data mining and geographic knowledge discovery ...
A journey through the spatial data mining and geographic knowledge discovery ...A journey through the spatial data mining and geographic knowledge discovery ...
A journey through the spatial data mining and geographic knowledge discovery ...
 

Recently uploaded

How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 

Recently uploaded (20)

How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 

Improving the Model’s Predictive Power with Ensemble Approaches

  • 2. KDD Cup 2010: Overview • The Challenge – How generally or narrowly do students learn? How quickly or slowly? Will the rate of improvement vary between students? What does it mean for one problem to be similar to another? – Is it possible to infer the knowledge requirements of problems directly from student performance data, without human analysis of the tasks? – This year's challenge asks you to predict student performance on mathematical problems from logs of student interaction with Intelligent Tutoring Systems.
  • 3. KDD Cup 2010: Results • Winners of KDD Cup 2010: All Teams – First Place: National Taiwan University Feature engineering and classifier ensembling for KDD CUP 2010 – First Runner Up: Zhang and Su Gradient Boosting Machines with Singular Value Decomposition – Second Runner Up: BigChaos @ KDD Collaborative Filtering Applied to Educational Data Mining
  • 4. Outline • What is Ensemble Learning? • Why Ensemble? • How good is Ensemble? • What next?
  • 5. Predictive Modeling • Widely-used in many applications: – Business • Churn modeling, Scoring – Science • Chemometrics – Bio-Science • Efficacy modeling, Classification – Academics • Admission selection, student performance
  • 6. Predictive Modeling New Data Set Training Model Predictive Prediction Set Development Rules
  • 7. Classical Approach: Model Selection Which one is the best?
  • 8. New Approach?: Ensemble Combine all models!!!
  • 9. What is Ensemble? • Single Expert vs Team of Experts
  • 10. What is Ensemble? Data Set Training Set #1 Training Set #2 …… Training Set #k . Learning Learning Learning …… Model #1 Model #2 Model #k . Combiner Ensemble Prediction
  • 11. Types of Ensemble • Hybrid Ensemble – Combining several different learning algorithms into one prediction – e.g: combining the result of regression, tree, neural nets, and support vector machine • Non-Hybrid Ensemble – Combining several learning models from the same algorithm into one prediction
  • 12. Well-Known Ensembles • Bagging – Generate learning models for the bootstrap samples – Aggregate the predictions via averaging or majority-vote • Boosting (AdaBoost) – Generate sequential learning models with higher weight to ‘difficult’ cases – Combine the predictions by concerning the weight • Random Forest – Similar to bagging except the existence of random feature selection for each learning model generation
  • 13. How Good is Ensemble? Error Rate 0.7 tree 0.6 bagging 0.5 adaboost 0.4 0.3 0.2 0.1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 Source: Dietterich (1999)
  • 14. How Good is Ensemble? AUC 0.9 0.8 CART C45 0.7 Bagging 0.6 Random Forest Rotation Forest 0.5 Rotation Boost 0.4 DIY Bank Telecom1 Mail-order Source: Bock & Poel (2011)
  • 15. What Next • Ensemble Predictive Models • Class-Imbalance Models – Gradient Boosting, EasyEnsemble, BalanceCascade, SMOTE Boost • Robust Predictive Models – Noise Ensemble
  • 18. Bagus Sartono Educational Background Professional Experience • Bachelor of Science in • Lecturer – Dept of Stats Stats – IPB (2000) IPB • Master of Science in • Experienced Trainer in Stats – IPB (2004) Analytics (Bank • PhD in Applied Indonesia, Bank Economics – University of Mandiri, Ganesha Cipta Antwerp (2012) Informatika, CIFOR, LIPI, LPEM-UI, etc)