SlideShare ist ein Scribd-Unternehmen logo
1 von 4
By Beibei Yang and Zheng Fang




January 7, 2009                                   1
Introduction
  Machine Learning: A computer program is said to learn from
  experience E with respect to some class of tasks T and performance
  measure P, if its performance at tasks in T, as measured by P, improves
  with experience E
    ith       i       E.
  Overfitting: Given a hypothesis space H, a hypothesis h H is said to
  overfit the training data if there exists some alternative hypothesis h’
  H,
  H such th t h h smaller error th h’ over th t i i examples, b t
         h that has        ll           than          the training      l   but
  h’ has smaller error than h over the entire distribution of instances.
  Decision tree: (aka classification trees or regression trees) In these
  tree structures, leaves represent classifications and b
  t     t t         l                  t l     ifi ti       d branches represent
                                                                   h            t
  conjunctions of features that lead to those classifications.
  Our implementation used the ID3 algorithm, which uses the entropy and
  gain of each node t create a classification t
      i f       h d to         t     l     ifi ti tree.




 January 7, 2009                                                                    2
Cause of Overfitting (1) Lack of training data
                      Standard
                         vs.
                    Insufficiency




  January 7, 2009                            3
Cause of Overfitting (2) Biased data
                    Standard
                       vs.
                      Bias




  January 7, 2009                      4

Weitere ähnliche Inhalte

Mehr von Beibei Yang

Hubway Half a Million Trip Data
Hubway Half a Million Trip DataHubway Half a Million Trip Data
Hubway Half a Million Trip DataBeibei Yang
 
Semantic Relatedness for Evaluation of Course Equivalencies
Semantic Relatedness for Evaluation of Course EquivalenciesSemantic Relatedness for Evaluation of Course Equivalencies
Semantic Relatedness for Evaluation of Course EquivalenciesBeibei Yang
 
Augmenting mobile 3 g using wifi
Augmenting mobile 3 g using wifiAugmenting mobile 3 g using wifi
Augmenting mobile 3 g using wifiBeibei Yang
 
91.650 Paper Presentation
91.650 Paper Presentation91.650 Paper Presentation
91.650 Paper PresentationBeibei Yang
 
Google Kernel Function
Google Kernel FunctionGoogle Kernel Function
Google Kernel FunctionBeibei Yang
 
Class Project Showcase: DNS Spoofing
Class Project Showcase: DNS SpoofingClass Project Showcase: DNS Spoofing
Class Project Showcase: DNS SpoofingBeibei Yang
 
Localization in HCI: Yahoo (US vs. China)
Localization in HCI: Yahoo (US vs. China)Localization in HCI: Yahoo (US vs. China)
Localization in HCI: Yahoo (US vs. China)Beibei Yang
 

Mehr von Beibei Yang (7)

Hubway Half a Million Trip Data
Hubway Half a Million Trip DataHubway Half a Million Trip Data
Hubway Half a Million Trip Data
 
Semantic Relatedness for Evaluation of Course Equivalencies
Semantic Relatedness for Evaluation of Course EquivalenciesSemantic Relatedness for Evaluation of Course Equivalencies
Semantic Relatedness for Evaluation of Course Equivalencies
 
Augmenting mobile 3 g using wifi
Augmenting mobile 3 g using wifiAugmenting mobile 3 g using wifi
Augmenting mobile 3 g using wifi
 
91.650 Paper Presentation
91.650 Paper Presentation91.650 Paper Presentation
91.650 Paper Presentation
 
Google Kernel Function
Google Kernel FunctionGoogle Kernel Function
Google Kernel Function
 
Class Project Showcase: DNS Spoofing
Class Project Showcase: DNS SpoofingClass Project Showcase: DNS Spoofing
Class Project Showcase: DNS Spoofing
 
Localization in HCI: Yahoo (US vs. China)
Localization in HCI: Yahoo (US vs. China)Localization in HCI: Yahoo (US vs. China)
Localization in HCI: Yahoo (US vs. China)
 

Kürzlich hochgeladen

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 

Kürzlich hochgeladen (20)

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 

Class Project Showcase: Overfitting in Machine Learning

  • 1. By Beibei Yang and Zheng Fang January 7, 2009 1
  • 2. Introduction Machine Learning: A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E ith i E. Overfitting: Given a hypothesis space H, a hypothesis h H is said to overfit the training data if there exists some alternative hypothesis h’ H, H such th t h h smaller error th h’ over th t i i examples, b t h that has ll than the training l but h’ has smaller error than h over the entire distribution of instances. Decision tree: (aka classification trees or regression trees) In these tree structures, leaves represent classifications and b t t t l t l ifi ti d branches represent h t conjunctions of features that lead to those classifications. Our implementation used the ID3 algorithm, which uses the entropy and gain of each node t create a classification t i f h d to t l ifi ti tree. January 7, 2009 2
  • 3. Cause of Overfitting (1) Lack of training data Standard vs. Insufficiency January 7, 2009 3
  • 4. Cause of Overfitting (2) Biased data Standard vs. Bias January 7, 2009 4