SlideShare ist ein Scribd-Unternehmen logo
1 von 21
Linear vs Nonlinear
Credit Modeling
Marc Stein
Founder and CEO
Underwrite.ai
#H2OWORLD
Korean Credit Market
• Highly efficient credit system
• Very low default rate and commensurately low interest rates
This is a logistic regression model based upon four key attribute areas.
How Credit Grade is Derived
Distribution of Credit Grades
Efficiency of Current Model
Credit Grade AUC = 0.90640
Efficiency of Current Model
This is a logistic regression model that
works very well. It utilizes a small
feature set very efficiently. This linear
model is quite performant.
Nonlinear Approach
But what if we take a nonlinear
approach and use H20 and DAI to
model the problem?
Nonlinear Approach
Are there gains to be had by using 763
variables in a combinatorial manner in
place of the linear model?
Nonlinear Approach
Experiment: CDS3, 2018-12-19 00:04, 1.4.2
Settings: 8/5/5, seed=828672342, GPUs enabled
Train data: CDS3_SELECTED Training.csv (60000, 67)
Validation data: CDS3_Selected Validate.csv (30000, 67)
Test data: CDS3_Selected Hold.csv (10000, 66)
Target column: outcome (binary, 99.258% target class)
System specs: Docker/Linux, 16 GB, 4 CPU cores, 1/1 GPU
Max memory usage: 2.98 GB, 0.595 GB GPU
Recipe: AutoDL (98 iterations, 8 individuals)
Validation scheme: user-given validation data
Feature engineering: 16749 features tested (210 selected)
Timing:
Data preparation: 8.89 secs
Model and feature tuning: 640.33 secs (49 models trained)
Feature evolution: 3085.32 secs (397 models trained)
Final pipeline training: 148.83 secs (1 model trained)
Validation score: AUC = 0.94953 +/- 0.0026775 (baseline)
Validation score: AUC = 0.95162 +/- 0.0026263 (final pipeline)
Test score: AUC = 0.95813 +/- 0.0072649 (final pipeline)
Efficiency of Current Model vs DAI Model
Credit Grade AUC = 0.90640
DAI AUC = 0.95813
Take Away
A highly efficient logistic regression model can
be significantly outperformed by a GBM model
which incorporates more data.
Less Efficient Models
US Case Study
Large consumer lender with an overall
bad loan rate of 8.6%
US Case Study
Performance by Rate Tier
Performance by Rate Decile
Performance by FICO Decile
Performance by CVLink Decile
Performance by AI Decile
Combined Performance
Combined Performance
Marc Stein, Underwrite.ai - Driverless AI Use Cases in Finance and Cancer Genomics - H2O World SF

Weitere ähnliche Inhalte

Was ist angesagt?

Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Sri Ambati
 
Tom Aliff, Equifax - Configurable Modeling for Maximizing Business Value - H2...
Tom Aliff, Equifax - Configurable Modeling for Maximizing Business Value - H2...Tom Aliff, Equifax - Configurable Modeling for Maximizing Business Value - H2...
Tom Aliff, Equifax - Configurable Modeling for Maximizing Business Value - H2...
Sri Ambati
 
Patrick Hall, H2O.ai - Human Friendly Machine Learning - H2O World San Francisco
Patrick Hall, H2O.ai - Human Friendly Machine Learning - H2O World San FranciscoPatrick Hall, H2O.ai - Human Friendly Machine Learning - H2O World San Francisco
Patrick Hall, H2O.ai - Human Friendly Machine Learning - H2O World San Francisco
Sri Ambati
 
Automatic Model Documentation with H2O
Automatic Model Documentation with H2OAutomatic Model Documentation with H2O
Automatic Model Documentation with H2O
Sri Ambati
 
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Sri Ambati
 
Keynote by Mike Gualtieri, Forrester Research - Making AI Happen Without Gett...
Keynote by Mike Gualtieri, Forrester Research - Making AI Happen Without Gett...Keynote by Mike Gualtieri, Forrester Research - Making AI Happen Without Gett...
Keynote by Mike Gualtieri, Forrester Research - Making AI Happen Without Gett...
Sri Ambati
 

Was ist angesagt? (20)

Machine Learning with H2O
Machine Learning with H2OMachine Learning with H2O
Machine Learning with H2O
 
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
 
No more grid search! How to build models effectively by Thomas Huijskens
No more grid search! How to build models effectively by Thomas HuijskensNo more grid search! How to build models effectively by Thomas Huijskens
No more grid search! How to build models effectively by Thomas Huijskens
 
Tom Aliff, Equifax - Configurable Modeling for Maximizing Business Value - H2...
Tom Aliff, Equifax - Configurable Modeling for Maximizing Business Value - H2...Tom Aliff, Equifax - Configurable Modeling for Maximizing Business Value - H2...
Tom Aliff, Equifax - Configurable Modeling for Maximizing Business Value - H2...
 
Introduction & Hands-on with H2O Driverless AI
Introduction & Hands-on with H2O Driverless AIIntroduction & Hands-on with H2O Driverless AI
Introduction & Hands-on with H2O Driverless AI
 
ML Model Deployment and Scoring on the Edge with Automatic ML & DF
ML Model Deployment and Scoring on the Edge with Automatic ML & DFML Model Deployment and Scoring on the Edge with Automatic ML & DF
ML Model Deployment and Scoring on the Edge with Automatic ML & DF
 
Building A Feature Factory
Building A Feature FactoryBuilding A Feature Factory
Building A Feature Factory
 
Patrick Hall, H2O.ai - Human Friendly Machine Learning - H2O World San Francisco
Patrick Hall, H2O.ai - Human Friendly Machine Learning - H2O World San FranciscoPatrick Hall, H2O.ai - Human Friendly Machine Learning - H2O World San Francisco
Patrick Hall, H2O.ai - Human Friendly Machine Learning - H2O World San Francisco
 
Martin Stein, G5 - Driving Marketing Performance with H2O Driverless AI - H2O...
Martin Stein, G5 - Driving Marketing Performance with H2O Driverless AI - H2O...Martin Stein, G5 - Driving Marketing Performance with H2O Driverless AI - H2O...
Martin Stein, G5 - Driving Marketing Performance with H2O Driverless AI - H2O...
 
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital OneUsing H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
 
Scalable Automatic Machine Learning with H2O
Scalable Automatic Machine Learning with H2OScalable Automatic Machine Learning with H2O
Scalable Automatic Machine Learning with H2O
 
Automatic Model Documentation with H2O
Automatic Model Documentation with H2OAutomatic Model Documentation with H2O
Automatic Model Documentation with H2O
 
Azure Machine Learning tutorial
Azure Machine Learning tutorialAzure Machine Learning tutorial
Azure Machine Learning tutorial
 
Building Understanding Out of Incomplete and Biased Datasets using Machine Le...
Building Understanding Out of Incomplete and Biased Datasets using Machine Le...Building Understanding Out of Incomplete and Biased Datasets using Machine Le...
Building Understanding Out of Incomplete and Biased Datasets using Machine Le...
 
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
 
Big Wins with Small Data: PredictionIO in Ecommerce
Big Wins with Small Data: PredictionIO in EcommerceBig Wins with Small Data: PredictionIO in Ecommerce
Big Wins with Small Data: PredictionIO in Ecommerce
 
Keynote by Mike Gualtieri, Forrester Research - Making AI Happen Without Gett...
Keynote by Mike Gualtieri, Forrester Research - Making AI Happen Without Gett...Keynote by Mike Gualtieri, Forrester Research - Making AI Happen Without Gett...
Keynote by Mike Gualtieri, Forrester Research - Making AI Happen Without Gett...
 
Custom Machine Learning Recipes
Custom Machine Learning Recipes Custom Machine Learning Recipes
Custom Machine Learning Recipes
 
Getting Started with Azure AutoML
Getting Started with Azure AutoMLGetting Started with Azure AutoML
Getting Started with Azure AutoML
 
TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform
 

Ähnlich wie Marc Stein, Underwrite.ai - Driverless AI Use Cases in Finance and Cancer Genomics - H2O World SF

Meetup_Consumer_Credit_Default_Vers_2_All
Meetup_Consumer_Credit_Default_Vers_2_AllMeetup_Consumer_Credit_Default_Vers_2_All
Meetup_Consumer_Credit_Default_Vers_2_All
Bernard Ong
 
Marketing Analytics RM Report
Marketing Analytics RM ReportMarketing Analytics RM Report
Marketing Analytics RM Report
Logan Moore
 
Development of Calibrated Operational Models for Real-Time Decision Support a...
Development of Calibrated Operational Models for Real-Time Decision Support a...Development of Calibrated Operational Models for Real-Time Decision Support a...
Development of Calibrated Operational Models for Real-Time Decision Support a...
Daniel Coakley
 

Ähnlich wie Marc Stein, Underwrite.ai - Driverless AI Use Cases in Finance and Cancer Genomics - H2O World SF (20)

Customer choice probabilities
Customer choice probabilitiesCustomer choice probabilities
Customer choice probabilities
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
 
Meetup_Consumer_Credit_Default_Vers_2_All
Meetup_Consumer_Credit_Default_Vers_2_AllMeetup_Consumer_Credit_Default_Vers_2_All
Meetup_Consumer_Credit_Default_Vers_2_All
 
Supply chain design and operation
Supply chain design and operationSupply chain design and operation
Supply chain design and operation
 
Six sigma11
Six sigma11Six sigma11
Six sigma11
 
machineLearningTypingTool_Rev1
machineLearningTypingTool_Rev1machineLearningTypingTool_Rev1
machineLearningTypingTool_Rev1
 
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
 
MLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott ClarkMLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott Clark
 
Marketing Analytics RM Report
Marketing Analytics RM ReportMarketing Analytics RM Report
Marketing Analytics RM Report
 
CSCCIX2005
CSCCIX2005CSCCIX2005
CSCCIX2005
 
Six sigma & TQM
Six sigma & TQMSix sigma & TQM
Six sigma & TQM
 
Six Sigma-s04.ppt
Six Sigma-s04.pptSix Sigma-s04.ppt
Six Sigma-s04.ppt
 
Six Sigma-s04.ppt
Six Sigma-s04.pptSix Sigma-s04.ppt
Six Sigma-s04.ppt
 
six sigma-s04.ppt
six sigma-s04.pptsix sigma-s04.ppt
six sigma-s04.ppt
 
Development of calibrated operational models of existing buildings for real-t...
Development of calibrated operational models of existing buildings for real-t...Development of calibrated operational models of existing buildings for real-t...
Development of calibrated operational models of existing buildings for real-t...
 
Development of Calibrated Operational Models for Real-Time Decision Support a...
Development of Calibrated Operational Models for Real-Time Decision Support a...Development of Calibrated Operational Models for Real-Time Decision Support a...
Development of Calibrated Operational Models for Real-Time Decision Support a...
 
6 sigma
6 sigma6 sigma
6 sigma
 
Quality andc apability hand out 091123200010 Phpapp01
Quality andc apability hand out 091123200010 Phpapp01Quality andc apability hand out 091123200010 Phpapp01
Quality andc apability hand out 091123200010 Phpapp01
 
SigOpt for Machine Learning and AI
SigOpt for Machine Learning and AISigOpt for Machine Learning and AI
SigOpt for Machine Learning and AI
 

Mehr von Sri Ambati

Mehr von Sri Ambati (20)

H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Generative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptxGenerative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptx
 
AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek
 
LLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5thLLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5th
 
Building, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionBuilding, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for Production
 
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
 
Risk Management for LLMs
Risk Management for LLMsRisk Management for LLMs
Risk Management for LLMs
 
Open-Source AI: Community is the Way
Open-Source AI: Community is the WayOpen-Source AI: Community is the Way
Open-Source AI: Community is the Way
 
Building Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2OBuilding Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2O
 
Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical
 
Cutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM PapersCutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM Papers
 
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
 
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
 
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
 
LLM Interpretability
LLM Interpretability LLM Interpretability
LLM Interpretability
 
Never Reply to an Email Again
Never Reply to an Email AgainNever Reply to an Email Again
Never Reply to an Email Again
 
Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)
 
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
 
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
 
AI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation JourneyAI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation Journey
 

Kürzlich hochgeladen

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Kürzlich hochgeladen (20)

DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 

Marc Stein, Underwrite.ai - Driverless AI Use Cases in Finance and Cancer Genomics - H2O World SF

  • 1. Linear vs Nonlinear Credit Modeling Marc Stein Founder and CEO Underwrite.ai #H2OWORLD
  • 2. Korean Credit Market • Highly efficient credit system • Very low default rate and commensurately low interest rates
  • 3. This is a logistic regression model based upon four key attribute areas. How Credit Grade is Derived
  • 5. Efficiency of Current Model Credit Grade AUC = 0.90640
  • 6. Efficiency of Current Model This is a logistic regression model that works very well. It utilizes a small feature set very efficiently. This linear model is quite performant.
  • 7. Nonlinear Approach But what if we take a nonlinear approach and use H20 and DAI to model the problem?
  • 8. Nonlinear Approach Are there gains to be had by using 763 variables in a combinatorial manner in place of the linear model?
  • 9. Nonlinear Approach Experiment: CDS3, 2018-12-19 00:04, 1.4.2 Settings: 8/5/5, seed=828672342, GPUs enabled Train data: CDS3_SELECTED Training.csv (60000, 67) Validation data: CDS3_Selected Validate.csv (30000, 67) Test data: CDS3_Selected Hold.csv (10000, 66) Target column: outcome (binary, 99.258% target class) System specs: Docker/Linux, 16 GB, 4 CPU cores, 1/1 GPU Max memory usage: 2.98 GB, 0.595 GB GPU Recipe: AutoDL (98 iterations, 8 individuals) Validation scheme: user-given validation data Feature engineering: 16749 features tested (210 selected) Timing: Data preparation: 8.89 secs Model and feature tuning: 640.33 secs (49 models trained) Feature evolution: 3085.32 secs (397 models trained) Final pipeline training: 148.83 secs (1 model trained) Validation score: AUC = 0.94953 +/- 0.0026775 (baseline) Validation score: AUC = 0.95162 +/- 0.0026263 (final pipeline) Test score: AUC = 0.95813 +/- 0.0072649 (final pipeline)
  • 10. Efficiency of Current Model vs DAI Model Credit Grade AUC = 0.90640 DAI AUC = 0.95813
  • 11. Take Away A highly efficient logistic regression model can be significantly outperformed by a GBM model which incorporates more data.
  • 12. Less Efficient Models US Case Study Large consumer lender with an overall bad loan rate of 8.6%