Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
Linear vs Nonlinear
Credit Modeling
Marc Stein
Founder and CEO
Underwrite.ai
#H2OWORLD
Korean Credit Market
• Highly efficient credit system
• Very low default rate and commensurately low interest rates
This is a logistic regression model based upon four key attribute areas.
How Credit Grade is Derived
Distribution of Credit Grades
Efficiency of Current Model
Credit Grade AUC = 0.90640
Efficiency of Current Model
This is a logistic regression model that
works very well. It utilizes a small
feature set very...
Nonlinear Approach
But what if we take a nonlinear
approach and use H20 and DAI to
model the problem?
Nonlinear Approach
Are there gains to be had by using 763
variables in a combinatorial manner in
place of the linear model?
Nonlinear Approach
Experiment: CDS3, 2018-12-19 00:04, 1.4.2
Settings: 8/5/5, seed=828672342, GPUs enabled
Train data: CDS...
Efficiency of Current Model vs DAI Model
Credit Grade AUC = 0.90640
DAI AUC = 0.95813
Take Away
A highly efficient logistic regression model can
be significantly outperformed by a GBM model
which incorporates...
Less Efficient Models
US Case Study
Large consumer lender with an overall
bad loan rate of 8.6%
US Case Study
Performance by Rate Tier
Performance by Rate Decile
Performance by FICO Decile
Performance by CVLink Decile
Performance by AI Decile
Combined Performance
Combined Performance
Marc Stein, Underwrite.ai - Driverless AI Use Cases in Finance and Cancer Genomics - H2O World SF
Nächste SlideShare
Wird geladen in …5
×

Marc Stein, Underwrite.ai - Driverless AI Use Cases in Finance and Cancer Genomics - H2O World SF

165 Aufrufe

Veröffentlicht am

This session was recorded in San Francisco on February 9th, 2019 and can be viewed here: https://youtu.be/6KY4CSA1AzU

Marc Stein is the founder and CEO of Underwrite.ai. Underwrite.ai applies advances in artificial intelligence derived from genomics and particle physics to provide lenders with non-linear, dynamic models of credit risk which radically outperform traditional approaches. Marc’s career has always revolved around deep interests in artificial intelligence, quantum physics, genomics, sugar cream pie, and all ice cream flavors found at Berthillon and the challenge of how to combine all these in practical applications.

Veröffentlicht in: Technologie
  • Als Erste(r) kommentieren

  • Gehören Sie zu den Ersten, denen das gefällt!

Marc Stein, Underwrite.ai - Driverless AI Use Cases in Finance and Cancer Genomics - H2O World SF

  1. 1. Linear vs Nonlinear Credit Modeling Marc Stein Founder and CEO Underwrite.ai #H2OWORLD
  2. 2. Korean Credit Market • Highly efficient credit system • Very low default rate and commensurately low interest rates
  3. 3. This is a logistic regression model based upon four key attribute areas. How Credit Grade is Derived
  4. 4. Distribution of Credit Grades
  5. 5. Efficiency of Current Model Credit Grade AUC = 0.90640
  6. 6. Efficiency of Current Model This is a logistic regression model that works very well. It utilizes a small feature set very efficiently. This linear model is quite performant.
  7. 7. Nonlinear Approach But what if we take a nonlinear approach and use H20 and DAI to model the problem?
  8. 8. Nonlinear Approach Are there gains to be had by using 763 variables in a combinatorial manner in place of the linear model?
  9. 9. Nonlinear Approach Experiment: CDS3, 2018-12-19 00:04, 1.4.2 Settings: 8/5/5, seed=828672342, GPUs enabled Train data: CDS3_SELECTED Training.csv (60000, 67) Validation data: CDS3_Selected Validate.csv (30000, 67) Test data: CDS3_Selected Hold.csv (10000, 66) Target column: outcome (binary, 99.258% target class) System specs: Docker/Linux, 16 GB, 4 CPU cores, 1/1 GPU Max memory usage: 2.98 GB, 0.595 GB GPU Recipe: AutoDL (98 iterations, 8 individuals) Validation scheme: user-given validation data Feature engineering: 16749 features tested (210 selected) Timing: Data preparation: 8.89 secs Model and feature tuning: 640.33 secs (49 models trained) Feature evolution: 3085.32 secs (397 models trained) Final pipeline training: 148.83 secs (1 model trained) Validation score: AUC = 0.94953 +/- 0.0026775 (baseline) Validation score: AUC = 0.95162 +/- 0.0026263 (final pipeline) Test score: AUC = 0.95813 +/- 0.0072649 (final pipeline)
  10. 10. Efficiency of Current Model vs DAI Model Credit Grade AUC = 0.90640 DAI AUC = 0.95813
  11. 11. Take Away A highly efficient logistic regression model can be significantly outperformed by a GBM model which incorporates more data.
  12. 12. Less Efficient Models US Case Study Large consumer lender with an overall bad loan rate of 8.6%
  13. 13. US Case Study
  14. 14. Performance by Rate Tier
  15. 15. Performance by Rate Decile
  16. 16. Performance by FICO Decile
  17. 17. Performance by CVLink Decile
  18. 18. Performance by AI Decile
  19. 19. Combined Performance
  20. 20. Combined Performance

×