SlideShare ist ein Scribd-Unternehmen logo
1 von 22
Downloaden Sie, um offline zu lesen
Forecasting
Peer-to-Peer
Lending Risk
Archange Giscard Destine
Steven Lerner
Erblin Mehmetaj
Hetal Shah
September 10, 2016
Forbes
Peer-to-Peer Lending
2
• Investors and borrowers are linked by online service providers
Investors Borrowers
• Growing rapidly
– $5.5B in the U.S. in 2014
– Over 100% annual growth rate today
– Expected to be a major player in consumer financing – over $150B by 2025
– Lending Club is the clear market leader
How Does It Work?
Borrowers
• Unsecured loan
• Rates often below
credit cards
• Done online –
quick and easy
3
Investors
• Higher rates, from
4 to 25+%
• Ability to spread
risk – invest as
little as $25 per
loan
Lending Club
• Collect ~ 5% fee up
front
• Collect ~ 1% on all
loan payments
• Pursue collections
But, roughly 14% of loans end in default
and
All risk is assumed by the investor
Objectives
Current
Develop a tool to help
investors avoid loans likely
to default
A model to forecast
probability of default, given
loan information …
emphasize default recall
versus precision
4
Future Work
For investors interested in taking
more risk, develop a tool to
determine effective interest rate
A model forecasting impact of
default (x, fraction of loan value)
Effective interest rate (z) =
n√[(1+i)n - p*x]
where i = original interest
n = loan duration, yrs
p = probability of default
12%
0%
Over 36 quarters
Unemployment rate
Charge-off rate
What’s Different Than Prior Work
• Lending Club’s new historical data set increases modeling difficulty
• Other studies ignored macroeconomic features … which are
important
5
Unsecured Personal Loan Delinquencies,2Q16 Unemployment Rate and Charge Off Rate
1.3% 7.7% TransUnion
Data Selection
• Loan data on completed loans from the Lending Club website
• Macroeconomic data
6
Measure State Fed. Value Slope* Reflection of:
Unemployment X X X Job loss & replacement difficulty
GDP X X X Overall economic activity
Disposable income X X X Cost/wage pressure
10-yr to 3-m T-bill spread X X Future economic growth
3-yr T-bill rate X X Short term inflation
Credit card rate (average) X X Alternative borrowing costs
* Slope is for 12 months prior, based on expert input
Data Ingestion: Sources
• Loan data: Lending Club website
– 111 features for each loan
– Historical data since June 2007
• Macroeconomic data
– Federal Reserve
– Bureau of Economic Analysis
– Bureau of Labor Statistics
– Cardhub
– National Conference of State Legislatures
• Collected data stored in data archive (PostgreSQL DB)
7
Data
Ingestion
Wrangling Computation / Analysis Modeling
Reporting /
Visualization
• Initial data reduction
– 111 historical features  29 features provided to investors
– Date range reduction to completed loans
• Data verification and cleanup
– Verify loan uniqueness
– Eliminate redundant data
– Eliminate non-informative features
(URL’s, free form, extremely sparse data, etc.)
– Trim entries: “months”, “%”, “+”, “years”, etc.
– Verify geographic scope
– Select uniform date structure for analysis and merging
– Address data that is both numeric and categorical
Data Wrangling… a big time consumer
8
Data Ingestion Wrangling Computation / Analysis Modeling
Reporting /
Visualization
220K instances
111 features
• Address all NaN entries
• Analyze outliers
• Economic calculations
– Least square slopes
– Interpolating for quarterly and annual
data
• Wrangle economic data: trimming
entries and using consistent format
• Merge economic and loan data
Data Wrangling (cont’d)
9
Categorical and
numerical wrangled
data frames
Surprise learning:
LC only verifies data for 31% of loans!
Data Ingestion Wrangling Computation / Analysis Modeling
Reporting /
Visualization
84K instances
30 features
- 21 loan
- 9 economic
Data Analysis
10
• Initial data analysis shows
little separation based on
features
• What separation there is,
appears to be driven by
macroeconomic variables
Data Ingestion Wrangling Data Analysis Modeling
Reporting /
Visualization
Paid
Default
Data Analysis (cont’d)
11
Features initially deemed important, showed little differentiation
Data Ingestion Wrangling Data Analysis Modeling
Reporting /
Visualization
Default
Paid
Overlap
Modeling
• Tested several modeling algorithms
– Logistical Regression
– Random Forest
– Naïve Bayes (Bernoulli, Gaussian, Multinomial)
– K-Nearest Neighbors
– Gradient Boosting
– Voting Classifier
• Manual feature exploration
• Created pipeline
– Standardization
– Feature reduction via PCA and LDA
12
Data Ingestion Wrangling Data Analysis Modeling
Reporting /
Visualization
Best recall was
0.58 to 0.62 …
was imbalanced
data the issue?
Modeling (cont’d)
13
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
Annual income
Feature importance for random forest
Data Ingestion Wrangling Data Analysis Modeling
Reporting /
Visualization
Feature importance for logistic regression
Annual income
Modeling (cont’d.)
• Balanced data set via undersampling paid loans
– Little improvement
– Losing lots of instances
• Added hyper-parameter tuning using GridSearch …
little improvement
• Balanced data via oversampling defaulted loans
– Extracted representative data sample (85/15, paid/default)
– Multiply remaining defaults 6X
– Train model using 80/20 split
– Final test versus extracted (unseen) data
14
Data Ingestion Wrangling Data Analysis Modeling
Reporting /
Visualization
De minimis
improvements
Modeling (cont’d)
• Sought expert advice
– Financial experts
– Modeling experts
• Adjusted feature set
– More responsive economic input
• 36/60 month lagging slopes  12 month leading slopes
• 36/60 month averages  point values
– Added critical ratios and indices to expand feature set
• Tested binary encoding
15
Data Ingestion Wrangling Data Analysis Modeling
Reporting /
Visualization
De minimis
improvements
Made a strategic
decision to modify
class weight to
enhance default
recall at the
expense of default
precision
Modeling: Metrics
Targeted 90+% default recall and 90+% paid precision
• Default recall
Defaults identified / total defaults
• Paid precision
Paids identified correctly / total instances identified as paid
16
Data Ingestion Wrangling Data Analysis Modeling
Reporting /
Visualization
Modeling (cont’d)
17
Logistic Regression Precision Recall F1 Score Support
Default (weight = 0.7) 0.52 0.94 0.67 13,568
Paid (weight = 0.3) 0.77 0.20 0.31 14,547
Unseen / Imbalanced Results
Default 0.16 0.97 0.20 115
Paid 0.97 0.18 0.30 734
Random Forest
Default (weight = 0.6) 0.53 0.92 0.68 13,568
Paid (weight = 0.4) 0.77 0.25 0.38 14,547
Unseen / Imbalanced Results
Default 0.16 0.95 0.28 115
Paid 0.97 0.24 0.39 734
What does
default recall = 0.97
and
default precision = 0.16
look like?
Data Ingestion Wrangling Data Analysis Modeling
Reporting /
Visualization
Reporting
• Tool (online) to predict loan status and probability of default
– Investor enters loan info
– Tool fetches macroeconomic data
– Above data is passed to webservice, which executes model and returns
predicted loan status and probability
• Tool developed using
– Flask interface with machine learning model as a RESTful webservice
– Jinja2 template
– HTML/CSS
– Javascript
18
Data Ingestion Wrangling Data Analysis Modeling Reporting
Demo
19
Conclusions
• Model effectively sequesters loans likely to default (97% default recall)
• Model cherry-picks loans not likely to default (97% paid precision)
• Achieving the above required class weighting which drives default recall
at the expense of default precision
… potentially good loans are misclassified as default
• Root causes appear to be lack of data separation, lack of feature
relevancy and imbalanced data
20
Future Work
Project specific
• Can we maintain recall and drive up precision by using logistic regression on
the total dataset followed by random forest on potential defaults?
• Can we identify or create more relevant features?
• Can we develop a tool for aggressive investors, providing impact of default?
General opportunity space around highly imbalanced data
21
21 21
Logistic Regression Random Forest
The authors would like to recognize the open source software that made this work possible
22
Questions?
Archange Giscard Destine ad1373@georgetown.edu Steven Lerner sll93@georgetown.edu
Erblin Mehmetaj em1109@georgetown.edu Hetal Shah hrs41@georgetown.edu

Weitere ähnliche Inhalte

Was ist angesagt?

Top Regulatory Insights for Fintechs & Financial Institutions
Top Regulatory Insights for Fintechs & Financial InstitutionsTop Regulatory Insights for Fintechs & Financial Institutions
Top Regulatory Insights for Fintechs & Financial InstitutionsExperian
 
Wharton FinTech - P2P Lending Discussion
Wharton FinTech - P2P Lending DiscussionWharton FinTech - P2P Lending Discussion
Wharton FinTech - P2P Lending Discussionwhartonfintech
 
Credit Scores: What's New?
Credit Scores: What's New?Credit Scores: What's New?
Credit Scores: What's New?milfamln
 
Analytics in banking services
Analytics in banking servicesAnalytics in banking services
Analytics in banking servicesMariyageorge
 
Vcu Stm Transformation 02 15 10
Vcu Stm Transformation 02 15 10Vcu Stm Transformation 02 15 10
Vcu Stm Transformation 02 15 10guesta24f4bc
 
4 best practices in digitizing mortgage verification
4 best practices in digitizing mortgage verification4 best practices in digitizing mortgage verification
4 best practices in digitizing mortgage verificationExperian
 
Data driven approach to KYC
Data driven approach to KYCData driven approach to KYC
Data driven approach to KYCPankaj Baid
 
Credit Score Basics-04-17
Credit Score Basics-04-17Credit Score Basics-04-17
Credit Score Basics-04-17Barbara O'Neill
 
Alternative Data: Transforming SME Finance
Alternative Data: Transforming SME FinanceAlternative Data: Transforming SME Finance
Alternative Data: Transforming SME FinanceJohn Owens
 
Boston Fintech Week - Day 1
Boston Fintech Week - Day 1Boston Fintech Week - Day 1
Boston Fintech Week - Day 1QuantUniversity
 
Marketplace Lending in the U.S. - An industry overview March 2015
Marketplace Lending in the U.S. - An industry overview March 2015Marketplace Lending in the U.S. - An industry overview March 2015
Marketplace Lending in the U.S. - An industry overview March 2015Rajesh Kamath
 
The Transformation Underway in FinTech Lending
The Transformation Underway in FinTech LendingThe Transformation Underway in FinTech Lending
The Transformation Underway in FinTech LendingDushyant Shahrawat, CFA
 

Was ist angesagt? (14)

Top Regulatory Insights for Fintechs & Financial Institutions
Top Regulatory Insights for Fintechs & Financial InstitutionsTop Regulatory Insights for Fintechs & Financial Institutions
Top Regulatory Insights for Fintechs & Financial Institutions
 
OnDeck Merchant Presentation
OnDeck Merchant PresentationOnDeck Merchant Presentation
OnDeck Merchant Presentation
 
Wharton FinTech - P2P Lending Discussion
Wharton FinTech - P2P Lending DiscussionWharton FinTech - P2P Lending Discussion
Wharton FinTech - P2P Lending Discussion
 
Credit Scores: What's New?
Credit Scores: What's New?Credit Scores: What's New?
Credit Scores: What's New?
 
Analytics in banking services
Analytics in banking servicesAnalytics in banking services
Analytics in banking services
 
Talk on Structured Finance
Talk on Structured FinanceTalk on Structured Finance
Talk on Structured Finance
 
Vcu Stm Transformation 02 15 10
Vcu Stm Transformation 02 15 10Vcu Stm Transformation 02 15 10
Vcu Stm Transformation 02 15 10
 
4 best practices in digitizing mortgage verification
4 best practices in digitizing mortgage verification4 best practices in digitizing mortgage verification
4 best practices in digitizing mortgage verification
 
Data driven approach to KYC
Data driven approach to KYCData driven approach to KYC
Data driven approach to KYC
 
Credit Score Basics-04-17
Credit Score Basics-04-17Credit Score Basics-04-17
Credit Score Basics-04-17
 
Alternative Data: Transforming SME Finance
Alternative Data: Transforming SME FinanceAlternative Data: Transforming SME Finance
Alternative Data: Transforming SME Finance
 
Boston Fintech Week - Day 1
Boston Fintech Week - Day 1Boston Fintech Week - Day 1
Boston Fintech Week - Day 1
 
Marketplace Lending in the U.S. - An industry overview March 2015
Marketplace Lending in the U.S. - An industry overview March 2015Marketplace Lending in the U.S. - An industry overview March 2015
Marketplace Lending in the U.S. - An industry overview March 2015
 
The Transformation Underway in FinTech Lending
The Transformation Underway in FinTech LendingThe Transformation Underway in FinTech Lending
The Transformation Underway in FinTech Lending
 

Andere mochten auch

H2O World - GBM and Random Forest in H2O- Mark Landry
H2O World - GBM and Random Forest in H2O- Mark LandryH2O World - GBM and Random Forest in H2O- Mark Landry
H2O World - GBM and Random Forest in H2O- Mark LandrySri Ambati
 
Higgs Boson Machine Learning Challenge - Kaggle
Higgs Boson Machine Learning Challenge - KaggleHiggs Boson Machine Learning Challenge - Kaggle
Higgs Boson Machine Learning Challenge - KaggleSajith Edirisinghe
 
classification_methods-logistic regression Machine Learning
classification_methods-logistic regression Machine Learning classification_methods-logistic regression Machine Learning
classification_methods-logistic regression Machine Learning Shiraz316
 
Consumer Credit Scoring Using Logistic Regression and Random Forest
Consumer Credit Scoring Using Logistic Regression and Random ForestConsumer Credit Scoring Using Logistic Regression and Random Forest
Consumer Credit Scoring Using Logistic Regression and Random ForestHirak Sen Roy
 
Logistic regression with low event rate (rare events)
Logistic regression with low event rate (rare events)Logistic regression with low event rate (rare events)
Logistic regression with low event rate (rare events)Tejamoy Ghosh
 
Lending Club Case Study
Lending Club Case StudyLending Club Case Study
Lending Club Case Studyrgn216
 
Estimation of the probability of default : Credit Rish
Estimation of the probability of default : Credit RishEstimation of the probability of default : Credit Rish
Estimation of the probability of default : Credit RishArsalan Qadri
 
Improve Your Regression with CART and RandomForests
Improve Your Regression with CART and RandomForestsImprove Your Regression with CART and RandomForests
Improve Your Regression with CART and RandomForestsSalford Systems
 
Logistic Modeling with Applications to Marketing and Credit Risk in the Autom...
Logistic Modeling with Applications to Marketing and Credit Risk in the Autom...Logistic Modeling with Applications to Marketing and Credit Risk in the Autom...
Logistic Modeling with Applications to Marketing and Credit Risk in the Autom...Magnify Analytic Solutions
 
Dr. Trevor Hastie: Data Science of GBM (October 10, 2013: Presented With H2O)
Dr. Trevor Hastie: Data Science of GBM (October 10, 2013: Presented With H2O)Dr. Trevor Hastie: Data Science of GBM (October 10, 2013: Presented With H2O)
Dr. Trevor Hastie: Data Science of GBM (October 10, 2013: Presented With H2O)Sri Ambati
 
Lending Club Case Study – P2P Lending 個案分析
Lending Club Case Study  – P2P Lending 個案分析Lending Club Case Study  – P2P Lending 個案分析
Lending Club Case Study – P2P Lending 個案分析賢澔 陳
 
Tree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptionsTree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptionsGilles Louppe
 
Intro to Classification: Logistic Regression & SVM
Intro to Classification: Logistic Regression & SVMIntro to Classification: Logistic Regression & SVM
Intro to Classification: Logistic Regression & SVMNYC Predictive Analytics
 
Understanding Random Forests: From Theory to Practice
Understanding Random Forests: From Theory to PracticeUnderstanding Random Forests: From Theory to Practice
Understanding Random Forests: From Theory to PracticeGilles Louppe
 
Model building in credit card and loan approval
Model building in credit card and loan approval Model building in credit card and loan approval
Model building in credit card and loan approval Venkata Reddy Konasani
 
GBM package in r
GBM package in rGBM package in r
GBM package in rmark_landry
 
Predicting Customer Conversion with Random Forests
Predicting Customer Conversion with Random ForestsPredicting Customer Conversion with Random Forests
Predicting Customer Conversion with Random ForestsEnplus Advisors, Inc.
 

Andere mochten auch (20)

P2p liberum alt f
P2p liberum alt fP2p liberum alt f
P2p liberum alt f
 
H2O World - GBM and Random Forest in H2O- Mark Landry
H2O World - GBM and Random Forest in H2O- Mark LandryH2O World - GBM and Random Forest in H2O- Mark Landry
H2O World - GBM and Random Forest in H2O- Mark Landry
 
Higgs Boson Machine Learning Challenge - Kaggle
Higgs Boson Machine Learning Challenge - KaggleHiggs Boson Machine Learning Challenge - Kaggle
Higgs Boson Machine Learning Challenge - Kaggle
 
classification_methods-logistic regression Machine Learning
classification_methods-logistic regression Machine Learning classification_methods-logistic regression Machine Learning
classification_methods-logistic regression Machine Learning
 
Consumer Credit Scoring Using Logistic Regression and Random Forest
Consumer Credit Scoring Using Logistic Regression and Random ForestConsumer Credit Scoring Using Logistic Regression and Random Forest
Consumer Credit Scoring Using Logistic Regression and Random Forest
 
Logistic regression with low event rate (rare events)
Logistic regression with low event rate (rare events)Logistic regression with low event rate (rare events)
Logistic regression with low event rate (rare events)
 
Lending Club Case Study
Lending Club Case StudyLending Club Case Study
Lending Club Case Study
 
Estimation of the probability of default : Credit Rish
Estimation of the probability of default : Credit RishEstimation of the probability of default : Credit Rish
Estimation of the probability of default : Credit Rish
 
Improve Your Regression with CART and RandomForests
Improve Your Regression with CART and RandomForestsImprove Your Regression with CART and RandomForests
Improve Your Regression with CART and RandomForests
 
Logistic Modeling with Applications to Marketing and Credit Risk in the Autom...
Logistic Modeling with Applications to Marketing and Credit Risk in the Autom...Logistic Modeling with Applications to Marketing and Credit Risk in the Autom...
Logistic Modeling with Applications to Marketing and Credit Risk in the Autom...
 
Dr. Trevor Hastie: Data Science of GBM (October 10, 2013: Presented With H2O)
Dr. Trevor Hastie: Data Science of GBM (October 10, 2013: Presented With H2O)Dr. Trevor Hastie: Data Science of GBM (October 10, 2013: Presented With H2O)
Dr. Trevor Hastie: Data Science of GBM (October 10, 2013: Presented With H2O)
 
Lending Club Case Study – P2P Lending 個案分析
Lending Club Case Study  – P2P Lending 個案分析Lending Club Case Study  – P2P Lending 個案分析
Lending Club Case Study – P2P Lending 個案分析
 
Tree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptionsTree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptions
 
Intro to Classification: Logistic Regression & SVM
Intro to Classification: Logistic Regression & SVMIntro to Classification: Logistic Regression & SVM
Intro to Classification: Logistic Regression & SVM
 
Introduction to Modeling
Introduction to ModelingIntroduction to Modeling
Introduction to Modeling
 
Xgboost
XgboostXgboost
Xgboost
 
Understanding Random Forests: From Theory to Practice
Understanding Random Forests: From Theory to PracticeUnderstanding Random Forests: From Theory to Practice
Understanding Random Forests: From Theory to Practice
 
Model building in credit card and loan approval
Model building in credit card and loan approval Model building in credit card and loan approval
Model building in credit card and loan approval
 
GBM package in r
GBM package in rGBM package in r
GBM package in r
 
Predicting Customer Conversion with Random Forests
Predicting Customer Conversion with Random ForestsPredicting Customer Conversion with Random Forests
Predicting Customer Conversion with Random Forests
 

Ähnlich wie Forecasting P2P Credit Risk based on Lending Club data

How Eastern Bank Uses Big Data to Better Serve and Protect its Customers
How Eastern Bank Uses Big Data to Better Serve and Protect its CustomersHow Eastern Bank Uses Big Data to Better Serve and Protect its Customers
How Eastern Bank Uses Big Data to Better Serve and Protect its CustomersBrian Griffith
 
Banking Circle: Money Laundering Beware: A Modern Approach to AML with Machin...
Banking Circle: Money Laundering Beware: A Modern Approach to AML with Machin...Banking Circle: Money Laundering Beware: A Modern Approach to AML with Machin...
Banking Circle: Money Laundering Beware: A Modern Approach to AML with Machin...Neo4j
 
Personal Loan Risk Assessment
Personal Loan Risk Assessment Personal Loan Risk Assessment
Personal Loan Risk Assessment Kunal Kashyap
 
Desai_edinburgh2001
Desai_edinburgh2001Desai_edinburgh2001
Desai_edinburgh2001Vijay Desai
 
OFSAA - BIGDATA - IBANK
OFSAA - BIGDATA - IBANKOFSAA - BIGDATA - IBANK
OFSAA - BIGDATA - IBANKibankuk
 
OFSAA - BIG DATA - IBANK
OFSAA - BIG DATA - IBANKOFSAA - BIG DATA - IBANK
OFSAA - BIG DATA - IBANKibankuk
 
Presentation Title
Presentation TitlePresentation Title
Presentation Titlebutest
 
Introduction to predictive modeling v1
Introduction to predictive modeling v1Introduction to predictive modeling v1
Introduction to predictive modeling v1Venkata Reddy Konasani
 
Creditscore
CreditscoreCreditscore
Creditscorekevinlan
 
20150118 s snet analytics vca
20150118 s snet analytics vca20150118 s snet analytics vca
20150118 s snet analytics vcaVishwanath Ramdas
 
How LinkedIn leverages data to build scalable payments strategy
How LinkedIn leverages data to build scalable payments strategyHow LinkedIn leverages data to build scalable payments strategy
How LinkedIn leverages data to build scalable payments strategyChi-Yi Kuan
 
AI powered decision making in banks
AI powered decision making in banksAI powered decision making in banks
AI powered decision making in banksPankaj Baid
 
Sageworks Portfolio Management Solutions
Sageworks Portfolio Management SolutionsSageworks Portfolio Management Solutions
Sageworks Portfolio Management SolutionsLibby Bierman
 
Sageworks Portfolio Management Solutions
Sageworks Portfolio Management SolutionsSageworks Portfolio Management Solutions
Sageworks Portfolio Management SolutionsCraig Burnham
 
PredictiveMetrics' Predictive Scoring for Collections Capabilities
PredictiveMetrics' Predictive Scoring for Collections CapabilitiesPredictiveMetrics' Predictive Scoring for Collections Capabilities
PredictiveMetrics' Predictive Scoring for Collections CapabilitiesPredictiveMetrics, Inc.
 
Big data in marketing at harvard business club nick1 june 15 2013
Big data in marketing at harvard business club nick1 june 15 2013Big data in marketing at harvard business club nick1 june 15 2013
Big data in marketing at harvard business club nick1 june 15 2013nkabra
 
Infogix BCBS 239 Implementation Challenges
Infogix BCBS 239 Implementation ChallengesInfogix BCBS 239 Implementation Challenges
Infogix BCBS 239 Implementation ChallengesMichelle Genser
 
A New Approach to Consumer Credit
A New Approach to Consumer CreditA New Approach to Consumer Credit
A New Approach to Consumer CreditRabindran Abraham
 

Ähnlich wie Forecasting P2P Credit Risk based on Lending Club data (20)

How Eastern Bank Uses Big Data to Better Serve and Protect its Customers
How Eastern Bank Uses Big Data to Better Serve and Protect its CustomersHow Eastern Bank Uses Big Data to Better Serve and Protect its Customers
How Eastern Bank Uses Big Data to Better Serve and Protect its Customers
 
Banking Circle: Money Laundering Beware: A Modern Approach to AML with Machin...
Banking Circle: Money Laundering Beware: A Modern Approach to AML with Machin...Banking Circle: Money Laundering Beware: A Modern Approach to AML with Machin...
Banking Circle: Money Laundering Beware: A Modern Approach to AML with Machin...
 
Personal Loan Risk Assessment
Personal Loan Risk Assessment Personal Loan Risk Assessment
Personal Loan Risk Assessment
 
Desai_edinburgh2001
Desai_edinburgh2001Desai_edinburgh2001
Desai_edinburgh2001
 
OFSAA - BIGDATA - IBANK
OFSAA - BIGDATA - IBANKOFSAA - BIGDATA - IBANK
OFSAA - BIGDATA - IBANK
 
OFSAA - BIG DATA - IBANK
OFSAA - BIG DATA - IBANKOFSAA - BIG DATA - IBANK
OFSAA - BIG DATA - IBANK
 
Alhuda cibe - How analytics can help mfi's
Alhuda cibe - How analytics can help mfi'sAlhuda cibe - How analytics can help mfi's
Alhuda cibe - How analytics can help mfi's
 
Presentation Title
Presentation TitlePresentation Title
Presentation Title
 
Introduction to predictive modeling v1
Introduction to predictive modeling v1Introduction to predictive modeling v1
Introduction to predictive modeling v1
 
Creditscore
CreditscoreCreditscore
Creditscore
 
20150118 s snet analytics vca
20150118 s snet analytics vca20150118 s snet analytics vca
20150118 s snet analytics vca
 
How LinkedIn leverages data to build scalable payments strategy
How LinkedIn leverages data to build scalable payments strategyHow LinkedIn leverages data to build scalable payments strategy
How LinkedIn leverages data to build scalable payments strategy
 
Large Scale Data Analytics
Large Scale Data AnalyticsLarge Scale Data Analytics
Large Scale Data Analytics
 
AI powered decision making in banks
AI powered decision making in banksAI powered decision making in banks
AI powered decision making in banks
 
Sageworks Portfolio Management Solutions
Sageworks Portfolio Management SolutionsSageworks Portfolio Management Solutions
Sageworks Portfolio Management Solutions
 
Sageworks Portfolio Management Solutions
Sageworks Portfolio Management SolutionsSageworks Portfolio Management Solutions
Sageworks Portfolio Management Solutions
 
PredictiveMetrics' Predictive Scoring for Collections Capabilities
PredictiveMetrics' Predictive Scoring for Collections CapabilitiesPredictiveMetrics' Predictive Scoring for Collections Capabilities
PredictiveMetrics' Predictive Scoring for Collections Capabilities
 
Big data in marketing at harvard business club nick1 june 15 2013
Big data in marketing at harvard business club nick1 june 15 2013Big data in marketing at harvard business club nick1 june 15 2013
Big data in marketing at harvard business club nick1 june 15 2013
 
Infogix BCBS 239 Implementation Challenges
Infogix BCBS 239 Implementation ChallengesInfogix BCBS 239 Implementation Challenges
Infogix BCBS 239 Implementation Challenges
 
A New Approach to Consumer Credit
A New Approach to Consumer CreditA New Approach to Consumer Credit
A New Approach to Consumer Credit
 

Kürzlich hochgeladen

Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...karishmasinghjnh
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...amitlee9823
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 

Kürzlich hochgeladen (20)

Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 

Forecasting P2P Credit Risk based on Lending Club data

  • 1. Forecasting Peer-to-Peer Lending Risk Archange Giscard Destine Steven Lerner Erblin Mehmetaj Hetal Shah September 10, 2016 Forbes
  • 2. Peer-to-Peer Lending 2 • Investors and borrowers are linked by online service providers Investors Borrowers • Growing rapidly – $5.5B in the U.S. in 2014 – Over 100% annual growth rate today – Expected to be a major player in consumer financing – over $150B by 2025 – Lending Club is the clear market leader
  • 3. How Does It Work? Borrowers • Unsecured loan • Rates often below credit cards • Done online – quick and easy 3 Investors • Higher rates, from 4 to 25+% • Ability to spread risk – invest as little as $25 per loan Lending Club • Collect ~ 5% fee up front • Collect ~ 1% on all loan payments • Pursue collections But, roughly 14% of loans end in default and All risk is assumed by the investor
  • 4. Objectives Current Develop a tool to help investors avoid loans likely to default A model to forecast probability of default, given loan information … emphasize default recall versus precision 4 Future Work For investors interested in taking more risk, develop a tool to determine effective interest rate A model forecasting impact of default (x, fraction of loan value) Effective interest rate (z) = n√[(1+i)n - p*x] where i = original interest n = loan duration, yrs p = probability of default
  • 5. 12% 0% Over 36 quarters Unemployment rate Charge-off rate What’s Different Than Prior Work • Lending Club’s new historical data set increases modeling difficulty • Other studies ignored macroeconomic features … which are important 5 Unsecured Personal Loan Delinquencies,2Q16 Unemployment Rate and Charge Off Rate 1.3% 7.7% TransUnion
  • 6. Data Selection • Loan data on completed loans from the Lending Club website • Macroeconomic data 6 Measure State Fed. Value Slope* Reflection of: Unemployment X X X Job loss & replacement difficulty GDP X X X Overall economic activity Disposable income X X X Cost/wage pressure 10-yr to 3-m T-bill spread X X Future economic growth 3-yr T-bill rate X X Short term inflation Credit card rate (average) X X Alternative borrowing costs * Slope is for 12 months prior, based on expert input
  • 7. Data Ingestion: Sources • Loan data: Lending Club website – 111 features for each loan – Historical data since June 2007 • Macroeconomic data – Federal Reserve – Bureau of Economic Analysis – Bureau of Labor Statistics – Cardhub – National Conference of State Legislatures • Collected data stored in data archive (PostgreSQL DB) 7 Data Ingestion Wrangling Computation / Analysis Modeling Reporting / Visualization
  • 8. • Initial data reduction – 111 historical features  29 features provided to investors – Date range reduction to completed loans • Data verification and cleanup – Verify loan uniqueness – Eliminate redundant data – Eliminate non-informative features (URL’s, free form, extremely sparse data, etc.) – Trim entries: “months”, “%”, “+”, “years”, etc. – Verify geographic scope – Select uniform date structure for analysis and merging – Address data that is both numeric and categorical Data Wrangling… a big time consumer 8 Data Ingestion Wrangling Computation / Analysis Modeling Reporting / Visualization 220K instances 111 features
  • 9. • Address all NaN entries • Analyze outliers • Economic calculations – Least square slopes – Interpolating for quarterly and annual data • Wrangle economic data: trimming entries and using consistent format • Merge economic and loan data Data Wrangling (cont’d) 9 Categorical and numerical wrangled data frames Surprise learning: LC only verifies data for 31% of loans! Data Ingestion Wrangling Computation / Analysis Modeling Reporting / Visualization 84K instances 30 features - 21 loan - 9 economic
  • 10. Data Analysis 10 • Initial data analysis shows little separation based on features • What separation there is, appears to be driven by macroeconomic variables Data Ingestion Wrangling Data Analysis Modeling Reporting / Visualization Paid Default
  • 11. Data Analysis (cont’d) 11 Features initially deemed important, showed little differentiation Data Ingestion Wrangling Data Analysis Modeling Reporting / Visualization Default Paid Overlap
  • 12. Modeling • Tested several modeling algorithms – Logistical Regression – Random Forest – Naïve Bayes (Bernoulli, Gaussian, Multinomial) – K-Nearest Neighbors – Gradient Boosting – Voting Classifier • Manual feature exploration • Created pipeline – Standardization – Feature reduction via PCA and LDA 12 Data Ingestion Wrangling Data Analysis Modeling Reporting / Visualization Best recall was 0.58 to 0.62 … was imbalanced data the issue?
  • 13. Modeling (cont’d) 13 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 Annual income Feature importance for random forest Data Ingestion Wrangling Data Analysis Modeling Reporting / Visualization Feature importance for logistic regression Annual income
  • 14. Modeling (cont’d.) • Balanced data set via undersampling paid loans – Little improvement – Losing lots of instances • Added hyper-parameter tuning using GridSearch … little improvement • Balanced data via oversampling defaulted loans – Extracted representative data sample (85/15, paid/default) – Multiply remaining defaults 6X – Train model using 80/20 split – Final test versus extracted (unseen) data 14 Data Ingestion Wrangling Data Analysis Modeling Reporting / Visualization De minimis improvements
  • 15. Modeling (cont’d) • Sought expert advice – Financial experts – Modeling experts • Adjusted feature set – More responsive economic input • 36/60 month lagging slopes  12 month leading slopes • 36/60 month averages  point values – Added critical ratios and indices to expand feature set • Tested binary encoding 15 Data Ingestion Wrangling Data Analysis Modeling Reporting / Visualization De minimis improvements Made a strategic decision to modify class weight to enhance default recall at the expense of default precision
  • 16. Modeling: Metrics Targeted 90+% default recall and 90+% paid precision • Default recall Defaults identified / total defaults • Paid precision Paids identified correctly / total instances identified as paid 16 Data Ingestion Wrangling Data Analysis Modeling Reporting / Visualization
  • 17. Modeling (cont’d) 17 Logistic Regression Precision Recall F1 Score Support Default (weight = 0.7) 0.52 0.94 0.67 13,568 Paid (weight = 0.3) 0.77 0.20 0.31 14,547 Unseen / Imbalanced Results Default 0.16 0.97 0.20 115 Paid 0.97 0.18 0.30 734 Random Forest Default (weight = 0.6) 0.53 0.92 0.68 13,568 Paid (weight = 0.4) 0.77 0.25 0.38 14,547 Unseen / Imbalanced Results Default 0.16 0.95 0.28 115 Paid 0.97 0.24 0.39 734 What does default recall = 0.97 and default precision = 0.16 look like? Data Ingestion Wrangling Data Analysis Modeling Reporting / Visualization
  • 18. Reporting • Tool (online) to predict loan status and probability of default – Investor enters loan info – Tool fetches macroeconomic data – Above data is passed to webservice, which executes model and returns predicted loan status and probability • Tool developed using – Flask interface with machine learning model as a RESTful webservice – Jinja2 template – HTML/CSS – Javascript 18 Data Ingestion Wrangling Data Analysis Modeling Reporting
  • 20. Conclusions • Model effectively sequesters loans likely to default (97% default recall) • Model cherry-picks loans not likely to default (97% paid precision) • Achieving the above required class weighting which drives default recall at the expense of default precision … potentially good loans are misclassified as default • Root causes appear to be lack of data separation, lack of feature relevancy and imbalanced data 20
  • 21. Future Work Project specific • Can we maintain recall and drive up precision by using logistic regression on the total dataset followed by random forest on potential defaults? • Can we identify or create more relevant features? • Can we develop a tool for aggressive investors, providing impact of default? General opportunity space around highly imbalanced data 21 21 21 Logistic Regression Random Forest
  • 22. The authors would like to recognize the open source software that made this work possible 22 Questions? Archange Giscard Destine ad1373@georgetown.edu Steven Lerner sll93@georgetown.edu Erblin Mehmetaj em1109@georgetown.edu Hetal Shah hrs41@georgetown.edu