SlideShare ist ein Scribd-Unternehmen logo
1 von 88
Pay No Attention to the Man
                Behind The Curtain




             The Changing Requirements of Business
                  Analytics in Financial Services

                            Jon C Farrar M.A.
                       Don’t Blame the Retriever….
Farrar -1                                            Don’t blame the Retriever;
                                                       Who threw the ball?
Introduction
      • “There is no business challenge that cannot be
        solved if one considers that a Business
        Challenge is simply a Tennis Ball waiting to be
        thrown….”
              – Jon Farrar




                               Don’t blame the Retriever;
                                 who threw the ball?



Farrar -2                                                   Don’t blame the Retriever;
                                                              Who threw the ball?
Once Upon A Time….




     There was this dream that everyone who
       needed a loan would always be treated fairly



Farrar -3                                             Don’t blame the Retriever;
                                                        Who threw the ball?
But there were factors at work
             that made the dream almost
                      impossible




Farrar -4                                    Don’t blame the Retriever;
                                               Who threw the ball?
But the needs were so great….




Farrar -5                                   Don’t blame the Retriever;
                                              Who threw the ball?
Then all of a sudden
Someone invented something called Credit
  Scores




   They were a bit odd, at first, but they were also kind of an
      elegant accessory and they fit real good.
   Once folks found out about ‘em, Everybody wanted ‘em

                                                                  Don’t blame the Retriever;
                                                                    Who threw the ball?
Everybody was Happy…




Farrar -7                          Don’t blame the Retriever;
                                     Who threw the ball?
They Seemed to go with EVERYTHING,




            and they were a little Magical besides…


Farrar -8                                             Don’t blame the Retriever;
                                                        Who threw the ball?
But, trouble was a-brewin….




             The Wizard of OCC (“awk”) found out
             about the Credit Scores and he was not
             happy.
Farrar -9                                             Don’t blame the Retriever;
                                                        Who threw the ball?
The Wizard of OCC thought Credit
                 Scores looked like this…




Farrar -10                                      Don’t blame the Retriever;
                                                  Who threw the ball?
And The Wizard Wanted them to
                  look more like this…




Farrar -11                                   Don’t blame the Retriever;
                                               Who threw the ball?
So, The Wizard sent his Minions to do
                   some work….




Farrar -12                                 Don’t blame the Retriever;
                                             Who threw the ball?
Storm after Storm blew down on
               everyone using Credit Scores




Farrar -13                                    Don’t blame the Retriever;
                                                Who threw the ball?
And Because the Wizard of OCC
      Wasn’t always real clear about what he wanted
       everybody to do. People were confused…


                  REG B




Farrar -14                                        Don’t blame the Retriever;
                                                    Who threw the ball?
So the Wizard Tried Again,


              OCC
              97-24




      More Confusion……



Farrar -15                                Don’t blame the Retriever;
                                            Who threw the ball?
And Again….




               OCC
             2000-16




                       And Still, NOBODY seemed to know what
                       to do….



Farrar -16                                                     Don’t blame the Retriever;
                                                                 Who threw the ball?
And then the wisest one of them all
                       had an Idea….
              C’mon, you guys!
              We just gotta go
              talk to the old bird




Farrar -17                                     Don’t blame the Retriever;
                                                 Who threw the ball?
Toto’s Right!




Farrar -18                   Don’t blame the Retriever;
                               Who threw the ball?
So there was only one thing
                     left to do…..




       They formed their little group and they went off
         to see the Wizard….
       They just needed to TALK to him…..
Farrar -19                                           Don’t blame the Retriever;
                                                       Who threw the ball?
So, they followed the
                FICO-Built Road



                       OCC




Farrar -20                           Don’t blame the Retriever;
                                       Who threw the ball?
But that proved kinda scary,

             everybody said The Wizard was REAL
               MEAN!




Farrar -21                                        Don’t blame the Retriever;
                                                    Who threw the ball?
And the Minions seemed to like that
                  everybody was confused




Farrar -22                                     Don’t blame the Retriever;
                                                 Who threw the ball?
And all along the road
      There were “Empirically Derived”s
      And “Demonstrably and Statistically Sound”s
      There were Models, Reporting,
        and BackTests

             OH MY!




Farrar -23                                          Don’t blame the Retriever;
                                                      Who threw the ball?
They knew they weren’t in Kansas
                       Anymore….




Farrar -24                                      Don’t blame the Retriever;
                                                  Who threw the ball?
But They Carried On
             In spite of attempts to
               deter them from their
               road…




Farrar -25                               Don’t blame the Retriever;
                                           Who threw the ball?
And When they finally got to
                    The Wizard

      They made an appointment with his Admin




Farrar -26                                      Don’t blame the Retriever;
                                                  Who threw the ball?
When They First Got
             Inside, they WERE scared




Farrar -27                              Don’t blame the Retriever;
                                          Who threw the ball?
But Then They Realized
                something funny




Farrar -28                            Don’t blame the Retriever;
                                        Who threw the ball?
The Wizard of OCC wasn’t such a bad
                     guy after all




Farrar -29                                 Don’t blame the Retriever;
                                             Who threw the ball?
He Just wanted Everybody to
             Understand How the Ruby Slippers
                       Were Made
      So that they held up,
      and didn’t fall apart,
      and were the right size,
      and were available to all,
      And so everybody could buy
        and sell more slippers,
      In a kinda sorta fair way….
Farrar -30                                      Don’t blame the Retriever;
                                                  Who threw the ball?
So the Wizard Of OCC Created




Farrar -31                                  Don’t blame the Retriever;
                                              Who threw the ball?
And everybody sort of understood,
              And everybody was sort of happy




Farrar -32                                       Don’t blame the Retriever;
                                                   Who threw the ball?
It STILL wasn’t perfect,
                 But it was a gosh-darn sight better
                   than what came before…..

    REG B
                    OCC
                    97-24            OCC
                                    2000-
                                      16




Farrar -33                                       Don’t blame the Retriever;
                                                   Who threw the ball?
There was something for everybody




Farrar -34                                   Don’t blame the Retriever;
                                               Who threw the ball?
And Toto too….




Farrar -35                    Don’t blame the Retriever;
                                Who threw the ball?
Dorothy Understood that she
              needed to spread the word




Farrar -36                                 Don’t blame the Retriever;
                                             Who threw the ball?
And with the help of a very good
                     Travel Agent….




Farrar -37                                      Don’t blame the Retriever;
                                                  Who threw the ball?
They loaded the Ruby Slippers
             and the New Instructions into
                   the Open Gray Box




Farrar -38                                   Don’t blame the Retriever;
                                               Who threw the ball?
And set off back to
             where it all started…




Farrar -39                           Don’t blame the Retriever;
                                       Who threw the ball?
*



             * Well, for the time being anyway….




Farrar -40                                             Don’t blame the Retriever;
                                                         Who threw the ball?
Pay No Attention… Part II
         • What we have learned thus far
             – Since the beginning, Models were Magical
             – Regulators were always concerned with Fairness and
               measurability
             – Models offer Promise but lots of confusion
                 • Models are used for lots of different functions
                 • Models are not always clearly understood
                 • Regulating them lagged behind their prevalence
                   and use
             – Multiple attempts to regulate but never clear
             – Finally catching up but still lagging
             – OCC 2011-12 best so far, large way there
                 • Dem’s da rules, Dat’s how we gotta play…


Farrar -41                                                           Don’t blame the Retriever;
                                                                       Who threw the ball?
Models offered Promise but lots
                    of confusion too
      • We started using models for all sorts of different
        functions
      • Consumers started asking lots of questions
      • “You didn’t Score enough” didn’t cut it
      • “Lemme talk to your MANAGER!”




Farrar -42                                               Don’t blame the Retriever;
                                                           Who threw the ball?
Characteristic                               Points           You see, time
             Home Ownership

                                                  Own
                                                  Rent
                                                                   35
                                                                   25
                                                                              was…..
                                    Lives with parents             20
                                                Other              15    Earlier Models were able
             Years On Job                                                to be very simply rendered
                                              < 2years             15
                                            2 – 5 years            20
                                            5 – 8 years
                                              8+ years
                                                                   20
                                                                   16
                                                                         One just added up the
             Credit History                                              points
                                             < 2 years              5
                                             2-4 years             10
                                             4-7 years             15    If there were enough to
                                              7+ years             20
                                                                         pass the cutoff, the
             Credit Report

                                          < 3 Inquiries             20
                                                                         customer was approved
                                           3+ Inquiries              5
                                       < 3 Satisfactory             10
                                        3+ Satisfactory
                              Worst Rating 60+ Delinq
                                                                    25
                                                                   -10
                                                                         But still nobody really
                                   Worst Rating Derog
                              Worst Rating Satisfactory
                                                                   -20
                                                                    20
                                                                         knew how to explain them

Farrar -43                                                                                         Don’t blame the Retriever;
                                                                                                     Who threw the ball?
And we started using models for
                     all kinds of things
                                                         Line
                                                     Authorizations                 Severe
                       Credit       Prescreen        Cross-selling                Collections
      Objective
                      Extension       Solicit          Collecting       Reissue    Recovery




                    New Account     Solicitation                                   Collection
             Tool     Scoring                             Behavior Scoring          Scoring
                                     Scoring



                                                               Masterfile
        Typical
                                                              Purchases &           Masterfile,
       Sources       Application,   Credit Bureau,
                                                               Payments           Credit Bureau,
        of Data     Credit Bureau   Demographics
                                                              Loan Details         Loan Details




                                            Linear Regression Models
                                           Logistic Regression Models


Farrar -44                                                                                         Don’t blame the Retriever;
                                                                                                     Who threw the ball?
Models offered Promise but lots
                       of confusion
      • Models used for lots of different functions
      • Consumers started asking lots of questions
             – Why did I only get that Loan Amount?
             – Why was I turned down?
             – Why didn’t you renew my Credit Line?
             – Why did you call me for a payment?




Farrar -45                                            Don’t blame the Retriever;
                                                        Who threw the ball?
Models offered Promise but lots
                       of confusion
      • Models used for lots of different functions
      • Consumers gaining Savvy and asked lots of
        questions
      • “You didn’t Score enough” didn’t cut it
             – Customers didn’t get it
             – Loan Officers also didn’t get it
             – The Tin Woodsman didn’t get it
                (and he had an Axe!)


Farrar -46                                            Don’t blame the Retriever;
                                                        Who threw the ball?
And now look where we are
                               (not to mention where we’re going…)
                  Severe
   Objective    Collections
                 Recovery         Fraud    Attrition   Cross Sell     Utilization   Propensity     Operations




                                                       Traditional MS Office Suite
                Collection                                Data extraction tools
        Tool
                 Scoring                     Leading edge Statistical packages (SAS, SPSS, R)
                                                          Data Mining packages
                                                     Pattern Recognition Algorithms
                                              Categorization and Regression Trees (CART®)
                                                Stochastic Gradient Boosting (TreeNet ®)
                                                Programming and Application Languages
    Typical
   Sources      Masterfile,
    of Data    Credit Bureau
                                                                DataMarts,
                                                             Data Warehouses


                                                    Web Logs, Transactional Databases,
                                                     Historical time series databases
                                   Internal system databases (DDA, Collection, Recovery, Financial, etc.)

Farrar -47                                                                                                  Don’t blame the Retriever;
                                                                                                              Who threw the ball?
Models offered Promise but lots
                      of confusion
      • Models used for lots of different functions
      • Consumers started asking lots of questions
      • “You didn’t Score enough” didn’t cut it




Farrar -48                                            Don’t blame the Retriever;
                                                        Who threw the ball?
But how ya gonna keep ‘em
                  down on the Farm…?
      • Plethora of Modeling techniques and Methodologies
        are part of Statistical training
      • Reality Bites
             • Only very small number of learned statistical techniques can
               actually be used in most business scenarios
             • Where we can apply them in Business, even fewer of those
               meet usability requirements
                – Tracking, Monitoring, Maintaining, Refreshing
                – Time to Develop, Validate, Test, Deploy
                – Extensible, Scalable, contribute to KPI’s and Financial Measures like
                  ROA, RAROC, ROI, etc.
                – EXPLAINABLE! (ahhhh… back to the Regulations….in a moment…)
             • So in general, it makes more sense to use simpler types of
               models for most business applications

Farrar -49                                                                           Don’t blame the Retriever;
                                                                                       Who threw the ball?
So How ya gonna keep ‘em down on
                         the Farm?
      • Easy.
      • Tell ‘em they have to follow 2011-12
      • They’ll NEVER leave!




Farrar -50                                     Don’t blame the Retriever;
                                                 Who threw the ball?
OCC 2011-12

      • The design, theory, and logic underlying the
        model should be well documented and generally
        supported by published research and sound
        industry practice. The model methodologies and
        processing components that implement the
        theory, including the mathematical specification
        and the numerical techniques and
        approximations, should be explained in detail
        with particular attention to merits and
        limitations.

Farrar -51                                            Don’t blame the Retriever;
                                                        Who threw the ball?
OCC 2011-12          (2)



      • Without adequate documentation, model risk
        assessment and management will be ineffective.
        Documentation of model development and
        validation should be sufficiently detailed so that
        parties unfamiliar with a model can understand
        how the model operates, its limitations, and its
        key assumptions. Documentation provides for
        continuity of operations, makes compliance with
        policy transparent, and helps track
        recommendations, responses, and exceptions.

Farrar -52                                              Don’t blame the Retriever;
                                                          Who threw the ball?
Vital organs of 2011-12
      •      Oversight – Model Risk Management Division
              –   Manage Model Risk like any other type of risk
              –   Detailed Policies and procedures for Models , their uses and permitted Overrides
              –   Rigorous assessment of Data quality, relevance, appropriateness and documentation
              –   All model assumptions must be tracked and monitored
              –   Appropriateness of chosen Methodology must be defensible (design and construction)
              –   Audit and Compliance Signoffs
      •      Rigorous Testing before Implementation
              – Stress testing against multiple economic and Financial Scenarios to identify model uncertainty
                and potential for inaccuracy
      •      Independent Validation prior to Implementation (internal unit or Contracted
             External resource)
      •      Model used for population designed on
      •      Reporting formalized, pre-established thresholds for performance effectiveness
             and stability
      •      Exhaustive documentation to EXPLAIN everything
              – Business Goals, Assumptions, Data, Intended Use, Methodology, How Model Works, ties in to
                Policy and Procedures, Adverse Action, Testing, Validation and tracking protocols, etc




Farrar -53                                                                                               Don’t blame the Retriever;
                                                                                                           Who threw the ball?
EXPLAINING now is a
                          really BIG thing…
              The sum of the square
             roots of any two sides of
              an isosceles triangle is
             equal to the square root
             of the remaining side. Oh
                joy! Rapture! I got a
               brain! How can I ever
                 thank you enough?




Farrar -54                                    Don’t blame the Retriever;
                                                Who threw the ball?
Explaining Models
    • Logistic and Linear Regression Models are very well
      understood, have been reliably used in Business
      Applications for over 60 years, and when properly built are
      stable, very good predictors of outcomes
    • Logistic and Linear Regression Models are relatively easy to
      explain
             – A linear regression line has an equation of the form Y = a +
               bX, where X is the explanatory variable and Y is the dependent
               variable. The slope of the line is b, and a is the intercept (the value
               of y when x = 0)*
             – Logistic regression is used for predicting binary outcomes
               (Bernoulli trials) rather than continuous outcomes, and models a
               transformation of the expected value as a linear function of the
               predictors, rather than the expected value itself**
       *http://www.stat.yale.edu/Courses/1997-98/101/linreg.htm
     **http://en.wikipedia.org/wiki/Logistic_regression#Definition
Farrar -55                                                                               Don’t blame the Retriever;
                                                                                           Who threw the ball?
Explaining Models                        (2)




      • Regression Models generally assume a statistically normal
        distribution of variables and predicted outcomes
      • Both Linear and Logistic Models are founded on the
        correlative nature of multiple variables to predicted outcomes
        and require some type of linear relationship between each
        variable and the predicted outcome
             – Sometimes (generally) first require data to be transformed in a variety
               of ways to establish an optimal linear relationship
             – Use a given variable only once in a given model, according to the
               (derived) linear relationship
                 • One variable (or range), one coefficient



Farrar -56                                                                         Don’t blame the Retriever;
                                                                                     Who threw the ball?
On The Other Hand…

     • Business Data is becoming less and less normally
       distributed
     • Businesses must now pay more and more attention to
       exceptions and outliers in order to maximize targeting and
       profitability
     • Linear and Logistic methodologies are no longer always
       adequate to solve the more complex business challenges
             – Some build model suites to address a single challenge
             – Lead times for development, validation, testing and
               documenting suites of models are therefore much more
               extended
             – Newer methodologies can help here, in the sense that often
               one model can be built, but…..
                • but 2011-12 rears its head again….
Farrar -57                                                                  Don’t blame the Retriever;
                                                                              Who threw the ball?
2011-12 rears its head again
     • If ya’ can’t explain it, ya’ can’t use it
     • Neural Networks, Bayesian
       Networks, Stochastic Gradient Boosting, etc.
       all need to be explained
     • Mathematical formulas, and underpinnings
       like assumptions, must be justified, can be
       difficult to objectively explain, and may be
       difficult if not impossible to place into an
       Adverse Action context




Farrar -58                                            Don’t blame the Retriever;
                                                        Who threw the ball?
Why CART is so cool…
      See, Decision Trees are “easy” because we can
        explain this one no problem:
             INDUS <= 6.145



              INDUS > 6.145 &&
                PT <= 18.65 &&                                          INDUS > 6.145 &&
                DIS <= 4.91145                                            PT > 18.65 &&
                                                                          NOX > 0.755

                    INDUS > 6.145 &&
                      PT <= 18.65 &&
                      DIS > 4.91145                        INDUS > 6.145 &&
                                                             PT > 18.65 &&
                                                             NOX <= 0.755 &&
                                       INDUS > 6.145 &&      LSTAT > 5.165
                                         PT > 18.65 &&
                                         NOX <= 0.755 &&
                                         LSTAT <= 5.165



Farrar -59                                                                                 Don’t blame the Retriever;
                                                                                             Who threw the ball?
Even if it’s a bigger tree….
                                                                                   INDUS > 6.145 &&
                                                                                     PT > 18.65 &&
              INDUS <= 6.145 &&                                                      NOX > 0.755
                MV <= 45.7



                 INDUS <= 6.145 &&                                                     INDUS > 6.145 &&
                   MV > 45.7                                                             PT > 18.65 &&
                                                                                         NOX <= 0.755 &&
             INDUS > 6.145 &&                                                            LSTAT > 5.165 &&
               PT <= 18.65 &&                                                            DIS > 1.1333
               DIS <= 4.91145 &&
               TAX <= 17.5


                                                               INDUS > 6.145 &&
                     INDUS > 6.145 &&                            PT > 18.65 &&     INDUS > 6.145 &&
                       PT <= 18.65 &&                            NOX <= 0.755 &&     PT > 18.65 &&
                       DIS <= 4.91145 &&                         LSTAT <= 5.165      NOX <= 0.755 &&
                       TAX > 17.5                                                    LSTAT > 5.165 &&
                                            INDUS > 6.145 &&                         DIS <= 1.1333
                                              PT <= 18.65 &&
                                              DIS > 4.91145



Farrar -60                                                                                                  Don’t blame the Retriever;
                                                                                                              Who threw the ball?
But how in Munchkin Land
             can you explain this thing?




Farrar -61                             Don’t blame the Retriever;
                                         Who threw the ball?
And what if ya’ had                              Oh
                                                               MY!
             something like THIS …
               +           +           +           +           +           +                 +



                                                                   +           +             +
                   +           +           +           +



                   +           +           +           +           +           +                 +



                                                                       +           +             +
                       +           +           +           +



                                                                       +           +   …….
                       +           +           +           +

Farrar -62                                                                                           Don’t blame the Retriever;
                                                                                                       Who threw the ball?
Even the TREES get
                                    confused…
             +           +           +           +           +           +                 +


                 +                       +                       +           +             +
                             +                       +


                 +           +           +           +           +           +                 +


                     +                       +                       +           +                +
                                 +                       +


                     +                       +                       +           +   …..
                                 +                       +

Farrar -63                                                                                     Don’t blame the Retriever;
                                                                                                 Who threw the ball?
BAD news



             SILENCE!

     • ya really
       can’t
       explain this
       one                   OCC 2011-12


Farrar -64                                 Don’t blame the Retriever;
                                             Who threw the ball?
Good news
      • Ya CAN explain this one…..
             +           +           +           +           +           +            +


                 +           +           +           +           +           +         +


                 +           +           +           +           +           +            +
                                                                                               But what the
                     +           +           +           +           +           +         +   Kansas is this
                                                                                               thing anyway?

                 +               +           +           +           +       +       ……..



Farrar -65                                                                                                  Don’t blame the Retriever;
                                                                                                              Who threw the ball?
A Woodman’s view of TreeNet®
     • Borrowing from Dan Steinberg’s introductory video….
        – TreeNet® is also called Stochastic Gradient Boosting
        – It’s speed and accuracy are unparalleled in Modeling and it has a
          number of advantages over more traditional methodologies
                 • I will leave the Sales Pitch to Salford, but it is my favorite tool and we used it for
                   every kind of model you can think of
             – I am no expert but here is kind of how it works (and TreeNet® does
               this automatically and keeps track of it all for you):
                 • build an initial tree and identify the misclassifications
                 • using the misclassified cases as the target, pull your whole sample again, develop a
                   new tree based on that
                 • continue until you have exhausted your errors. Could be hundreds or thousands of
                   builds, all happening very quickly
                 • You then “simply” add up all of the weights of the variables in the individual trees
                   and Voilà!


Farrar -66                                                                                                  Don’t blame the Retriever;
                                                                                                              Who threw the ball?
Think about it like this….
     • So you get your one tree…

     • TreeNet® changes your target to the Misclasses
       and creates a second tree….

     • And TreeNet® does it again and again
       and again while you get a treat for Toto……

     • In the end, TreeNet® adds the weights of the variables in
       all the trees together…..

                 +       +      +       +       +       + ….
       • Then you simply export the code and implement the
         model!
Farrar -67                                                     Don’t blame the Retriever;
                                                                 Who threw the ball?
Here’s a bit of what a Treenet Model
                                          looks like to a C Programmer
     /**********************************************           * Here come the treenets in the grove. A shell for calling   /**********************************************
     ************                                             them                                                          *********/
     * The following C source code was automatically           * appears at the end of this source file.                    if (CRIM == DBL_MISSING_VALUE) CRIM = 0.2102;
     generated                                                ***********************************************                 if (ZN == DBL_MISSING_VALUE) ZN = 0;
     * by the TRANSLATE feature in Salford Predictive         ****************/                                               if (INDUS == DBL_MISSING_VALUE) INDUS = 8.14;
     Miner(tm).                                               double TreeNet_1(double * const pProb0, double * const          if (NOX == DBL_MISSING_VALUE) NOX = 0.515;
     * Modeling version: 6.6.0.091, Translation version:      pProb1)                                                         if (RM == DBL_MISSING_VALUE) RM = 6.251;
     6.6.0.091                                                {                                                               if (AGE == DBL_MISSING_VALUE) AGE = 74.3;
                                                                /* TreeNet version: 6.6.0.091 */                              if (DIS == DBL_MISSING_VALUE) DIS = 3.4211;
     ***********************************************            /* TreeNet: TreeNet_1 */                                      if (RAD == DBL_MISSING_VALUE) RAD = 5;
     ***********/                                               /* Timestamp: 2012043172135 */                                if (TAX == DBL_MISSING_VALUE) TAX = 207;
                                                                /* Grove:                                                     if (PT == DBL_MISSING_VALUE) PT = 18.6;
     #include <string.h> /* for strcmp() */                   C:DOCUME~1OfficeLOCALS~1Temps5u137 */                      if (B == DBL_MISSING_VALUE) B = 192.11;
     #include <math.h> /* for exp() */                          /* Target: CHAS */                                            if (LSTAT == DBL_MISSING_VALUE) LSTAT = 10.3;
                                                                /* N trees: 197 */                                            if (MV == DBL_MISSING_VALUE) MV = 21.7;
     /**********************************************            /* N target classes: 2 */
     ************
     * **** APPLICATION-DEPENDENT MISSING VALUES               double target, net_response = 0.0;                           /* Tree 1 of 197 */
     ****                                                      int node, done;                                               /* N terminal nodes = 6, Depth = 5 */
     * The two constants must be set **by you** to             int response = 0;
     whatever                                                                                                                target = 0.0;
     * value(s) you use in your data management or             /***************************/                                 node = 1; /* start at root node */
     programming                                               /* Class-specific treenets */                                 done = 0; /* set at terminal node */
     * workflow to represent missing data.                     /***************************/
                                                                                                                             while (!done) switch (node) {
     ***********************************************           double expsum = 0.0;
     ***********/                                              double prob0, score0; /* CHAS = 0 */                           case 1:
                                                               double prob1, score1; /* CHAS = 1 */                            if (NOX != DBL_MISSING_VALUE && NOX < 0.755)
     const double DBL_MISSING_VALUE = /* value needed                                                                       node = 2;
     here! */ ;                                                                                                                else node = -6;
     const int INT_MISSING_VALUE = /* value needed here!      /**********************************************                  break;
     */ ;                                                     *********/
                                                               /* The following predictors had no missing data in */             case 2:
     /************                                             /* the learn sample, so the TreeNet model is unable to             if (TAX != DBL_MISSING_VALUE && TAX < 278) node =
     * PREDICTORS                                             */                                                            3;
     ************/                                             /* accommodate missing data for them during scoring.              else node = 5;
                                                              */                                                                 break;
     double                                                    /* They must be imputed. These particular values are
     CRIM, ZN, INDUS, NOX, RM, AGE, DIS, RAD, TAX, PT, B, L   */                                                             case 3:
     STAT, MV;                                                 /* the learn sample medians and/or modes. These are              if (RM != DBL_MISSING_VALUE && RM < 5.93) node =
                                                              */                                                            -1;
     /**********************************************           /* provided as a convenience, you may wish to replace            else node = 4;
     *****************                                        */                                                                break;
                                                               /* these expressions with your own.           */


Farrar -68                                                                                                                                                                            Don’t blame the Retriever;
                                                                                                                                                                                        Who threw the ball?
TreeNet®
                                                                                                                       case -2:                                    code2
     case -1:                                                 default: /* error */                                      target = -0.005427301;
      target = -1.202511;                                      target = 0.0;                                            node = 2;
      node = 1;                                                done = 1;                                                done = 1;
      done = 1;                                                node = 0;                                                break;
      break;                                                   break;
                                                                                                                       case -3:
      case 4:                                             }                                                             target = 0.0093125903;
       if (LSTAT != DBL_MISSING_VALUE && LSTAT < 6.13)                                                                  node = 3;
    node = -2;                                           net_response += target;                                        done = 1;
       else node = -3;                                                                                                  break;
       break;                                             /* Tree 2 of 197 */
                                                          /* N terminal nodes = 6, Depth = 5 */                     case 5:
     case -2:                                                                                                        if (RM != DBL_MISSING_VALUE && RM < 5.5815)
      target = -1.217944;                                 target = 0.0;                                           node = -4;
      node = 2;                                           node = 1; /* start at root node */                         else node = -5;
      done = 1;                                           done = 0; /* set at terminal node */                       break;
      break;
                                                          while (!done) switch (node) {                                case -4:
                                                                                                                                                                   Code for
     case -3:                                                                                                           target = 0.00081652142;                    the first 3
      target = -1.2337965;                                 case 1:                                                      node = 4;
      node = 3;                                             if (NOX != DBL_MISSING_VALUE && NOX < 0.7155)               done = 1;                                  Trees in
      done = 1;                                          node = 2;                                                      break;
      break;                                                else node = -6;                                                                                        the
    case 5:
                                                            break;                                                case -5:
                                                                                                                    target = -0.0047567333;
                                                                                                                                                                   Model…*
       if (MV != DBL_MISSING_VALUE && MV < 27.3)              case 2:                                               node = 5;
    node = -4;                                                 if (PT != DBL_MISSING_VALUE && PT < 17.7) node =     done = 1;
       else node = -5;                                   3;                                                         break;
       break;                                                  else node = 5;
                                                               break;                                                  case -6:
     case -4:                                                                                                           target = 0.01884071;
      target = -1.2337965;                                 case 3:                                                      node = 6;
      node = 4;                                             if (TAX != DBL_MISSING_VALUE && TAX < 40.5)                 done = 1;
      done = 1;                                          node = -1;                                                     break;
      break;                                                else node = 4;
                                                            break;                                                     default: /* error */
     case -5:                                                                                                           target = 0.0;
      target = -1.2231822;                               case -1:                                                       done = 1;
      node = 5;                                            target = 0.024272515;                                        node = 0;
      done = 1;                                            node = 1;                                                    break;
      break;                                               done = 1;
                                                           break;                                                  }
     case -6:
      target = -1.2087922;                                 case 4:                                                net_response += target;
      node = 6;                                             if (CRIM != DBL_MISSING_VALUE && CRIM <
      done = 1;                                          0.191425) node = -2;                                      /* Tree 3 of 197 */
      break;                                                else node = -3;                                        /* N terminal nodes = 6, Depth = 5 */   (…..)
                                                            break;


Farrar -69       *NOTE! We multiplied the results times 10,000 to eliminate double precision problems during implementation… Ask me!                                Don’t blame the Retriever;
                                                                                                                                                                      Who threw the ball?
Imagine that for THOUSANDS
                      of trees….




Farrar -70                                Don’t blame the Retriever;
                                            Who threw the ball?
But back to the Wizard of OCC….
   • Forget about the code… it’s just text! IT can handle it!
    •    What you need to focus on is explaining it
         all for the Wizard….
    •    And that doesn’t mean slapping down a bunch of
         code lines
    •    The Wizard needs to understand how come the
         Ruby Slippers fit so well, how the Slippers were
         put together, and where the material comes from
         (the variables and weights that drive the results)
    •    Especially if you need to communicate to
         customers the effects of wearing the Slippers
             –   In modeling terms, like if it is an Origination model
                 needing Score Factor Codes for Adverse Action
                 Letters…)



        • So here’s one way to do that….

Farrar -71                                                               Don’t blame the Retriever;
                                                                           Who threw the ball?
CASE STUDY from Real Life….

       •Attrition Model – Customer will close all accounts
            •Needed Talking Points (Score Factors) to facilitate
            attempts to save customer accounts
            •Built TreeNet® model to predict probability that a
            customer will close all of their accounts
            •Identified CART Equivalent Rules for all Accounts
            •Pulled new out of sample data for recent periods
            •Scored and Validated the results against known
            outcomes
            •Based on the Probability, generated list of high risk
            accounts and pushed to Branches with Score
            Factors (rules) appended




Farrar -72                                                           Don’t blame the Retriever;
                                                                       Who threw the ball?
Attrition Model Process
      •      Built TreeNet® Model
      •      Scored Validation Set using model built
              • Created new data set appending probability score and Node identifier to each sample
                   point
              • Identified Variable Importance
              • Used CART to derive a Regression tree using TreeNet® score as the target
              • Compared Variable Importances
              • Looked at rules governing each of the like nodes
              • Manually went through tree finding Terminal Nodes with like Mean values
              • Generalized like nodes based on rules and split thresholds, creating factors such as “Low
                   Balance,” “Short Time On Books,” “Diminishing Balance Over Last 6 Months,” etc.
              • Pruned Tree where possible (without fundamentally changing Rules and split
                   thresholds)
              • Analyzed each step to understand Utility vs. Complexity tradeoffs
              • Tested outcome (same data) with the generalized variables
              • Tested with repeated out-of-sample Validation sets
              • Subjected process to Model Risk Management Unit which independently validated
                   model and documentation
              • Implemented Model

Farrar -73                                                                                           Don’t blame the Retriever;
                                                                                                       Who threw the ball?
A Schematical* Representation of
                        what I just explained…
                            Initial Regression Tree
                               (post- TreeNet®)




 *HAH! I love new words….

Farrar -74                                            Don’t blame the Retriever;
                                                        Who threw the ball?
Look at which cases hit the same Node
                                    and group them

             Step 1
             CASEID   RESPONSE     NODE   CASEID   RESPONSE     NODE   CASEID RESPONSE     NODE   CASEID RESPONSE     NODE
                1     -0.000004613  20      31      -0.00000251  14      61    0.000006376   3      91     0.00001477  21
                2     -0.000064770  12      32      0.000005463   1      62    0.000006376   3      92     0.00001477  21
                3      0.002201621   9      33      -0.00000251  14      63    0.000006376   3      93     0.00001477  21
                4      0.002201621   9      34      0.000005463   1      64    0.000006376   3      94     0.00001477  21
                5      0.002201621   9      35      0.000005463   1      65   -0.000046164  13      95     0.00001477  21
                6      0.002201621   9      36       0.00002013   4      66   -0.000004613  20      96    0.000020736  22
                7      0.000004718  19      37       0.00002013   4      67    0.000004618   2      97    0.000020736  22
                8     -0.000000771  13       2      0.000016180  14      68    0.000005193  16      98    0.002227983  11
                9     -0.000004596  18      19      0.000016180  14      69    0.000005193  16      99    0.002227983  11
               10      0.000004718  19      16      0.002227895   7      70    0.000005193  16     100   -0.000068815  17
               11      0.000004718  19       1       0.00221521   5      71    0.000005463   1     101    0.001503144   3
               12      0.000004718  19      42       0.00221521   5      72    0.000005463   1     102    0.000014534  24
               13      0.000004718  19      13       0.00221521   5      73    0.000005463   1     103    0.000005463   1
               14      0.000005463   1      44     -0.000064770  12      74    0.000005463   1     104    0.000005463   1
               15      0.000005463   1      45     -0.000064770  12      75    0.000005463   1     105    0.000005463   1
               16      0.000005463   1      46      0.004469995   2      76    0.000005463   1     106    0.000005463   1
               17      0.000005463   1      47      0.004469995   2      77    0.000005463   1     107    0.000005463   1
               18      0.000005463   1      18     -0.000064770  12      78    0.000005463   1     108    0.000005463   1
               19      0.000015537   7      14      0.004469995   2      79    0.000005463   1     109    0.000005463   1
               20      0.000005463   1      50      0.004469995   2      80    0.000005463   1     110    0.000005463   1
               21      0.000005463   1      51     -0.000046164  13      81    0.001503144   3     111    0.000005463   1
               22      0.000005463   1      52     -0.000046164  13      82     0.00002013   4     112    0.000014534  24
               23      0.000005463   1      53       0.00222916   8      83    0.000016180   5     113    0.000005463   1
               24      0.000005463   1      54     -0.000046164  13      84     0.00002013   4     114    0.000005463   1
               25      0.000005463   1      55      0.000005193  16      85    0.000008446  15     115    0.000005463   1
               26      0.000005463   1      56      0.002220315   6      86    0.000008446  15     116    0.000005463   1
               27      0.000005463   1       3     -0.000004613  20      87    0.000008446  15     117    0.000005463   1
                4      0.000005463   1      58       0.00240118  10      88    0.000008446  15     118    0.000005463   1
                3      0.000005463   1      59      0.000006376   3      89     0.00240118  10     119    0.000005463   1
               20      0.000005463   1      60      0.000006376   3      90    0.002227983  11     120    0.000005463   1




Farrar -75                                                                                                                   Don’t blame the Retriever;
                                                                                                                               Who threw the ball?
Sort by Response and determine if it
                          makes sense to group similar outcomes
             Step 2
             CASEID RESPONSE           NODE   CASEID RESPONSE         NODE   CASEID RESPONSE         NODE   CASEID RESPONSE         NODE
              100      -0.0000688150    17      18     0.0000054630    1      108     0.0000054630     1      38     0.0000161800    14
                2      -0.0000647700    12      20     0.0000054630    1      109     0.0000054630     1      39     0.0000161800    14
               44      -0.0000647700    12      21     0.0000054630    1      110     0.0000054630     1      83     0.0000161800    14
               45      -0.0000647700    12      22     0.0000054630    1      111     0.0000054630     1      36     0.0000201300     4
               48      -0.0000647700    12      23     0.0000054630    1      113     0.0000054630     1      37     0.0000201300     4
               51      -0.0000461640    13      24     0.0000054630    1      114     0.0000054630     1      82     0.0000201300     4
               52      -0.0000461640    13      25     0.0000054630    1      115     0.0000054630     1      84     0.0000201300     4
               54      -0.0000461640    13      26     0.0000054630    1      116     0.0000054630     1      96     0.0000207360    22
               65      -0.0000461640    13      27     0.0000054630    1      117     0.0000054630     1      97     0.0000207360    22
                1      -0.0000046130    20      28     0.0000054630    1      118     0.0000054630     1      81     0.0015031440     3
               57      -0.0000046130    20      29     0.0000054630    1      119     0.0000054630     1     101     0.0015031440     3
               66      -0.0000046130    20      30     0.0000054630    1      120     0.0000054630     1       3     0.0022016210     9
                9      -0.0000045960    18      32     0.0000054630    1       59     0.0000063760     3       4     0.0022016210     9
               31      -0.0000025100    14      34     0.0000054630    1       60     0.0000063760     3       5     0.0022016210     9
               33      -0.0000025100    14      35     0.0000054630    1       61     0.0000063760     3       6     0.0022016210     9
                8      -0.0000007710    13      71     0.0000054630    1       62     0.0000063760     3      41     0.0022152100     5
               67       0.0000046180     2      72     0.0000054630    1       63     0.0000063760     3      42     0.0022152100     5
                7       0.0000047180    21      73     0.0000054630    1       64     0.0000063760     3      43     0.0022152100     5
               10       0.0000047180    21      74     0.0000054630    1       85     0.0000084460    15      56     0.0022203150     6
               11       0.0000047180    21      75     0.0000054630    1       86     0.0000084460    15      40     0.0022278950     7
               12       0.0000047180    21      76     0.0000054630    1       87     0.0000084460    15      90     0.0022279830    11
               13       0.0000047180    21      77     0.0000054630    1       88     0.0000084460    15      98     0.0022279830    11
               55       0.0000051930    16      78     0.0000054630    1      102     0.0000145340    24      99     0.0022279830    11
               68       0.0000051930    16      79     0.0000054630    1      112     0.0000145340    24      53     0.0022291600     8
               69       0.0000051930    16      80     0.0000054630    1       91     0.0000147700    19      58     0.0024011800    10
               70       0.0000051930    16     103     0.0000054630    1       92     0.0000147700    19      89     0.0024011800    10
               14       0.0000054630     1     104     0.0000054630    1       93     0.0000147700    19      46     0.0044699950     2
               15       0.0000054630     1     105     0.0000054630    1       94     0.0000147700    19      47     0.0044699950     2
               16       0.0000054630     1     106     0.0000054630    1       95     0.0000147700    19      49     0.0044699950     2
               17       0.0000054630     1     107     0.0000054630    1       19     0.0000155370     7      50     0.0044699950     2




Farrar -76                                                                                                                                 Don’t blame the Retriever;
                                                                                                                                             Who threw the ball?
Similar Outcomes, Consolidate Rules?

       Tnode 4 0.00002013
       YRS_OB > 6.145 &&
         CONTACTS <= 18.65 &&
         NUM_ACCTS <= 4.5 &&
         BRANCH <= 417.5 &&
         C_BTL > 0.04427 &&
         PROFIT_ILE <= 24.95                                             Tnode 22 0.00002074
                                                                         YRS_OB > 6.145 &&
                                                                           CONTACTS > 18.65 &&
                                                                           FEE_LEVEL <= 0.755 &&
                                                                           TRANS_NUM > 540.5 &&
                                                                           M_VAL <= 39.9




     Tnode 8 0.00222916                               Tnode 11 0.00222798
     YRS_OB > 6.145 &&                                YRS_OB > 6.145 &&
       CONTACTS <= 18.65 &&                             CONTACTS <= 18.65 &&
       NUM_ACCTS <= 4.5 &&                              NUM_ACCTS <= 4.5 &&
       C_BTL > 0.04427 &&                               C_BTL > 0.04427 &&
       PROFIT_ILE > 24.95 &&                            BRANCH > 290.5 &&
       BRANCH <= 290.5 &&                               BRANCH <= 417.5 &&
       TRANS_NUM > 1325.5 &&                            NUM_PROD > 4.5 &&
       TRANS_NUM <= 2731                                FEE_LEVEL > 0.5055 &&
                                                        PROFIT_ILE > 24.95 &&
                                                        PROFIT_ILE <= 96.05 &&
                                                        MOS_ACTIVE <= 367.315 &&
                                                        TRANS_NUM <= 1563

Farrar -77                                                                               Don’t blame the Retriever;
                                                                                           Who threw the ball?
Node Rule Consolidation Decisions
                          Tnode 4 0.00002013
                                                                            Tnode 22 0.00002074
                          YRS_OB > 6.145 &&
                                                                            YRS_OB > 6.145 &&
                            CONTACTS <= 18.65 &&
                                                                              CONTACTS > 18.65 &&
                            NUM_ACCTS <= 4.5 &&
                                                                              FEE_LEVEL <= 0.755 &&
                            BRANCH <= 417.5 &&
                                                                              TRANS_NUM > 540.5 &&
                            C_BTL > 0.04427 &&
                                                                              M_VAL <= 39.9
                            PROFIT_ILE <= 24.95




                                                        TN4 and TN22
                                             Not good for Consolidation, too many
                                                          differences
                                                                                            Tnode 11 0.002227983
                                                                                            YRS_OB > 6.145 &&
             Tnode 8 0.00222916                                                               CONTACTS <= 18.65 &&
             YRS_OB > 6.145 &&                                                                NUM_ACCTS <= 4.5 &&
               CONTACTS <= 18.65 &&                                                           C_BTL > 0.04427 &&
               NUM_ACCTS <= 4.5 &&                                                            BRANCH > 290.5 &&
               C_BTL > 0.04427 &&                       TN8 and TN11                          BRANCH <= 417.5 &&
               PROFIT_ILE > 24.95 &&          Good for Consolidation, differences             NUM_PROD > 4.5 &&
               BRANCH <= 290.5 &&                     can be dealt with                       FEE_LEVEL > 0.5055 &&
               TRANS_NUM > 1325.5 &&                                                          PROFIT_ILE > 24.95 &&
               TRANS_NUM <= 2731                                                              PROFIT_ILE <= 96.05 &&
                                                                                              MOS_ACTIVE <= 367.315 &&
                                                                                              TRANS_NUM <= 1563
                                                      Rules Tnodes 8 & 11

                                               BRANCH = “NORTHERN THRU CENTRAL”
                                                           YRS_OB > 6
                                                        CONTACTS = “LOW”
                                                   TRANS_NUM = “MODERATE”
Farrar -78                                                                                                               Don’t blame the Retriever;
                                                                                                                           Who threw the ball?
Determine if Pruning the tree will
                         appreciably affect the generic rules




             TNode 5
             YRS_OB > 6.145 &&                                            TNode 8
               CONTACTS > 18.65 &&                                        YRS_OB > 6.145 &&
               FEE_LEVEL <= 0.755 &&                                        CONTACTS > 18.65 &&
               TRANS_NUM <= 540.5                                           FEE_LEVEL > 0.755



                                               Rules Tnodes 5 & 8

                                       BRANCH = “NORTHERN THRU CENTRAL”
                                                   YRS_OB > 6
                                                CONTACTS = “LOW”
                                           TRANS_NUM = “MODERATE”




Farrar -79                                                                                        Don’t blame the Retriever;
                                                                                                    Who threw the ball?
Analyze Differences:
                                                 Full vs. Pruned Trees
             F_CASEID   F_NODE   P_CASEID   P_NODE   Same Node
                1          1        1          1       TRUE
                2          3        2          3       TRUE
                3          3        3          3       TRUE        Counts      _NODE            P_NODE
                4          1        4          1       TRUE       F_NODE         1     2   3      4   5    6    7   8   Grand Total
                5          1        5          1       TRUE           1         154                                        154
                6          1        6          1       TRUE           2                9                                    9
                7         20        7          4       FALSE          3                     5                               5
                8         20        8          4       FALSE          4                     5                               5
                9         20        9          4       FALSE          5                     1                               1
                10        20        10         4       FALSE          6                     9                               9
                11        20        11         4       FALSE          7                     3                               3
                12        20        12         4       FALSE          8                     9                               9
                13        20        13         4       FALSE          9                    12                               12
                14        22        14         6       FALSE         10                     6                               6
                15        22        15         6       FALSE         11                     9                               9
                16        22        16         6       FALSE         12                     2                               2
                17        22        17         6       FALSE         13                     4                               4
                18        22        18         6       FALSE         14                    10                               10
                19        22        19         6       FALSE         15                     1                               1
                20        22        20         6       FALSE         16                     9                               9
                21        22        21         6       FALSE         17                     1                               1
                22        22        22         6       FALSE         18                     4                               4
                23        22        23         6       FALSE         19                     9                               9
                24        22        24         6       FALSE         20                          21                         21
                25        22        25         6       FALSE         21                               5                     5
                26        22        26         6       FALSE         22                                   208              208
                27        22        27         6       FALSE         23                                         2           2
                28        22        28         6       FALSE         24                                             8       8
                29        22        29         6       FALSE     Grand Total    154    9   99    21   5   208   2   8      506
                30        22        30         6       FALSE
                31        22        31         6       FALSE
                32        22        32         6       FALSE
                33        22        33         6       FALSE
                34        22        34         6       FALSE


Farrar -80                                                                                                                            Don’t blame the Retriever;
                                                                                                                                        Who threw the ball?
Variable Importance
                                              Changes
 Full Tree Variable Importance                                                Pruned Tree Variable Importance
 Variable            Score                                                    Variable          Score
 M_VAL               100.00      ||||||||||||||||||||||||||||||||||||||||||   M_VAL             100.00          ||||||||||||||||||||||||||||||||||||||||||
 TRANS_NUM           90.09       ||||||||||||||||||||||||||||||||||||||       CONTACTS          80.93           ||||||||||||||||||||||||||||||||||
 CONTACTS            88.67       |||||||||||||||||||||||||||||||||||||        TRANS_NUM         80.71           ||||||||||||||||||||||||||||||||||
 FEE_LEVEL           79.38       |||||||||||||||||||||||||||||||||            FEE_LEVEL         75.06           |||||||||||||||||||||||||||||||
 BRANCH              70.26       |||||||||||||||||||||||||||||                YRS_OB            70.90           |||||||||||||||||||||||||||||
 YRS_OB              66.95       ||||||||||||||||||||||||||||                 NUM_ACCTS         60.14           |||||||||||||||||||||||||
 REGION              56.57       |||||||||||||||||||||||                      REGION            56.84           |||||||||||||||||||||||
 NUM_ACCTS           56.23       |||||||||||||||||||||||                      BRANCH            41.87           |||||||||||||||||
 PROFIT_ILE          39.96       ||||||||||||||||                             FAM_MEMS          28.42           |||||||||||
 FAM_MEMS            35.23       ||||||||||||||                               PROFIT_ILE        17.74           |||||||
 C_BTL               34.01       ||||||||||||||                               C_BTL             16.08           ||||||
 NUM_PROD            23.71       |||||||||                                    NUM_PROD          8.96            |||
 MOS_ACTIVE          9.20        |||




Farrar -81                                                                                                                                      Don’t blame the Retriever;
                                                                                                                                                  Who threw the ball?
Tnode 8 0.00222916
                                                            Before…                                    Tnode 11 0.002227983
                                                                                                       YRS_OB > 6.145 &&
                                                                                                         CONTACTS <= 18.65 &&
                                                                                                         NUM_ACCTS <= 4.5 &&
                  YRS_OB > 6.145 &&
                                                                                                         C_BTL > 0.04427 &&
                    CONTACTS <= 18.65 &&
                                                                                                         BRANCH > 290.5 &&
                    NUM_ACCTS <= 4.5 &&
                                                                                                         BRANCH <= 417.5 &&
                    C_BTL > 0.04427 &&
                                                                  TN8 and TN11                           NUM_PROD > 4.5 &&
                    PROFIT_ILE > 24.95 &&
                                                                                                         FEE_LEVEL > 0.5055 &&
                    BRANCH <= 290.5 &&               Good for Consolidation, differences can be          PROFIT_ILE > 24.95 &&
                    TRANS_NUM > 1325.5 &&                           dealt with                           PROFIT_ILE <= 96.05 &&
                    TRANS_NUM <= 2731
                                                                                                         MOS_ACTIVE <= 367.315 &&
                                                                                                         TRANS_NUM <= 1563
                                                                  Rules Tnodes 8 & 11

                                                           BRANCH = “NORTHERN THRU CENTRAL”
                                                                       YRS_OB > 6
                                                                    CONTACTS = “LOW”
                                                               TRANS_NUM = “MODERATE”




             and After…
               TNode 5
               YRS_OB > 6.145 &&                                                                              TNode 8
                                                                   Rules TNodes 5 & 8                         YRS_OB > 6.145 &&
                 CONTACTS > 18.65 &&
                 FEE_LEVEL <= 0.755 &&                                                                          CONTACTS > 18.65 &&
                                                           BRANCH = “NORTHERN THRU CENTRAL”                     FEE_LEVEL > 0.755
                 TRANS_NUM <= 540.5                                    YRS_OB > 6
                                                                    CONTACTS = “LOW”
                                                               TRANS_NUM = “MODERATE”




                Effect after pruning (Where Art Meets Science):
                •TNodes change from 8 and 11 to 5 and 8 (Smaller tree)
                •“BRANCH” kept since it applied prior to pruning and aids in list generation and routing
                •“YRS_OB” split threshold becomes rounded generalized threshold
                •Generalization can still be used
                         •In this example, “FEE_LEVEL” was not included ( “<= and >” cancel each other out)
                         •“CONTACTS” thresholds change ( “<= becomes >” ) but threshold still can be used within “Low” designation
                         •“TRANS_NUM was kept since it applied prior to pruning and aided in talking points
Farrar -82                                                                                                                            Don’t blame the Retriever;
                                                                                                                                        Who threw the ball?
RUN, Toto, RUN!!!!
      • Implement the Dog-gone thing!




                  Customer        Branch   Risk         Point 1            Point 2                   Point 3                  Point 4
                                                  Long time on
             Bill Muchkinovski     200     Low    books           6 Mos. Moderate Balance    Moderate number products Moderate Profit
                                                  Short time on                                                       6 Mos. Low number
             Millie Smoller        27      Med    books           6 Mos. Low Balance         Low Number Products      contacts
                                                  Short time on                                                       6 Mos. High number
             Beulah Diminuitive    343     Med    books           6 Mos. Low Balance         Low Number Products      contacts
             Casper                               Long time on                               6 Mos. High Number of
             Lollipopovich         721     High   books           6 Mos. Diminishing Balance Contacts                 Moderate Profit
                                                  Long time on                               6 Mos. High Number of
             Martha Smallkind      14      High   books           6 Mos. Diminishing Balance Contacts                 Moderate Profit
             Elmo                                 Long time on    6 Mos. High Number of
             Munchkinovich          1      High   books           Contacts                   6 Mos. High Balance      High Profit
Farrar -83                                                                                                                                 Don’t blame the Retriever;
                                                                                                                                             Who threw the ball?
Save those CUSTOMERS!




Farrar -84                           Don’t blame the Retriever;
                                       Who threw the ball?
Happily Down the Road….




Farrar -85                             Don’t blame the Retriever;
                                         Who threw the ball?
There’s No Place Like Home….




Farrar -86                                  Don’t blame the Retriever;
                                              Who threw the ball?
The End




Farrar -87             Don’t blame the Retriever;
                         Who threw the ball?
Jon’s 30+ years of Predictive Modeling expertise comes
                                               from various segments of the financial industry
                                               including Banking, Consumer Finance, Mortgage, and
                                               Modeling Vendor. He has experience in the U.S.,
                                               Canada, Australia and the United Kingdom. As SVP and
                                               Manager of Predictive Modeling at Union Bank, Jon
                                               introduced Scoring technology in 1995 and provided
   The now departed Zeppelin, best human
                                               Credit Risk research, analytics and Customer
   being I ever knew, proudly displaying the   Segmentation strategies, along with many of the Bank’s
   four balls he so loved to retrieve…         Business Intelligence and Operations statistical models.

    Contact Information:                       Jon’s Expertise includes Regulatory oversight and all
                                               things AVM (Automated Valuation Modeling).
    jcf4now@sbcglobal.net
                                               In addition to Consulting and Expert Witness
                                               engagements, Jon holds a Master’s Degree in
                                               Counseling Psychology and speaks at a variety of
                                               Industry conferences.




Farrar -88                                                                                             Don’t blame the Retriever;
                                                                                                         Who threw the ball?

Weitere ähnliche Inhalte

Mehr von Salford Systems

Statistically Significant Quotes To Remember
Statistically Significant Quotes To RememberStatistically Significant Quotes To Remember
Statistically Significant Quotes To RememberSalford Systems
 
Using CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example DatasetUsing CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example DatasetSalford Systems
 
CART Classification and Regression Trees Experienced User Guide
CART Classification and Regression Trees Experienced User GuideCART Classification and Regression Trees Experienced User Guide
CART Classification and Regression Trees Experienced User GuideSalford Systems
 
Evolution of regression ols to gps to mars
Evolution of regression   ols to gps to marsEvolution of regression   ols to gps to mars
Evolution of regression ols to gps to marsSalford Systems
 
Data Mining for Higher Education
Data Mining for Higher EducationData Mining for Higher Education
Data Mining for Higher EducationSalford Systems
 
Comparison of statistical methods commonly used in predictive modeling
Comparison of statistical methods commonly used in predictive modelingComparison of statistical methods commonly used in predictive modeling
Comparison of statistical methods commonly used in predictive modelingSalford Systems
 
Molecular data mining tool advances in hiv
Molecular data mining tool  advances in hivMolecular data mining tool  advances in hiv
Molecular data mining tool advances in hivSalford Systems
 
TreeNet Tree Ensembles & CART Decision Trees: A Winning Combination
TreeNet Tree Ensembles & CART Decision Trees:  A Winning CombinationTreeNet Tree Ensembles & CART Decision Trees:  A Winning Combination
TreeNet Tree Ensembles & CART Decision Trees: A Winning CombinationSalford Systems
 
SPM User's Guide: Introducing MARS
SPM User's Guide: Introducing MARSSPM User's Guide: Introducing MARS
SPM User's Guide: Introducing MARSSalford Systems
 
Hybrid cart logit model 1998
Hybrid cart logit model 1998Hybrid cart logit model 1998
Hybrid cart logit model 1998Salford Systems
 
Session Logs Tutorial for SPM
Session Logs Tutorial for SPMSession Logs Tutorial for SPM
Session Logs Tutorial for SPMSalford Systems
 
Some of the new features in SPM 7
Some of the new features in SPM 7Some of the new features in SPM 7
Some of the new features in SPM 7Salford Systems
 
TreeNet Overview - Updated October 2012
TreeNet Overview  - Updated October 2012TreeNet Overview  - Updated October 2012
TreeNet Overview - Updated October 2012Salford Systems
 
TreeNet Tree Ensembles and CART Decision Trees: A Winning Combination
TreeNet Tree Ensembles and CART  Decision Trees:  A Winning CombinationTreeNet Tree Ensembles and CART  Decision Trees:  A Winning Combination
TreeNet Tree Ensembles and CART Decision Trees: A Winning CombinationSalford Systems
 
Paradigm shifts in wildlife and biodiversity management through machine learning
Paradigm shifts in wildlife and biodiversity management through machine learningParadigm shifts in wildlife and biodiversity management through machine learning
Paradigm shifts in wildlife and biodiversity management through machine learningSalford Systems
 
Global Modeling of Biodiversity and Climate Change
Global Modeling of Biodiversity and Climate ChangeGlobal Modeling of Biodiversity and Climate Change
Global Modeling of Biodiversity and Climate ChangeSalford Systems
 
Predicting Hospital Readmission Using TreeNet
Predicting Hospital Readmission Using TreeNetPredicting Hospital Readmission Using TreeNet
Predicting Hospital Readmission Using TreeNetSalford Systems
 
Applied Multivariable Modeling in Public Health: Use of CART and Logistic Reg...
Applied Multivariable Modeling in Public Health: Use of CART and Logistic Reg...Applied Multivariable Modeling in Public Health: Use of CART and Logistic Reg...
Applied Multivariable Modeling in Public Health: Use of CART and Logistic Reg...Salford Systems
 

Mehr von Salford Systems (20)

Statistically Significant Quotes To Remember
Statistically Significant Quotes To RememberStatistically Significant Quotes To Remember
Statistically Significant Quotes To Remember
 
Using CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example DatasetUsing CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example Dataset
 
CART Classification and Regression Trees Experienced User Guide
CART Classification and Regression Trees Experienced User GuideCART Classification and Regression Trees Experienced User Guide
CART Classification and Regression Trees Experienced User Guide
 
Evolution of regression ols to gps to mars
Evolution of regression   ols to gps to marsEvolution of regression   ols to gps to mars
Evolution of regression ols to gps to mars
 
Data Mining for Higher Education
Data Mining for Higher EducationData Mining for Higher Education
Data Mining for Higher Education
 
Comparison of statistical methods commonly used in predictive modeling
Comparison of statistical methods commonly used in predictive modelingComparison of statistical methods commonly used in predictive modeling
Comparison of statistical methods commonly used in predictive modeling
 
Molecular data mining tool advances in hiv
Molecular data mining tool  advances in hivMolecular data mining tool  advances in hiv
Molecular data mining tool advances in hiv
 
TreeNet Tree Ensembles & CART Decision Trees: A Winning Combination
TreeNet Tree Ensembles & CART Decision Trees:  A Winning CombinationTreeNet Tree Ensembles & CART Decision Trees:  A Winning Combination
TreeNet Tree Ensembles & CART Decision Trees: A Winning Combination
 
SPM v7.0 Feature Matrix
SPM v7.0 Feature MatrixSPM v7.0 Feature Matrix
SPM v7.0 Feature Matrix
 
SPM User's Guide: Introducing MARS
SPM User's Guide: Introducing MARSSPM User's Guide: Introducing MARS
SPM User's Guide: Introducing MARS
 
Hybrid cart logit model 1998
Hybrid cart logit model 1998Hybrid cart logit model 1998
Hybrid cart logit model 1998
 
Session Logs Tutorial for SPM
Session Logs Tutorial for SPMSession Logs Tutorial for SPM
Session Logs Tutorial for SPM
 
Some of the new features in SPM 7
Some of the new features in SPM 7Some of the new features in SPM 7
Some of the new features in SPM 7
 
TreeNet Overview - Updated October 2012
TreeNet Overview  - Updated October 2012TreeNet Overview  - Updated October 2012
TreeNet Overview - Updated October 2012
 
TreeNet Tree Ensembles and CART Decision Trees: A Winning Combination
TreeNet Tree Ensembles and CART  Decision Trees:  A Winning CombinationTreeNet Tree Ensembles and CART  Decision Trees:  A Winning Combination
TreeNet Tree Ensembles and CART Decision Trees: A Winning Combination
 
Text mining tutorial
Text mining tutorialText mining tutorial
Text mining tutorial
 
Paradigm shifts in wildlife and biodiversity management through machine learning
Paradigm shifts in wildlife and biodiversity management through machine learningParadigm shifts in wildlife and biodiversity management through machine learning
Paradigm shifts in wildlife and biodiversity management through machine learning
 
Global Modeling of Biodiversity and Climate Change
Global Modeling of Biodiversity and Climate ChangeGlobal Modeling of Biodiversity and Climate Change
Global Modeling of Biodiversity and Climate Change
 
Predicting Hospital Readmission Using TreeNet
Predicting Hospital Readmission Using TreeNetPredicting Hospital Readmission Using TreeNet
Predicting Hospital Readmission Using TreeNet
 
Applied Multivariable Modeling in Public Health: Use of CART and Logistic Reg...
Applied Multivariable Modeling in Public Health: Use of CART and Logistic Reg...Applied Multivariable Modeling in Public Health: Use of CART and Logistic Reg...
Applied Multivariable Modeling in Public Health: Use of CART and Logistic Reg...
 

Kürzlich hochgeladen

VIP Independent Call Girls in Bandra West 🌹 9920725232 ( Call Me ) Mumbai Esc...
VIP Independent Call Girls in Bandra West 🌹 9920725232 ( Call Me ) Mumbai Esc...VIP Independent Call Girls in Bandra West 🌹 9920725232 ( Call Me ) Mumbai Esc...
VIP Independent Call Girls in Bandra West 🌹 9920725232 ( Call Me ) Mumbai Esc...dipikadinghjn ( Why You Choose Us? ) Escorts
 
Best VIP Call Girls Noida Sector 18 Call Me: 8448380779
Best VIP Call Girls Noida Sector 18 Call Me: 8448380779Best VIP Call Girls Noida Sector 18 Call Me: 8448380779
Best VIP Call Girls Noida Sector 18 Call Me: 8448380779Delhi Call girls
 
20240429 Calibre April 2024 Investor Presentation.pdf
20240429 Calibre April 2024 Investor Presentation.pdf20240429 Calibre April 2024 Investor Presentation.pdf
20240429 Calibre April 2024 Investor Presentation.pdfAdnet Communications
 
Booking open Available Pune Call Girls Wadgaon Sheri 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Wadgaon Sheri  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Wadgaon Sheri  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Wadgaon Sheri 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
Pooja 9892124323 : Call Girl in Juhu Escorts Service Free Home Delivery
Pooja 9892124323 : Call Girl in Juhu Escorts Service Free Home DeliveryPooja 9892124323 : Call Girl in Juhu Escorts Service Free Home Delivery
Pooja 9892124323 : Call Girl in Juhu Escorts Service Free Home DeliveryPooja Nehwal
 
The Economic History of the U.S. Lecture 30.pdf
The Economic History of the U.S. Lecture 30.pdfThe Economic History of the U.S. Lecture 30.pdf
The Economic History of the U.S. Lecture 30.pdfGale Pooley
 
Indore Real Estate Market Trends Report.pdf
Indore Real Estate Market Trends Report.pdfIndore Real Estate Market Trends Report.pdf
Indore Real Estate Market Trends Report.pdfSaviRakhecha1
 
(DIYA) Bhumkar Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(DIYA) Bhumkar Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(DIYA) Bhumkar Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(DIYA) Bhumkar Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
00_Main ppt_MeetupDORA&CyberSecurity.pptx
00_Main ppt_MeetupDORA&CyberSecurity.pptx00_Main ppt_MeetupDORA&CyberSecurity.pptx
00_Main ppt_MeetupDORA&CyberSecurity.pptxFinTech Belgium
 
Independent Call Girl Number in Kurla Mumbai📲 Pooja Nehwal 9892124323 💞 Full ...
Independent Call Girl Number in Kurla Mumbai📲 Pooja Nehwal 9892124323 💞 Full ...Independent Call Girl Number in Kurla Mumbai📲 Pooja Nehwal 9892124323 💞 Full ...
Independent Call Girl Number in Kurla Mumbai📲 Pooja Nehwal 9892124323 💞 Full ...Pooja Nehwal
 
03_Emmanuel Ndiaye_Degroof Petercam.pptx
03_Emmanuel Ndiaye_Degroof Petercam.pptx03_Emmanuel Ndiaye_Degroof Petercam.pptx
03_Emmanuel Ndiaye_Degroof Petercam.pptxFinTech Belgium
 
Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...
Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...
Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...ssifa0344
 
CALL ON ➥8923113531 🔝Call Girls Gomti Nagar Lucknow best sexual service
CALL ON ➥8923113531 🔝Call Girls Gomti Nagar Lucknow best sexual serviceCALL ON ➥8923113531 🔝Call Girls Gomti Nagar Lucknow best sexual service
CALL ON ➥8923113531 🔝Call Girls Gomti Nagar Lucknow best sexual serviceanilsa9823
 
Top Rated Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
Top Rated  Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...Top Rated  Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
Top Rated Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...Call Girls in Nagpur High Profile
 
Gurley shaw Theory of Monetary Economics.
Gurley shaw Theory of Monetary Economics.Gurley shaw Theory of Monetary Economics.
Gurley shaw Theory of Monetary Economics.Vinodha Devi
 
VIP Call Girl in Mira Road 💧 9920725232 ( Call Me ) Get A New Crush Everyday ...
VIP Call Girl in Mira Road 💧 9920725232 ( Call Me ) Get A New Crush Everyday ...VIP Call Girl in Mira Road 💧 9920725232 ( Call Me ) Get A New Crush Everyday ...
VIP Call Girl in Mira Road 💧 9920725232 ( Call Me ) Get A New Crush Everyday ...dipikadinghjn ( Why You Choose Us? ) Escorts
 
The Economic History of the U.S. Lecture 25.pdf
The Economic History of the U.S. Lecture 25.pdfThe Economic History of the U.S. Lecture 25.pdf
The Economic History of the U.S. Lecture 25.pdfGale Pooley
 
The Economic History of the U.S. Lecture 26.pdf
The Economic History of the U.S. Lecture 26.pdfThe Economic History of the U.S. Lecture 26.pdf
The Economic History of the U.S. Lecture 26.pdfGale Pooley
 
Booking open Available Pune Call Girls Shivane 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Shivane  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Shivane  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Shivane 6297143586 Call Hot Indian Gi...Call Girls in Nagpur High Profile
 

Kürzlich hochgeladen (20)

VIP Independent Call Girls in Bandra West 🌹 9920725232 ( Call Me ) Mumbai Esc...
VIP Independent Call Girls in Bandra West 🌹 9920725232 ( Call Me ) Mumbai Esc...VIP Independent Call Girls in Bandra West 🌹 9920725232 ( Call Me ) Mumbai Esc...
VIP Independent Call Girls in Bandra West 🌹 9920725232 ( Call Me ) Mumbai Esc...
 
Best VIP Call Girls Noida Sector 18 Call Me: 8448380779
Best VIP Call Girls Noida Sector 18 Call Me: 8448380779Best VIP Call Girls Noida Sector 18 Call Me: 8448380779
Best VIP Call Girls Noida Sector 18 Call Me: 8448380779
 
20240429 Calibre April 2024 Investor Presentation.pdf
20240429 Calibre April 2024 Investor Presentation.pdf20240429 Calibre April 2024 Investor Presentation.pdf
20240429 Calibre April 2024 Investor Presentation.pdf
 
Booking open Available Pune Call Girls Wadgaon Sheri 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Wadgaon Sheri  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Wadgaon Sheri  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Wadgaon Sheri 6297143586 Call Hot Ind...
 
Pooja 9892124323 : Call Girl in Juhu Escorts Service Free Home Delivery
Pooja 9892124323 : Call Girl in Juhu Escorts Service Free Home DeliveryPooja 9892124323 : Call Girl in Juhu Escorts Service Free Home Delivery
Pooja 9892124323 : Call Girl in Juhu Escorts Service Free Home Delivery
 
The Economic History of the U.S. Lecture 30.pdf
The Economic History of the U.S. Lecture 30.pdfThe Economic History of the U.S. Lecture 30.pdf
The Economic History of the U.S. Lecture 30.pdf
 
Indore Real Estate Market Trends Report.pdf
Indore Real Estate Market Trends Report.pdfIndore Real Estate Market Trends Report.pdf
Indore Real Estate Market Trends Report.pdf
 
(DIYA) Bhumkar Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(DIYA) Bhumkar Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(DIYA) Bhumkar Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(DIYA) Bhumkar Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
00_Main ppt_MeetupDORA&CyberSecurity.pptx
00_Main ppt_MeetupDORA&CyberSecurity.pptx00_Main ppt_MeetupDORA&CyberSecurity.pptx
00_Main ppt_MeetupDORA&CyberSecurity.pptx
 
Independent Call Girl Number in Kurla Mumbai📲 Pooja Nehwal 9892124323 💞 Full ...
Independent Call Girl Number in Kurla Mumbai📲 Pooja Nehwal 9892124323 💞 Full ...Independent Call Girl Number in Kurla Mumbai📲 Pooja Nehwal 9892124323 💞 Full ...
Independent Call Girl Number in Kurla Mumbai📲 Pooja Nehwal 9892124323 💞 Full ...
 
03_Emmanuel Ndiaye_Degroof Petercam.pptx
03_Emmanuel Ndiaye_Degroof Petercam.pptx03_Emmanuel Ndiaye_Degroof Petercam.pptx
03_Emmanuel Ndiaye_Degroof Petercam.pptx
 
(INDIRA) Call Girl Mumbai Call Now 8250077686 Mumbai Escorts 24x7
(INDIRA) Call Girl Mumbai Call Now 8250077686 Mumbai Escorts 24x7(INDIRA) Call Girl Mumbai Call Now 8250077686 Mumbai Escorts 24x7
(INDIRA) Call Girl Mumbai Call Now 8250077686 Mumbai Escorts 24x7
 
Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...
Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...
Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...
 
CALL ON ➥8923113531 🔝Call Girls Gomti Nagar Lucknow best sexual service
CALL ON ➥8923113531 🔝Call Girls Gomti Nagar Lucknow best sexual serviceCALL ON ➥8923113531 🔝Call Girls Gomti Nagar Lucknow best sexual service
CALL ON ➥8923113531 🔝Call Girls Gomti Nagar Lucknow best sexual service
 
Top Rated Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
Top Rated  Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...Top Rated  Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
Top Rated Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
 
Gurley shaw Theory of Monetary Economics.
Gurley shaw Theory of Monetary Economics.Gurley shaw Theory of Monetary Economics.
Gurley shaw Theory of Monetary Economics.
 
VIP Call Girl in Mira Road 💧 9920725232 ( Call Me ) Get A New Crush Everyday ...
VIP Call Girl in Mira Road 💧 9920725232 ( Call Me ) Get A New Crush Everyday ...VIP Call Girl in Mira Road 💧 9920725232 ( Call Me ) Get A New Crush Everyday ...
VIP Call Girl in Mira Road 💧 9920725232 ( Call Me ) Get A New Crush Everyday ...
 
The Economic History of the U.S. Lecture 25.pdf
The Economic History of the U.S. Lecture 25.pdfThe Economic History of the U.S. Lecture 25.pdf
The Economic History of the U.S. Lecture 25.pdf
 
The Economic History of the U.S. Lecture 26.pdf
The Economic History of the U.S. Lecture 26.pdfThe Economic History of the U.S. Lecture 26.pdf
The Economic History of the U.S. Lecture 26.pdf
 
Booking open Available Pune Call Girls Shivane 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Shivane  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Shivane  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Shivane 6297143586 Call Hot Indian Gi...
 

Don't Blame the Retriever; Who Threw the Ball

  • 1. Pay No Attention to the Man Behind The Curtain The Changing Requirements of Business Analytics in Financial Services Jon C Farrar M.A. Don’t Blame the Retriever…. Farrar -1 Don’t blame the Retriever; Who threw the ball?
  • 2. Introduction • “There is no business challenge that cannot be solved if one considers that a Business Challenge is simply a Tennis Ball waiting to be thrown….” – Jon Farrar Don’t blame the Retriever; who threw the ball? Farrar -2 Don’t blame the Retriever; Who threw the ball?
  • 3. Once Upon A Time…. There was this dream that everyone who needed a loan would always be treated fairly Farrar -3 Don’t blame the Retriever; Who threw the ball?
  • 4. But there were factors at work that made the dream almost impossible Farrar -4 Don’t blame the Retriever; Who threw the ball?
  • 5. But the needs were so great…. Farrar -5 Don’t blame the Retriever; Who threw the ball?
  • 6. Then all of a sudden Someone invented something called Credit Scores They were a bit odd, at first, but they were also kind of an elegant accessory and they fit real good. Once folks found out about ‘em, Everybody wanted ‘em Don’t blame the Retriever; Who threw the ball?
  • 7. Everybody was Happy… Farrar -7 Don’t blame the Retriever; Who threw the ball?
  • 8. They Seemed to go with EVERYTHING, and they were a little Magical besides… Farrar -8 Don’t blame the Retriever; Who threw the ball?
  • 9. But, trouble was a-brewin…. The Wizard of OCC (“awk”) found out about the Credit Scores and he was not happy. Farrar -9 Don’t blame the Retriever; Who threw the ball?
  • 10. The Wizard of OCC thought Credit Scores looked like this… Farrar -10 Don’t blame the Retriever; Who threw the ball?
  • 11. And The Wizard Wanted them to look more like this… Farrar -11 Don’t blame the Retriever; Who threw the ball?
  • 12. So, The Wizard sent his Minions to do some work…. Farrar -12 Don’t blame the Retriever; Who threw the ball?
  • 13. Storm after Storm blew down on everyone using Credit Scores Farrar -13 Don’t blame the Retriever; Who threw the ball?
  • 14. And Because the Wizard of OCC Wasn’t always real clear about what he wanted everybody to do. People were confused… REG B Farrar -14 Don’t blame the Retriever; Who threw the ball?
  • 15. So the Wizard Tried Again, OCC 97-24 More Confusion…… Farrar -15 Don’t blame the Retriever; Who threw the ball?
  • 16. And Again…. OCC 2000-16 And Still, NOBODY seemed to know what to do…. Farrar -16 Don’t blame the Retriever; Who threw the ball?
  • 17. And then the wisest one of them all had an Idea…. C’mon, you guys! We just gotta go talk to the old bird Farrar -17 Don’t blame the Retriever; Who threw the ball?
  • 18. Toto’s Right! Farrar -18 Don’t blame the Retriever; Who threw the ball?
  • 19. So there was only one thing left to do….. They formed their little group and they went off to see the Wizard…. They just needed to TALK to him….. Farrar -19 Don’t blame the Retriever; Who threw the ball?
  • 20. So, they followed the FICO-Built Road OCC Farrar -20 Don’t blame the Retriever; Who threw the ball?
  • 21. But that proved kinda scary, everybody said The Wizard was REAL MEAN! Farrar -21 Don’t blame the Retriever; Who threw the ball?
  • 22. And the Minions seemed to like that everybody was confused Farrar -22 Don’t blame the Retriever; Who threw the ball?
  • 23. And all along the road There were “Empirically Derived”s And “Demonstrably and Statistically Sound”s There were Models, Reporting, and BackTests OH MY! Farrar -23 Don’t blame the Retriever; Who threw the ball?
  • 24. They knew they weren’t in Kansas Anymore…. Farrar -24 Don’t blame the Retriever; Who threw the ball?
  • 25. But They Carried On In spite of attempts to deter them from their road… Farrar -25 Don’t blame the Retriever; Who threw the ball?
  • 26. And When they finally got to The Wizard They made an appointment with his Admin Farrar -26 Don’t blame the Retriever; Who threw the ball?
  • 27. When They First Got Inside, they WERE scared Farrar -27 Don’t blame the Retriever; Who threw the ball?
  • 28. But Then They Realized something funny Farrar -28 Don’t blame the Retriever; Who threw the ball?
  • 29. The Wizard of OCC wasn’t such a bad guy after all Farrar -29 Don’t blame the Retriever; Who threw the ball?
  • 30. He Just wanted Everybody to Understand How the Ruby Slippers Were Made So that they held up, and didn’t fall apart, and were the right size, and were available to all, And so everybody could buy and sell more slippers, In a kinda sorta fair way…. Farrar -30 Don’t blame the Retriever; Who threw the ball?
  • 31. So the Wizard Of OCC Created Farrar -31 Don’t blame the Retriever; Who threw the ball?
  • 32. And everybody sort of understood, And everybody was sort of happy Farrar -32 Don’t blame the Retriever; Who threw the ball?
  • 33. It STILL wasn’t perfect, But it was a gosh-darn sight better than what came before….. REG B OCC 97-24 OCC 2000- 16 Farrar -33 Don’t blame the Retriever; Who threw the ball?
  • 34. There was something for everybody Farrar -34 Don’t blame the Retriever; Who threw the ball?
  • 35. And Toto too…. Farrar -35 Don’t blame the Retriever; Who threw the ball?
  • 36. Dorothy Understood that she needed to spread the word Farrar -36 Don’t blame the Retriever; Who threw the ball?
  • 37. And with the help of a very good Travel Agent…. Farrar -37 Don’t blame the Retriever; Who threw the ball?
  • 38. They loaded the Ruby Slippers and the New Instructions into the Open Gray Box Farrar -38 Don’t blame the Retriever; Who threw the ball?
  • 39. And set off back to where it all started… Farrar -39 Don’t blame the Retriever; Who threw the ball?
  • 40. * * Well, for the time being anyway…. Farrar -40 Don’t blame the Retriever; Who threw the ball?
  • 41. Pay No Attention… Part II • What we have learned thus far – Since the beginning, Models were Magical – Regulators were always concerned with Fairness and measurability – Models offer Promise but lots of confusion • Models are used for lots of different functions • Models are not always clearly understood • Regulating them lagged behind their prevalence and use – Multiple attempts to regulate but never clear – Finally catching up but still lagging – OCC 2011-12 best so far, large way there • Dem’s da rules, Dat’s how we gotta play… Farrar -41 Don’t blame the Retriever; Who threw the ball?
  • 42. Models offered Promise but lots of confusion too • We started using models for all sorts of different functions • Consumers started asking lots of questions • “You didn’t Score enough” didn’t cut it • “Lemme talk to your MANAGER!” Farrar -42 Don’t blame the Retriever; Who threw the ball?
  • 43. Characteristic Points You see, time Home Ownership Own Rent 35 25 was….. Lives with parents 20 Other 15 Earlier Models were able Years On Job to be very simply rendered < 2years 15 2 – 5 years 20 5 – 8 years 8+ years 20 16 One just added up the Credit History points < 2 years 5 2-4 years 10 4-7 years 15 If there were enough to 7+ years 20 pass the cutoff, the Credit Report < 3 Inquiries 20 customer was approved 3+ Inquiries 5 < 3 Satisfactory 10 3+ Satisfactory Worst Rating 60+ Delinq 25 -10 But still nobody really Worst Rating Derog Worst Rating Satisfactory -20 20 knew how to explain them Farrar -43 Don’t blame the Retriever; Who threw the ball?
  • 44. And we started using models for all kinds of things Line Authorizations Severe Credit Prescreen Cross-selling Collections Objective Extension Solicit Collecting Reissue Recovery New Account Solicitation Collection Tool Scoring Behavior Scoring Scoring Scoring Masterfile Typical Purchases & Masterfile, Sources Application, Credit Bureau, Payments Credit Bureau, of Data Credit Bureau Demographics Loan Details Loan Details Linear Regression Models Logistic Regression Models Farrar -44 Don’t blame the Retriever; Who threw the ball?
  • 45. Models offered Promise but lots of confusion • Models used for lots of different functions • Consumers started asking lots of questions – Why did I only get that Loan Amount? – Why was I turned down? – Why didn’t you renew my Credit Line? – Why did you call me for a payment? Farrar -45 Don’t blame the Retriever; Who threw the ball?
  • 46. Models offered Promise but lots of confusion • Models used for lots of different functions • Consumers gaining Savvy and asked lots of questions • “You didn’t Score enough” didn’t cut it – Customers didn’t get it – Loan Officers also didn’t get it – The Tin Woodsman didn’t get it (and he had an Axe!) Farrar -46 Don’t blame the Retriever; Who threw the ball?
  • 47. And now look where we are (not to mention where we’re going…) Severe Objective Collections Recovery Fraud Attrition Cross Sell Utilization Propensity Operations Traditional MS Office Suite Collection Data extraction tools Tool Scoring Leading edge Statistical packages (SAS, SPSS, R) Data Mining packages Pattern Recognition Algorithms Categorization and Regression Trees (CART®) Stochastic Gradient Boosting (TreeNet ®) Programming and Application Languages Typical Sources Masterfile, of Data Credit Bureau DataMarts, Data Warehouses Web Logs, Transactional Databases, Historical time series databases Internal system databases (DDA, Collection, Recovery, Financial, etc.) Farrar -47 Don’t blame the Retriever; Who threw the ball?
  • 48. Models offered Promise but lots of confusion • Models used for lots of different functions • Consumers started asking lots of questions • “You didn’t Score enough” didn’t cut it Farrar -48 Don’t blame the Retriever; Who threw the ball?
  • 49. But how ya gonna keep ‘em down on the Farm…? • Plethora of Modeling techniques and Methodologies are part of Statistical training • Reality Bites • Only very small number of learned statistical techniques can actually be used in most business scenarios • Where we can apply them in Business, even fewer of those meet usability requirements – Tracking, Monitoring, Maintaining, Refreshing – Time to Develop, Validate, Test, Deploy – Extensible, Scalable, contribute to KPI’s and Financial Measures like ROA, RAROC, ROI, etc. – EXPLAINABLE! (ahhhh… back to the Regulations….in a moment…) • So in general, it makes more sense to use simpler types of models for most business applications Farrar -49 Don’t blame the Retriever; Who threw the ball?
  • 50. So How ya gonna keep ‘em down on the Farm? • Easy. • Tell ‘em they have to follow 2011-12 • They’ll NEVER leave! Farrar -50 Don’t blame the Retriever; Who threw the ball?
  • 51. OCC 2011-12 • The design, theory, and logic underlying the model should be well documented and generally supported by published research and sound industry practice. The model methodologies and processing components that implement the theory, including the mathematical specification and the numerical techniques and approximations, should be explained in detail with particular attention to merits and limitations. Farrar -51 Don’t blame the Retriever; Who threw the ball?
  • 52. OCC 2011-12 (2) • Without adequate documentation, model risk assessment and management will be ineffective. Documentation of model development and validation should be sufficiently detailed so that parties unfamiliar with a model can understand how the model operates, its limitations, and its key assumptions. Documentation provides for continuity of operations, makes compliance with policy transparent, and helps track recommendations, responses, and exceptions. Farrar -52 Don’t blame the Retriever; Who threw the ball?
  • 53. Vital organs of 2011-12 • Oversight – Model Risk Management Division – Manage Model Risk like any other type of risk – Detailed Policies and procedures for Models , their uses and permitted Overrides – Rigorous assessment of Data quality, relevance, appropriateness and documentation – All model assumptions must be tracked and monitored – Appropriateness of chosen Methodology must be defensible (design and construction) – Audit and Compliance Signoffs • Rigorous Testing before Implementation – Stress testing against multiple economic and Financial Scenarios to identify model uncertainty and potential for inaccuracy • Independent Validation prior to Implementation (internal unit or Contracted External resource) • Model used for population designed on • Reporting formalized, pre-established thresholds for performance effectiveness and stability • Exhaustive documentation to EXPLAIN everything – Business Goals, Assumptions, Data, Intended Use, Methodology, How Model Works, ties in to Policy and Procedures, Adverse Action, Testing, Validation and tracking protocols, etc Farrar -53 Don’t blame the Retriever; Who threw the ball?
  • 54. EXPLAINING now is a really BIG thing… The sum of the square roots of any two sides of an isosceles triangle is equal to the square root of the remaining side. Oh joy! Rapture! I got a brain! How can I ever thank you enough? Farrar -54 Don’t blame the Retriever; Who threw the ball?
  • 55. Explaining Models • Logistic and Linear Regression Models are very well understood, have been reliably used in Business Applications for over 60 years, and when properly built are stable, very good predictors of outcomes • Logistic and Linear Regression Models are relatively easy to explain – A linear regression line has an equation of the form Y = a + bX, where X is the explanatory variable and Y is the dependent variable. The slope of the line is b, and a is the intercept (the value of y when x = 0)* – Logistic regression is used for predicting binary outcomes (Bernoulli trials) rather than continuous outcomes, and models a transformation of the expected value as a linear function of the predictors, rather than the expected value itself** *http://www.stat.yale.edu/Courses/1997-98/101/linreg.htm **http://en.wikipedia.org/wiki/Logistic_regression#Definition Farrar -55 Don’t blame the Retriever; Who threw the ball?
  • 56. Explaining Models (2) • Regression Models generally assume a statistically normal distribution of variables and predicted outcomes • Both Linear and Logistic Models are founded on the correlative nature of multiple variables to predicted outcomes and require some type of linear relationship between each variable and the predicted outcome – Sometimes (generally) first require data to be transformed in a variety of ways to establish an optimal linear relationship – Use a given variable only once in a given model, according to the (derived) linear relationship • One variable (or range), one coefficient Farrar -56 Don’t blame the Retriever; Who threw the ball?
  • 57. On The Other Hand… • Business Data is becoming less and less normally distributed • Businesses must now pay more and more attention to exceptions and outliers in order to maximize targeting and profitability • Linear and Logistic methodologies are no longer always adequate to solve the more complex business challenges – Some build model suites to address a single challenge – Lead times for development, validation, testing and documenting suites of models are therefore much more extended – Newer methodologies can help here, in the sense that often one model can be built, but….. • but 2011-12 rears its head again…. Farrar -57 Don’t blame the Retriever; Who threw the ball?
  • 58. 2011-12 rears its head again • If ya’ can’t explain it, ya’ can’t use it • Neural Networks, Bayesian Networks, Stochastic Gradient Boosting, etc. all need to be explained • Mathematical formulas, and underpinnings like assumptions, must be justified, can be difficult to objectively explain, and may be difficult if not impossible to place into an Adverse Action context Farrar -58 Don’t blame the Retriever; Who threw the ball?
  • 59. Why CART is so cool… See, Decision Trees are “easy” because we can explain this one no problem: INDUS <= 6.145 INDUS > 6.145 && PT <= 18.65 && INDUS > 6.145 && DIS <= 4.91145 PT > 18.65 && NOX > 0.755 INDUS > 6.145 && PT <= 18.65 && DIS > 4.91145 INDUS > 6.145 && PT > 18.65 && NOX <= 0.755 && INDUS > 6.145 && LSTAT > 5.165 PT > 18.65 && NOX <= 0.755 && LSTAT <= 5.165 Farrar -59 Don’t blame the Retriever; Who threw the ball?
  • 60. Even if it’s a bigger tree…. INDUS > 6.145 && PT > 18.65 && INDUS <= 6.145 && NOX > 0.755 MV <= 45.7 INDUS <= 6.145 && INDUS > 6.145 && MV > 45.7 PT > 18.65 && NOX <= 0.755 && INDUS > 6.145 && LSTAT > 5.165 && PT <= 18.65 && DIS > 1.1333 DIS <= 4.91145 && TAX <= 17.5 INDUS > 6.145 && INDUS > 6.145 && PT > 18.65 && INDUS > 6.145 && PT <= 18.65 && NOX <= 0.755 && PT > 18.65 && DIS <= 4.91145 && LSTAT <= 5.165 NOX <= 0.755 && TAX > 17.5 LSTAT > 5.165 && INDUS > 6.145 && DIS <= 1.1333 PT <= 18.65 && DIS > 4.91145 Farrar -60 Don’t blame the Retriever; Who threw the ball?
  • 61. But how in Munchkin Land can you explain this thing? Farrar -61 Don’t blame the Retriever; Who threw the ball?
  • 62. And what if ya’ had Oh MY! something like THIS … + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + ……. + + + + Farrar -62 Don’t blame the Retriever; Who threw the ball?
  • 63. Even the TREES get confused… + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + ….. + + Farrar -63 Don’t blame the Retriever; Who threw the ball?
  • 64. BAD news SILENCE! • ya really can’t explain this one OCC 2011-12 Farrar -64 Don’t blame the Retriever; Who threw the ball?
  • 65. Good news • Ya CAN explain this one….. + + + + + + + + + + + + + + + + + + + + + But what the + + + + + + + Kansas is this thing anyway? + + + + + + …….. Farrar -65 Don’t blame the Retriever; Who threw the ball?
  • 66. A Woodman’s view of TreeNet® • Borrowing from Dan Steinberg’s introductory video…. – TreeNet® is also called Stochastic Gradient Boosting – It’s speed and accuracy are unparalleled in Modeling and it has a number of advantages over more traditional methodologies • I will leave the Sales Pitch to Salford, but it is my favorite tool and we used it for every kind of model you can think of – I am no expert but here is kind of how it works (and TreeNet® does this automatically and keeps track of it all for you): • build an initial tree and identify the misclassifications • using the misclassified cases as the target, pull your whole sample again, develop a new tree based on that • continue until you have exhausted your errors. Could be hundreds or thousands of builds, all happening very quickly • You then “simply” add up all of the weights of the variables in the individual trees and Voilà! Farrar -66 Don’t blame the Retriever; Who threw the ball?
  • 67. Think about it like this…. • So you get your one tree… • TreeNet® changes your target to the Misclasses and creates a second tree…. • And TreeNet® does it again and again and again while you get a treat for Toto…… • In the end, TreeNet® adds the weights of the variables in all the trees together….. + + + + + + …. • Then you simply export the code and implement the model! Farrar -67 Don’t blame the Retriever; Who threw the ball?
  • 68. Here’s a bit of what a Treenet Model looks like to a C Programmer /********************************************** * Here come the treenets in the grove. A shell for calling /********************************************** ************ them *********/ * The following C source code was automatically * appears at the end of this source file. if (CRIM == DBL_MISSING_VALUE) CRIM = 0.2102; generated *********************************************** if (ZN == DBL_MISSING_VALUE) ZN = 0; * by the TRANSLATE feature in Salford Predictive ****************/ if (INDUS == DBL_MISSING_VALUE) INDUS = 8.14; Miner(tm). double TreeNet_1(double * const pProb0, double * const if (NOX == DBL_MISSING_VALUE) NOX = 0.515; * Modeling version: 6.6.0.091, Translation version: pProb1) if (RM == DBL_MISSING_VALUE) RM = 6.251; 6.6.0.091 { if (AGE == DBL_MISSING_VALUE) AGE = 74.3; /* TreeNet version: 6.6.0.091 */ if (DIS == DBL_MISSING_VALUE) DIS = 3.4211; *********************************************** /* TreeNet: TreeNet_1 */ if (RAD == DBL_MISSING_VALUE) RAD = 5; ***********/ /* Timestamp: 2012043172135 */ if (TAX == DBL_MISSING_VALUE) TAX = 207; /* Grove: if (PT == DBL_MISSING_VALUE) PT = 18.6; #include <string.h> /* for strcmp() */ C:DOCUME~1OfficeLOCALS~1Temps5u137 */ if (B == DBL_MISSING_VALUE) B = 192.11; #include <math.h> /* for exp() */ /* Target: CHAS */ if (LSTAT == DBL_MISSING_VALUE) LSTAT = 10.3; /* N trees: 197 */ if (MV == DBL_MISSING_VALUE) MV = 21.7; /********************************************** /* N target classes: 2 */ ************ * **** APPLICATION-DEPENDENT MISSING VALUES double target, net_response = 0.0; /* Tree 1 of 197 */ **** int node, done; /* N terminal nodes = 6, Depth = 5 */ * The two constants must be set **by you** to int response = 0; whatever target = 0.0; * value(s) you use in your data management or /***************************/ node = 1; /* start at root node */ programming /* Class-specific treenets */ done = 0; /* set at terminal node */ * workflow to represent missing data. /***************************/ while (!done) switch (node) { *********************************************** double expsum = 0.0; ***********/ double prob0, score0; /* CHAS = 0 */ case 1: double prob1, score1; /* CHAS = 1 */ if (NOX != DBL_MISSING_VALUE && NOX < 0.755) const double DBL_MISSING_VALUE = /* value needed node = 2; here! */ ; else node = -6; const int INT_MISSING_VALUE = /* value needed here! /********************************************** break; */ ; *********/ /* The following predictors had no missing data in */ case 2: /************ /* the learn sample, so the TreeNet model is unable to if (TAX != DBL_MISSING_VALUE && TAX < 278) node = * PREDICTORS */ 3; ************/ /* accommodate missing data for them during scoring. else node = 5; */ break; double /* They must be imputed. These particular values are CRIM, ZN, INDUS, NOX, RM, AGE, DIS, RAD, TAX, PT, B, L */ case 3: STAT, MV; /* the learn sample medians and/or modes. These are if (RM != DBL_MISSING_VALUE && RM < 5.93) node = */ -1; /********************************************** /* provided as a convenience, you may wish to replace else node = 4; ***************** */ break; /* these expressions with your own. */ Farrar -68 Don’t blame the Retriever; Who threw the ball?
  • 69. TreeNet® case -2: code2 case -1: default: /* error */ target = -0.005427301; target = -1.202511; target = 0.0; node = 2; node = 1; done = 1; done = 1; done = 1; node = 0; break; break; break; case -3: case 4: } target = 0.0093125903; if (LSTAT != DBL_MISSING_VALUE && LSTAT < 6.13) node = 3; node = -2; net_response += target; done = 1; else node = -3; break; break; /* Tree 2 of 197 */ /* N terminal nodes = 6, Depth = 5 */ case 5: case -2: if (RM != DBL_MISSING_VALUE && RM < 5.5815) target = -1.217944; target = 0.0; node = -4; node = 2; node = 1; /* start at root node */ else node = -5; done = 1; done = 0; /* set at terminal node */ break; break; while (!done) switch (node) { case -4: Code for case -3: target = 0.00081652142; the first 3 target = -1.2337965; case 1: node = 4; node = 3; if (NOX != DBL_MISSING_VALUE && NOX < 0.7155) done = 1; Trees in done = 1; node = 2; break; break; else node = -6; the case 5: break; case -5: target = -0.0047567333; Model…* if (MV != DBL_MISSING_VALUE && MV < 27.3) case 2: node = 5; node = -4; if (PT != DBL_MISSING_VALUE && PT < 17.7) node = done = 1; else node = -5; 3; break; break; else node = 5; break; case -6: case -4: target = 0.01884071; target = -1.2337965; case 3: node = 6; node = 4; if (TAX != DBL_MISSING_VALUE && TAX < 40.5) done = 1; done = 1; node = -1; break; break; else node = 4; break; default: /* error */ case -5: target = 0.0; target = -1.2231822; case -1: done = 1; node = 5; target = 0.024272515; node = 0; done = 1; node = 1; break; break; done = 1; break; } case -6: target = -1.2087922; case 4: net_response += target; node = 6; if (CRIM != DBL_MISSING_VALUE && CRIM < done = 1; 0.191425) node = -2; /* Tree 3 of 197 */ break; else node = -3; /* N terminal nodes = 6, Depth = 5 */ (…..) break; Farrar -69 *NOTE! We multiplied the results times 10,000 to eliminate double precision problems during implementation… Ask me! Don’t blame the Retriever; Who threw the ball?
  • 70. Imagine that for THOUSANDS of trees…. Farrar -70 Don’t blame the Retriever; Who threw the ball?
  • 71. But back to the Wizard of OCC…. • Forget about the code… it’s just text! IT can handle it! • What you need to focus on is explaining it all for the Wizard…. • And that doesn’t mean slapping down a bunch of code lines • The Wizard needs to understand how come the Ruby Slippers fit so well, how the Slippers were put together, and where the material comes from (the variables and weights that drive the results) • Especially if you need to communicate to customers the effects of wearing the Slippers – In modeling terms, like if it is an Origination model needing Score Factor Codes for Adverse Action Letters…) • So here’s one way to do that…. Farrar -71 Don’t blame the Retriever; Who threw the ball?
  • 72. CASE STUDY from Real Life…. •Attrition Model – Customer will close all accounts •Needed Talking Points (Score Factors) to facilitate attempts to save customer accounts •Built TreeNet® model to predict probability that a customer will close all of their accounts •Identified CART Equivalent Rules for all Accounts •Pulled new out of sample data for recent periods •Scored and Validated the results against known outcomes •Based on the Probability, generated list of high risk accounts and pushed to Branches with Score Factors (rules) appended Farrar -72 Don’t blame the Retriever; Who threw the ball?
  • 73. Attrition Model Process • Built TreeNet® Model • Scored Validation Set using model built • Created new data set appending probability score and Node identifier to each sample point • Identified Variable Importance • Used CART to derive a Regression tree using TreeNet® score as the target • Compared Variable Importances • Looked at rules governing each of the like nodes • Manually went through tree finding Terminal Nodes with like Mean values • Generalized like nodes based on rules and split thresholds, creating factors such as “Low Balance,” “Short Time On Books,” “Diminishing Balance Over Last 6 Months,” etc. • Pruned Tree where possible (without fundamentally changing Rules and split thresholds) • Analyzed each step to understand Utility vs. Complexity tradeoffs • Tested outcome (same data) with the generalized variables • Tested with repeated out-of-sample Validation sets • Subjected process to Model Risk Management Unit which independently validated model and documentation • Implemented Model Farrar -73 Don’t blame the Retriever; Who threw the ball?
  • 74. A Schematical* Representation of what I just explained… Initial Regression Tree (post- TreeNet®) *HAH! I love new words…. Farrar -74 Don’t blame the Retriever; Who threw the ball?
  • 75. Look at which cases hit the same Node and group them Step 1 CASEID RESPONSE NODE CASEID RESPONSE NODE CASEID RESPONSE NODE CASEID RESPONSE NODE 1 -0.000004613 20 31 -0.00000251 14 61 0.000006376 3 91 0.00001477 21 2 -0.000064770 12 32 0.000005463 1 62 0.000006376 3 92 0.00001477 21 3 0.002201621 9 33 -0.00000251 14 63 0.000006376 3 93 0.00001477 21 4 0.002201621 9 34 0.000005463 1 64 0.000006376 3 94 0.00001477 21 5 0.002201621 9 35 0.000005463 1 65 -0.000046164 13 95 0.00001477 21 6 0.002201621 9 36 0.00002013 4 66 -0.000004613 20 96 0.000020736 22 7 0.000004718 19 37 0.00002013 4 67 0.000004618 2 97 0.000020736 22 8 -0.000000771 13 2 0.000016180 14 68 0.000005193 16 98 0.002227983 11 9 -0.000004596 18 19 0.000016180 14 69 0.000005193 16 99 0.002227983 11 10 0.000004718 19 16 0.002227895 7 70 0.000005193 16 100 -0.000068815 17 11 0.000004718 19 1 0.00221521 5 71 0.000005463 1 101 0.001503144 3 12 0.000004718 19 42 0.00221521 5 72 0.000005463 1 102 0.000014534 24 13 0.000004718 19 13 0.00221521 5 73 0.000005463 1 103 0.000005463 1 14 0.000005463 1 44 -0.000064770 12 74 0.000005463 1 104 0.000005463 1 15 0.000005463 1 45 -0.000064770 12 75 0.000005463 1 105 0.000005463 1 16 0.000005463 1 46 0.004469995 2 76 0.000005463 1 106 0.000005463 1 17 0.000005463 1 47 0.004469995 2 77 0.000005463 1 107 0.000005463 1 18 0.000005463 1 18 -0.000064770 12 78 0.000005463 1 108 0.000005463 1 19 0.000015537 7 14 0.004469995 2 79 0.000005463 1 109 0.000005463 1 20 0.000005463 1 50 0.004469995 2 80 0.000005463 1 110 0.000005463 1 21 0.000005463 1 51 -0.000046164 13 81 0.001503144 3 111 0.000005463 1 22 0.000005463 1 52 -0.000046164 13 82 0.00002013 4 112 0.000014534 24 23 0.000005463 1 53 0.00222916 8 83 0.000016180 5 113 0.000005463 1 24 0.000005463 1 54 -0.000046164 13 84 0.00002013 4 114 0.000005463 1 25 0.000005463 1 55 0.000005193 16 85 0.000008446 15 115 0.000005463 1 26 0.000005463 1 56 0.002220315 6 86 0.000008446 15 116 0.000005463 1 27 0.000005463 1 3 -0.000004613 20 87 0.000008446 15 117 0.000005463 1 4 0.000005463 1 58 0.00240118 10 88 0.000008446 15 118 0.000005463 1 3 0.000005463 1 59 0.000006376 3 89 0.00240118 10 119 0.000005463 1 20 0.000005463 1 60 0.000006376 3 90 0.002227983 11 120 0.000005463 1 Farrar -75 Don’t blame the Retriever; Who threw the ball?
  • 76. Sort by Response and determine if it makes sense to group similar outcomes Step 2 CASEID RESPONSE NODE CASEID RESPONSE NODE CASEID RESPONSE NODE CASEID RESPONSE NODE 100 -0.0000688150 17 18 0.0000054630 1 108 0.0000054630 1 38 0.0000161800 14 2 -0.0000647700 12 20 0.0000054630 1 109 0.0000054630 1 39 0.0000161800 14 44 -0.0000647700 12 21 0.0000054630 1 110 0.0000054630 1 83 0.0000161800 14 45 -0.0000647700 12 22 0.0000054630 1 111 0.0000054630 1 36 0.0000201300 4 48 -0.0000647700 12 23 0.0000054630 1 113 0.0000054630 1 37 0.0000201300 4 51 -0.0000461640 13 24 0.0000054630 1 114 0.0000054630 1 82 0.0000201300 4 52 -0.0000461640 13 25 0.0000054630 1 115 0.0000054630 1 84 0.0000201300 4 54 -0.0000461640 13 26 0.0000054630 1 116 0.0000054630 1 96 0.0000207360 22 65 -0.0000461640 13 27 0.0000054630 1 117 0.0000054630 1 97 0.0000207360 22 1 -0.0000046130 20 28 0.0000054630 1 118 0.0000054630 1 81 0.0015031440 3 57 -0.0000046130 20 29 0.0000054630 1 119 0.0000054630 1 101 0.0015031440 3 66 -0.0000046130 20 30 0.0000054630 1 120 0.0000054630 1 3 0.0022016210 9 9 -0.0000045960 18 32 0.0000054630 1 59 0.0000063760 3 4 0.0022016210 9 31 -0.0000025100 14 34 0.0000054630 1 60 0.0000063760 3 5 0.0022016210 9 33 -0.0000025100 14 35 0.0000054630 1 61 0.0000063760 3 6 0.0022016210 9 8 -0.0000007710 13 71 0.0000054630 1 62 0.0000063760 3 41 0.0022152100 5 67 0.0000046180 2 72 0.0000054630 1 63 0.0000063760 3 42 0.0022152100 5 7 0.0000047180 21 73 0.0000054630 1 64 0.0000063760 3 43 0.0022152100 5 10 0.0000047180 21 74 0.0000054630 1 85 0.0000084460 15 56 0.0022203150 6 11 0.0000047180 21 75 0.0000054630 1 86 0.0000084460 15 40 0.0022278950 7 12 0.0000047180 21 76 0.0000054630 1 87 0.0000084460 15 90 0.0022279830 11 13 0.0000047180 21 77 0.0000054630 1 88 0.0000084460 15 98 0.0022279830 11 55 0.0000051930 16 78 0.0000054630 1 102 0.0000145340 24 99 0.0022279830 11 68 0.0000051930 16 79 0.0000054630 1 112 0.0000145340 24 53 0.0022291600 8 69 0.0000051930 16 80 0.0000054630 1 91 0.0000147700 19 58 0.0024011800 10 70 0.0000051930 16 103 0.0000054630 1 92 0.0000147700 19 89 0.0024011800 10 14 0.0000054630 1 104 0.0000054630 1 93 0.0000147700 19 46 0.0044699950 2 15 0.0000054630 1 105 0.0000054630 1 94 0.0000147700 19 47 0.0044699950 2 16 0.0000054630 1 106 0.0000054630 1 95 0.0000147700 19 49 0.0044699950 2 17 0.0000054630 1 107 0.0000054630 1 19 0.0000155370 7 50 0.0044699950 2 Farrar -76 Don’t blame the Retriever; Who threw the ball?
  • 77. Similar Outcomes, Consolidate Rules? Tnode 4 0.00002013 YRS_OB > 6.145 && CONTACTS <= 18.65 && NUM_ACCTS <= 4.5 && BRANCH <= 417.5 && C_BTL > 0.04427 && PROFIT_ILE <= 24.95 Tnode 22 0.00002074 YRS_OB > 6.145 && CONTACTS > 18.65 && FEE_LEVEL <= 0.755 && TRANS_NUM > 540.5 && M_VAL <= 39.9 Tnode 8 0.00222916 Tnode 11 0.00222798 YRS_OB > 6.145 && YRS_OB > 6.145 && CONTACTS <= 18.65 && CONTACTS <= 18.65 && NUM_ACCTS <= 4.5 && NUM_ACCTS <= 4.5 && C_BTL > 0.04427 && C_BTL > 0.04427 && PROFIT_ILE > 24.95 && BRANCH > 290.5 && BRANCH <= 290.5 && BRANCH <= 417.5 && TRANS_NUM > 1325.5 && NUM_PROD > 4.5 && TRANS_NUM <= 2731 FEE_LEVEL > 0.5055 && PROFIT_ILE > 24.95 && PROFIT_ILE <= 96.05 && MOS_ACTIVE <= 367.315 && TRANS_NUM <= 1563 Farrar -77 Don’t blame the Retriever; Who threw the ball?
  • 78. Node Rule Consolidation Decisions Tnode 4 0.00002013 Tnode 22 0.00002074 YRS_OB > 6.145 && YRS_OB > 6.145 && CONTACTS <= 18.65 && CONTACTS > 18.65 && NUM_ACCTS <= 4.5 && FEE_LEVEL <= 0.755 && BRANCH <= 417.5 && TRANS_NUM > 540.5 && C_BTL > 0.04427 && M_VAL <= 39.9 PROFIT_ILE <= 24.95 TN4 and TN22 Not good for Consolidation, too many differences Tnode 11 0.002227983 YRS_OB > 6.145 && Tnode 8 0.00222916 CONTACTS <= 18.65 && YRS_OB > 6.145 && NUM_ACCTS <= 4.5 && CONTACTS <= 18.65 && C_BTL > 0.04427 && NUM_ACCTS <= 4.5 && BRANCH > 290.5 && C_BTL > 0.04427 && TN8 and TN11 BRANCH <= 417.5 && PROFIT_ILE > 24.95 && Good for Consolidation, differences NUM_PROD > 4.5 && BRANCH <= 290.5 && can be dealt with FEE_LEVEL > 0.5055 && TRANS_NUM > 1325.5 && PROFIT_ILE > 24.95 && TRANS_NUM <= 2731 PROFIT_ILE <= 96.05 && MOS_ACTIVE <= 367.315 && TRANS_NUM <= 1563 Rules Tnodes 8 & 11 BRANCH = “NORTHERN THRU CENTRAL” YRS_OB > 6 CONTACTS = “LOW” TRANS_NUM = “MODERATE” Farrar -78 Don’t blame the Retriever; Who threw the ball?
  • 79. Determine if Pruning the tree will appreciably affect the generic rules TNode 5 YRS_OB > 6.145 && TNode 8 CONTACTS > 18.65 && YRS_OB > 6.145 && FEE_LEVEL <= 0.755 && CONTACTS > 18.65 && TRANS_NUM <= 540.5 FEE_LEVEL > 0.755 Rules Tnodes 5 & 8 BRANCH = “NORTHERN THRU CENTRAL” YRS_OB > 6 CONTACTS = “LOW” TRANS_NUM = “MODERATE” Farrar -79 Don’t blame the Retriever; Who threw the ball?
  • 80. Analyze Differences: Full vs. Pruned Trees F_CASEID F_NODE P_CASEID P_NODE Same Node 1 1 1 1 TRUE 2 3 2 3 TRUE 3 3 3 3 TRUE Counts _NODE P_NODE 4 1 4 1 TRUE F_NODE 1 2 3 4 5 6 7 8 Grand Total 5 1 5 1 TRUE 1 154 154 6 1 6 1 TRUE 2 9 9 7 20 7 4 FALSE 3 5 5 8 20 8 4 FALSE 4 5 5 9 20 9 4 FALSE 5 1 1 10 20 10 4 FALSE 6 9 9 11 20 11 4 FALSE 7 3 3 12 20 12 4 FALSE 8 9 9 13 20 13 4 FALSE 9 12 12 14 22 14 6 FALSE 10 6 6 15 22 15 6 FALSE 11 9 9 16 22 16 6 FALSE 12 2 2 17 22 17 6 FALSE 13 4 4 18 22 18 6 FALSE 14 10 10 19 22 19 6 FALSE 15 1 1 20 22 20 6 FALSE 16 9 9 21 22 21 6 FALSE 17 1 1 22 22 22 6 FALSE 18 4 4 23 22 23 6 FALSE 19 9 9 24 22 24 6 FALSE 20 21 21 25 22 25 6 FALSE 21 5 5 26 22 26 6 FALSE 22 208 208 27 22 27 6 FALSE 23 2 2 28 22 28 6 FALSE 24 8 8 29 22 29 6 FALSE Grand Total 154 9 99 21 5 208 2 8 506 30 22 30 6 FALSE 31 22 31 6 FALSE 32 22 32 6 FALSE 33 22 33 6 FALSE 34 22 34 6 FALSE Farrar -80 Don’t blame the Retriever; Who threw the ball?
  • 81. Variable Importance Changes Full Tree Variable Importance Pruned Tree Variable Importance Variable Score Variable Score M_VAL 100.00 |||||||||||||||||||||||||||||||||||||||||| M_VAL 100.00 |||||||||||||||||||||||||||||||||||||||||| TRANS_NUM 90.09 |||||||||||||||||||||||||||||||||||||| CONTACTS 80.93 |||||||||||||||||||||||||||||||||| CONTACTS 88.67 ||||||||||||||||||||||||||||||||||||| TRANS_NUM 80.71 |||||||||||||||||||||||||||||||||| FEE_LEVEL 79.38 ||||||||||||||||||||||||||||||||| FEE_LEVEL 75.06 ||||||||||||||||||||||||||||||| BRANCH 70.26 ||||||||||||||||||||||||||||| YRS_OB 70.90 ||||||||||||||||||||||||||||| YRS_OB 66.95 |||||||||||||||||||||||||||| NUM_ACCTS 60.14 ||||||||||||||||||||||||| REGION 56.57 ||||||||||||||||||||||| REGION 56.84 ||||||||||||||||||||||| NUM_ACCTS 56.23 ||||||||||||||||||||||| BRANCH 41.87 ||||||||||||||||| PROFIT_ILE 39.96 |||||||||||||||| FAM_MEMS 28.42 ||||||||||| FAM_MEMS 35.23 |||||||||||||| PROFIT_ILE 17.74 ||||||| C_BTL 34.01 |||||||||||||| C_BTL 16.08 |||||| NUM_PROD 23.71 ||||||||| NUM_PROD 8.96 ||| MOS_ACTIVE 9.20 ||| Farrar -81 Don’t blame the Retriever; Who threw the ball?
  • 82. Tnode 8 0.00222916 Before… Tnode 11 0.002227983 YRS_OB > 6.145 && CONTACTS <= 18.65 && NUM_ACCTS <= 4.5 && YRS_OB > 6.145 && C_BTL > 0.04427 && CONTACTS <= 18.65 && BRANCH > 290.5 && NUM_ACCTS <= 4.5 && BRANCH <= 417.5 && C_BTL > 0.04427 && TN8 and TN11 NUM_PROD > 4.5 && PROFIT_ILE > 24.95 && FEE_LEVEL > 0.5055 && BRANCH <= 290.5 && Good for Consolidation, differences can be PROFIT_ILE > 24.95 && TRANS_NUM > 1325.5 && dealt with PROFIT_ILE <= 96.05 && TRANS_NUM <= 2731 MOS_ACTIVE <= 367.315 && TRANS_NUM <= 1563 Rules Tnodes 8 & 11 BRANCH = “NORTHERN THRU CENTRAL” YRS_OB > 6 CONTACTS = “LOW” TRANS_NUM = “MODERATE” and After… TNode 5 YRS_OB > 6.145 && TNode 8 Rules TNodes 5 & 8 YRS_OB > 6.145 && CONTACTS > 18.65 && FEE_LEVEL <= 0.755 && CONTACTS > 18.65 && BRANCH = “NORTHERN THRU CENTRAL” FEE_LEVEL > 0.755 TRANS_NUM <= 540.5 YRS_OB > 6 CONTACTS = “LOW” TRANS_NUM = “MODERATE” Effect after pruning (Where Art Meets Science): •TNodes change from 8 and 11 to 5 and 8 (Smaller tree) •“BRANCH” kept since it applied prior to pruning and aids in list generation and routing •“YRS_OB” split threshold becomes rounded generalized threshold •Generalization can still be used •In this example, “FEE_LEVEL” was not included ( “<= and >” cancel each other out) •“CONTACTS” thresholds change ( “<= becomes >” ) but threshold still can be used within “Low” designation •“TRANS_NUM was kept since it applied prior to pruning and aided in talking points Farrar -82 Don’t blame the Retriever; Who threw the ball?
  • 83. RUN, Toto, RUN!!!! • Implement the Dog-gone thing! Customer Branch Risk Point 1 Point 2 Point 3 Point 4 Long time on Bill Muchkinovski 200 Low books 6 Mos. Moderate Balance Moderate number products Moderate Profit Short time on 6 Mos. Low number Millie Smoller 27 Med books 6 Mos. Low Balance Low Number Products contacts Short time on 6 Mos. High number Beulah Diminuitive 343 Med books 6 Mos. Low Balance Low Number Products contacts Casper Long time on 6 Mos. High Number of Lollipopovich 721 High books 6 Mos. Diminishing Balance Contacts Moderate Profit Long time on 6 Mos. High Number of Martha Smallkind 14 High books 6 Mos. Diminishing Balance Contacts Moderate Profit Elmo Long time on 6 Mos. High Number of Munchkinovich 1 High books Contacts 6 Mos. High Balance High Profit Farrar -83 Don’t blame the Retriever; Who threw the ball?
  • 84. Save those CUSTOMERS! Farrar -84 Don’t blame the Retriever; Who threw the ball?
  • 85. Happily Down the Road…. Farrar -85 Don’t blame the Retriever; Who threw the ball?
  • 86. There’s No Place Like Home…. Farrar -86 Don’t blame the Retriever; Who threw the ball?
  • 87. The End Farrar -87 Don’t blame the Retriever; Who threw the ball?
  • 88. Jon’s 30+ years of Predictive Modeling expertise comes from various segments of the financial industry including Banking, Consumer Finance, Mortgage, and Modeling Vendor. He has experience in the U.S., Canada, Australia and the United Kingdom. As SVP and Manager of Predictive Modeling at Union Bank, Jon introduced Scoring technology in 1995 and provided The now departed Zeppelin, best human Credit Risk research, analytics and Customer being I ever knew, proudly displaying the Segmentation strategies, along with many of the Bank’s four balls he so loved to retrieve… Business Intelligence and Operations statistical models. Contact Information: Jon’s Expertise includes Regulatory oversight and all things AVM (Automated Valuation Modeling). jcf4now@sbcglobal.net In addition to Consulting and Expert Witness engagements, Jon holds a Master’s Degree in Counseling Psychology and speaks at a variety of Industry conferences. Farrar -88 Don’t blame the Retriever; Who threw the ball?