8. What is Deep Learning?
•
• This is a DL model with 2 hidden layers c(4,3)
• The input layer is the first layer where we have the inputs: Age,
Income & FICO score
• First hidden layer has 4 neurons and second hidden layer has 3
neurons
• The final layer has a softmax function which converts the
weights to probabilities
• Learns non-linearity
• Black-box, brute-
force algorithm for
pattern recognition
9. Scoring Metrics in Flow UI
Hidden layers = c(64,64)
Epochs = 200
Stopping Metric = MSE
Sampled data (25% of original)
10. Scoring Metrics in Flow UI
Hidden layers = c(64,64)
Epochs = 200
Stopping Metric = MSE
All available data
11. Scoring Metrics in Flow UI
Hidden layers = c(100, 100)
Epochs = 200
Stopping Metric = Classification error
All available data
12. Scoring Metrics in Flow UI
Hidden layers = c(200, 200)
Epochs = 200
Stopping Metric = Classification error
All available data
13. Scoring Metrics in Flow UI
Hidden layers = c(512)
Epochs = 200
Stopping Metric = Classification error
All available data
14. Scoring Metrics in Flow UI
Hidden layers = c(200, 200)
Epochs = 200
Stopping Metric = Classification error
All available data
L1 & L2 regularization
15. Tuning your DL model in H2O
• Other tuning options:
o Add more hidden layers 2 to 5
o Change number of neurons
o Use k-fold validation for hyper-parameter tuning
o Change the input & hidden dropout ratios
o Change the adaptive rate: rho, epsilon
• Use dimensionality reduction like GLRM or
deep learning features
16. Generalized Low Ranking Models in H2O
What is a low ranking model ?
A: table with mxn dimensions
Rows of Y (k): the archetypal features created from columns of A
Rows of X: reduced feature-set for A
So A can be approx. reconstructed from X & Y
17. GLRM: Why use it?
• Reduces storage space: 10GB 100MB
• Since the predictions are done on the compressed data set, we
have increased speed
• Impute missing Data
• Identify and visualize important features
18. GLRM: Compressing Pin codes
• Replace Pin code column in the training data set with the low-
rank model X
• Now train using the modified dataset
19. Ensembles in H2O
There are three types of ensembles:
1) Bagging 2) Boosting & 3) Stacking (Super Learner)
The h2oEnsemble package uses the “Super Learner Algorithm”
• Start with the “Level-Zero Data”
• Define L base learners
• Specify a meta-learner
• Perform k-fold cross validation
using each of the L learners
20. Ensembles in H2O: Super-learning
• p1 to pL are the predicted values from the k-fold CV by each of
the L learners
• These values are then combined together into the Z table
• The meta-learner is then used on Z,y
• The meta-learner/super learner learns the optimal
combination in which the base learners should be combined
21. Resources
• H2O can be directly downloaded from their Download Section
• Open Source community forum H2ostream
• Github repo
• Learning platform
• On Gitter
• On Youtube
22. Something about the author…
• LinkedIn Profile
• He is a data fanatic! He loves to
crunch data and tries to find out
hidden relationships. He is open to
collaborative relationships in the field
of data analytics.
• If you have similar interests, send him
an invitation request indicating how
the two of you can collaborate.