Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
H2O World - Machine Learning for non-data scientists
1. Calling
Your
Shots
with
Data
How
to
Ask
Smarter
Questions
to
Make
Better
Business
Decisions
November
9,
2015
Jessica
LanfordChen
Huang
2. Need
a
handy
conference
guide?
Download
our
app,
“H2O
World
2015”
3. What
do
these
stickers
mean?
I have H2O
Installed
I have Python
installed
I have R
installed
I have the H2O
World data
sets
Pick
up
stickers
or
get
install
help
at
the
information
booth
4. Agenda
• Introduction
• Business
Decision
Process
• Tools
&
Resources
• Bridging
the
Communications
Gap
• Q&A
5. Business
Decision
Process
Make
data-‐
informed
business
decisions
Ask
business
questions
Define
business
problems
Analysis
Process
6. Motivators
that
Influence
Business
Decisions
Business
decisions
will
impact:
Customer
product
interaction
Customer
and
company
engagement
Product
development
???
1
2
3
4
Your%
customers%
Your%products%
/%services%/%
offerings
Your%business
1
2 3
4
7. Asking
the
“Right”
Questions
The
answers
to
the
business
questions
will
ultimately
provide
business
context
for
the
“Analysis
Process”.
Make
data-‐
informed
business
decisions
Ask
business
questions
Define
business
problems
Analysis
Process
12. What
is
Machine
Learning?
• Machine
reads
the
data,
learns
from
the
data,
uses
it
to
make
predictions
• Can
show
you
correlation
but
not
necessarily
causation
• Can
find
relationships
and
patterns
within
volumes
of
data
that
the
human
mind
is
incapable
of
processing
Note:
There
is
no
“right”
or
“best”
model
that
a
data
scientist
can
use.
The
model
used
is
dependent
on
the
data,
problem,
and
the
data
scientist.
13. Supervised
Learning
Business
Applications:
• Classification
• Twitter
sentiments:
Rant
-‐>
negative,
Rave
-‐>
positive
• Coffee
vs.
tea
vs.
soda
drinker
• Recommender
systems
• Netflix’s
“More
Like
This”
• Amazon’s
“Customers
Who
Bought
This
Item
Also
Bought”
• Fraud
detection
• Authorizing
transactions
• Known
right
answer,
using
model
to
verify
• Algorithm
tries
to
predict
results
• Based
on
its
training
data,
the
program
can
make
accurate
decisions
when
given
new
data
• Examples
of
algorithms
and
models:
GLM,
DRF,
GBM,
Deep
Learning
Data
Science
Concept:
14. Unsupervised
Learning
Business
Applications:
• Anomaly
detection
• outliers:
detecting
irregular
heartbeats
• computer
security
with
unauthorized
access
• Clustering
• Grouping
users
by
salary
• Grouping
users
by
behavior
• No
“known”
answer,
using
algorithms
to
determine
answer
• Algorithm
tries
to
identify
patterns
in
the
data
• General
understanding
of
input
data
where
no
prediction
is
needed
• Examples
of
algorithms
and
models:
K-‐means,
PCA
Data
Science
Concept:
15. Classification
(Supervised)
Business
Applications:
• Will
customers
upgrade
to
new
software?
• What
age
groups
tested
well
for
this
new
TV
show?
(marketing
campaigns)
• Nigerian
419
(spam
classification)
• Will
the
real
Barack
Obama
please
stand
up?
(fraud
detection)
• Classification
is
the
process
of
taking
an
input
and
assigning
a
label
to
it.
• The
labels
could
be
binomial
(Yes,
No)
or
multinomial
(High,
Medium,
Low).
• Examples
of
algorithms
and
models:
Random
Forest
Data
Science
Concept:
16. Regression
(Supervised)
Business
Applications:
• How
much
money
would
a
user
who
has
reached
level
200
in
CandyCrush
spend
on
in-‐app
purchases?
(forecasting)
• How
much
would
a
customer
expect
to
pay
for
car
insurance
based
on
age,
gender,
and
car
type?
(prediction)
• How
many
registered
meetup.com
attendees
will
actually
show
up
based
on
past
event
registration
and
attendance?
(prediction)
• Regression
predict
a
continuous
numerical
value
output
• Examples
of
algorithms
and
models:
Linear
Regression,
Random
Forest
Data
Science
Concept:
17. Deep
Learning
(Supervised
and
Unsupervised)
Business
Applications:
• Scanning
mug
shots
of
suspects
against
FBI
database
(scanning
image
classification)
• Siri
(language
processing)
• Early
detection
of
frustrated
customers
who
call
into
call
centers
(audio
processing)
• Uses
“features”
(multiple
variables
impacting
a
result)
to
identify
patterns
• Uses
results
to
iteratively
improve
predictions
for
new
data
Data
Science
Concept:
18. Clustering
(Unsupervised)
Business
Applications:
• Identify
different
types
of
shoppers
based
on
purchasing
history
to
create
exclusive
promotions
(market
segmentation)
• Identifying
groups
of
products
people
like
to
buy
online
• Identify
geographic
locations
where
a
national
mobile
carrier
should
install
its
next
cellular
tower
to
optimize
for
its
user
base
• Grouping
a
set
of
objects
in
the
same
group
that
are
more
similar
to
each
other
than
other
groups
• Examples
of
algorithms
and
models:
K-‐means
clustering,
hierarchical
clustering,
DBSCAN
Data
Science
Concept:
19. Business
Examples:
Types
of
Machine
Learning:
Machine
Learning
Summary
Supervised
• Calculating
estimated
lifetime
value
• Forecasting
and
prediction
• Recommendation
engine
• Fraud
detection
Unsupervised
Data
Science
Concepts:
• Anomaly
detection
• Determining
customer
behavior
• Imagine,
text,
and
audio
processing
• Classification
• Regression
• Deep
Learning
• Deep
Learning
• Clustering
20. If
you
Want
to
Learn
More…
• StackExchange:
stats.stackexchange.com
• Quora:
quora.com/Machine-‐Learning
• Data
Science
in
H2O:
http://docs.h2o.ai/
h2oclassic/datascience/top.html
• Visualization
Introduction
to
Machine
Learning:
r2d3.us/visual-‐intro-‐to-‐machine-‐learning-‐part-‐1
• Machine
Learning
Map:
http://scikit-‐learn.org/
stable/tutorial/machine_learning_map/