Slide notes of an interactive workshop at the Brussels Data Innovation Summit 2018. In this workshop for management and aspiring data scientists, we have briefly covered 10 key Data Science applications. For each application, we described what the application is about, and illustrated this with vivid example. As a bonus, we explained how we use such slides and workshops to inspire clients and prospects on generating their ideas to harvest innovative data science projects internally.
1. Geert Verstraeten
Managing Partner - Python Predictions
Data Innovation Summit
27th June 2018
#DISUMMIT
Seeing the forest through the trees
Data science explained
in 10 concrete applications
2. Geert Verstraeten – Seeing the forest through the trees – @pythongeert – #DISUMMIT
Why this workshop?
My cocktail problem
in Data Science
In the beginning of my
career (2001), when I
mentioned to people I was
predicting human behavior
using data, I received a lot
of blank stares
3. Geert Verstraeten – Seeing the forest through the trees – @pythongeert – #DISUMMIT
Nowadays (2018) my
friend tells me that the
camera on his smartphone
has AI – and I’m doing the
blank staring -
“What does a camera with
AI do? And why would I
need that?”
AI, machine learning and
data science are hyped
today – I decided to focus
on some cool yet concrete
applications of data
science here
4. AIRPORT USE CASE IDEAS
Geert Verstraeten – Seeing the forest through the trees – @pythongeert – #DISUMMIT
SEEING THE FOREST THROUGH THE TREES
I’ve aimed to provide an idea of
what data science is, based on
10 concrete examples of 10
different and important data
science applications
5. Geert Verstraeten – Seeing the forest through the trees – @pythongeert – #DISUMMIT
AUDIO & TEXT ANALYTICS
Some towns in Belgium are experimenting with
a system to detect loneliness among older
people – most will reply they are not, but the
audio information (such as intonation) reveals
much more than the text (link - in Dutch)
Extracting useful information out of text and audio inputs
Other examples: call centers may use
audio signals to detect whether
customers are likely to leave – or
whether clients call them out of
loneliness versus when they have
issues with their product or service,
but also chatbots are cool examples
of improved text analytics
6. Geert Verstraeten – Seeing the forest through the trees – @pythongeert – #DISUMMIT
RECOMMENDER SYSTEMS
In his PhD for VDAB, Michael Reusens used
recommender systems to match jobs to
candidates – in that case it’s not only important
that the candidate finds the job interesting, the
idea is that jobs are recommended where the
recruiter would find the candidate interesting -
so it should be a match from both sides (link)
Recommending the right offers or content to the right audience
Other examples: movie recommendations by Netflix are
probably the most known application in Data Science,
but news recommendations work in similar ways, and
dating apps might face similar challenges as VDAB.
Recommending the right politician to the right voter could
work in the same way but is obviously more sensitive ☺
7. Geert Verstraeten – Seeing the forest through the trees – @pythongeert – #DISUMMIT
COMPUTER VISION
SAS is contributing the a project
named WildTrack, an organization
counting the number of animals in
the wild for certain species. While
such studies depended on expert
track finders before, everyone
with a smartphone is now
encouraged to take photographs
of footprints, and using image
recognition, they are classified
into the right species, which leads
to better information (link)
Extracting useful information out of images and video
Many other examples: two high school students predict
forest fires by examining the soil – Octinion produced a
prototype robot for picking strawberries at the right time
– diagnosis of medical images surpassed human expert
level (for example for detecting melanoma) – robotic
weed control – detection of plastic waste in the ocean
8. PREDICTIVE ANALYTICS
Geert Verstraeten – Seeing the forest through the trees – @pythongeert – #DISUMMIT
Predicting behavior in a business context, with interpretation
One of the most interesting
projects we’ve worked on, is a
project that started out as
employee burnout prediction
with sdworx, as presented on
the Data Innovation Summit
2017. In this case, we learned
a lot about absenteeism,
constructed a model with good
quality on aggregated level,
and we taught sdworx how
they can do similar projects for
their clients (link).
Some other examples: Companies are typically
predicting who will buy a product, who will become a
valuable client in the future, who will leave the
relationship, who will repay their debts, who will show
fraudulent behavior, who will leave their employer, …
In most of these cases, the interpretation of the model
is equally important as the predictive quality. A great
summary can be found in this book
9. Geert Verstraeten – Seeing the forest through the trees – @pythongeert – #DISUMMIT
EXPERIMENTAL DESIGN
Testing different options and approaches to maximize impact
Prediction by itself does
not create value, but
the resulting actions do.
Yet to be sure value is
created, these actions
should be tested
through carefully
designed experiments.
Experimentation is common in
online environments, but also
useful offline. It may range
from simple A / B tests to
complex tests – for example
some debt collectors have
optimized the way they collect
through offline experimentation
of collection strategies
Few companies understand this
better than Booking.com, who created
a platform allowing hundreds of
employees to run hundreds of small
experiments at any time - resulting in
conversion levels 2-3x the industry
average (link)
10. ANOMALY DETECTION
Geert Verstraeten – Seeing the forest through the trees – @pythongeert – #DISUMMIT
Data Science is in many cases concerned by
understanding the patterns. Yet sometimes, it
may be interesting to look at what deviates. For
example, volcanic eruptions can be detected
based on anomalies in thermal features such as
geysers, hot springs and lava flows (link)
Some other examples: anomaly
detection is most commonly used
to detect fraud (see for example
how HSBC screens card fraud),
crime in general (including
cybersecurity) – such type of
behavior is very volatile, but
consistent in the fact that it differs
from average, normal behavior.
Another remarkable example:
Chinese researchers are also
using anomalies in the genepool
to identify potential athletes.
It may be interesting to detect what deviates from the pattern
11. Geert Verstraeten – Seeing the forest through the trees – @pythongeert – #DISUMMIT
SEGMENTATION
Grouping objects based on their similarity
Perhaps one of the
oldest applications in the
book, but certainly
incredibly valuable for
exploratory work. One of
the coolest projects
we’ve done is predicting
needs of Private Banking
clients – not all wealthy
clients of a bank expect
the same benefits –
some like VIP events, but
others want immediate
financial information, etc
(no link or details due to
confidentiality).
Some other examples: Due to the
strategic nature of segmentation
projects, it proved difficult to find
concrete cases online. But in this
context, it may be fun to look at the
different segments of participants in
the Belgian Data Science community
(article and dashboard).
12. Geert Verstraeten – Seeing the forest through the trees – @pythongeert – #DISUMMIT
IOT ANALYTICS
Extracting information of connected devices
It is easy to find analytics of sensor
data, but more difficult to find cases
where the connectivity between
devices is key. Google Nest is a great
example – in the sense that it learns
from past behavior (e.g. when
someone is absent) and can
communicate to other devices (e.g. to
turn off the oven).
Some other examples:
The Belgian Red Devils
are training using
sensors that measure
fatigue to prevent
injuries, Volvo Trucks
claims to be able to
reduce standstill of it’s
fleet by 80% by
predicting the need for
maintenance, and
sensors are crucial for
many applications in
healthcare
13. NETWORK ANALYSIS
Geert Verstraeten – Seeing the forest through the trees – @pythongeert – #DISUMMIT
Sometimes the most important information is the network itself
In specific cases, the understanding
of a network of people and things
can help solve a case. For example,
in an analysis of the 9-11 attacks, It
has been estimated that the 9-11
network could have been
dismantled if just three central
nodes had been eliminated (link)
Some other examples: Network analysis
is a common technique for fraud and
crime detection. It can also serve to
detect churn (wich customer will leave a
company), to understand which
companies are connected through
shareholders and which authors are
connected through publications, etc
14. Geert Verstraeten – Seeing the forest through the trees – @pythongeert – #DISUMMIT
PROCESS MINING
Understand and reduce inefficient processes
one of the largest academic hospitals
in the Netherlands uses process
mining to reduce waiting times,
increase efficiency and optimize
treatments (link)
Processes are everywhere:
from job applications to
complaint handling, road
works, and new client
onboarding. Increasing their
efficiency often leads to
reduced costs and improved
(customer) experience.
Another concrete example of
using process mining to
improve the Customer
Journey: ING Belgium
15. About half the time was spend
explaining the examples I had
found in preparing the
session, with help from many
friends on LinkedIn
In the other half of the time,
the audience interacted with
their often colorful examples,
which resulted in a fun and
interactive session that
seemed to be appreciated
Looking back at this, an exciting
journey into exploring what Data
Science is today, and how it
leads to concrete value. Thanks
to all who participated and
provided input!