SlideShare a Scribd company logo
1 of 27
1
Real-time Machine Learning
Vinoth Kannan
Intelligent software architecture using Modified Lambda architecture
& Apache Mahout
SkillFactory 71
Vinoth.kannan@widas.de
2
Agenda
What is Machine Learning ?
Need for Real Time Machine Learning
What is Lambda architecture ?
What is Mahout ?
How does a basic recommendor engine works ?
Some Use Cases
3
What is machine learning?
4
Introduction
Machine Learning from Streaming Data
Model that
considers recent
history
Model that is updatable
Machine Learning
It has been sunny and 30
degrees in the last two days, it is
unlikely that it will be -10 degrees
and snowing the next day
A retail sales model that
remains accurate as the
business gets larger
Dont they both mean the same ??
5
Introduction
Machine Learning from Streaming Data
Time-series prediction non-stationary data
distributions
weather Retail sales
Model that
considers recent
history
Model that is updatable
6
Introduction
Machine Learning from non-stationary data distributions
Incremental Algorithms
non-stationary data
distributions
Batch algorithm
These are machine
learning algorithms that
learn incrementally over
the data.
These are machine
learning algorithms that
re-trains periodically
with a batch algorithm.
7
Introduction
The Challenge for the Best Big Data Technology
Hadoop
Batch processing
System that can churn
huge volume of data
Storm
Real time complex event
processing System that
can process data stream
Wrong Fight !!!
9
+ =
Real-time
Big Data
Its a Chance not a Challenge
Lambda Architecture!!!
10
Lambda Architecture
Overview
Speed Layer
Serving layer
Batch layer
Speed Layer
• Only new data
• Compensates for high latency
Serving layer updates
• Batch layer overrides speed
layer
Serving layer
• Loads and expose the batch
views for querying
• Random access to batch views
Batch layer
• Immutable, constantly growing
datasets
• Batch views are computed from
this raw dataset
Lambda Architecture
Overview with description
Basic Idea behind Lambda architecture
12
query = function(all data)
- Nathan Marz
Big Data - Principles and best practices of scalable realtime data systems
Basic Idea behind Lambda
13
Perform some function from real-time data “0“ to the history data “n“
Real Time Big Data
Lambda Architecture
Hadoop ProcessStorm ProcessReal Time Big Data
}
}
}Letting the History data processed by Hadoop makes process faster
The Problem
14
Batch ProcessReal-timeReal Time Big Data
}
}
}
• How to define the boundery between Real-time and Batch
Process ?
• How to synchronize the computation between the two
system ?
• How to avoid gaps and overlaps ?
• What algorithm to use?
• How to avoid failure and have fault tolerance mechanism ?
Questions to be answered
Unanswered questions of Lambda architecture
Modified Lamda Architecture
Presentation Layer
• Presentation layer must aggregate the output of
Storm and Hadoop outputs
• User will see the result of his events in less than 2
seconds
• Seamless merge between short and long term data
Machine Learning with Mahout
16
17
What is Mahout ?
Introduction
• Apache Software Foundation Java library
• Scalable “machine learning“ library that runs on Hadoop mostly
• Currently Mahout supports mainly four use cases
Recommendation Clustering
Classification
Frequent Itemset
mining
• Core algorithms for clustering, classfication and batch based
collaborative filtering are implemented on top of Apache Hadoop using
the map/reduce paradigm
18
Basic Recommendor algorithm
How it works
Today‘s FOCUS : Suggesting item to user based on current search
19
Basic Recommendor algorithm
Defining recommendation
Two broad categories of recommender engine algorithms
Mahout implements a collabrative filtering framework
User-based
Recommends items by
finding similar users.
Harder to scale because of
dynamic nature of users
Item-based
Calculate similiarty between
items and make
recommendations.
Items usually dont change
much and hence could be
calculated offline
20
Basic Recommendor algorithm
Defining recommendation
User Preference to an Item
• Like Something
• Dont Like something
• Dont Care
1 Click = 1 Like = Uniform Preference
Safe to assume
Mahout Library of Algorithms
Lots of algorithms to Choose From
Use Cases
Real Time Machine Learning
eCommerce
Objective : Increase sales revenue
Match potential customer to the right product
Personalise user experience on web and email
Customer lifecycle management
Use Cases
Real Time Machine Learning
Financial Services
Objective : Real Time Fraud Detection
Compute patterns/ predictors for individual
customers
Classify and Cluster custumers and recalculate
patterns and predictors
Set threshold across all data
Use Cases
Real Time Machine Learning
Media
Objective : Generating Meta Data
Video/ Audio/Text analysis
Find patterns/cluster for
people, places, products, things
Use Cases
Real Time Machine Learning
Carbookplus
Objective : Generating Meta Data
Match potential trips to right destination
Recommend best gas station
Recommend contacts whom user might know
Match right advertisers to customer based on
vehcile needs
Summary
Ability to create real time systems based on lambda
architecture
Usefulness of predictive algorithms
Reason to concentrate on real time predicitions
More Read
http://storm-project.net/
http://mahout.apache.org/
http://hadoop.apache.org/
26
27
Thank You

More Related Content

What's hot

The Future Of Work & The Work Of The Future
The Future Of Work & The Work Of The FutureThe Future Of Work & The Work Of The Future
The Future Of Work & The Work Of The FutureArturo Pelayo
 
An Introduction to Generative AI
An Introduction  to Generative AIAn Introduction  to Generative AI
An Introduction to Generative AICori Faklaris
 
AI for Manufacturing (Machine Vision, Edge AI, Federated Learning)
AI for Manufacturing (Machine Vision, Edge AI, Federated Learning)AI for Manufacturing (Machine Vision, Edge AI, Federated Learning)
AI for Manufacturing (Machine Vision, Edge AI, Federated Learning)byteLAKE
 
Swarm intelligence
Swarm intelligenceSwarm intelligence
Swarm intelligenceEslam Hamed
 
Build a Recommendation Engine using Amazon Machine Learning in Real-time
Build a Recommendation Engine using Amazon Machine Learning in Real-timeBuild a Recommendation Engine using Amazon Machine Learning in Real-time
Build a Recommendation Engine using Amazon Machine Learning in Real-timeAmazon Web Services
 
ClouDedup - Secure De-duplication with encrypted data for cloud storage
ClouDedup - Secure De-duplication with encrypted data for cloud storageClouDedup - Secure De-duplication with encrypted data for cloud storage
ClouDedup - Secure De-duplication with encrypted data for cloud storageSagar Uday Kumar
 
Building a Real-Time Fraud Prevention Engine Using Open Source (Big Data) Sof...
Building a Real-Time Fraud Prevention Engine Using Open Source (Big Data) Sof...Building a Real-Time Fraud Prevention Engine Using Open Source (Big Data) Sof...
Building a Real-Time Fraud Prevention Engine Using Open Source (Big Data) Sof...Spark Summit
 
Artificial Intelligence - Machine Learning Vs Deep Learning
Artificial Intelligence - Machine Learning Vs Deep LearningArtificial Intelligence - Machine Learning Vs Deep Learning
Artificial Intelligence - Machine Learning Vs Deep LearningLogiticks
 
Collaborative Filtering with Spark
Collaborative Filtering with SparkCollaborative Filtering with Spark
Collaborative Filtering with SparkChris Johnson
 
Introduction to Quantum Computing from Math perspective
Introduction to Quantum Computing from Math perspectiveIntroduction to Quantum Computing from Math perspective
Introduction to Quantum Computing from Math perspectivePavel Belevich
 
Recommendation System
Recommendation SystemRecommendation System
Recommendation SystemAnamta Sayyed
 
ppt Artificial intelligence .pptx
ppt Artificial intelligence .pptxppt Artificial intelligence .pptx
ppt Artificial intelligence .pptxAdityaKumar602198
 
Ethics in the use of Data & AI
Ethics in the use of Data & AI Ethics in the use of Data & AI
Ethics in the use of Data & AI Kalilur Rahman
 
Ant Colony Optimization: Routing
Ant Colony Optimization: RoutingAnt Colony Optimization: Routing
Ant Colony Optimization: RoutingAdrian Wilke
 
Responsible AI in Industry (ICML 2021 Tutorial)
Responsible AI in Industry (ICML 2021 Tutorial)Responsible AI in Industry (ICML 2021 Tutorial)
Responsible AI in Industry (ICML 2021 Tutorial)Krishnaram Kenthapadi
 
RPA at Full Scale. Advancing From Initial Success to Sustainable Automation
RPA at Full Scale. Advancing From Initial Success to Sustainable AutomationRPA at Full Scale. Advancing From Initial Success to Sustainable Automation
RPA at Full Scale. Advancing From Initial Success to Sustainable AutomationUiPath
 
Responsible AI
Responsible AIResponsible AI
Responsible AINeo4j
 

What's hot (20)

Swarm Intelligence in Robotics
Swarm Intelligence in RoboticsSwarm Intelligence in Robotics
Swarm Intelligence in Robotics
 
The Future Of Work & The Work Of The Future
The Future Of Work & The Work Of The FutureThe Future Of Work & The Work Of The Future
The Future Of Work & The Work Of The Future
 
An Introduction to Generative AI
An Introduction  to Generative AIAn Introduction  to Generative AI
An Introduction to Generative AI
 
AI for Manufacturing (Machine Vision, Edge AI, Federated Learning)
AI for Manufacturing (Machine Vision, Edge AI, Federated Learning)AI for Manufacturing (Machine Vision, Edge AI, Federated Learning)
AI for Manufacturing (Machine Vision, Edge AI, Federated Learning)
 
Swarm intelligence
Swarm intelligenceSwarm intelligence
Swarm intelligence
 
Build a Recommendation Engine using Amazon Machine Learning in Real-time
Build a Recommendation Engine using Amazon Machine Learning in Real-timeBuild a Recommendation Engine using Amazon Machine Learning in Real-time
Build a Recommendation Engine using Amazon Machine Learning in Real-time
 
ClouDedup - Secure De-duplication with encrypted data for cloud storage
ClouDedup - Secure De-duplication with encrypted data for cloud storageClouDedup - Secure De-duplication with encrypted data for cloud storage
ClouDedup - Secure De-duplication with encrypted data for cloud storage
 
Building a Real-Time Fraud Prevention Engine Using Open Source (Big Data) Sof...
Building a Real-Time Fraud Prevention Engine Using Open Source (Big Data) Sof...Building a Real-Time Fraud Prevention Engine Using Open Source (Big Data) Sof...
Building a Real-Time Fraud Prevention Engine Using Open Source (Big Data) Sof...
 
Artificial Intelligence - Machine Learning Vs Deep Learning
Artificial Intelligence - Machine Learning Vs Deep LearningArtificial Intelligence - Machine Learning Vs Deep Learning
Artificial Intelligence - Machine Learning Vs Deep Learning
 
Collaborative Filtering with Spark
Collaborative Filtering with SparkCollaborative Filtering with Spark
Collaborative Filtering with Spark
 
Introduction to Quantum Computing from Math perspective
Introduction to Quantum Computing from Math perspectiveIntroduction to Quantum Computing from Math perspective
Introduction to Quantum Computing from Math perspective
 
Recommendation System
Recommendation SystemRecommendation System
Recommendation System
 
ppt Artificial intelligence .pptx
ppt Artificial intelligence .pptxppt Artificial intelligence .pptx
ppt Artificial intelligence .pptx
 
Computer vision
Computer visionComputer vision
Computer vision
 
Ethics in the use of Data & AI
Ethics in the use of Data & AI Ethics in the use of Data & AI
Ethics in the use of Data & AI
 
Ant Colony Optimization: Routing
Ant Colony Optimization: RoutingAnt Colony Optimization: Routing
Ant Colony Optimization: Routing
 
Responsible AI in Industry (ICML 2021 Tutorial)
Responsible AI in Industry (ICML 2021 Tutorial)Responsible AI in Industry (ICML 2021 Tutorial)
Responsible AI in Industry (ICML 2021 Tutorial)
 
RPA at Full Scale. Advancing From Initial Success to Sustainable Automation
RPA at Full Scale. Advancing From Initial Success to Sustainable AutomationRPA at Full Scale. Advancing From Initial Success to Sustainable Automation
RPA at Full Scale. Advancing From Initial Success to Sustainable Automation
 
Responsible AI
Responsible AIResponsible AI
Responsible AI
 
How Siri Works
How Siri WorksHow Siri Works
How Siri Works
 

Viewers also liked

How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...Kai Wähner
 
Apache Storm vs. Spark Streaming - two stream processing platforms compared
Apache Storm vs. Spark Streaming - two stream processing platforms comparedApache Storm vs. Spark Streaming - two stream processing platforms compared
Apache Storm vs. Spark Streaming - two stream processing platforms comparedGuido Schmutz
 
Lambda Architecture 2.0 Convergence between Real-Time Analytics, Context-awar...
Lambda Architecture 2.0 Convergence between Real-Time Analytics, Context-awar...Lambda Architecture 2.0 Convergence between Real-Time Analytics, Context-awar...
Lambda Architecture 2.0 Convergence between Real-Time Analytics, Context-awar...Sabri Skhiri
 
Data Pipelines & Integrating Real-time Web Services w/ Storm : Improving on t...
Data Pipelines & Integrating Real-time Web Services w/ Storm : Improving on t...Data Pipelines & Integrating Real-time Web Services w/ Storm : Improving on t...
Data Pipelines & Integrating Real-time Web Services w/ Storm : Improving on t...Brian O'Neill
 
Achieve big data analytic platform with lambda architecture on cloud
Achieve big data analytic platform with lambda architecture on cloudAchieve big data analytic platform with lambda architecture on cloud
Achieve big data analytic platform with lambda architecture on cloudScott Miao
 
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...Brian O'Neill
 
Apache Storm vs. Spark Streaming – two Stream Processing Platforms compared
Apache Storm vs. Spark Streaming – two Stream Processing Platforms comparedApache Storm vs. Spark Streaming – two Stream Processing Platforms compared
Apache Storm vs. Spark Streaming – two Stream Processing Platforms comparedGuido Schmutz
 
Speed layer : Real time views in LAMBDA architecture
Speed layer : Real time views in LAMBDA architecture Speed layer : Real time views in LAMBDA architecture
Speed layer : Real time views in LAMBDA architecture Tin Ho
 
An Architecture for Agile Machine Learning in Real-Time Applications
An Architecture for Agile Machine Learning in Real-Time ApplicationsAn Architecture for Agile Machine Learning in Real-Time Applications
An Architecture for Agile Machine Learning in Real-Time ApplicationsJohann Schleier-Smith
 
Lambda architecture on Spark, Kafka for real-time large scale ML
Lambda architecture on Spark, Kafka for real-time large scale MLLambda architecture on Spark, Kafka for real-time large scale ML
Lambda architecture on Spark, Kafka for real-time large scale MLhuguk
 
Big data real time architectures
Big data real time architecturesBig data real time architectures
Big data real time architecturesDaniel Marcous
 
Machine Learning system architecture – Microsoft Translator, a Case Study : ...
Machine Learning system architecture – Microsoft Translator, a Case Study :  ...Machine Learning system architecture – Microsoft Translator, a Case Study :  ...
Machine Learning system architecture – Microsoft Translator, a Case Study : ...Vishal Chowdhary
 
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...Helena Edelson
 
Apache storm vs. Spark Streaming
Apache storm vs. Spark StreamingApache storm vs. Spark Streaming
Apache storm vs. Spark StreamingP. Taylor Goetz
 
Docker - An Introduction
Docker - An IntroductionDocker - An Introduction
Docker - An IntroductionKnoldus Inc.
 
Big Data - Fast Machine Learning at Scale + Couchbase
Big Data - Fast Machine Learning at Scale + CouchbaseBig Data - Fast Machine Learning at Scale + Couchbase
Big Data - Fast Machine Learning at Scale + CouchbaseFujio Turner
 
Machine Learning for (JVM) Developers
Machine Learning for (JVM) DevelopersMachine Learning for (JVM) Developers
Machine Learning for (JVM) DevelopersMateusz Dymczyk
 

Viewers also liked (20)

How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
 
Real-Time Machine Learning at Industrial scale (University of Oxford, 9th Oct...
Real-Time Machine Learning at Industrial scale (University of Oxford, 9th Oct...Real-Time Machine Learning at Industrial scale (University of Oxford, 9th Oct...
Real-Time Machine Learning at Industrial scale (University of Oxford, 9th Oct...
 
Apache Storm vs. Spark Streaming - two stream processing platforms compared
Apache Storm vs. Spark Streaming - two stream processing platforms comparedApache Storm vs. Spark Streaming - two stream processing platforms compared
Apache Storm vs. Spark Streaming - two stream processing platforms compared
 
Lambda Architecture 2.0 Convergence between Real-Time Analytics, Context-awar...
Lambda Architecture 2.0 Convergence between Real-Time Analytics, Context-awar...Lambda Architecture 2.0 Convergence between Real-Time Analytics, Context-awar...
Lambda Architecture 2.0 Convergence between Real-Time Analytics, Context-awar...
 
Data Pipelines & Integrating Real-time Web Services w/ Storm : Improving on t...
Data Pipelines & Integrating Real-time Web Services w/ Storm : Improving on t...Data Pipelines & Integrating Real-time Web Services w/ Storm : Improving on t...
Data Pipelines & Integrating Real-time Web Services w/ Storm : Improving on t...
 
Achieve big data analytic platform with lambda architecture on cloud
Achieve big data analytic platform with lambda architecture on cloudAchieve big data analytic platform with lambda architecture on cloud
Achieve big data analytic platform with lambda architecture on cloud
 
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
 
Apache Storm vs. Spark Streaming – two Stream Processing Platforms compared
Apache Storm vs. Spark Streaming – two Stream Processing Platforms comparedApache Storm vs. Spark Streaming – two Stream Processing Platforms compared
Apache Storm vs. Spark Streaming – two Stream Processing Platforms compared
 
Real Time Machine Learning Visualization with Spark
Real Time Machine Learning Visualization with SparkReal Time Machine Learning Visualization with Spark
Real Time Machine Learning Visualization with Spark
 
Speed layer : Real time views in LAMBDA architecture
Speed layer : Real time views in LAMBDA architecture Speed layer : Real time views in LAMBDA architecture
Speed layer : Real time views in LAMBDA architecture
 
An Architecture for Agile Machine Learning in Real-Time Applications
An Architecture for Agile Machine Learning in Real-Time ApplicationsAn Architecture for Agile Machine Learning in Real-Time Applications
An Architecture for Agile Machine Learning in Real-Time Applications
 
Lambda architecture on Spark, Kafka for real-time large scale ML
Lambda architecture on Spark, Kafka for real-time large scale MLLambda architecture on Spark, Kafka for real-time large scale ML
Lambda architecture on Spark, Kafka for real-time large scale ML
 
Arquitectura Lambda
Arquitectura LambdaArquitectura Lambda
Arquitectura Lambda
 
Big data real time architectures
Big data real time architecturesBig data real time architectures
Big data real time architectures
 
Machine Learning system architecture – Microsoft Translator, a Case Study : ...
Machine Learning system architecture – Microsoft Translator, a Case Study :  ...Machine Learning system architecture – Microsoft Translator, a Case Study :  ...
Machine Learning system architecture – Microsoft Translator, a Case Study : ...
 
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
 
Apache storm vs. Spark Streaming
Apache storm vs. Spark StreamingApache storm vs. Spark Streaming
Apache storm vs. Spark Streaming
 
Docker - An Introduction
Docker - An IntroductionDocker - An Introduction
Docker - An Introduction
 
Big Data - Fast Machine Learning at Scale + Couchbase
Big Data - Fast Machine Learning at Scale + CouchbaseBig Data - Fast Machine Learning at Scale + Couchbase
Big Data - Fast Machine Learning at Scale + Couchbase
 
Machine Learning for (JVM) Developers
Machine Learning for (JVM) DevelopersMachine Learning for (JVM) Developers
Machine Learning for (JVM) Developers
 

Similar to Real time machine learning

How Startups can leverage big data?
How Startups can leverage big data?How Startups can leverage big data?
How Startups can leverage big data?Rackspace
 
Lambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big dataLambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big dataTrieu Nguyen
 
Open Blueprint for Real-Time Analytics in Retail: Strata Hadoop World 2017 S...
Open Blueprint for Real-Time  Analytics in Retail: Strata Hadoop World 2017 S...Open Blueprint for Real-Time  Analytics in Retail: Strata Hadoop World 2017 S...
Open Blueprint for Real-Time Analytics in Retail: Strata Hadoop World 2017 S...Grid Dynamics
 
Big Data Analytics for Real Time Systems
Big Data Analytics for Real Time SystemsBig Data Analytics for Real Time Systems
Big Data Analytics for Real Time SystemsKamalika Dutta
 
Big Data Expo 2015 - Talend Delivering Real Time
Big Data Expo 2015 - Talend Delivering Real TimeBig Data Expo 2015 - Talend Delivering Real Time
Big Data Expo 2015 - Talend Delivering Real TimeBigDataExpo
 
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big DataInfochimps, a CSC Big Data Business
 
Big Data Paris - A Modern Enterprise Architecture
Big Data Paris - A Modern Enterprise ArchitectureBig Data Paris - A Modern Enterprise Architecture
Big Data Paris - A Modern Enterprise ArchitectureMongoDB
 
Data Science, Machine Learning, and H2O
Data Science, Machine Learning, and H2OData Science, Machine Learning, and H2O
Data Science, Machine Learning, and H2OSri Ambati
 
MongoDB .local Chicago 2019: MongoDB – Powering the new age data demands
MongoDB .local Chicago 2019: MongoDB – Powering the new age data demandsMongoDB .local Chicago 2019: MongoDB – Powering the new age data demands
MongoDB .local Chicago 2019: MongoDB – Powering the new age data demandsMongoDB
 
Building Reactive Real-time Data Pipeline
Building Reactive Real-time Data PipelineBuilding Reactive Real-time Data Pipeline
Building Reactive Real-time Data PipelineTrieu Nguyen
 
Big data solutions explained for marketeers & business executives
Big data solutions explained for marketeers & business executivesBig data solutions explained for marketeers & business executives
Big data solutions explained for marketeers & business executivesAgile Delivery
 
Innovating With Data and Analytics
Innovating With Data and AnalyticsInnovating With Data and Analytics
Innovating With Data and AnalyticsVMware Tanzu
 
Flink Forward Berlin 2017: Bas Geerdink, Martijn Visser - Fast Data at ING - ...
Flink Forward Berlin 2017: Bas Geerdink, Martijn Visser - Fast Data at ING - ...Flink Forward Berlin 2017: Bas Geerdink, Martijn Visser - Fast Data at ING - ...
Flink Forward Berlin 2017: Bas Geerdink, Martijn Visser - Fast Data at ING - ...Flink Forward
 
Webinar: MongoDB and Hadoop - Working Together to provide Business Insights
Webinar: MongoDB and Hadoop - Working Together to provide Business InsightsWebinar: MongoDB and Hadoop - Working Together to provide Business Insights
Webinar: MongoDB and Hadoop - Working Together to provide Business InsightsMongoDB
 
Unlocking Operational Intelligence from the Data Lake
Unlocking Operational Intelligence from the Data LakeUnlocking Operational Intelligence from the Data Lake
Unlocking Operational Intelligence from the Data LakeMongoDB
 
Build Next Generation Real-time Applications with SAP HANA on AWS (BDT211) | ...
Build Next Generation Real-time Applications with SAP HANA on AWS (BDT211) | ...Build Next Generation Real-time Applications with SAP HANA on AWS (BDT211) | ...
Build Next Generation Real-time Applications with SAP HANA on AWS (BDT211) | ...Amazon Web Services
 
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...Databricks
 
Lean Enterprise, Microservices and Big Data
Lean Enterprise, Microservices and Big DataLean Enterprise, Microservices and Big Data
Lean Enterprise, Microservices and Big DataStylight
 

Similar to Real time machine learning (20)

How Startups can leverage big data?
How Startups can leverage big data?How Startups can leverage big data?
How Startups can leverage big data?
 
Lambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big dataLambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big data
 
Open Blueprint for Real-Time Analytics in Retail: Strata Hadoop World 2017 S...
Open Blueprint for Real-Time  Analytics in Retail: Strata Hadoop World 2017 S...Open Blueprint for Real-Time  Analytics in Retail: Strata Hadoop World 2017 S...
Open Blueprint for Real-Time Analytics in Retail: Strata Hadoop World 2017 S...
 
Big Data Analytics for Real Time Systems
Big Data Analytics for Real Time SystemsBig Data Analytics for Real Time Systems
Big Data Analytics for Real Time Systems
 
Big Data Expo 2015 - Talend Delivering Real Time
Big Data Expo 2015 - Talend Delivering Real TimeBig Data Expo 2015 - Talend Delivering Real Time
Big Data Expo 2015 - Talend Delivering Real Time
 
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
 
Big Data Paris - A Modern Enterprise Architecture
Big Data Paris - A Modern Enterprise ArchitectureBig Data Paris - A Modern Enterprise Architecture
Big Data Paris - A Modern Enterprise Architecture
 
Data Science, Machine Learning, and H2O
Data Science, Machine Learning, and H2OData Science, Machine Learning, and H2O
Data Science, Machine Learning, and H2O
 
MongoDB .local Chicago 2019: MongoDB – Powering the new age data demands
MongoDB .local Chicago 2019: MongoDB – Powering the new age data demandsMongoDB .local Chicago 2019: MongoDB – Powering the new age data demands
MongoDB .local Chicago 2019: MongoDB – Powering the new age data demands
 
Building Reactive Real-time Data Pipeline
Building Reactive Real-time Data PipelineBuilding Reactive Real-time Data Pipeline
Building Reactive Real-time Data Pipeline
 
Recommendation engine
Recommendation engineRecommendation engine
Recommendation engine
 
Big data solutions explained for marketeers & business executives
Big data solutions explained for marketeers & business executivesBig data solutions explained for marketeers & business executives
Big data solutions explained for marketeers & business executives
 
Innovating With Data and Analytics
Innovating With Data and AnalyticsInnovating With Data and Analytics
Innovating With Data and Analytics
 
Flink Forward Berlin 2017: Bas Geerdink, Martijn Visser - Fast Data at ING - ...
Flink Forward Berlin 2017: Bas Geerdink, Martijn Visser - Fast Data at ING - ...Flink Forward Berlin 2017: Bas Geerdink, Martijn Visser - Fast Data at ING - ...
Flink Forward Berlin 2017: Bas Geerdink, Martijn Visser - Fast Data at ING - ...
 
Webinar: MongoDB and Hadoop - Working Together to provide Business Insights
Webinar: MongoDB and Hadoop - Working Together to provide Business InsightsWebinar: MongoDB and Hadoop - Working Together to provide Business Insights
Webinar: MongoDB and Hadoop - Working Together to provide Business Insights
 
Unlocking Operational Intelligence from the Data Lake
Unlocking Operational Intelligence from the Data LakeUnlocking Operational Intelligence from the Data Lake
Unlocking Operational Intelligence from the Data Lake
 
Build Next Generation Real-time Applications with SAP HANA on AWS (BDT211) | ...
Build Next Generation Real-time Applications with SAP HANA on AWS (BDT211) | ...Build Next Generation Real-time Applications with SAP HANA on AWS (BDT211) | ...
Build Next Generation Real-time Applications with SAP HANA on AWS (BDT211) | ...
 
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
 
Lean Enterprise, Microservices and Big Data
Lean Enterprise, Microservices and Big DataLean Enterprise, Microservices and Big Data
Lean Enterprise, Microservices and Big Data
 
Automated Analytics at Scale
Automated Analytics at ScaleAutomated Analytics at Scale
Automated Analytics at Scale
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 

Recently uploaded (20)

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 

Real time machine learning

  • 1. 1 Real-time Machine Learning Vinoth Kannan Intelligent software architecture using Modified Lambda architecture & Apache Mahout SkillFactory 71 Vinoth.kannan@widas.de
  • 2. 2 Agenda What is Machine Learning ? Need for Real Time Machine Learning What is Lambda architecture ? What is Mahout ? How does a basic recommendor engine works ? Some Use Cases
  • 3. 3 What is machine learning?
  • 4. 4 Introduction Machine Learning from Streaming Data Model that considers recent history Model that is updatable Machine Learning It has been sunny and 30 degrees in the last two days, it is unlikely that it will be -10 degrees and snowing the next day A retail sales model that remains accurate as the business gets larger Dont they both mean the same ??
  • 5. 5 Introduction Machine Learning from Streaming Data Time-series prediction non-stationary data distributions weather Retail sales Model that considers recent history Model that is updatable
  • 6. 6 Introduction Machine Learning from non-stationary data distributions Incremental Algorithms non-stationary data distributions Batch algorithm These are machine learning algorithms that learn incrementally over the data. These are machine learning algorithms that re-trains periodically with a batch algorithm.
  • 7. 7 Introduction The Challenge for the Best Big Data Technology Hadoop Batch processing System that can churn huge volume of data Storm Real time complex event processing System that can process data stream
  • 9. 9 + = Real-time Big Data Its a Chance not a Challenge Lambda Architecture!!!
  • 11. Speed Layer • Only new data • Compensates for high latency Serving layer updates • Batch layer overrides speed layer Serving layer • Loads and expose the batch views for querying • Random access to batch views Batch layer • Immutable, constantly growing datasets • Batch views are computed from this raw dataset Lambda Architecture Overview with description
  • 12. Basic Idea behind Lambda architecture 12 query = function(all data) - Nathan Marz Big Data - Principles and best practices of scalable realtime data systems
  • 13. Basic Idea behind Lambda 13 Perform some function from real-time data “0“ to the history data “n“ Real Time Big Data Lambda Architecture Hadoop ProcessStorm ProcessReal Time Big Data } } }Letting the History data processed by Hadoop makes process faster
  • 14. The Problem 14 Batch ProcessReal-timeReal Time Big Data } } } • How to define the boundery between Real-time and Batch Process ? • How to synchronize the computation between the two system ? • How to avoid gaps and overlaps ? • What algorithm to use? • How to avoid failure and have fault tolerance mechanism ? Questions to be answered Unanswered questions of Lambda architecture
  • 15. Modified Lamda Architecture Presentation Layer • Presentation layer must aggregate the output of Storm and Hadoop outputs • User will see the result of his events in less than 2 seconds • Seamless merge between short and long term data
  • 17. 17 What is Mahout ? Introduction • Apache Software Foundation Java library • Scalable “machine learning“ library that runs on Hadoop mostly • Currently Mahout supports mainly four use cases Recommendation Clustering Classification Frequent Itemset mining • Core algorithms for clustering, classfication and batch based collaborative filtering are implemented on top of Apache Hadoop using the map/reduce paradigm
  • 18. 18 Basic Recommendor algorithm How it works Today‘s FOCUS : Suggesting item to user based on current search
  • 19. 19 Basic Recommendor algorithm Defining recommendation Two broad categories of recommender engine algorithms Mahout implements a collabrative filtering framework User-based Recommends items by finding similar users. Harder to scale because of dynamic nature of users Item-based Calculate similiarty between items and make recommendations. Items usually dont change much and hence could be calculated offline
  • 20. 20 Basic Recommendor algorithm Defining recommendation User Preference to an Item • Like Something • Dont Like something • Dont Care 1 Click = 1 Like = Uniform Preference Safe to assume
  • 21. Mahout Library of Algorithms Lots of algorithms to Choose From
  • 22. Use Cases Real Time Machine Learning eCommerce Objective : Increase sales revenue Match potential customer to the right product Personalise user experience on web and email Customer lifecycle management
  • 23. Use Cases Real Time Machine Learning Financial Services Objective : Real Time Fraud Detection Compute patterns/ predictors for individual customers Classify and Cluster custumers and recalculate patterns and predictors Set threshold across all data
  • 24. Use Cases Real Time Machine Learning Media Objective : Generating Meta Data Video/ Audio/Text analysis Find patterns/cluster for people, places, products, things
  • 25. Use Cases Real Time Machine Learning Carbookplus Objective : Generating Meta Data Match potential trips to right destination Recommend best gas station Recommend contacts whom user might know Match right advertisers to customer based on vehcile needs
  • 26. Summary Ability to create real time systems based on lambda architecture Usefulness of predictive algorithms Reason to concentrate on real time predicitions More Read http://storm-project.net/ http://mahout.apache.org/ http://hadoop.apache.org/ 26