SlideShare a Scribd company logo
1 of 36
Kai Wähner
Technology Evangelist
kontakt@kai-waehner.de
LinkedIn
@KaiWaehner
www.kai-waehner.de
jDays - Gothenburg, Sweden (March 2017)
Advanced Analytics and Machine Learning
with R, Spark, H2O and TensorFlow for Real Time Processing
© Copyright 2000-2017 TIBCO Software Inc.
Apply Big Data Analytics to Real Time Processing
© Copyright 2000-2017 TIBCO Software Inc.
Agenda
1) Machine Learning and Big Data Analytics
2) Building an Analytic Model
3) Applying an Analytic Model in Real Time
© Copyright 2000-2017 TIBCO Software Inc.
Agenda
1) Machine Learning and Big Data Analytics
2) Building an Analytic Model
3) Applying an Analytic Model in Real Time
Machine Learning
…. allows computers to find hidden insights without being
explicitly programmed where to look.
Real World Examples of Machine Learning
Spam Detection
Search Results +
Product Recommendation
Picture Detection
(Friends, Locations, Products)
Machine Learning is already present in daily life…
Now, every enterprise is beginning to leverage it!
The Next Disruption:
Google Beats Go Champion
© Copyright 2000-2017 TIBCO Software Inc.
From Insight to Action - Closed Loop for Big Data Analytics
Insight ActionEVENTSEVENTS
© Copyright 2000-2017 TIBCO Software Inc.
Agenda
1) Machine Learning and Big Data Analytics
2) Building an Analytic Model
3) Applying an Analytic Model in Real Time
© Copyright 2000-2017 TIBCO Software Inc.
Analytical Pipeline
1. Data Access
2. Data Preparation
3. Exploratory Data Analysis
4. Model Building
5. Model Validation
6. Model Execution
7. Deployment
© Copyright 2000-2017 TIBCO Software Inc.
Variety of Data in Enterprises
Custom	GUI-driven	
data	access	via	SDK
Siebel
eBusiness
Local	data	sources
AccessExcel STDF
Drag-and-drop
MySQL
SQL	Server
Oracle
Information	Services
(join,	transform,	reusable,	
parameterized,	dynamic	query	
for	in-memory	use)
Databases
JDBC/ODBC
Hadoop
SFDC
PostgreSQL
Teradata
Netezza
Etc.XML
RDBMS
Flat
Files
Spread-
sheets
Web
Services
Oracle
E-Business
RDBMS
RDBMS
RDBMS
SAP BWSAP R/3 D
A
T
A
F
A
B
R
I
C
Salesforce
ODBC
OLE	DB
SqlClient
Direct	
connection
Oracle
TeradataAsterMS	SSAS
Teradata
Direct	Query
(dynamically	query	and	retrieve	data	for	
visualization	and	analysis)
Databases
MySQL
Etc.
OBIEE
Netezza
Hadoop
© Copyright 2000-2017 TIBCO Software Inc.
Data Preparation
http://www.slideshare.net/odsc/feature-engineering
Data Preparation
Visual Analytics - Interactive Brush-Linked
© Copyright 2000-2017 TIBCO Software Inc.
© Copyright 2000-2017 TIBCO Software Inc.
Model Building
A model is a simplification of the truth
that helps you with decision making.
© Copyright 2000-2017 TIBCO Software Inc.
Cross-Validation Procedure
https://genome.tugraz.at/proclassify/help/pages/XV.html
© Copyright 2000-2017 TIBCO Software Inc.
Execution via Code / Scripting
Execution within the Visual Analytics Tooling
© Copyright 2000-2017 TIBCO Software Inc.
Customer Churn with Random Forest Algorithm:
Select variables
for the model
© Copyright 2000-2017 TIBCO Software Inc.
Frameworks and Tooling
Advanced Analytics and Big Data Tools for Data Scientists
Many more ….
Portable Format
for Analytics (PFA)
© Copyright 2000-2017 TIBCO Software Inc.
Demystify Data Science for the Business Analyst
Leverage Machine Learning
without the help of a Data Scientist
Development of Analytic Models
with R, TensorFlow, Apache Spark, RapidMiner, TIBCO Spotfire
Live DemoLive Demo
© Copyright 2000-2017 TIBCO Software Inc.
Agenda
1) Machine Learning and Big Data Analytics
2) Building an Analytic Model
3) Applying an Analytic Model in Real Time
© Copyright 2000-2017 TIBCO Software Inc.
Analytical Pipeline
1. Data Access
2. Data Preparation
3. Exploratory Data Analysis
4. Model Building
5. Model Validation
6. Model Execution
7. Deployment
© Copyright 2000-2017 TIBCO Software Inc.
Streaming Analytics - Processing Pipeline
APIs
Adapters /
Channels
Integration
Messaging
Stream Ingest
Transformation
Aggregation
Enrichment
Filtering
Stream
Preprocessing
Process
Management
Analytics
(Real Time)
Applications
& APIs
Analytics /
DW Reporting
Stream
Outcomes
• Contextual Rules
• Windowing
• Patterns
• Analytics
• Deep ML
• …
Stream Analytics &
Processing
Index / SearchNormalization
Applying an Analytic Model
is just a piece of the puzzle!
© Copyright 2000-2017 TIBCO Software Inc.
Frameworks and Products
(no complete list!)
OPEN SOURCE CLOSED SOURCE
PRODUCT
FRAMEWORK
Azure Microsoft
Stream Analytics
© Copyright 2000-2017 TIBCO Software Inc.
How to
apply analytic models
to real time processing
without redevelopment?
Stream
Processing
H20.ai
Open
Source
R
TERR
Spark
ML
MATLAB
SAS
PMML
Apache Spark ML and Spark Streaming with PMML Models
https://github.com/jpmml/jpmml-spark
© Copyright 2000-2017 TIBCO Software Inc.
© Copyright 2000-2017 TIBCO Software Inc.
TIBCO StreamBase Connector for R and TERR
© Copyright 2000-2017 TIBCO Software Inc.
TIBCO StreamBase Connector for H2O.ai
© Copyright 2000-2017 TIBCO Software Inc.
TIBCO StreamBase Connector for PMML
Scenario: Predictive Scrapping of Parts in an Assembly Line
Station 1 Station 2
Cost Before
9€
7€ 13€
Total Cost
29€
(or more)
Scrap? Scrap?
Example: Predictive Analytics for Manufacturing (“scrap parts as early as possible”)
Fast Data Architecture for Predictive Maintenance
Operational	Analytics
Operations
Live	UI
CSV Batch
JSON Real Time
XML Real Time
Streaming	AnalyticsAction
Aggregate
Rules
Analytics
Correlate
Live	Datamart
Continuous	query	
processing
Alerts
Manual	action,	
escalation
HISTORICAL	ANALYSIS Data	
Scientists
Flume
HDFS
Spotfire
R	/	TERR
HDFS
Hadoop (Cloudera)
StreamBase
TIBCO Fast Data Platform
H2O
Oracle	RDBMS
Avro Parquet … PMML
Internal	Data
TIBCO Spotfire with H2O Integration
© Copyright 2000-2017 TIBCO Software Inc.
Example: Predictive Analytics for Manufacturing (“scrap parts as early as possible”)
TIBCO StreamBase / Live Datamart + H2O.ai
Live DemoLive Demo
© Copyright 2000-2017 TIBCO Software Inc.
From Insight to Action - Closed Loop for Big Data Analytics
Insight Action
MONITOR
PREDICT
ACT
DECIDE
MODEL
ACCESS
ANALYZE
WRANGLE
© Copyright 2000-2017 TIBCO Software Inc.
Key Take-Aways
Ø Insights are hidden in Historical Data on Big Data Platforms
Ø Machine Learning and Big Data Analytics find these Insights by building Analytics Models
Ø Event Processing uses these Models (without Redevelopment) to take Action in Real Time
Questions? Please contact me!
Kai Wähner
Technology Evangelist
kontakt@kai-waehner.de
@KaiWaehner
www.kai-waehner.de
LinkedIn

More Related Content

What's hot

Big Data beyond Apache Hadoop - How to integrate ALL your Data
Big Data beyond Apache Hadoop - How to integrate ALL your DataBig Data beyond Apache Hadoop - How to integrate ALL your Data
Big Data beyond Apache Hadoop - How to integrate ALL your Data
Kai Wähner
 
Javaedge 2010-cschalk
Javaedge 2010-cschalkJavaedge 2010-cschalk
Javaedge 2010-cschalk
Chris Schalk
 
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Shirshanka Das
 

What's hot (20)

How to Apply Big Data Analytics and Machine Learning to Real Time Processing ...
How to Apply Big Data Analytics and Machine Learning to Real Time Processing ...How to Apply Big Data Analytics and Machine Learning to Real Time Processing ...
How to Apply Big Data Analytics and Machine Learning to Real Time Processing ...
 
On Demand BI
On Demand BIOn Demand BI
On Demand BI
 
Connect Faster with SnapLogic at Workday Rising
Connect Faster with SnapLogic at Workday RisingConnect Faster with SnapLogic at Workday Rising
Connect Faster with SnapLogic at Workday Rising
 
Everything you need to know about cloud migration(Build Stuff 2021)
Everything you need to know about cloud migration(Build Stuff 2021)Everything you need to know about cloud migration(Build Stuff 2021)
Everything you need to know about cloud migration(Build Stuff 2021)
 
Modern Data Platforms
Modern Data Platforms Modern Data Platforms
Modern Data Platforms
 
"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013
"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013
"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013
 
The Scout24 Data Landscape Manifesto: Building an Opinionated Data Platform
The Scout24 Data Landscape Manifesto: Building an Opinionated Data PlatformThe Scout24 Data Landscape Manifesto: Building an Opinionated Data Platform
The Scout24 Data Landscape Manifesto: Building an Opinionated Data Platform
 
[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics
[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics
[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics
 
The Scout24 Data Platform - a technical deep dive
The Scout24 Data Platform - a technical deep diveThe Scout24 Data Platform - a technical deep dive
The Scout24 Data Platform - a technical deep dive
 
Big Data Application Architectures - IoT
Big Data Application Architectures - IoTBig Data Application Architectures - IoT
Big Data Application Architectures - IoT
 
Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...
Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...
Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...
 
Big Data beyond Apache Hadoop - How to integrate ALL your Data
Big Data beyond Apache Hadoop - How to integrate ALL your DataBig Data beyond Apache Hadoop - How to integrate ALL your Data
Big Data beyond Apache Hadoop - How to integrate ALL your Data
 
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
 
Data analysis trend 2015 2016 v071
Data analysis trend 2015 2016 v071Data analysis trend 2015 2016 v071
Data analysis trend 2015 2016 v071
 
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020
 
Javaedge 2010-cschalk
Javaedge 2010-cschalkJavaedge 2010-cschalk
Javaedge 2010-cschalk
 
Hadoop for Humans: Introducing SnapReduce 2.0
Hadoop for Humans: Introducing SnapReduce 2.0Hadoop for Humans: Introducing SnapReduce 2.0
Hadoop for Humans: Introducing SnapReduce 2.0
 
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
 
451 Research Impact Report
451 Research Impact Report451 Research Impact Report
451 Research Impact Report
 
Native Spark Executors on Kubernetes: Diving into the Data Lake - Chicago Clo...
Native Spark Executors on Kubernetes: Diving into the Data Lake - Chicago Clo...Native Spark Executors on Kubernetes: Diving into the Data Lake - Chicago Clo...
Native Spark Executors on Kubernetes: Diving into the Data Lake - Chicago Clo...
 

Similar to R, Spark, Tensorflow, H20.ai Applied to Streaming Analytics

Similar to R, Spark, Tensorflow, H20.ai Applied to Streaming Analytics (20)

How to Leverage Machine Learning (R, Hadoop, Spark, H2O) for Real Time Proces...
How to Leverage Machine Learning (R, Hadoop, Spark, H2O) for Real Time Proces...How to Leverage Machine Learning (R, Hadoop, Spark, H2O) for Real Time Proces...
How to Leverage Machine Learning (R, Hadoop, Spark, H2O) for Real Time Proces...
 
Machine Learning Applied to Real Time Scoring in Manufacturing and Energy Uti...
Machine Learning Applied to Real Time Scoring in Manufacturing and Energy Uti...Machine Learning Applied to Real Time Scoring in Manufacturing and Energy Uti...
Machine Learning Applied to Real Time Scoring in Manufacturing and Energy Uti...
 
Findability Day 2016 - Big data analytics and machine learning
Findability Day 2016 - Big data analytics and machine learningFindability Day 2016 - Big data analytics and machine learning
Findability Day 2016 - Big data analytics and machine learning
 
HOW TO APPLY BIG DATA ANALYTICS AND MACHINE LEARNING TO REAL TIME PROCESSING ...
HOW TO APPLY BIG DATA ANALYTICS AND MACHINE LEARNING TO REAL TIME PROCESSING ...HOW TO APPLY BIG DATA ANALYTICS AND MACHINE LEARNING TO REAL TIME PROCESSING ...
HOW TO APPLY BIG DATA ANALYTICS AND MACHINE LEARNING TO REAL TIME PROCESSING ...
 
Big Data LDN 2017: How Big Data Insights Become Easily Accessible With Workfl...
Big Data LDN 2017: How Big Data Insights Become Easily Accessible With Workfl...Big Data LDN 2017: How Big Data Insights Become Easily Accessible With Workfl...
Big Data LDN 2017: How Big Data Insights Become Easily Accessible With Workfl...
 
AI Solutions with Macnica.ai - AI Expo 2018 Tokyo Japan
AI Solutions with Macnica.ai - AI Expo 2018 Tokyo JapanAI Solutions with Macnica.ai - AI Expo 2018 Tokyo Japan
AI Solutions with Macnica.ai - AI Expo 2018 Tokyo Japan
 
Make your application stand out with bi that blends in
Make your application stand out with bi that blends inMake your application stand out with bi that blends in
Make your application stand out with bi that blends in
 
JASPERSOFT LIVE DEMO - NAM
JASPERSOFT LIVE DEMO - NAMJASPERSOFT LIVE DEMO - NAM
JASPERSOFT LIVE DEMO - NAM
 
Introduction to jaspersoft7 customer webinar
Introduction to jaspersoft7 customer webinarIntroduction to jaspersoft7 customer webinar
Introduction to jaspersoft7 customer webinar
 
Tibco Augmented Intelligence - Analytics, IoT, Big Data, Streaming 20161025
Tibco Augmented Intelligence - Analytics, IoT, Big Data, Streaming 20161025Tibco Augmented Intelligence - Analytics, IoT, Big Data, Streaming 20161025
Tibco Augmented Intelligence - Analytics, IoT, Big Data, Streaming 20161025
 
2018 Oracle Impact 발표자료: Oracle Enterprise AI
2018  Oracle Impact 발표자료: Oracle Enterprise AI2018  Oracle Impact 발표자료: Oracle Enterprise AI
2018 Oracle Impact 발표자료: Oracle Enterprise AI
 
Accelerate IoT Development with KnowThings.io
Accelerate IoT Development with KnowThings.ioAccelerate IoT Development with KnowThings.io
Accelerate IoT Development with KnowThings.io
 
AI Foundations: Simpler Technologies, Smarter Business
AI Foundations: Simpler Technologies, Smarter BusinessAI Foundations: Simpler Technologies, Smarter Business
AI Foundations: Simpler Technologies, Smarter Business
 
IOT311_Customer Stories of Things, Cloud, and Analytics on AWS
IOT311_Customer Stories of Things, Cloud, and Analytics on AWSIOT311_Customer Stories of Things, Cloud, and Analytics on AWS
IOT311_Customer Stories of Things, Cloud, and Analytics on AWS
 
Streaming Analytics for IoT-Oriented Applications
Streaming Analytics for IoT-Oriented ApplicationsStreaming Analytics for IoT-Oriented Applications
Streaming Analytics for IoT-Oriented Applications
 
SAP Leonardo
SAP LeonardoSAP Leonardo
SAP Leonardo
 
BUILD with Microsoft - Radu Stefan
 BUILD with Microsoft - Radu Stefan BUILD with Microsoft - Radu Stefan
BUILD with Microsoft - Radu Stefan
 
Machine Learning at the Edge
Machine Learning at the EdgeMachine Learning at the Edge
Machine Learning at the Edge
 
Apache spark empowering the real time data driven enterprise - StreamAnalytix...
Apache spark empowering the real time data driven enterprise - StreamAnalytix...Apache spark empowering the real time data driven enterprise - StreamAnalytix...
Apache spark empowering the real time data driven enterprise - StreamAnalytix...
 
Bitrock manufacturing
Bitrock manufacturing Bitrock manufacturing
Bitrock manufacturing
 

More from Kai Wähner

Kappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology Comparison
Kai Wähner
 
The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022
Kai Wähner
 

More from Kai Wähner (20)

Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
 
When NOT to use Apache Kafka?
When NOT to use Apache Kafka?When NOT to use Apache Kafka?
When NOT to use Apache Kafka?
 
Kafka for Live Commerce to Transform the Retail and Shopping Metaverse
Kafka for Live Commerce to Transform the Retail and Shopping MetaverseKafka for Live Commerce to Transform the Retail and Shopping Metaverse
Kafka for Live Commerce to Transform the Retail and Shopping Metaverse
 
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache KafkaThe Heart of the Data Mesh Beats in Real-Time with Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
 
Apache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
Apache Kafka vs. Cloud-native iPaaS Integration Platform MiddlewareApache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
Apache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
 
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
 
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureServerless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
 
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
 
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Data Streaming with Apache Kafka in the Defence and Cybersecurity IndustryData Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
 
Apache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryApache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare Industry
 
Apache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryApache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare Industry
 
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
Apache Kafka for Real-time Supply Chainin the Food and Retail IndustryApache Kafka for Real-time Supply Chainin the Food and Retail Industry
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
 
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid Cloud
 
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
 
Apache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and ManufacturingApache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and Manufacturing
 
Kappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology Comparison
 
The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022
 
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka ArchitecturesEvent Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
 
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
 
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Recently uploaded (20)

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 

R, Spark, Tensorflow, H20.ai Applied to Streaming Analytics

  • 1. Kai Wähner Technology Evangelist kontakt@kai-waehner.de LinkedIn @KaiWaehner www.kai-waehner.de jDays - Gothenburg, Sweden (March 2017) Advanced Analytics and Machine Learning with R, Spark, H2O and TensorFlow for Real Time Processing
  • 2. © Copyright 2000-2017 TIBCO Software Inc. Apply Big Data Analytics to Real Time Processing
  • 3. © Copyright 2000-2017 TIBCO Software Inc. Agenda 1) Machine Learning and Big Data Analytics 2) Building an Analytic Model 3) Applying an Analytic Model in Real Time
  • 4. © Copyright 2000-2017 TIBCO Software Inc. Agenda 1) Machine Learning and Big Data Analytics 2) Building an Analytic Model 3) Applying an Analytic Model in Real Time
  • 5. Machine Learning …. allows computers to find hidden insights without being explicitly programmed where to look.
  • 6. Real World Examples of Machine Learning Spam Detection Search Results + Product Recommendation Picture Detection (Friends, Locations, Products) Machine Learning is already present in daily life… Now, every enterprise is beginning to leverage it! The Next Disruption: Google Beats Go Champion
  • 7. © Copyright 2000-2017 TIBCO Software Inc. From Insight to Action - Closed Loop for Big Data Analytics Insight ActionEVENTSEVENTS
  • 8. © Copyright 2000-2017 TIBCO Software Inc. Agenda 1) Machine Learning and Big Data Analytics 2) Building an Analytic Model 3) Applying an Analytic Model in Real Time
  • 9. © Copyright 2000-2017 TIBCO Software Inc. Analytical Pipeline 1. Data Access 2. Data Preparation 3. Exploratory Data Analysis 4. Model Building 5. Model Validation 6. Model Execution 7. Deployment
  • 10. © Copyright 2000-2017 TIBCO Software Inc. Variety of Data in Enterprises Custom GUI-driven data access via SDK Siebel eBusiness Local data sources AccessExcel STDF Drag-and-drop MySQL SQL Server Oracle Information Services (join, transform, reusable, parameterized, dynamic query for in-memory use) Databases JDBC/ODBC Hadoop SFDC PostgreSQL Teradata Netezza Etc.XML RDBMS Flat Files Spread- sheets Web Services Oracle E-Business RDBMS RDBMS RDBMS SAP BWSAP R/3 D A T A F A B R I C Salesforce ODBC OLE DB SqlClient Direct connection Oracle TeradataAsterMS SSAS Teradata Direct Query (dynamically query and retrieve data for visualization and analysis) Databases MySQL Etc. OBIEE Netezza Hadoop
  • 11. © Copyright 2000-2017 TIBCO Software Inc. Data Preparation http://www.slideshare.net/odsc/feature-engineering Data Preparation
  • 12. Visual Analytics - Interactive Brush-Linked © Copyright 2000-2017 TIBCO Software Inc.
  • 13. © Copyright 2000-2017 TIBCO Software Inc. Model Building A model is a simplification of the truth that helps you with decision making.
  • 14. © Copyright 2000-2017 TIBCO Software Inc. Cross-Validation Procedure https://genome.tugraz.at/proclassify/help/pages/XV.html
  • 15. © Copyright 2000-2017 TIBCO Software Inc. Execution via Code / Scripting
  • 16. Execution within the Visual Analytics Tooling © Copyright 2000-2017 TIBCO Software Inc. Customer Churn with Random Forest Algorithm: Select variables for the model
  • 17. © Copyright 2000-2017 TIBCO Software Inc. Frameworks and Tooling
  • 18. Advanced Analytics and Big Data Tools for Data Scientists Many more …. Portable Format for Analytics (PFA)
  • 19. © Copyright 2000-2017 TIBCO Software Inc. Demystify Data Science for the Business Analyst Leverage Machine Learning without the help of a Data Scientist
  • 20. Development of Analytic Models with R, TensorFlow, Apache Spark, RapidMiner, TIBCO Spotfire Live DemoLive Demo
  • 21. © Copyright 2000-2017 TIBCO Software Inc. Agenda 1) Machine Learning and Big Data Analytics 2) Building an Analytic Model 3) Applying an Analytic Model in Real Time
  • 22. © Copyright 2000-2017 TIBCO Software Inc. Analytical Pipeline 1. Data Access 2. Data Preparation 3. Exploratory Data Analysis 4. Model Building 5. Model Validation 6. Model Execution 7. Deployment
  • 23. © Copyright 2000-2017 TIBCO Software Inc. Streaming Analytics - Processing Pipeline APIs Adapters / Channels Integration Messaging Stream Ingest Transformation Aggregation Enrichment Filtering Stream Preprocessing Process Management Analytics (Real Time) Applications & APIs Analytics / DW Reporting Stream Outcomes • Contextual Rules • Windowing • Patterns • Analytics • Deep ML • … Stream Analytics & Processing Index / SearchNormalization Applying an Analytic Model is just a piece of the puzzle!
  • 24. © Copyright 2000-2017 TIBCO Software Inc. Frameworks and Products (no complete list!) OPEN SOURCE CLOSED SOURCE PRODUCT FRAMEWORK Azure Microsoft Stream Analytics
  • 25. © Copyright 2000-2017 TIBCO Software Inc. How to apply analytic models to real time processing without redevelopment? Stream Processing H20.ai Open Source R TERR Spark ML MATLAB SAS PMML
  • 26. Apache Spark ML and Spark Streaming with PMML Models https://github.com/jpmml/jpmml-spark © Copyright 2000-2017 TIBCO Software Inc.
  • 27. © Copyright 2000-2017 TIBCO Software Inc. TIBCO StreamBase Connector for R and TERR
  • 28. © Copyright 2000-2017 TIBCO Software Inc. TIBCO StreamBase Connector for H2O.ai
  • 29. © Copyright 2000-2017 TIBCO Software Inc. TIBCO StreamBase Connector for PMML
  • 30. Scenario: Predictive Scrapping of Parts in an Assembly Line Station 1 Station 2 Cost Before 9€ 7€ 13€ Total Cost 29€ (or more) Scrap? Scrap? Example: Predictive Analytics for Manufacturing (“scrap parts as early as possible”)
  • 31. Fast Data Architecture for Predictive Maintenance Operational Analytics Operations Live UI CSV Batch JSON Real Time XML Real Time Streaming AnalyticsAction Aggregate Rules Analytics Correlate Live Datamart Continuous query processing Alerts Manual action, escalation HISTORICAL ANALYSIS Data Scientists Flume HDFS Spotfire R / TERR HDFS Hadoop (Cloudera) StreamBase TIBCO Fast Data Platform H2O Oracle RDBMS Avro Parquet … PMML Internal Data
  • 32. TIBCO Spotfire with H2O Integration © Copyright 2000-2017 TIBCO Software Inc. Example: Predictive Analytics for Manufacturing (“scrap parts as early as possible”)
  • 33. TIBCO StreamBase / Live Datamart + H2O.ai Live DemoLive Demo
  • 34. © Copyright 2000-2017 TIBCO Software Inc. From Insight to Action - Closed Loop for Big Data Analytics Insight Action MONITOR PREDICT ACT DECIDE MODEL ACCESS ANALYZE WRANGLE
  • 35. © Copyright 2000-2017 TIBCO Software Inc. Key Take-Aways Ø Insights are hidden in Historical Data on Big Data Platforms Ø Machine Learning and Big Data Analytics find these Insights by building Analytics Models Ø Event Processing uses these Models (without Redevelopment) to take Action in Real Time
  • 36. Questions? Please contact me! Kai Wähner Technology Evangelist kontakt@kai-waehner.de @KaiWaehner www.kai-waehner.de LinkedIn