SlideShare ist ein Scribd-Unternehmen logo
1 von 93
Fast Data and Streaming Analytics
in the Era of Hadoop, R and Apache Spark
Kai Wähner
kwaehner@tibco.com
@KaiWaehner
www.kai-waehner.de
LinkedIn / Xing  Please connect!
Key Messages
– Streaming Analytics processes Data while it is in Motion!
– Automation and Proactive Human Interaction are BOTH needed!
– Time to Market is the Key Requirement for most Use Cases!
Agenda
– Real World Use Cases
– Introduction to Stream Processing
– Market Overview
– Relation to other Big Data Components
Agenda
– Real World Use Cases
– Introduction to Stream Processing
– Market Overview
– Relation to other Big Data Components
© Copyright 2015 TIBCO Software Inc.
Find and Act on “Critical Business Moments”
“Business Moments” occur in Every Facet of Enterprise Operations, they drive
competitive differentiation, customer satisfaction and business success!
Optimize
Pricing Identify
fraud
Make cross-
sell offers
Restock
inventory
Reroute
trucks
Deliver proactive
customer service
Predict equipment
failure & fix
proactively Anticipate
and handle
disruptions
Operational Intelligence in Action
© Copyright 2000-2015 TIBCO Software Inc.
Actions by Operations
Human decisions in real time informed by
up to date information
The Challenge:
Empower operations staff to see and
seize key business moments6
Automated action based on models of history
combined with live context and business rules
The Challenge:
Create, understand, and deploy algorithms &
rules that automate key business reactions
Machine-to-Machine Automation
7
Success Story
© Copyright 2000-2015 TIBCO Software Inc.
Predictive Fault Management
© Copyright 2000-2013 TIBCO Software Inc.
“An outage on one well can cost $10M per
hour. We have 20-100 outages per year.“
- Drilling operations VP, major oil company
Data Monitoring
• Motor temperature
• Motor vibration
• Current
• Intake pressure
• Intake temperature
 Flow
Electrical power cable
Pump
Intake
Protector
ESP motor
Pump monitoring unit
Electric Submersible Pumps (ESP)
Predictive Analytics (Fault Management)
Voltage
Temperature
Vibration
Device
history
Temporal analytic: “If vibration spike is followed by temp spike then
voltage spike [within 12 minutes] then flag high severity alert.”
Predictive Analytics (Fault Management)
Live Surveillance of Equipment
© Copyright 2000-2014 TIBCO Software Inc.
Continuous, live geospatial display of pump
health and predictive signal breeches
Alerts based on
predictive signals
Compare live readings and signals
to historical average and means
Continuous, live visualization
of stats per 100’s of wells
12
Success Story
© Copyright 2000-2015 TIBCO Software Inc.
Smart Manufacturing
IoT for High Tech Manufacturing Yield Optimization
© Copyright 2000-2014 TIBCO Software Inc.
• Before: Solar Panel Manufacturer with No Unified View of
Manufacturing Process
– Multiple manufacturing facilities, multiple processes – no way to compare
production to yield expectations
• Negative Consequences: Sub-Optimal Production
– Operations are sub-optimal: high tolerance leads to better yield but less
output; tight tolerance means high throughput but lower yield
• Business Outcome: Higher Yield and More Runs
– Process Manufacturing can run tighter tolerances and adjust them mid-run,
predicting yield and adjusting to changing variables
– Systems proactively re-route high-value customers around affected network
areas in real-time
• How We Do It: The TIBCO Fast Data Platform
– IoT, Spotfire, StreamBase, and TERR for predictive modeling, high-speed
network by TIBCO “For every 1% increase in shipped
product, we make $11MM in profit.
The demand is there, we just need to
fulfill it.”
- Head of Quality, Solar Panel Manufacturer
High Tech Manufacturing Yield Optimization
© Copyright 2000-2014 TIBCO Software Inc.
Live streaming datamart analysis
Continuous update and exploration of
top yield metrics; take action
High Tech Manufacturing Yield Optimization
© Copyright 2000-2014 TIBCO Software Inc.
Continuously computed real-time analytics on streams by
StreamBase (thresholds, min / max, average)
Analysis, alerts and triggers are based on streaming analytics
High Tech Manufacturing Yield Optimization
© Copyright 2000-2014 TIBCO Software Inc.
Manufacturing operations staff
drill down on any machine, any
time, to inspect and fix problems
before they impact yield
17
Success Story
© Copyright 2000-2015 TIBCO Software Inc.
Crowd Management
18
© Copyright 2000-2015 TIBCO Software Inc.
Crowd Management (Stadium, Airport, Conference, …)
Sacramento Kings  World’s Smartest Building
© Copyright 2000-2015 TIBCO Software Inc.
20
© Copyright 2000-2014 TIBCO Software Inc.
21
© Copyright 2000-2014 TIBCO Software Inc.
22
© Copyright 2000-2014 TIBCO Software Inc.
23
© Copyright 2000-2014 TIBCO Software Inc.
24
© Copyright 2000-2014 TIBCO Software Inc.
25
© Copyright 2000-2014 TIBCO Software Inc.
26
© Copyright 2000-2014 TIBCO Software Inc.
27
Success Story
© Copyright 2000-2015 TIBCO Software Inc.
Retailing in the 21st Century
Challenges of the 21st Century Retailer
• Retailing and Retail Challenges are changing
• Consumers expect better and integrated customer experience across all channels
– Rapid adoption of mobile is a major driver
– Customers want an integrated service across physical and digital channels… Simultaneously
– Customer experience is becoming one of the main differentiators
• Real-Time, one-on-one marketing can:
– Improve a retailer’s relevance with the customer
– Increase customer wallet-share
• Key to being able to achieve this is:
– Identifying and knowing your customer, in depth in real-time
– Understanding the opportunity their past behavior reveals
– Understanding your inventory (availability, velocity, pipeline)
29
© Copyright 2000-2014 TIBCO Software Inc.
All Customers are different… Treat them that way…
Capture – Engage – Expand - Monetize
Patterns – Real time
MOREPERSONAL
MORE CONTEXT
social
CRM
POS
mobileweb
e-mails
National Retailer Loyalty 2015
© Copyright 2000-2015 TIBCO Software Inc.
Top Benefits
• Smart cross-selling based in iBeacons
• Location-based services in real time
• Leveraging partner offerings
New Real-Time Fraud Detection
Based on Deep Historical Insight
Real-time fraud action can be taken based on
historical insight – system not “whiplashed”
by real-time events
Streaming Analytics for Gift Card Fraud Protection
32
© Copyright 2000-2015 TIBCO Software Inc.
Internet of Things
Hybrid Stores
Smart Tags
Smart Shelves
Smart Warehouse
Faster Delivery
Buy Online
Pickup at Store
Same Day
Delivery
Omni Channel 2.0
Store Fulfillment
Social Media
Predictive
Shopping
National Retailer Loyalty 2018
33
Great success stories, but …
© Copyright 2000-2015 TIBCO Software Inc.
… how to realize these use cases?
34
© Copyright 2000-2014 TIBCO Software Inc.
Real Time Close Loop
Model
Develop model
Deploy into Stream
Processing flow
Act
Automatically monitor
real-time transactions
Automatically trigger
action
Analyze
Analyze data via Data
Discovery
Uncover patterns,
trends, correlations
Agenda
– Real World Use Cases
– Introduction to Stream Processing
– Market Overview
– Relation to other Big Data Components
Traditional Data Processing: Challenges
• Introduces too much “decision
latency” into the business.
• Responses are delivered “after-the-
fact”.
• Maximum value of the identified
situation is lost.
– Cross-sell / up-sell opportunities are lost,
impending equipment failure is missed,
business processes are slow to respond
and lack timely context.
• Decisions are made on old and stale
data.
© Copyright 2000-2015 TIBCO Software Inc.
Store
Analyze
Act
The New Era: Fast Data Processing
• Events are analyzed and processed
in real-time as they arrive.
• Decisions are timely, contextual, and
based on fresh data.
• Decision latency is eliminated,
resulting in:
 Superior Customer Experience
 Operational Excellence
 Instant Awareness and Timely Decisions
© Copyright 2000-2015 TIBCO Software Inc.
Act
Analyze
Store
Streaming Analytics
© Copyright 2000-2015 TIBCO Software Inc.
time
1 2 3 4 5 6 7 8 9
Event Streams
• Continuous Queries
• Sliding Windows
• Filter
• Aggregation
• Correlation
• …
39
Act while data is in motion!
Time
Business
Value
Business Event
Data Ready for Analysis
Analysis Completed
Decision Made
$$$$
$$$
$$
$ Action Taken
Stream Processing
speeds action and
increases business
value by seizing
opportunities while
they matter
Operational Analytics
Operations
Live UI
SENSOR DATA
TRANSACTIONS
MESSAGE BUS
MACHINE DATA
SOCIAL DATA
Streaming AnalyticsAction
Aggregate
Rules
Stream Processing
Analytics
Correlate
Live Datamart
Continuous query
processing
Alerts
Manual action,
escalation
HISTORICAL ANALYSIS
MS Excel
SAS
Data
Scientists
Cleansed
Data
History
Data Discovery
R
Enterprise Service Bus
ERP MDM DB WMS
SOA
BIG DATA
Data Warehouse, Hadoop
InternalData
IntegrationBus
API
Event Server
Streaming Analytics Reference Architecture
Spark
Agenda
– Real World Use Cases
– Introduction to Stream Processing
– Market Overview
– Relation to other Big Data Components
Operational Analytics
Operations
Live UI
SENSOR DATA
TRANSACTIONS
MESSAGE BUS
MACHINE DATA
SOCIAL DATA
Streaming AnalyticsAction
Aggregate
Rules
Stream Processing
Analytics
Correlate
Live Datamart
Continuous query
processing
Alerts
Manual action,
escalation
HISTORICAL ANALYSIS
MS Excel
SAS
Data
Scientists
Cleansed
Data
History
Data Discovery
R
Enterprise Service Bus
ERP MDM DB WMS
SOA
BIG DATA
Data Warehouse, Hadoop
InternalData
IntegrationBus
API
Event Server
Streaming Analytics Reference Architecture
Spark
44
Alternatives for Stream Processing
Time
to
Market
Streaming
Frameworks
Streaming
Products
Slow Fast
Streaming
Concepts
IncludesIncludes
© Copyright 2000-2015 TIBCO Software Inc.
Concepts (Continuous Queries, Sliding Windows)
Patterns (Counting, Sequencing, Tracking, Trends)
Build everything by yourself! 
45
What Streaming Alternative do you need?
Time
to
Market
Streaming
Frameworks
Streaming
Products
Slow Fast
Streaming
Concepts
© Copyright 2000-2015 TIBCO Software Inc.
46
Usually not an option ...
© Copyright 2000-2015 TIBCO Software Inc.
… as there are a lot of
Frameworks and
Products available!
47
Alternatives
© Copyright 2000-2015 TIBCO Software Inc.
OPEN SOURCE CLOSED SOURCE
PRODUCT
FRAMEWORK
(no complete list!)
Library (Java, .NET, Python)
Query Language (often similar to SQL)
Scalability (horizontal and vertical, fail over)
Connectivity (technologies, markets, products)
Operators (Filter, Sort, Aggregate)
48
What Streaming Alternative do you need?
Time
to
Market
Streaming
Frameworks
Streaming
Products
Slow Fast
Streaming
Concepts
© Copyright 2000-2015 TIBCO Software Inc.
49
Apache Storm
© Copyright 2000-2015 TIBCO Software Inc.
Spout Bolt
50
Apache Storm – Hello World
© Copyright 2000-2015 TIBCO Software Inc.
http://wpcertification.blogspot.ch/2014/02/helloworld-apache-storm-word-counter.html
51
Amazon Kinesis
© Copyright 2000-2015 TIBCO Software Inc.
https://aws.amazon.com/kinesis/
AWS S3 RedShift DynamoDB
52
Amazon Kinesis – Hello World
© Copyright 2000-2015 TIBCO Software Inc.
53
Amazon Kinesis – The Cloud ...
© Copyright 2000-2015 TIBCO Software Inc.
… is easy to setup and scale!
But you do not have full control 
• Any data that is older than 24 hours is automatically deleted
• Every Kinesis application consists of just one procedure, so you can’t use Kinesis to
perform complex stream processing unless you connect multiple applications
• Kinesis can only support a maximum size of 50KB for each data item
http://diamondstream.com/amazon-kinesis-big-real-time-data-processing-solution/
(blog post from 2014, might be outdated, but shows that you do not have full control over a cloud service)
54
Apache Spark
© Copyright 2000-2015 TIBCO Software Inc.
General Data-processing Framework
 However, focus is especially on Analytics (these days)
http://fortune.com/2015/09/09/cloudera-spark-mapreduce/
55
Apache Spark – Focus on Analytics
© Copyright 2000-2015 TIBCO Software Inc.
http://aptuz.com/blog/is-apache-spark-going-to-replace-hadoop/
http://fortune.com/2015/09/09/cloudera-spark-mapreduce/
http://www.ebaytechblog.com/2014/05/28/using-spark-to-ignite-data-analytics/
http://www.forbes.com/sites/paulmiller/2015/06/15/ibm-backs-apache-spark-for-big-data-analytics/
“[IBM’s initiatives] include:
• deepening the integration between Apache Spark and
existing IBM products like the Watson Health Cloud;
• open sourcing IBM’s existing SystemML machine
learning technology;
56
Spark Streaming
© Copyright 2000-2015 TIBCO Software Inc.
Spark Streaming
• is no real streaming solution
• uses micro-batches
• cannot process data in real-time (i.e. no ultra-low latency)
• allows easy combination with other Spark components (SQL, Machine Learning, etc.)
57
Apache Spark – Hello World
© Copyright 2000-2015 TIBCO Software Inc.
Spark Streaming API
Spark Core API
58
Alternatives
© Copyright 2000-2015 TIBCO Software Inc.
OPEN SOURCE CLOSED SOURCE
PRODUCT
FRAMEWORK
(no complete list!)
Visual IDE (Dev, Test, Debug)
Simulation (Feed Testing, Test Generation)
Live UI (monitoring, proactive interaction)
Maturity (24h support, consulting)
Integration (ootb integration: ESB, MDM, etc.)
Library (Java, .NET, Python)
Query Language (often similar to SQL)
Scalability (horizontal and vertical, fail over)
Connectivity (technologies, markets, products)
Operators (Filter, Sort, Aggregate)
What Streaming Alternative do you need?
Time
to
Market
Streaming
Frameworks
Streaming
Products
Slow Fast
Streaming
Concepts
60
IBM InfoSphere Streams
© Copyright 2000-2015 TIBCO Software Inc.
61
IBM InfoSphere Streams
© Copyright 2000-2015 TIBCO Software Inc.
https://developer.ibm.com/streamsdev/wp-content/uploads/sites/15/2014/04/Streams-and-Storm-April-2014-Final.pdf
TIBCO StreamBase
• Performance: Latency, Throughput, Scalability
– Multi-threaded and clustered server from version 1
– High throughput: Millions of messages, 100,000s of quotes, 10,000s of orders
– Low-latency: microsecond latency for algo trading, pre-trade risk, market data
• Take Advantage of High Performance Hardware
– Multicore (12, 24, 32 core) large memory (10s of gigabytes)
– 64-bit Linux, Windows, Solaris deployment
– Hardware acceleration (GPU, Solace, Tervela)
• Enterprise Deployment
– High availability and fault tolerance
– Distributed state management for large data sets
– Management and monitoring tools
– Security and entitlements Integration
– Continuous deployment and QA Process Support
StreamSQL compiler
and static optimizer
In process, in thread
adapter architecture
Visual parallelism and
scaling
ActiveSpaces
integration for
distributed shared state
Data parallelism and
dispatch
StreamBase Server
Innovations
“The StreamBase engine is for real. We couldn’t break it, and believe
me, I tried” SVP Development, Top 5 Broker Dealer
StreamBase: The Power of Visual Programming
© Copyright 2000-2015 TIBCO Software Inc.
1) Get ideas into
market in days or
weeks, not months or
years
2) Unlock the power of
IT and data scientists
working together
64
© Copyright 2000-2013 TIBCO Software Inc.
Code Anyone Can Read
Limit Gift Card Activation Amounts at One Location
Aggregate
Capture card activations per
location
Sales too high!
Log to any
database
No Fraud
Sales too high?
Visual Debugger
Feed Simulation
Unit Testing
“StreamBase’s modeling tools are easy to
use and will enable the exchange to
quickly react to the ever changing needs of
our customers.” Steve Goldman,
Director of Enterprise Architecture
StreamBase Development Studio
Live Datamart
Continuous Query Processor Alerts
BusinessEvents
FTL
EMS
ActiveSpaces
Live Datamart
BusinessWorks
Social Media Data
Market Data
Sensor Data
Historical
Data
ActiveSpaces Datagrid
Enterprise
dataMarket Data
IoT
Mobile
Social
LiveView Desktop
Command & Control
ACTION
Continuous Query
67
Dynamic aggregation
Live visualization
Ad-hoc continuous query
Alerts
Action
LiveView Desktop
Live Datamart Clients and APIs
• Rich Desktop Client
– Drag&Drop, no coding
• Rich Web Client
– Drag&Drop, no coding
• HTML5 and Javascript API
– D3, jQuery, ExtJS, Google Charts,
Bing, AngularJS
• .NET API
– For custom .NET development
• Java API
– For custom Java GUI development
• Combination
– Rich Client + HTML5 Extensions
Predictive Sensor Analytics
Live Demo (Stream Processing)
70
Spoilt for Choice – Which one to choose?
© Copyright 2000-2015 TIBCO Software Inc.
What are the
key aspects?
71
What do you need (out-of-the-box)?
© Copyright 2000-2015 TIBCO Software Inc.
• A stream processing programming language for streaming analytics
• Visual development and debugging instead of coding
• Out-of-the-box connectivity to streaming and historical data sources
• Performance (real-time vs. micro-batches)
• Automated monitoring and alerts
• Live UI for proactive human interaction
• Maturity and proven deployments
• Fault tolerance
• Commercial support
• Professional services and training
72
Spoilt for Choice – Framework or Product?
© Copyright 2000-2015 TIBCO Software Inc.
Does it make sense
to combine both?
Example: Apache Storm + TIBCO Live Datamart
External
Data
Snapshot Results
Continuous Query Processor
Query
TIBCO Live Datamart
Continuous
Alerting
Active Tables Active Tables
Continuous
Updates
Clients
Message Bus
Public Data
Customer Data
StreamBase Bolt StreamBase Spout
Operational
Data
StreamBase Bolt and Spout connect
Apache Storm to StreamBase to provide
real-time analytics on operational data
Agenda
– Real World Use Cases
– Introduction to Stream Processing
– Market Overview
– Relation to other Big Data Components
Operational Analytics
Operations
Live UI
SENSOR DATA
TRANSACTIONS
MESSAGE BUS
MACHINE DATA
SOCIAL DATA
Streaming AnalyticsAction
Aggregate
Rules
Stream Processing
Analytics
Correlate
Live Datamart
Continuous query
processing
Alerts
Manual action,
escalation
HISTORICAL ANALYSIS
MS Excel
SAS
Data
Scientists
Cleansed
Data
History
Data Discovery
R
Enterprise Service Bus
ERP MDM DB WMS
SOA
BIG DATA
Data Warehouse, Hadoop
InternalData
IntegrationBus
API
Event Server
Streaming Analytics Reference Architecture
Spark
76
© Copyright 2000-2014 TIBCO Software Inc.
Real Time Close Loop
Model
Develop model
Deploy into Stream
Processing flow
Act
Automatically monitor
real-time transactions
Automatically trigger
action
Analyze
Analyze data via Data
Discovery
Uncover patterns,
trends, correlations
Real Time Close Loop: Understand – Anticipate – Act
Big Data
 store everything
in Hadoop, DWH, NoSQL, etc.
 even without structure
 even if you do not need it today
http://blogs.teradata.com/international/tag/hadoop/
Real Time Close Loop: Understand – Anticipate – Act
Data Discovery + Statistics + Machine Learning
to find insights and patterns in historical data
Real Time Close Loop: Understand – Anticipate – Act
Streaming Analytics
to operationalize insights
and patterns in real time
Stream
Processing
Hadoop
Open
Source
R
TERR
SAS
MATLAB
In-
database
analytics
Spark
R with Revolution Analytics (now Microsoft)
© Copyright 2000-2015 TIBCO Software Inc.
Open Source GPL License
http://www.revolutionanalytics.com/webinars/introducing-revolution-r-open-enhanced-open-source-r-distribution-revolution-analytics
R with TIBCO Runtime for R (TERR)
TIBCO TERR delivers production-grade R analytics to
enterprises
 Flexibility & analytic power of R language
 Time-to-market agility
 Enterprise-grade platform
• A TIBCO licensed & supported product
• Not GPL, not a repackaging of the Open source R engine
• Deployment in TIBCO products and 3rd party applications
(e.g. Hadoop)
http://spotfire.tibco.com/discover-spotfire/what-does-spotfire-do/predictive-analytics/tibco-enterprise-runtime-for-r-terr
Use Open Source R or Not?
© Copyright 2000-2015 TIBCO Software Inc.
http://www.forbes.com/sites/danwoods/2015/01/27/microsofts-revolution-analytics-acquisition-is-the-wrong-way-to-embrace-r/
Spark MLlib
© Copyright 2000-2015 TIBCO Software Inc.
MLlib is Spark’s machine learning
(ML) library. Its goal is to make
practical machine learning scalable
and easy.
It consists of common learning
algorithms and utilities, including
classification, regression,
clustering, collaborative filtering,
dimensionality reduction, as well as
lower-level optimization primitives
and higher-level pipeline APIs.
You can even combine Mllib module with R language
Predictive Sensor Analytics
Live Demo (Data Discovery, Statistics)
© Copyright 2000-2013 TIBCO Software Inc.
80% of betting happens
AFTER the game begins
TODAY
Case Study: Streaming Analytics for Betting
• Situation: Today, 80% of Betting is Done After the
Game Starts
• It’s not your father’s bookie anymore!
• Problem: How to Analyze Big Betting Data?
• Thousands of concurrent games, constantly adjusting odds, dozens of
betting networks – firms must correlate millions of events a day to find
the best betting opportunities in real-time
• Solution: TIBCO for Fast Data Architecture
• TXOdds uses TIBCO to correlate, aggregate, and analyze large
volumes of streaming betting data in real-time and publish innovative
predictive betting analytics to their customers
• Result: TXOdds First to Market with Innovative Zero
Latency Betting Analytics
• Innovative real-time analytics help players who can process electronic
data in real-time the edge
“With StreamBase, in two
months we had our first betting
analytics feed live, and we
continually deploy new ideas
and evolve our old ones.”
- Alex Kozlenkov, VP of technology, TXOdds
87
“WHEN 5 KEY BOOKIES RAISE
THE SAME ODDS IN A 5-SECOND
WINDOW, BET LESS”
? ? ? ? ?? ???
88
“WHEN THE REAL-TIME ODDS ARE
5% GREATER THAN THE HISTORICAL
SPREAD, INCREASE MY BET”
? ? ? ? ? ? ? ? ? ? ? ? ? ?
? ? ? ? ? ? ? ? ? ? ? ? ? ?
Reference Architecture: Streaming Betting Analytics
Event Processing
MONITOR
REAL-TIME ANALYTICS
AGGREGATE
HISTORICAL COMPARISON
Predictive odds
analytics
Zero Latency
Betting Analytics
GLOBAL, DISTRIBUTED INFRASTRUCTURE
Historical odds
deviations
B
U
S
BETTING LINES
SCORES
NEWS
HADOOP
Context:
Historical
Betting Data,
Odds, Outcomes
B
U
S
CACHE CACHE CACHE
Real-Time Analytics
CORRELATE
StreamBase LiveView
SOCIAL
Twitter
(#TomBradyBrokenLeg)
Twitter (#Boston)
Brady’s Stats
Actionable
Insights
Real-Time Social Media Analytics
Twitter (#NFL)
Something relevant happening?
Every minute counts!
Change Odds (automated or manually triggered):
• Stop live-betting for the currently running game?
• How many interceptions will the Quarterback throw?
• Will the Patriots win the Super Bowl?
• …
91
Real-Time Social Sentiment Analysis
Did you get the Key Message?
– Streaming Analytics processes Data while it is in Motion!
– Automation and Proactive Human Interaction are BOTH needed!
– Time to Market is the Key Requirement for most Use Cases!
Key Messages
Questions?
Kai Wähner
kwaehner@tibco.com
@KaiWaehner
www.kai-waehner.de
LinkedIn / Xing  Please connect!

Weitere ähnliche Inhalte

Was ist angesagt?

Running Flink in Production: The good, The bad and The in Between - Lakshmi ...
Running Flink in Production:  The good, The bad and The in Between - Lakshmi ...Running Flink in Production:  The good, The bad and The in Between - Lakshmi ...
Running Flink in Production: The good, The bad and The in Between - Lakshmi ...Flink Forward
 
Notebooks @ Netflix: From analytics to engineering with Jupyter notebooks
Notebooks @ Netflix: From analytics to engineering with Jupyter notebooksNotebooks @ Netflix: From analytics to engineering with Jupyter notebooks
Notebooks @ Netflix: From analytics to engineering with Jupyter notebooksMichelle Ufford
 
Lightning-fast Analytics for Workday transactional data
Lightning-fast Analytics for Workday transactional dataLightning-fast Analytics for Workday transactional data
Lightning-fast Analytics for Workday transactional dataPavel Hardak
 
Building a Feature Store around Dataframes and Apache Spark
Building a Feature Store around Dataframes and Apache SparkBuilding a Feature Store around Dataframes and Apache Spark
Building a Feature Store around Dataframes and Apache SparkDatabricks
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsFlink Forward
 
Introduction to Neo4j
Introduction to Neo4jIntroduction to Neo4j
Introduction to Neo4jNeo4j
 
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingIntroduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingTill Rohrmann
 
Geospatial Advancements in Elasticsearch
Geospatial Advancements in ElasticsearchGeospatial Advancements in Elasticsearch
Geospatial Advancements in ElasticsearchElasticsearch
 
文章を読み、理解する機能の獲得に向けて-Machine Comprehensionの研究動向-
文章を読み、理解する機能の獲得に向けて-Machine Comprehensionの研究動向-文章を読み、理解する機能の獲得に向けて-Machine Comprehensionの研究動向-
文章を読み、理解する機能の獲得に向けて-Machine Comprehensionの研究動向-Takahiro Kubo
 
大規模言語モデルとChatGPT
大規模言語モデルとChatGPT大規模言語モデルとChatGPT
大規模言語モデルとChatGPTnlab_utokyo
 
Data engineering in 10 years.pdf
Data engineering in 10 years.pdfData engineering in 10 years.pdf
Data engineering in 10 years.pdfLars Albertsson
 
hive HBase Metastore - Improving Hive with a Big Data Metadata Storage
hive HBase Metastore - Improving Hive with a Big Data Metadata Storagehive HBase Metastore - Improving Hive with a Big Data Metadata Storage
hive HBase Metastore - Improving Hive with a Big Data Metadata StorageDataWorks Summit/Hadoop Summit
 
Machine Learning with PyCaret
Machine Learning with PyCaretMachine Learning with PyCaret
Machine Learning with PyCaretDatabricks
 
Stream processing and managing real-time data
Stream processing and managing real-time dataStream processing and managing real-time data
Stream processing and managing real-time dataAmazon Web Services
 
Managing and Versioning Machine Learning Models in Python
Managing and Versioning Machine Learning Models in PythonManaging and Versioning Machine Learning Models in Python
Managing and Versioning Machine Learning Models in PythonSimon Frid
 
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?DataWorks Summit
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveDataWorks Summit
 

Was ist angesagt? (20)

Running Flink in Production: The good, The bad and The in Between - Lakshmi ...
Running Flink in Production:  The good, The bad and The in Between - Lakshmi ...Running Flink in Production:  The good, The bad and The in Between - Lakshmi ...
Running Flink in Production: The good, The bad and The in Between - Lakshmi ...
 
Notebooks @ Netflix: From analytics to engineering with Jupyter notebooks
Notebooks @ Netflix: From analytics to engineering with Jupyter notebooksNotebooks @ Netflix: From analytics to engineering with Jupyter notebooks
Notebooks @ Netflix: From analytics to engineering with Jupyter notebooks
 
Lightning-fast Analytics for Workday transactional data
Lightning-fast Analytics for Workday transactional dataLightning-fast Analytics for Workday transactional data
Lightning-fast Analytics for Workday transactional data
 
Building a Feature Store around Dataframes and Apache Spark
Building a Feature Store around Dataframes and Apache SparkBuilding a Feature Store around Dataframes and Apache Spark
Building a Feature Store around Dataframes and Apache Spark
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data Alerts
 
Introduction to Neo4j
Introduction to Neo4jIntroduction to Neo4j
Introduction to Neo4j
 
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingIntroduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processing
 
Geospatial Advancements in Elasticsearch
Geospatial Advancements in ElasticsearchGeospatial Advancements in Elasticsearch
Geospatial Advancements in Elasticsearch
 
Sketchnotes
SketchnotesSketchnotes
Sketchnotes
 
Spark MLlibではじめるスケーラブルな機械学習
Spark MLlibではじめるスケーラブルな機械学習Spark MLlibではじめるスケーラブルな機械学習
Spark MLlibではじめるスケーラブルな機械学習
 
文章を読み、理解する機能の獲得に向けて-Machine Comprehensionの研究動向-
文章を読み、理解する機能の獲得に向けて-Machine Comprehensionの研究動向-文章を読み、理解する機能の獲得に向けて-Machine Comprehensionの研究動向-
文章を読み、理解する機能の獲得に向けて-Machine Comprehensionの研究動向-
 
大規模言語モデルとChatGPT
大規模言語モデルとChatGPT大規模言語モデルとChatGPT
大規模言語モデルとChatGPT
 
Data engineering in 10 years.pdf
Data engineering in 10 years.pdfData engineering in 10 years.pdf
Data engineering in 10 years.pdf
 
hive HBase Metastore - Improving Hive with a Big Data Metadata Storage
hive HBase Metastore - Improving Hive with a Big Data Metadata Storagehive HBase Metastore - Improving Hive with a Big Data Metadata Storage
hive HBase Metastore - Improving Hive with a Big Data Metadata Storage
 
Machine Learning with PyCaret
Machine Learning with PyCaretMachine Learning with PyCaret
Machine Learning with PyCaret
 
Stream processing and managing real-time data
Stream processing and managing real-time dataStream processing and managing real-time data
Stream processing and managing real-time data
 
The C10k Problem
The C10k ProblemThe C10k Problem
The C10k Problem
 
Managing and Versioning Machine Learning Models in Python
Managing and Versioning Machine Learning Models in PythonManaging and Versioning Machine Learning Models in Python
Managing and Versioning Machine Learning Models in Python
 
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
 

Ähnlich wie Streaming Analytics - Comparison of Open Source Frameworks and Products

TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...Nelson Petracek
 
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...Big Data Spain
 
Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Ser...
Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Ser...Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Ser...
Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Ser...Kai Wähner
 
Intelligent Business Process Management Suites (iBPMS) - The Next-Generation ...
Intelligent Business Process Management Suites (iBPMS) - The Next-Generation ...Intelligent Business Process Management Suites (iBPMS) - The Next-Generation ...
Intelligent Business Process Management Suites (iBPMS) - The Next-Generation ...Kai Wähner
 
How Analytic Solutions Drive Real-world Change (Interesting Use Cases)
How Analytic Solutions Drive Real-world Change (Interesting Use Cases)How Analytic Solutions Drive Real-world Change (Interesting Use Cases)
How Analytic Solutions Drive Real-world Change (Interesting Use Cases)TIBCO Jaspersoft
 
Data Science Case Studies: The Internet of Things: Implications for the Enter...
Data Science Case Studies: The Internet of Things: Implications for the Enter...Data Science Case Studies: The Internet of Things: Implications for the Enter...
Data Science Case Studies: The Internet of Things: Implications for the Enter...VMware Tanzu
 
apidays LIVE Australia - Events are Cool Again! by Nelson Petracek
apidays LIVE Australia -  Events are Cool Again! by Nelson Petracekapidays LIVE Australia -  Events are Cool Again! by Nelson Petracek
apidays LIVE Australia - Events are Cool Again! by Nelson Petracekapidays
 
TIBCO presentation at the Chief Analytics Officer Forum East Coast 2016 (#CAO...
TIBCO presentation at the Chief Analytics Officer Forum East Coast 2016 (#CAO...TIBCO presentation at the Chief Analytics Officer Forum East Coast 2016 (#CAO...
TIBCO presentation at the Chief Analytics Officer Forum East Coast 2016 (#CAO...Chief Analytics Officer Forum
 
The Business Justification for APM
The Business Justification for APMThe Business Justification for APM
The Business Justification for APMJonah Kowall
 
EVAM_Streaming Analytics_v1.5
EVAM_Streaming Analytics_v1.5EVAM_Streaming Analytics_v1.5
EVAM_Streaming Analytics_v1.5John Nikolaidis
 
Take Action: The New Reality of Data-Driven Business
Take Action: The New Reality of Data-Driven BusinessTake Action: The New Reality of Data-Driven Business
Take Action: The New Reality of Data-Driven BusinessInside Analysis
 
Real-Time Analytics for Industries
Real-Time Analytics for IndustriesReal-Time Analytics for Industries
Real-Time Analytics for IndustriesAvadhoot Patwardhan
 
Big Data LDN 2018: ACCELERATING YOUR ANALYTICS JOURNEY WITH REAL-TIME AI
Big Data LDN 2018: ACCELERATING YOUR ANALYTICS JOURNEY WITH REAL-TIME AIBig Data LDN 2018: ACCELERATING YOUR ANALYTICS JOURNEY WITH REAL-TIME AI
Big Data LDN 2018: ACCELERATING YOUR ANALYTICS JOURNEY WITH REAL-TIME AIMatt Stubbs
 
Predictive Analytics and the Industrial Internet of Manufacturing Things with...
Predictive Analytics and the Industrial Internet of Manufacturing Things with...Predictive Analytics and the Industrial Internet of Manufacturing Things with...
Predictive Analytics and the Industrial Internet of Manufacturing Things with...gogo6
 
Mi intellithink c
Mi intellithink cMi intellithink c
Mi intellithink cethirajk1
 
Sensor Data Management & Analytics: Advanced Process Control
Sensor Data Management & Analytics: Advanced Process ControlSensor Data Management & Analytics: Advanced Process Control
Sensor Data Management & Analytics: Advanced Process ControlTIBCO_Software
 
Bitrock manufacturing
Bitrock manufacturing Bitrock manufacturing
Bitrock manufacturing cosma_r
 
Vitria IoT Analytics Platform
Vitria IoT Analytics PlatformVitria IoT Analytics Platform
Vitria IoT Analytics PlatformAbhishek Sood
 
How to Handle the Realities of DevOps Monitoring Today
How to Handle the Realities of DevOps Monitoring TodayHow to Handle the Realities of DevOps Monitoring Today
How to Handle the Realities of DevOps Monitoring TodayDevOps.com
 

Ähnlich wie Streaming Analytics - Comparison of Open Source Frameworks and Products (20)

TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...
 
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
 
Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Ser...
Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Ser...Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Ser...
Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Ser...
 
Intelligent Business Process Management Suites (iBPMS) - The Next-Generation ...
Intelligent Business Process Management Suites (iBPMS) - The Next-Generation ...Intelligent Business Process Management Suites (iBPMS) - The Next-Generation ...
Intelligent Business Process Management Suites (iBPMS) - The Next-Generation ...
 
How Analytic Solutions Drive Real-world Change (Interesting Use Cases)
How Analytic Solutions Drive Real-world Change (Interesting Use Cases)How Analytic Solutions Drive Real-world Change (Interesting Use Cases)
How Analytic Solutions Drive Real-world Change (Interesting Use Cases)
 
Data Science Case Studies: The Internet of Things: Implications for the Enter...
Data Science Case Studies: The Internet of Things: Implications for the Enter...Data Science Case Studies: The Internet of Things: Implications for the Enter...
Data Science Case Studies: The Internet of Things: Implications for the Enter...
 
TIBCO OEM Partnership
TIBCO OEM PartnershipTIBCO OEM Partnership
TIBCO OEM Partnership
 
apidays LIVE Australia - Events are Cool Again! by Nelson Petracek
apidays LIVE Australia -  Events are Cool Again! by Nelson Petracekapidays LIVE Australia -  Events are Cool Again! by Nelson Petracek
apidays LIVE Australia - Events are Cool Again! by Nelson Petracek
 
TIBCO presentation at the Chief Analytics Officer Forum East Coast 2016 (#CAO...
TIBCO presentation at the Chief Analytics Officer Forum East Coast 2016 (#CAO...TIBCO presentation at the Chief Analytics Officer Forum East Coast 2016 (#CAO...
TIBCO presentation at the Chief Analytics Officer Forum East Coast 2016 (#CAO...
 
The Business Justification for APM
The Business Justification for APMThe Business Justification for APM
The Business Justification for APM
 
EVAM_Streaming Analytics_v1.5
EVAM_Streaming Analytics_v1.5EVAM_Streaming Analytics_v1.5
EVAM_Streaming Analytics_v1.5
 
Take Action: The New Reality of Data-Driven Business
Take Action: The New Reality of Data-Driven BusinessTake Action: The New Reality of Data-Driven Business
Take Action: The New Reality of Data-Driven Business
 
Real-Time Analytics for Industries
Real-Time Analytics for IndustriesReal-Time Analytics for Industries
Real-Time Analytics for Industries
 
Big Data LDN 2018: ACCELERATING YOUR ANALYTICS JOURNEY WITH REAL-TIME AI
Big Data LDN 2018: ACCELERATING YOUR ANALYTICS JOURNEY WITH REAL-TIME AIBig Data LDN 2018: ACCELERATING YOUR ANALYTICS JOURNEY WITH REAL-TIME AI
Big Data LDN 2018: ACCELERATING YOUR ANALYTICS JOURNEY WITH REAL-TIME AI
 
Predictive Analytics and the Industrial Internet of Manufacturing Things with...
Predictive Analytics and the Industrial Internet of Manufacturing Things with...Predictive Analytics and the Industrial Internet of Manufacturing Things with...
Predictive Analytics and the Industrial Internet of Manufacturing Things with...
 
Mi intellithink c
Mi intellithink cMi intellithink c
Mi intellithink c
 
Sensor Data Management & Analytics: Advanced Process Control
Sensor Data Management & Analytics: Advanced Process ControlSensor Data Management & Analytics: Advanced Process Control
Sensor Data Management & Analytics: Advanced Process Control
 
Bitrock manufacturing
Bitrock manufacturing Bitrock manufacturing
Bitrock manufacturing
 
Vitria IoT Analytics Platform
Vitria IoT Analytics PlatformVitria IoT Analytics Platform
Vitria IoT Analytics Platform
 
How to Handle the Realities of DevOps Monitoring Today
How to Handle the Realities of DevOps Monitoring TodayHow to Handle the Realities of DevOps Monitoring Today
How to Handle the Realities of DevOps Monitoring Today
 

Mehr von Kai Wähner

Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)Kai Wähner
 
When NOT to use Apache Kafka?
When NOT to use Apache Kafka?When NOT to use Apache Kafka?
When NOT to use Apache Kafka?Kai Wähner
 
Kafka for Live Commerce to Transform the Retail and Shopping Metaverse
Kafka for Live Commerce to Transform the Retail and Shopping MetaverseKafka for Live Commerce to Transform the Retail and Shopping Metaverse
Kafka for Live Commerce to Transform the Retail and Shopping MetaverseKai Wähner
 
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache KafkaThe Heart of the Data Mesh Beats in Real-Time with Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache KafkaKai Wähner
 
Apache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
Apache Kafka vs. Cloud-native iPaaS Integration Platform MiddlewareApache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
Apache Kafka vs. Cloud-native iPaaS Integration Platform MiddlewareKai Wähner
 
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?Kai Wähner
 
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureServerless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureKai Wähner
 
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...Kai Wähner
 
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Data Streaming with Apache Kafka in the Defence and Cybersecurity IndustryData Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Data Streaming with Apache Kafka in the Defence and Cybersecurity IndustryKai Wähner
 
Apache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryApache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryKai Wähner
 
Apache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryApache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryKai Wähner
 
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
Apache Kafka for Real-time Supply Chainin the Food and Retail IndustryApache Kafka for Real-time Supply Chainin the Food and Retail Industry
Apache Kafka for Real-time Supply Chain in the Food and Retail IndustryKai Wähner
 
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKai Wähner
 
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0Kai Wähner
 
Apache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and ManufacturingApache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and ManufacturingKai Wähner
 
Kappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKai Wähner
 
The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022Kai Wähner
 
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka ArchitecturesEvent Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka ArchitecturesKai Wähner
 
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...Kai Wähner
 
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...Kai Wähner
 

Mehr von Kai Wähner (20)

Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
 
When NOT to use Apache Kafka?
When NOT to use Apache Kafka?When NOT to use Apache Kafka?
When NOT to use Apache Kafka?
 
Kafka for Live Commerce to Transform the Retail and Shopping Metaverse
Kafka for Live Commerce to Transform the Retail and Shopping MetaverseKafka for Live Commerce to Transform the Retail and Shopping Metaverse
Kafka for Live Commerce to Transform the Retail and Shopping Metaverse
 
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache KafkaThe Heart of the Data Mesh Beats in Real-Time with Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
 
Apache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
Apache Kafka vs. Cloud-native iPaaS Integration Platform MiddlewareApache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
Apache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
 
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
 
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureServerless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
 
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
 
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Data Streaming with Apache Kafka in the Defence and Cybersecurity IndustryData Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
 
Apache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryApache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare Industry
 
Apache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryApache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare Industry
 
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
Apache Kafka for Real-time Supply Chainin the Food and Retail IndustryApache Kafka for Real-time Supply Chainin the Food and Retail Industry
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
 
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid Cloud
 
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
 
Apache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and ManufacturingApache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and Manufacturing
 
Kappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology Comparison
 
The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022
 
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka ArchitecturesEvent Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
 
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
 
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
 

Kürzlich hochgeladen

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 

Kürzlich hochgeladen (20)

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 

Streaming Analytics - Comparison of Open Source Frameworks and Products

  • 1. Fast Data and Streaming Analytics in the Era of Hadoop, R and Apache Spark Kai Wähner kwaehner@tibco.com @KaiWaehner www.kai-waehner.de LinkedIn / Xing  Please connect!
  • 2. Key Messages – Streaming Analytics processes Data while it is in Motion! – Automation and Proactive Human Interaction are BOTH needed! – Time to Market is the Key Requirement for most Use Cases!
  • 3. Agenda – Real World Use Cases – Introduction to Stream Processing – Market Overview – Relation to other Big Data Components
  • 4. Agenda – Real World Use Cases – Introduction to Stream Processing – Market Overview – Relation to other Big Data Components
  • 5. © Copyright 2015 TIBCO Software Inc. Find and Act on “Critical Business Moments” “Business Moments” occur in Every Facet of Enterprise Operations, they drive competitive differentiation, customer satisfaction and business success! Optimize Pricing Identify fraud Make cross- sell offers Restock inventory Reroute trucks Deliver proactive customer service Predict equipment failure & fix proactively Anticipate and handle disruptions
  • 6. Operational Intelligence in Action © Copyright 2000-2015 TIBCO Software Inc. Actions by Operations Human decisions in real time informed by up to date information The Challenge: Empower operations staff to see and seize key business moments6 Automated action based on models of history combined with live context and business rules The Challenge: Create, understand, and deploy algorithms & rules that automate key business reactions Machine-to-Machine Automation
  • 7. 7 Success Story © Copyright 2000-2015 TIBCO Software Inc. Predictive Fault Management
  • 8. © Copyright 2000-2013 TIBCO Software Inc. “An outage on one well can cost $10M per hour. We have 20-100 outages per year.“ - Drilling operations VP, major oil company
  • 9. Data Monitoring • Motor temperature • Motor vibration • Current • Intake pressure • Intake temperature  Flow Electrical power cable Pump Intake Protector ESP motor Pump monitoring unit Electric Submersible Pumps (ESP) Predictive Analytics (Fault Management)
  • 10. Voltage Temperature Vibration Device history Temporal analytic: “If vibration spike is followed by temp spike then voltage spike [within 12 minutes] then flag high severity alert.” Predictive Analytics (Fault Management)
  • 11. Live Surveillance of Equipment © Copyright 2000-2014 TIBCO Software Inc. Continuous, live geospatial display of pump health and predictive signal breeches Alerts based on predictive signals Compare live readings and signals to historical average and means Continuous, live visualization of stats per 100’s of wells
  • 12. 12 Success Story © Copyright 2000-2015 TIBCO Software Inc. Smart Manufacturing
  • 13. IoT for High Tech Manufacturing Yield Optimization © Copyright 2000-2014 TIBCO Software Inc. • Before: Solar Panel Manufacturer with No Unified View of Manufacturing Process – Multiple manufacturing facilities, multiple processes – no way to compare production to yield expectations • Negative Consequences: Sub-Optimal Production – Operations are sub-optimal: high tolerance leads to better yield but less output; tight tolerance means high throughput but lower yield • Business Outcome: Higher Yield and More Runs – Process Manufacturing can run tighter tolerances and adjust them mid-run, predicting yield and adjusting to changing variables – Systems proactively re-route high-value customers around affected network areas in real-time • How We Do It: The TIBCO Fast Data Platform – IoT, Spotfire, StreamBase, and TERR for predictive modeling, high-speed network by TIBCO “For every 1% increase in shipped product, we make $11MM in profit. The demand is there, we just need to fulfill it.” - Head of Quality, Solar Panel Manufacturer
  • 14. High Tech Manufacturing Yield Optimization © Copyright 2000-2014 TIBCO Software Inc. Live streaming datamart analysis Continuous update and exploration of top yield metrics; take action
  • 15. High Tech Manufacturing Yield Optimization © Copyright 2000-2014 TIBCO Software Inc. Continuously computed real-time analytics on streams by StreamBase (thresholds, min / max, average) Analysis, alerts and triggers are based on streaming analytics
  • 16. High Tech Manufacturing Yield Optimization © Copyright 2000-2014 TIBCO Software Inc. Manufacturing operations staff drill down on any machine, any time, to inspect and fix problems before they impact yield
  • 17. 17 Success Story © Copyright 2000-2015 TIBCO Software Inc. Crowd Management
  • 18. 18 © Copyright 2000-2015 TIBCO Software Inc. Crowd Management (Stadium, Airport, Conference, …)
  • 19. Sacramento Kings  World’s Smartest Building © Copyright 2000-2015 TIBCO Software Inc.
  • 20. 20 © Copyright 2000-2014 TIBCO Software Inc.
  • 21. 21 © Copyright 2000-2014 TIBCO Software Inc.
  • 22. 22 © Copyright 2000-2014 TIBCO Software Inc.
  • 23. 23 © Copyright 2000-2014 TIBCO Software Inc.
  • 24. 24 © Copyright 2000-2014 TIBCO Software Inc.
  • 25. 25 © Copyright 2000-2014 TIBCO Software Inc.
  • 26. 26 © Copyright 2000-2014 TIBCO Software Inc.
  • 27. 27 Success Story © Copyright 2000-2015 TIBCO Software Inc. Retailing in the 21st Century
  • 28. Challenges of the 21st Century Retailer • Retailing and Retail Challenges are changing • Consumers expect better and integrated customer experience across all channels – Rapid adoption of mobile is a major driver – Customers want an integrated service across physical and digital channels… Simultaneously – Customer experience is becoming one of the main differentiators • Real-Time, one-on-one marketing can: – Improve a retailer’s relevance with the customer – Increase customer wallet-share • Key to being able to achieve this is: – Identifying and knowing your customer, in depth in real-time – Understanding the opportunity their past behavior reveals – Understanding your inventory (availability, velocity, pipeline)
  • 29. 29 © Copyright 2000-2014 TIBCO Software Inc. All Customers are different… Treat them that way… Capture – Engage – Expand - Monetize Patterns – Real time MOREPERSONAL MORE CONTEXT social CRM POS mobileweb e-mails
  • 30. National Retailer Loyalty 2015 © Copyright 2000-2015 TIBCO Software Inc. Top Benefits • Smart cross-selling based in iBeacons • Location-based services in real time • Leveraging partner offerings
  • 31. New Real-Time Fraud Detection Based on Deep Historical Insight Real-time fraud action can be taken based on historical insight – system not “whiplashed” by real-time events Streaming Analytics for Gift Card Fraud Protection
  • 32. 32 © Copyright 2000-2015 TIBCO Software Inc. Internet of Things Hybrid Stores Smart Tags Smart Shelves Smart Warehouse Faster Delivery Buy Online Pickup at Store Same Day Delivery Omni Channel 2.0 Store Fulfillment Social Media Predictive Shopping National Retailer Loyalty 2018
  • 33. 33 Great success stories, but … © Copyright 2000-2015 TIBCO Software Inc. … how to realize these use cases?
  • 34. 34 © Copyright 2000-2014 TIBCO Software Inc. Real Time Close Loop Model Develop model Deploy into Stream Processing flow Act Automatically monitor real-time transactions Automatically trigger action Analyze Analyze data via Data Discovery Uncover patterns, trends, correlations
  • 35. Agenda – Real World Use Cases – Introduction to Stream Processing – Market Overview – Relation to other Big Data Components
  • 36. Traditional Data Processing: Challenges • Introduces too much “decision latency” into the business. • Responses are delivered “after-the- fact”. • Maximum value of the identified situation is lost. – Cross-sell / up-sell opportunities are lost, impending equipment failure is missed, business processes are slow to respond and lack timely context. • Decisions are made on old and stale data. © Copyright 2000-2015 TIBCO Software Inc. Store Analyze Act
  • 37. The New Era: Fast Data Processing • Events are analyzed and processed in real-time as they arrive. • Decisions are timely, contextual, and based on fresh data. • Decision latency is eliminated, resulting in:  Superior Customer Experience  Operational Excellence  Instant Awareness and Timely Decisions © Copyright 2000-2015 TIBCO Software Inc. Act Analyze Store
  • 38. Streaming Analytics © Copyright 2000-2015 TIBCO Software Inc. time 1 2 3 4 5 6 7 8 9 Event Streams • Continuous Queries • Sliding Windows • Filter • Aggregation • Correlation • …
  • 39. 39 Act while data is in motion! Time Business Value Business Event Data Ready for Analysis Analysis Completed Decision Made $$$$ $$$ $$ $ Action Taken Stream Processing speeds action and increases business value by seizing opportunities while they matter
  • 40. Operational Analytics Operations Live UI SENSOR DATA TRANSACTIONS MESSAGE BUS MACHINE DATA SOCIAL DATA Streaming AnalyticsAction Aggregate Rules Stream Processing Analytics Correlate Live Datamart Continuous query processing Alerts Manual action, escalation HISTORICAL ANALYSIS MS Excel SAS Data Scientists Cleansed Data History Data Discovery R Enterprise Service Bus ERP MDM DB WMS SOA BIG DATA Data Warehouse, Hadoop InternalData IntegrationBus API Event Server Streaming Analytics Reference Architecture Spark
  • 41. Agenda – Real World Use Cases – Introduction to Stream Processing – Market Overview – Relation to other Big Data Components
  • 42. Operational Analytics Operations Live UI SENSOR DATA TRANSACTIONS MESSAGE BUS MACHINE DATA SOCIAL DATA Streaming AnalyticsAction Aggregate Rules Stream Processing Analytics Correlate Live Datamart Continuous query processing Alerts Manual action, escalation HISTORICAL ANALYSIS MS Excel SAS Data Scientists Cleansed Data History Data Discovery R Enterprise Service Bus ERP MDM DB WMS SOA BIG DATA Data Warehouse, Hadoop InternalData IntegrationBus API Event Server Streaming Analytics Reference Architecture Spark
  • 43. 44 Alternatives for Stream Processing Time to Market Streaming Frameworks Streaming Products Slow Fast Streaming Concepts IncludesIncludes © Copyright 2000-2015 TIBCO Software Inc.
  • 44. Concepts (Continuous Queries, Sliding Windows) Patterns (Counting, Sequencing, Tracking, Trends) Build everything by yourself!  45 What Streaming Alternative do you need? Time to Market Streaming Frameworks Streaming Products Slow Fast Streaming Concepts © Copyright 2000-2015 TIBCO Software Inc.
  • 45. 46 Usually not an option ... © Copyright 2000-2015 TIBCO Software Inc. … as there are a lot of Frameworks and Products available!
  • 46. 47 Alternatives © Copyright 2000-2015 TIBCO Software Inc. OPEN SOURCE CLOSED SOURCE PRODUCT FRAMEWORK (no complete list!)
  • 47. Library (Java, .NET, Python) Query Language (often similar to SQL) Scalability (horizontal and vertical, fail over) Connectivity (technologies, markets, products) Operators (Filter, Sort, Aggregate) 48 What Streaming Alternative do you need? Time to Market Streaming Frameworks Streaming Products Slow Fast Streaming Concepts © Copyright 2000-2015 TIBCO Software Inc.
  • 48. 49 Apache Storm © Copyright 2000-2015 TIBCO Software Inc. Spout Bolt
  • 49. 50 Apache Storm – Hello World © Copyright 2000-2015 TIBCO Software Inc. http://wpcertification.blogspot.ch/2014/02/helloworld-apache-storm-word-counter.html
  • 50. 51 Amazon Kinesis © Copyright 2000-2015 TIBCO Software Inc. https://aws.amazon.com/kinesis/ AWS S3 RedShift DynamoDB
  • 51. 52 Amazon Kinesis – Hello World © Copyright 2000-2015 TIBCO Software Inc.
  • 52. 53 Amazon Kinesis – The Cloud ... © Copyright 2000-2015 TIBCO Software Inc. … is easy to setup and scale! But you do not have full control  • Any data that is older than 24 hours is automatically deleted • Every Kinesis application consists of just one procedure, so you can’t use Kinesis to perform complex stream processing unless you connect multiple applications • Kinesis can only support a maximum size of 50KB for each data item http://diamondstream.com/amazon-kinesis-big-real-time-data-processing-solution/ (blog post from 2014, might be outdated, but shows that you do not have full control over a cloud service)
  • 53. 54 Apache Spark © Copyright 2000-2015 TIBCO Software Inc. General Data-processing Framework  However, focus is especially on Analytics (these days) http://fortune.com/2015/09/09/cloudera-spark-mapreduce/
  • 54. 55 Apache Spark – Focus on Analytics © Copyright 2000-2015 TIBCO Software Inc. http://aptuz.com/blog/is-apache-spark-going-to-replace-hadoop/ http://fortune.com/2015/09/09/cloudera-spark-mapreduce/ http://www.ebaytechblog.com/2014/05/28/using-spark-to-ignite-data-analytics/ http://www.forbes.com/sites/paulmiller/2015/06/15/ibm-backs-apache-spark-for-big-data-analytics/ “[IBM’s initiatives] include: • deepening the integration between Apache Spark and existing IBM products like the Watson Health Cloud; • open sourcing IBM’s existing SystemML machine learning technology;
  • 55. 56 Spark Streaming © Copyright 2000-2015 TIBCO Software Inc. Spark Streaming • is no real streaming solution • uses micro-batches • cannot process data in real-time (i.e. no ultra-low latency) • allows easy combination with other Spark components (SQL, Machine Learning, etc.)
  • 56. 57 Apache Spark – Hello World © Copyright 2000-2015 TIBCO Software Inc. Spark Streaming API Spark Core API
  • 57. 58 Alternatives © Copyright 2000-2015 TIBCO Software Inc. OPEN SOURCE CLOSED SOURCE PRODUCT FRAMEWORK (no complete list!)
  • 58. Visual IDE (Dev, Test, Debug) Simulation (Feed Testing, Test Generation) Live UI (monitoring, proactive interaction) Maturity (24h support, consulting) Integration (ootb integration: ESB, MDM, etc.) Library (Java, .NET, Python) Query Language (often similar to SQL) Scalability (horizontal and vertical, fail over) Connectivity (technologies, markets, products) Operators (Filter, Sort, Aggregate) What Streaming Alternative do you need? Time to Market Streaming Frameworks Streaming Products Slow Fast Streaming Concepts
  • 59. 60 IBM InfoSphere Streams © Copyright 2000-2015 TIBCO Software Inc.
  • 60. 61 IBM InfoSphere Streams © Copyright 2000-2015 TIBCO Software Inc. https://developer.ibm.com/streamsdev/wp-content/uploads/sites/15/2014/04/Streams-and-Storm-April-2014-Final.pdf
  • 61. TIBCO StreamBase • Performance: Latency, Throughput, Scalability – Multi-threaded and clustered server from version 1 – High throughput: Millions of messages, 100,000s of quotes, 10,000s of orders – Low-latency: microsecond latency for algo trading, pre-trade risk, market data • Take Advantage of High Performance Hardware – Multicore (12, 24, 32 core) large memory (10s of gigabytes) – 64-bit Linux, Windows, Solaris deployment – Hardware acceleration (GPU, Solace, Tervela) • Enterprise Deployment – High availability and fault tolerance – Distributed state management for large data sets – Management and monitoring tools – Security and entitlements Integration – Continuous deployment and QA Process Support StreamSQL compiler and static optimizer In process, in thread adapter architecture Visual parallelism and scaling ActiveSpaces integration for distributed shared state Data parallelism and dispatch StreamBase Server Innovations “The StreamBase engine is for real. We couldn’t break it, and believe me, I tried” SVP Development, Top 5 Broker Dealer
  • 62. StreamBase: The Power of Visual Programming © Copyright 2000-2015 TIBCO Software Inc. 1) Get ideas into market in days or weeks, not months or years 2) Unlock the power of IT and data scientists working together
  • 63. 64 © Copyright 2000-2013 TIBCO Software Inc. Code Anyone Can Read Limit Gift Card Activation Amounts at One Location Aggregate Capture card activations per location Sales too high! Log to any database No Fraud Sales too high?
  • 64. Visual Debugger Feed Simulation Unit Testing “StreamBase’s modeling tools are easy to use and will enable the exchange to quickly react to the ever changing needs of our customers.” Steve Goldman, Director of Enterprise Architecture StreamBase Development Studio
  • 65. Live Datamart Continuous Query Processor Alerts BusinessEvents FTL EMS ActiveSpaces Live Datamart BusinessWorks Social Media Data Market Data Sensor Data Historical Data ActiveSpaces Datagrid Enterprise dataMarket Data IoT Mobile Social LiveView Desktop Command & Control ACTION Continuous Query
  • 66. 67 Dynamic aggregation Live visualization Ad-hoc continuous query Alerts Action LiveView Desktop
  • 67. Live Datamart Clients and APIs • Rich Desktop Client – Drag&Drop, no coding • Rich Web Client – Drag&Drop, no coding • HTML5 and Javascript API – D3, jQuery, ExtJS, Google Charts, Bing, AngularJS • .NET API – For custom .NET development • Java API – For custom Java GUI development • Combination – Rich Client + HTML5 Extensions
  • 68. Predictive Sensor Analytics Live Demo (Stream Processing)
  • 69. 70 Spoilt for Choice – Which one to choose? © Copyright 2000-2015 TIBCO Software Inc. What are the key aspects?
  • 70. 71 What do you need (out-of-the-box)? © Copyright 2000-2015 TIBCO Software Inc. • A stream processing programming language for streaming analytics • Visual development and debugging instead of coding • Out-of-the-box connectivity to streaming and historical data sources • Performance (real-time vs. micro-batches) • Automated monitoring and alerts • Live UI for proactive human interaction • Maturity and proven deployments • Fault tolerance • Commercial support • Professional services and training
  • 71. 72 Spoilt for Choice – Framework or Product? © Copyright 2000-2015 TIBCO Software Inc. Does it make sense to combine both?
  • 72. Example: Apache Storm + TIBCO Live Datamart External Data Snapshot Results Continuous Query Processor Query TIBCO Live Datamart Continuous Alerting Active Tables Active Tables Continuous Updates Clients Message Bus Public Data Customer Data StreamBase Bolt StreamBase Spout Operational Data StreamBase Bolt and Spout connect Apache Storm to StreamBase to provide real-time analytics on operational data
  • 73. Agenda – Real World Use Cases – Introduction to Stream Processing – Market Overview – Relation to other Big Data Components
  • 74. Operational Analytics Operations Live UI SENSOR DATA TRANSACTIONS MESSAGE BUS MACHINE DATA SOCIAL DATA Streaming AnalyticsAction Aggregate Rules Stream Processing Analytics Correlate Live Datamart Continuous query processing Alerts Manual action, escalation HISTORICAL ANALYSIS MS Excel SAS Data Scientists Cleansed Data History Data Discovery R Enterprise Service Bus ERP MDM DB WMS SOA BIG DATA Data Warehouse, Hadoop InternalData IntegrationBus API Event Server Streaming Analytics Reference Architecture Spark
  • 75. 76 © Copyright 2000-2014 TIBCO Software Inc. Real Time Close Loop Model Develop model Deploy into Stream Processing flow Act Automatically monitor real-time transactions Automatically trigger action Analyze Analyze data via Data Discovery Uncover patterns, trends, correlations
  • 76. Real Time Close Loop: Understand – Anticipate – Act Big Data  store everything in Hadoop, DWH, NoSQL, etc.  even without structure  even if you do not need it today http://blogs.teradata.com/international/tag/hadoop/
  • 77. Real Time Close Loop: Understand – Anticipate – Act Data Discovery + Statistics + Machine Learning to find insights and patterns in historical data
  • 78. Real Time Close Loop: Understand – Anticipate – Act Streaming Analytics to operationalize insights and patterns in real time Stream Processing Hadoop Open Source R TERR SAS MATLAB In- database analytics Spark
  • 79. R with Revolution Analytics (now Microsoft) © Copyright 2000-2015 TIBCO Software Inc. Open Source GPL License http://www.revolutionanalytics.com/webinars/introducing-revolution-r-open-enhanced-open-source-r-distribution-revolution-analytics
  • 80. R with TIBCO Runtime for R (TERR) TIBCO TERR delivers production-grade R analytics to enterprises  Flexibility & analytic power of R language  Time-to-market agility  Enterprise-grade platform • A TIBCO licensed & supported product • Not GPL, not a repackaging of the Open source R engine • Deployment in TIBCO products and 3rd party applications (e.g. Hadoop) http://spotfire.tibco.com/discover-spotfire/what-does-spotfire-do/predictive-analytics/tibco-enterprise-runtime-for-r-terr
  • 81. Use Open Source R or Not? © Copyright 2000-2015 TIBCO Software Inc. http://www.forbes.com/sites/danwoods/2015/01/27/microsofts-revolution-analytics-acquisition-is-the-wrong-way-to-embrace-r/
  • 82. Spark MLlib © Copyright 2000-2015 TIBCO Software Inc. MLlib is Spark’s machine learning (ML) library. Its goal is to make practical machine learning scalable and easy. It consists of common learning algorithms and utilities, including classification, regression, clustering, collaborative filtering, dimensionality reduction, as well as lower-level optimization primitives and higher-level pipeline APIs. You can even combine Mllib module with R language
  • 83. Predictive Sensor Analytics Live Demo (Data Discovery, Statistics)
  • 84. © Copyright 2000-2013 TIBCO Software Inc. 80% of betting happens AFTER the game begins TODAY
  • 85. Case Study: Streaming Analytics for Betting • Situation: Today, 80% of Betting is Done After the Game Starts • It’s not your father’s bookie anymore! • Problem: How to Analyze Big Betting Data? • Thousands of concurrent games, constantly adjusting odds, dozens of betting networks – firms must correlate millions of events a day to find the best betting opportunities in real-time • Solution: TIBCO for Fast Data Architecture • TXOdds uses TIBCO to correlate, aggregate, and analyze large volumes of streaming betting data in real-time and publish innovative predictive betting analytics to their customers • Result: TXOdds First to Market with Innovative Zero Latency Betting Analytics • Innovative real-time analytics help players who can process electronic data in real-time the edge “With StreamBase, in two months we had our first betting analytics feed live, and we continually deploy new ideas and evolve our old ones.” - Alex Kozlenkov, VP of technology, TXOdds
  • 86. 87 “WHEN 5 KEY BOOKIES RAISE THE SAME ODDS IN A 5-SECOND WINDOW, BET LESS” ? ? ? ? ?? ???
  • 87. 88 “WHEN THE REAL-TIME ODDS ARE 5% GREATER THAN THE HISTORICAL SPREAD, INCREASE MY BET” ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
  • 88. Reference Architecture: Streaming Betting Analytics Event Processing MONITOR REAL-TIME ANALYTICS AGGREGATE HISTORICAL COMPARISON Predictive odds analytics Zero Latency Betting Analytics GLOBAL, DISTRIBUTED INFRASTRUCTURE Historical odds deviations B U S BETTING LINES SCORES NEWS HADOOP Context: Historical Betting Data, Odds, Outcomes B U S CACHE CACHE CACHE Real-Time Analytics CORRELATE StreamBase LiveView SOCIAL
  • 89. Twitter (#TomBradyBrokenLeg) Twitter (#Boston) Brady’s Stats Actionable Insights Real-Time Social Media Analytics Twitter (#NFL) Something relevant happening? Every minute counts! Change Odds (automated or manually triggered): • Stop live-betting for the currently running game? • How many interceptions will the Quarterback throw? • Will the Patriots win the Super Bowl? • …
  • 91. Did you get the Key Message?
  • 92. – Streaming Analytics processes Data while it is in Motion! – Automation and Proactive Human Interaction are BOTH needed! – Time to Market is the Key Requirement for most Use Cases! Key Messages