SlideShare ist ein Scribd-Unternehmen logo
1 von 26
Downloaden Sie, um offline zu lesen
Data @Altocloud
Maciej Dabrowski, Chief Data Scientist
HUG 05/2017 Dublin
1
Modern Customer Engagement
• SMS
• Web Chat
• FB Messenger, Twitter DM
• Offers & Surveys
• Scheduled Callbacks
• Customer Context
• Behaviour Analytics
• Call Attribution to Campaigns
• Predictive Models
• Voice Calls
• Video
• Screen-share
Customer Journey Analytics
Connect the dots with live analytics and
AI to discover, analyse and predict
customer behaviour patterns.
Digital Messaging
Connect with customers by having live
web chat or SMS conversations, sending
targeted messages and offers.
Real Time Communications
Connect in real time using voice, video
and screensharing to engage with
exceptional customer service.
• Engage at the best time
• Accelerate revenue conversion
• Improve Customer Experience
• Resolve issues quickly
• Reduce calls / workload
• Increase First Call Resolution
• Reduce bounce and abandons
How Companies Benefit
25 people
1 dragon
2 locations
8 nationalities
having fun
…and growing!
4
EVENT PROCESSORS
Altocloud Holistic Customer Journey
BATCH
MODEL
LEARNING
ENRICHMENT
MODEL
EVALUATION
STORAGE
QUEUES
Web events
Call, IVR,Ticket events
ACTIONS
Marketing
Automation
SEGMENTATION
CRM
Web
Hook
AGGREGATION
ACTIONS
CREATION
EVENT STREAMS
OUTCOME PROBABILITIES
REAL-TIME CUSTOMER JOURNEY
Holistic view of your customers
6
Focus on real-time analytics
Make predictions on live visitors in real-time (in seconds) by:
Ingesting customer actions (events) and context
Building predictive models
Actions offered to customers based on real-time predictions
7
DISCLAIMER: NO LIPSTICK
This is not a sales pitch
Learn from mistakes of others
Show what works and what not
8
Agenda
Engineering challenges
Data platform
AI platform and workloads
9
Engineering challenges
Product complexity
Communication platform
Data platform
Scale
Millions of events per day
Billions of events overall
Typically no stable
schemas
10
Real-time aspects
Response in second(s)
Streaming nature
Reliability
24/7 availability
Services go down
Servers disappear
ALTOCLOUD DATA PLATFORM
ALTOCLOUD PLATFORM
Altocloud Platform
11
APIs
MESSAGE QUEUES
DATA PROCESSORS
STORAGE
APIsAPIs
APIs
Tools that we use
Focus on open source (Apache)
12
Tools that we use - data
13
Why Spark
Fast for iterative algorithms (important for Machine Learning)
Good integration with other tools (Kafka and Cassandra)
One code base for streaming and batch processing
Easy to deploy and maintain
Growing ecosystem (SQL, MLlib, GraphX, …)
Large open-source community
14
Data source: Kafka
Pub-sub message broker
Fast: 100s MBs /s on a single broker
Scalable: partitioned data streams
Durable: messages persisted and replicated
Distributed: Strong durability and fault-tolerance
Downside: requires ZooKeeper
15
Scalable storage
Easy to setup
High availability - no master
Great performance
CQL - SQL like querying
Great support and bug-free drivers from Datastax
Key: Design your schema around queries;
16
Data
Demographic
device
location
organisation
contact details,
and more
JSON
17
Events:
page views
form fills
searches
purchases
IVR / telephony
custom events
…
MESSAGE QUEUES
DATA PROCESSORS
DATA INGESTION
QUERY LAYER
STORAGE LAYER
Altocloud Data Platform
18
PLATFORM APIs
DATA APIs
Goals for Analytics platform
Easy to scale
As real-time as possible
Performance vs. flexibility
~80% of queries known upfront
Limited resources
Low latency
19
Analytics
MESSAGE QUEUES
DATA PROCESSORS
QUERY
LAYER
STORAGE LAYER
20
APIs
EVENT STORAGE
EVENTS DIMENSIONS VIEWS
AGGREGATIONS
EVENTS EVENT METADATA
1 2
2
4
3 5 6
7
APIs
Summary
Materialise views for buckets every minute
Hourly roll ups on raw events
Some numbers:
1bn+ events / day on 8 cores (Spark)
Sub-second query time
Lessons learned:
Know your data partitioning
Idempotent design is key!
21
Outcome Probabilities
22
AI platform
Goal: predict probability of customer X achieving goal Y
Train Models per Outcome and Business (1000s)
Apply models per each event in real time (5s)
Flexibility to add new data features on demand
Different dataset sizes forcing different algorithms
23
Spark ML Pipeline
“Decode” Spark ML pipeline & stages
Combine feature & model pipelines per-outcome
“Compose” per-outcome pipeline in streaming
Apply different pipelines per event in streaming batch
Key takeaways
Streaming over batch - highly reactive, low latency
Design for idempotent processing: things will always fail
Open source is great (most of the time) and cheap
macdab@altocloud.com
25
Complex algorithms behind simple UX
26

Weitere ähnliche Inhalte

Was ist angesagt?

Overview of Composable SaaS Models
Overview of Composable SaaS ModelsOverview of Composable SaaS Models
Overview of Composable SaaS ModelsGabe Pei
 
Kasabi Linked Data Marketplace
Kasabi Linked Data MarketplaceKasabi Linked Data Marketplace
Kasabi Linked Data MarketplaceLeigh Dodds
 
Neo4j GraphTalk Frankfurt - Identity und Access Management
Neo4j GraphTalk Frankfurt - Identity und Access ManagementNeo4j GraphTalk Frankfurt - Identity und Access Management
Neo4j GraphTalk Frankfurt - Identity und Access ManagementNeo4j
 
Real World MongoDB: Use Cases from Financial Services by Daniel Roberts
Real World MongoDB: Use Cases from Financial Services by Daniel RobertsReal World MongoDB: Use Cases from Financial Services by Daniel Roberts
Real World MongoDB: Use Cases from Financial Services by Daniel RobertsMongoDB
 
Neo4j GraphTalk - How Graphs Revolutionize Identity & Access Management
Neo4j GraphTalk - How Graphs Revolutionize Identity & Access ManagementNeo4j GraphTalk - How Graphs Revolutionize Identity & Access Management
Neo4j GraphTalk - How Graphs Revolutionize Identity & Access ManagementNeo4j
 
4 Steps to Make Customer Data Actionable
4 Steps to Make Customer Data Actionable 4 Steps to Make Customer Data Actionable
4 Steps to Make Customer Data Actionable Jean-Michel Franco
 
The role of Big Data and Modern Data Management in Driving a Customer 360 fro...
The role of Big Data and Modern Data Management in Driving a Customer 360 fro...The role of Big Data and Modern Data Management in Driving a Customer 360 fro...
The role of Big Data and Modern Data Management in Driving a Customer 360 fro...Cloudera, Inc.
 
Big data solutions explained for marketeers & business executives
Big data solutions explained for marketeers & business executivesBig data solutions explained for marketeers & business executives
Big data solutions explained for marketeers & business executivesAgile Delivery
 
Using neo4j for enterprise metadata requirements
Using neo4j for enterprise metadata requirementsUsing neo4j for enterprise metadata requirements
Using neo4j for enterprise metadata requirementsNeo4j
 

Was ist angesagt? (11)

Overview of Composable SaaS Models
Overview of Composable SaaS ModelsOverview of Composable SaaS Models
Overview of Composable SaaS Models
 
Kasabi Linked Data Marketplace
Kasabi Linked Data MarketplaceKasabi Linked Data Marketplace
Kasabi Linked Data Marketplace
 
Aw some day_insider_ycan
Aw some day_insider_ycanAw some day_insider_ycan
Aw some day_insider_ycan
 
Web Analytics
Web AnalyticsWeb Analytics
Web Analytics
 
Neo4j GraphTalk Frankfurt - Identity und Access Management
Neo4j GraphTalk Frankfurt - Identity und Access ManagementNeo4j GraphTalk Frankfurt - Identity und Access Management
Neo4j GraphTalk Frankfurt - Identity und Access Management
 
Real World MongoDB: Use Cases from Financial Services by Daniel Roberts
Real World MongoDB: Use Cases from Financial Services by Daniel RobertsReal World MongoDB: Use Cases from Financial Services by Daniel Roberts
Real World MongoDB: Use Cases from Financial Services by Daniel Roberts
 
Neo4j GraphTalk - How Graphs Revolutionize Identity & Access Management
Neo4j GraphTalk - How Graphs Revolutionize Identity & Access ManagementNeo4j GraphTalk - How Graphs Revolutionize Identity & Access Management
Neo4j GraphTalk - How Graphs Revolutionize Identity & Access Management
 
4 Steps to Make Customer Data Actionable
4 Steps to Make Customer Data Actionable 4 Steps to Make Customer Data Actionable
4 Steps to Make Customer Data Actionable
 
The role of Big Data and Modern Data Management in Driving a Customer 360 fro...
The role of Big Data and Modern Data Management in Driving a Customer 360 fro...The role of Big Data and Modern Data Management in Driving a Customer 360 fro...
The role of Big Data and Modern Data Management in Driving a Customer 360 fro...
 
Big data solutions explained for marketeers & business executives
Big data solutions explained for marketeers & business executivesBig data solutions explained for marketeers & business executives
Big data solutions explained for marketeers & business executives
 
Using neo4j for enterprise metadata requirements
Using neo4j for enterprise metadata requirementsUsing neo4j for enterprise metadata requirements
Using neo4j for enterprise metadata requirements
 

Ähnlich wie 2017 05 Hadoop User Group Meetup Dublin

Creating an Omnichannel Banking Experience with Machine Learning on Azure Dat...
Creating an Omnichannel Banking Experience with Machine Learning on Azure Dat...Creating an Omnichannel Banking Experience with Machine Learning on Azure Dat...
Creating an Omnichannel Banking Experience with Machine Learning on Azure Dat...Databricks
 
Data Analytics at Altocloud
Data Analytics at Altocloud Data Analytics at Altocloud
Data Analytics at Altocloud Altocloud
 
AWS Webcast - Sales Productivity Solutions with MicroStrategy and Redshift
AWS Webcast - Sales Productivity Solutions with MicroStrategy and RedshiftAWS Webcast - Sales Productivity Solutions with MicroStrategy and Redshift
AWS Webcast - Sales Productivity Solutions with MicroStrategy and RedshiftAmazon Web Services
 
E-Commerce and In-Memory Computing: Crossing the Scalability Chasm
E-Commerce and In-Memory Computing: Crossing the Scalability ChasmE-Commerce and In-Memory Computing: Crossing the Scalability Chasm
E-Commerce and In-Memory Computing: Crossing the Scalability ChasmAli Hodroj
 
AWS Webcast - Informatica - Big Data Solutions Showcase
AWS Webcast - Informatica - Big Data Solutions ShowcaseAWS Webcast - Informatica - Big Data Solutions Showcase
AWS Webcast - Informatica - Big Data Solutions ShowcaseAmazon Web Services
 
Using ML and Azure to improve Customer Lifetime Value
Using ML and Azure to improve Customer Lifetime ValueUsing ML and Azure to improve Customer Lifetime Value
Using ML and Azure to improve Customer Lifetime ValueNavin Albert
 
Unlocking Operational Intelligence from the Data Lake
Unlocking Operational Intelligence from the Data LakeUnlocking Operational Intelligence from the Data Lake
Unlocking Operational Intelligence from the Data LakeMongoDB
 
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerceDon't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerceDataStax
 
Ai big dataconference_ml_fastdata_vitalii bondarenko
Ai big dataconference_ml_fastdata_vitalii bondarenkoAi big dataconference_ml_fastdata_vitalii bondarenko
Ai big dataconference_ml_fastdata_vitalii bondarenkoOlga Zinkevych
 
Vitalii Bondarenko "Machine Learning on Fast Data"
Vitalii Bondarenko "Machine Learning on Fast Data"Vitalii Bondarenko "Machine Learning on Fast Data"
Vitalii Bondarenko "Machine Learning on Fast Data"DataConf
 
Unlock Data-driven Insights in Databricks Using Location Intelligence
Unlock Data-driven Insights in Databricks Using Location IntelligenceUnlock Data-driven Insights in Databricks Using Location Intelligence
Unlock Data-driven Insights in Databricks Using Location IntelligencePrecisely
 
Boston Data Engineering: Alphabet Soup with Composable Analytics
Boston Data Engineering: Alphabet Soup with Composable AnalyticsBoston Data Engineering: Alphabet Soup with Composable Analytics
Boston Data Engineering: Alphabet Soup with Composable AnalyticsBoston Data Engineering
 
Unlocking Operational Intelligence from the Data Lake
Unlocking Operational Intelligence from the Data LakeUnlocking Operational Intelligence from the Data Lake
Unlocking Operational Intelligence from the Data LakeMongoDB
 
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...DataStax
 
Overview Microsoft's ML & AI tools
Overview Microsoft's ML & AI toolsOverview Microsoft's ML & AI tools
Overview Microsoft's ML & AI toolsDavid Voyles
 
Making Kafka Cloud Native | Jay Kreps, Co-Founder & CEO, Confluent
Making Kafka Cloud Native | Jay Kreps, Co-Founder & CEO, ConfluentMaking Kafka Cloud Native | Jay Kreps, Co-Founder & CEO, Confluent
Making Kafka Cloud Native | Jay Kreps, Co-Founder & CEO, ConfluentHostedbyConfluent
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureDATAVERSITY
 
Confluent:AWS - GameDay.pptx
 Confluent:AWS - GameDay.pptx Confluent:AWS - GameDay.pptx
Confluent:AWS - GameDay.pptxAhmed791434
 

Ähnlich wie 2017 05 Hadoop User Group Meetup Dublin (20)

Creating an Omnichannel Banking Experience with Machine Learning on Azure Dat...
Creating an Omnichannel Banking Experience with Machine Learning on Azure Dat...Creating an Omnichannel Banking Experience with Machine Learning on Azure Dat...
Creating an Omnichannel Banking Experience with Machine Learning on Azure Dat...
 
Workshop: Make the Most of Customer Data Platforms - David Raab
Workshop: Make the Most of Customer Data Platforms - David RaabWorkshop: Make the Most of Customer Data Platforms - David Raab
Workshop: Make the Most of Customer Data Platforms - David Raab
 
Data Analytics at Altocloud
Data Analytics at Altocloud Data Analytics at Altocloud
Data Analytics at Altocloud
 
AWS Webcast - Sales Productivity Solutions with MicroStrategy and Redshift
AWS Webcast - Sales Productivity Solutions with MicroStrategy and RedshiftAWS Webcast - Sales Productivity Solutions with MicroStrategy and Redshift
AWS Webcast - Sales Productivity Solutions with MicroStrategy and Redshift
 
E-Commerce and In-Memory Computing: Crossing the Scalability Chasm
E-Commerce and In-Memory Computing: Crossing the Scalability ChasmE-Commerce and In-Memory Computing: Crossing the Scalability Chasm
E-Commerce and In-Memory Computing: Crossing the Scalability Chasm
 
AWS Webcast - Informatica - Big Data Solutions Showcase
AWS Webcast - Informatica - Big Data Solutions ShowcaseAWS Webcast - Informatica - Big Data Solutions Showcase
AWS Webcast - Informatica - Big Data Solutions Showcase
 
Using ML and Azure to improve Customer Lifetime Value
Using ML and Azure to improve Customer Lifetime ValueUsing ML and Azure to improve Customer Lifetime Value
Using ML and Azure to improve Customer Lifetime Value
 
Unlocking Operational Intelligence from the Data Lake
Unlocking Operational Intelligence from the Data LakeUnlocking Operational Intelligence from the Data Lake
Unlocking Operational Intelligence from the Data Lake
 
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerceDon't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
 
Ai big dataconference_ml_fastdata_vitalii bondarenko
Ai big dataconference_ml_fastdata_vitalii bondarenkoAi big dataconference_ml_fastdata_vitalii bondarenko
Ai big dataconference_ml_fastdata_vitalii bondarenko
 
Vitalii Bondarenko "Machine Learning on Fast Data"
Vitalii Bondarenko "Machine Learning on Fast Data"Vitalii Bondarenko "Machine Learning on Fast Data"
Vitalii Bondarenko "Machine Learning on Fast Data"
 
Unlock Data-driven Insights in Databricks Using Location Intelligence
Unlock Data-driven Insights in Databricks Using Location IntelligenceUnlock Data-driven Insights in Databricks Using Location Intelligence
Unlock Data-driven Insights in Databricks Using Location Intelligence
 
Boston Data Engineering: Alphabet Soup with Composable Analytics
Boston Data Engineering: Alphabet Soup with Composable AnalyticsBoston Data Engineering: Alphabet Soup with Composable Analytics
Boston Data Engineering: Alphabet Soup with Composable Analytics
 
Unlocking Operational Intelligence from the Data Lake
Unlocking Operational Intelligence from the Data LakeUnlocking Operational Intelligence from the Data Lake
Unlocking Operational Intelligence from the Data Lake
 
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
 
Overview Microsoft's ML & AI tools
Overview Microsoft's ML & AI toolsOverview Microsoft's ML & AI tools
Overview Microsoft's ML & AI tools
 
Making Kafka Cloud Native | Jay Kreps, Co-Founder & CEO, Confluent
Making Kafka Cloud Native | Jay Kreps, Co-Founder & CEO, ConfluentMaking Kafka Cloud Native | Jay Kreps, Co-Founder & CEO, Confluent
Making Kafka Cloud Native | Jay Kreps, Co-Founder & CEO, Confluent
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data Architecture
 
Confluent:AWS - GameDay.pptx
 Confluent:AWS - GameDay.pptx Confluent:AWS - GameDay.pptx
Confluent:AWS - GameDay.pptx
 
Power
PowerPower
Power
 

Mehr von mdabrowski

Spark Summit Europe 2017 - Applying multiple ML pipelines to heterogenous dat...
Spark Summit Europe 2017 - Applying multiple ML pipelines to heterogenous dat...Spark Summit Europe 2017 - Applying multiple ML pipelines to heterogenous dat...
Spark Summit Europe 2017 - Applying multiple ML pipelines to heterogenous dat...mdabrowski
 
The true meaning of data
The true meaning of dataThe true meaning of data
The true meaning of datamdabrowski
 
Near real-time recommendations in enterprise social networks
Near real-time recommendations in enterprise social networksNear real-time recommendations in enterprise social networks
Near real-time recommendations in enterprise social networksmdabrowski
 
Applications of the Social Semantic Web
Applications of the Social Semantic WebApplications of the Social Semantic Web
Applications of the Social Semantic Webmdabrowski
 
Short guide to the Semantic Web
Short guide to the Semantic WebShort guide to the Semantic Web
Short guide to the Semantic Webmdabrowski
 
Introduction to the Social Semantic Web
Introduction to the Social Semantic WebIntroduction to the Social Semantic Web
Introduction to the Social Semantic Webmdabrowski
 
Introduction to the Social Web and its applications
Introduction to the Social Web and its applicationsIntroduction to the Social Web and its applications
Introduction to the Social Web and its applicationsmdabrowski
 
Geo-annotations in Semantic Digital Libraries
Geo-annotations in Semantic Digital Libraries Geo-annotations in Semantic Digital Libraries
Geo-annotations in Semantic Digital Libraries mdabrowski
 
MarcOnt Initiative - Protege meeting
MarcOnt Initiative - Protege meetingMarcOnt Initiative - Protege meeting
MarcOnt Initiative - Protege meetingmdabrowski
 
Philosophy and Atrificial Inteligence
Philosophy and Atrificial Inteligence Philosophy and Atrificial Inteligence
Philosophy and Atrificial Inteligence mdabrowski
 
MarcOnt Initiative
MarcOnt InitiativeMarcOnt Initiative
MarcOnt Initiativemdabrowski
 

Mehr von mdabrowski (11)

Spark Summit Europe 2017 - Applying multiple ML pipelines to heterogenous dat...
Spark Summit Europe 2017 - Applying multiple ML pipelines to heterogenous dat...Spark Summit Europe 2017 - Applying multiple ML pipelines to heterogenous dat...
Spark Summit Europe 2017 - Applying multiple ML pipelines to heterogenous dat...
 
The true meaning of data
The true meaning of dataThe true meaning of data
The true meaning of data
 
Near real-time recommendations in enterprise social networks
Near real-time recommendations in enterprise social networksNear real-time recommendations in enterprise social networks
Near real-time recommendations in enterprise social networks
 
Applications of the Social Semantic Web
Applications of the Social Semantic WebApplications of the Social Semantic Web
Applications of the Social Semantic Web
 
Short guide to the Semantic Web
Short guide to the Semantic WebShort guide to the Semantic Web
Short guide to the Semantic Web
 
Introduction to the Social Semantic Web
Introduction to the Social Semantic WebIntroduction to the Social Semantic Web
Introduction to the Social Semantic Web
 
Introduction to the Social Web and its applications
Introduction to the Social Web and its applicationsIntroduction to the Social Web and its applications
Introduction to the Social Web and its applications
 
Geo-annotations in Semantic Digital Libraries
Geo-annotations in Semantic Digital Libraries Geo-annotations in Semantic Digital Libraries
Geo-annotations in Semantic Digital Libraries
 
MarcOnt Initiative - Protege meeting
MarcOnt Initiative - Protege meetingMarcOnt Initiative - Protege meeting
MarcOnt Initiative - Protege meeting
 
Philosophy and Atrificial Inteligence
Philosophy and Atrificial Inteligence Philosophy and Atrificial Inteligence
Philosophy and Atrificial Inteligence
 
MarcOnt Initiative
MarcOnt InitiativeMarcOnt Initiative
MarcOnt Initiative
 

Kürzlich hochgeladen

Statistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfStatistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfnikeshsingh56
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
Role of Consumer Insights in business transformation
Role of Consumer Insights in business transformationRole of Consumer Insights in business transformation
Role of Consumer Insights in business transformationAnnie Melnic
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaManalVerma4
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelBoston Institute of Analytics
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...Jack Cole
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfNicoChristianSunaryo
 
Non Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfNon Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfPratikPatil591646
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
Presentation of project of business person who are success
Presentation of project of business person who are successPresentation of project of business person who are success
Presentation of project of business person who are successPratikSingh115843
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etclalithasri22
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 

Kürzlich hochgeladen (17)

Statistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfStatistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdf
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
Role of Consumer Insights in business transformation
Role of Consumer Insights in business transformationRole of Consumer Insights in business transformation
Role of Consumer Insights in business transformation
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in India
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdf
 
Non Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfNon Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdf
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
Presentation of project of business person who are success
Presentation of project of business person who are successPresentation of project of business person who are success
Presentation of project of business person who are success
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etc
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use
 

2017 05 Hadoop User Group Meetup Dublin

  • 1. Data @Altocloud Maciej Dabrowski, Chief Data Scientist HUG 05/2017 Dublin 1
  • 2. Modern Customer Engagement • SMS • Web Chat • FB Messenger, Twitter DM • Offers & Surveys • Scheduled Callbacks • Customer Context • Behaviour Analytics • Call Attribution to Campaigns • Predictive Models • Voice Calls • Video • Screen-share Customer Journey Analytics Connect the dots with live analytics and AI to discover, analyse and predict customer behaviour patterns. Digital Messaging Connect with customers by having live web chat or SMS conversations, sending targeted messages and offers. Real Time Communications Connect in real time using voice, video and screensharing to engage with exceptional customer service.
  • 3. • Engage at the best time • Accelerate revenue conversion • Improve Customer Experience • Resolve issues quickly • Reduce calls / workload • Increase First Call Resolution • Reduce bounce and abandons How Companies Benefit
  • 4. 25 people 1 dragon 2 locations 8 nationalities having fun …and growing! 4
  • 5. EVENT PROCESSORS Altocloud Holistic Customer Journey BATCH MODEL LEARNING ENRICHMENT MODEL EVALUATION STORAGE QUEUES Web events Call, IVR,Ticket events ACTIONS Marketing Automation SEGMENTATION CRM Web Hook AGGREGATION ACTIONS CREATION EVENT STREAMS OUTCOME PROBABILITIES REAL-TIME CUSTOMER JOURNEY
  • 6. Holistic view of your customers 6
  • 7. Focus on real-time analytics Make predictions on live visitors in real-time (in seconds) by: Ingesting customer actions (events) and context Building predictive models Actions offered to customers based on real-time predictions 7
  • 8. DISCLAIMER: NO LIPSTICK This is not a sales pitch Learn from mistakes of others Show what works and what not 8
  • 10. Engineering challenges Product complexity Communication platform Data platform Scale Millions of events per day Billions of events overall Typically no stable schemas 10 Real-time aspects Response in second(s) Streaming nature Reliability 24/7 availability Services go down Servers disappear
  • 11. ALTOCLOUD DATA PLATFORM ALTOCLOUD PLATFORM Altocloud Platform 11 APIs MESSAGE QUEUES DATA PROCESSORS STORAGE APIsAPIs APIs
  • 12. Tools that we use Focus on open source (Apache) 12
  • 13. Tools that we use - data 13
  • 14. Why Spark Fast for iterative algorithms (important for Machine Learning) Good integration with other tools (Kafka and Cassandra) One code base for streaming and batch processing Easy to deploy and maintain Growing ecosystem (SQL, MLlib, GraphX, …) Large open-source community 14
  • 15. Data source: Kafka Pub-sub message broker Fast: 100s MBs /s on a single broker Scalable: partitioned data streams Durable: messages persisted and replicated Distributed: Strong durability and fault-tolerance Downside: requires ZooKeeper 15
  • 16. Scalable storage Easy to setup High availability - no master Great performance CQL - SQL like querying Great support and bug-free drivers from Datastax Key: Design your schema around queries; 16
  • 17. Data Demographic device location organisation contact details, and more JSON 17 Events: page views form fills searches purchases IVR / telephony custom events …
  • 18. MESSAGE QUEUES DATA PROCESSORS DATA INGESTION QUERY LAYER STORAGE LAYER Altocloud Data Platform 18 PLATFORM APIs DATA APIs
  • 19. Goals for Analytics platform Easy to scale As real-time as possible Performance vs. flexibility ~80% of queries known upfront Limited resources Low latency 19
  • 20. Analytics MESSAGE QUEUES DATA PROCESSORS QUERY LAYER STORAGE LAYER 20 APIs EVENT STORAGE EVENTS DIMENSIONS VIEWS AGGREGATIONS EVENTS EVENT METADATA 1 2 2 4 3 5 6 7 APIs
  • 21. Summary Materialise views for buckets every minute Hourly roll ups on raw events Some numbers: 1bn+ events / day on 8 cores (Spark) Sub-second query time Lessons learned: Know your data partitioning Idempotent design is key! 21
  • 23. AI platform Goal: predict probability of customer X achieving goal Y Train Models per Outcome and Business (1000s) Apply models per each event in real time (5s) Flexibility to add new data features on demand Different dataset sizes forcing different algorithms 23
  • 24. Spark ML Pipeline “Decode” Spark ML pipeline & stages Combine feature & model pipelines per-outcome “Compose” per-outcome pipeline in streaming Apply different pipelines per event in streaming batch
  • 25. Key takeaways Streaming over batch - highly reactive, low latency Design for idempotent processing: things will always fail Open source is great (most of the time) and cheap macdab@altocloud.com 25

Hinweis der Redaktion

  1. One code base for batch and streaming Richer API (e.g. window functions) HDFS is the only requirement (that is if you want to do checkpointing)
  2. Fast A single Kafka broker can handle hundreds of megabytes of reads and writes per second from thousands of clients. Scalable Kafka is designed to allow a single cluster to serve as the central data backbone for a large organization. It can be elastically and transparently expanded without downtime. Data streams are partitioned and spread over a cluster of machines to allow data streams larger than the capability of any single machine and to allow clusters of co-ordinated consumers Durable Messages are persisted on disk and replicated within the cluster to prevent data loss. Each broker can handle terabytes of messages without performance impact. Distributed by Design Kafka has a modern cluster-centric design that offers strong durability and fault-tolerance guarantees.