SlideShare ist ein Scribd-Unternehmen logo
1 von 38
Downloaden Sie, um offline zu lesen
•  SaaS Company – since 2008
•  Social Media Analytics track and measure activity
of brands and personality, providing information to
market research & brand comparison
•  Multi Language Technology (English, Portuguese
and Spanish)
•  Leader in Latin America, with operations in 5
countries, customers in LatAm and US
•  1 out of 34 Twitter Certified Program Worldwide
Our customers
Ranking Brand 1 Brand 2 Brand 3
Q2 Q3 Q2 Q3 Q2 Q3
1° Flavor Breakfast Flavor Flavor Advertising Flavor
2° Healthy Flavor Packaging Brand I love Flavor Breakfast
3° Components Components Healthy Packaging Healthy Healthy
4° Advertising Healthy Components Addiction Components Advertising
5° Enquires Desire Prices Consumption Prices Components
TOTAL 1.401 8.189 463 5.519 1.081 2.445
Share of Topics
Which conversation my brand and my competitors are driving?
smx.io/reinvent #reinvent
Challenges
Challenges: Variety
• Different data sources
• Different API
• SLA
• Method (Pull or Push)
• Rate-Limit, Backoff
strategy
Challenges: Velocity
•  Updates every second
•  Top users, top hashtags each
minute
•  After event analysis are made
with batch over complete
dataset
•  Spikes of 20,000+ tweets per
minute
Last TV
Debate
Results
Announced
Challenges: Meaning
• Disambiguation
• Data Enrichment
– Demographics
– Sentiment
– Influencers
• Human Analysis
PAN
Orange Telecom
Oi Telecom Hi!
Challenges: Alert & Report
• Clear &
Understandable UI
• Slice-dice for business
(not BI experts)
• Real-time Alerts for
Anomalies
Architecture Evolution
Drivers for Architecture Evolution
•  More customers, bigger customers
•  Add new features
•  Keep costs under control
Architecture Evolution
0
20
40
60
80
100
120
#1 #2 #3 #4
ActiveCustomers
Architecture – 1st iteration
What we needed:
• Complete data isolation
• Trying different solutions/offerings
Architecture – 1st iteration
What we did:
• All-in-one approach
• Multi instance architecture
• Simple vertical scalability
• MySQL performance tunning
Architecture – 1st iteration
What we've learned:
• Multi-instance is harder to administrate, but
minimize instability impact on customers
• Vertical scalability: poor resource management
• MySQL schema changes translates into downtime
Architecture – 2nd iteration
What we needed:
• Separation of Responsabilities (crawling,
processing)
• Horizontal Scalability
• Fast Provisioning
• Costs reduction
Architecture – 2nd iteration
What we changed:
• Migrated to AWS
• RabbitMQ (Single Node)
• Replace MySQL for RDS
• Cloud Formation
• Auto Scaling Groups
Architecture – 2nd iteration
What we've learned:
• PIOPs à
• Tuning the auto scaling policies can be hard
• Cloud Formation: great for migration, not enough
for daily ops
Architecture – 3rd iteration
What we needed:
• Deliver new features (NRT, more complex analytics)
• Scale Fast
• Be resilient against failure
• Adding and improving data-sources
• Keep costs under control (always)
Architecture – 3rd iteration
What we changed:
• Apache Storm
• RabbitMQ HA
• EMR (Hadoop/Hive)
• CloudFormation + Chef
• Glacier + S3 lifecycles policies
Architecture – 3rd iteration
What we've learned:
• Spot instances + Reserved instances
• Hive = SQL à SQL scripts are hard to test
• Bulk upserts on RDS can be expensive (PIOPS)
• DynamoDB is great, but expensive (for our use-case)
Dashboard
Architecture – 4th iteration
What we needed:
• Monitor millions of social media profiles
• Make data accessible (exploration, PoC)
• Improve UI response times
• Testing our data pipelines
• Reprocessing (faster)
Architecture – 4th iteration
What we changed:
• Cassandra (DSE)
• MongoDB MMS
• Apache Spark
What we've learned:
•  Leverage on AWS ecosystem
•  Datastax AMI + Opscenter integration
•  MongoDB MMS: automation magic!
•  Apache Spark unit testing + ec2 launch scripts
•  EMR doesn’t have the latest stable versions
Architecture – 4th iteration
Architecture Evolution
-
20
40
60
80
100
120
140
160
0
20
40
60
80
100
120
#1 #2 #3 #4
ActiveCustomers
Costs Customers
Lessons Learned
Lessons Learned
•  Automate since day 1 (cloudformation + chef)
•  Monitor systems activity, understand your data
patterns. eg: LogStash (ELK)
•  Always have a Source of Truth (S3 + Glacier)
•  Make your Source of Truth Searchable
Lessons Learned (II)
• Approximation is a good thing: HLL, CMS, Bloom
• Write your pipelines considering reprocessing
needs
•  Avoid at all costs framework explosion
• AWS ecosystem allows rapid prototype
Socialmetrix NextGen
2015
Architecture Evolution
0
20
40
60
80
100
120
#1 #2 #3 #4
ActiveCustomers
Architecture NextGen
•  Reduce moving parts
•  Apache Spark as central processing framework
–  Realtime (Micro-batch)
–  Batch-processing
•  Kafka (Message Broker)
•  Cassandra (Time-series storage)
•  ElasticSearch (Content Indexer)
To infinity …
and beyond!Architecture
Evolution
0
20
40
60
80
100
120
#1 #2 #3 #4 NextGen
ActiveCustomers
Gustavo Arjones, CTO
@arjones | gustavo@socialmetrix.com
Sebastian Montini, Solutions Architect
@sebamontini | sebastian@socialmetrix.com
Let’s talk at Venetian-Titian Hallway
Feedback and Q&A
Please give us your feedback on this
presentation
© 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
Join the conversation on Twitter with #reinvent
ARC202
Thank you!

Weitere ähnliche Inhalte

Was ist angesagt?

Boost Business Objects life cycle management and backup & recovery best pract...
Boost Business Objects life cycle management and backup & recovery best pract...Boost Business Objects life cycle management and backup & recovery best pract...
Boost Business Objects life cycle management and backup & recovery best pract...Sebastien Goiffon
 
Apache Flink Online Training
Apache Flink Online TrainingApache Flink Online Training
Apache Flink Online TrainingLearntek1
 
Tear It Down, Build It Back Up: Empowering Developers with Amazon CloudFormation
Tear It Down, Build It Back Up: Empowering Developers with Amazon CloudFormationTear It Down, Build It Back Up: Empowering Developers with Amazon CloudFormation
Tear It Down, Build It Back Up: Empowering Developers with Amazon CloudFormationJames Andrew Vaughn
 
Metrics driven development with dedicated Observability Team
Metrics driven development with dedicated Observability TeamMetrics driven development with dedicated Observability Team
Metrics driven development with dedicated Observability TeamLINE Corporation
 
Cassandra summit 2015 - Simplifying Streaming Analytics
Cassandra summit 2015 - Simplifying Streaming AnalyticsCassandra summit 2015 - Simplifying Streaming Analytics
Cassandra summit 2015 - Simplifying Streaming AnalyticsBrenden Matthews
 
Cloudtrek Basics Overview
Cloudtrek Basics OverviewCloudtrek Basics Overview
Cloudtrek Basics OverviewDmitriy Zgoda
 
Maxis Alchemize imug 2017
Maxis Alchemize imug 2017Maxis Alchemize imug 2017
Maxis Alchemize imug 2017BrandonWilhelm4
 
Next Generation Data Warehouse Development with Lambda and Redshift
Next Generation Data Warehouse Development with Lambda and RedshiftNext Generation Data Warehouse Development with Lambda and Redshift
Next Generation Data Warehouse Development with Lambda and RedshiftTerraAlto
 
Kubernetes: A Modern Approach for Scalable Infrastructure
Kubernetes: A Modern Approach for Scalable InfrastructureKubernetes: A Modern Approach for Scalable Infrastructure
Kubernetes: A Modern Approach for Scalable InfrastructureAshot Karapetyan
 
GlueCon 2015 - Publish your SQL data as web APIs
GlueCon 2015 - Publish your SQL data as web APIsGlueCon 2015 - Publish your SQL data as web APIs
GlueCon 2015 - Publish your SQL data as web APIsRestlet
 
Continuous Delivery Automation of Cloud Infrastructure and Applications at Ch...
Continuous Delivery Automation of Cloud Infrastructure and Applications at Ch...Continuous Delivery Automation of Cloud Infrastructure and Applications at Ch...
Continuous Delivery Automation of Cloud Infrastructure and Applications at Ch...Brian Mericle
 
Coursera's Adoption of Cassandra
Coursera's Adoption of CassandraCoursera's Adoption of Cassandra
Coursera's Adoption of CassandraDataStax Academy
 
ActiveMigrate - ECM Renovation Roadshow
ActiveMigrate - ECM Renovation RoadshowActiveMigrate - ECM Renovation Roadshow
ActiveMigrate - ECM Renovation RoadshowZia Consulting
 

Was ist angesagt? (20)

Apache flink
Apache flinkApache flink
Apache flink
 
Boost Business Objects life cycle management and backup & recovery best pract...
Boost Business Objects life cycle management and backup & recovery best pract...Boost Business Objects life cycle management and backup & recovery best pract...
Boost Business Objects life cycle management and backup & recovery best pract...
 
Apache Flink Online Training
Apache Flink Online TrainingApache Flink Online Training
Apache Flink Online Training
 
Master thesis
Master thesisMaster thesis
Master thesis
 
Aneka platform
Aneka platformAneka platform
Aneka platform
 
Tear It Down, Build It Back Up: Empowering Developers with Amazon CloudFormation
Tear It Down, Build It Back Up: Empowering Developers with Amazon CloudFormationTear It Down, Build It Back Up: Empowering Developers with Amazon CloudFormation
Tear It Down, Build It Back Up: Empowering Developers with Amazon CloudFormation
 
Metrics driven development with dedicated Observability Team
Metrics driven development with dedicated Observability TeamMetrics driven development with dedicated Observability Team
Metrics driven development with dedicated Observability Team
 
Cassandra summit 2015 - Simplifying Streaming Analytics
Cassandra summit 2015 - Simplifying Streaming AnalyticsCassandra summit 2015 - Simplifying Streaming Analytics
Cassandra summit 2015 - Simplifying Streaming Analytics
 
Cloudtrek Basics Overview
Cloudtrek Basics OverviewCloudtrek Basics Overview
Cloudtrek Basics Overview
 
Linq
LinqLinq
Linq
 
Maxis Alchemize imug 2017
Maxis Alchemize imug 2017Maxis Alchemize imug 2017
Maxis Alchemize imug 2017
 
Next Generation Data Warehouse Development with Lambda and Redshift
Next Generation Data Warehouse Development with Lambda and RedshiftNext Generation Data Warehouse Development with Lambda and Redshift
Next Generation Data Warehouse Development with Lambda and Redshift
 
Kubernetes: A Modern Approach for Scalable Infrastructure
Kubernetes: A Modern Approach for Scalable InfrastructureKubernetes: A Modern Approach for Scalable Infrastructure
Kubernetes: A Modern Approach for Scalable Infrastructure
 
Apache Flink
Apache FlinkApache Flink
Apache Flink
 
FatDB Intro
FatDB IntroFatDB Intro
FatDB Intro
 
GlueCon 2015 - Publish your SQL data as web APIs
GlueCon 2015 - Publish your SQL data as web APIsGlueCon 2015 - Publish your SQL data as web APIs
GlueCon 2015 - Publish your SQL data as web APIs
 
Reporting
ReportingReporting
Reporting
 
Continuous Delivery Automation of Cloud Infrastructure and Applications at Ch...
Continuous Delivery Automation of Cloud Infrastructure and Applications at Ch...Continuous Delivery Automation of Cloud Infrastructure and Applications at Ch...
Continuous Delivery Automation of Cloud Infrastructure and Applications at Ch...
 
Coursera's Adoption of Cassandra
Coursera's Adoption of CassandraCoursera's Adoption of Cassandra
Coursera's Adoption of Cassandra
 
ActiveMigrate - ECM Renovation Roadshow
ActiveMigrate - ECM Renovation RoadshowActiveMigrate - ECM Renovation Roadshow
ActiveMigrate - ECM Renovation Roadshow
 

Andere mochten auch

Mindmappen
MindmappenMindmappen
Mindmappenyperlaan
 
Using a Canary Microservice to Validate the Software Delivery Pipeline
Using a Canary Microservice to Validate the Software Delivery PipelineUsing a Canary Microservice to Validate the Software Delivery Pipeline
Using a Canary Microservice to Validate the Software Delivery PipelineXebiaLabs
 
Resume -Resume -continous monitoring
Resume -Resume -continous monitoringResume -Resume -continous monitoring
Resume -Resume -continous monitoringTony Kenny
 
Failing at Scale - PNWPHP 2016
Failing at Scale - PNWPHP 2016Failing at Scale - PNWPHP 2016
Failing at Scale - PNWPHP 2016Chris Tankersley
 
Secure Yourself, Practice what we preach - BSides Austin 2015
Secure Yourself, Practice what we preach - BSides Austin 2015Secure Yourself, Practice what we preach - BSides Austin 2015
Secure Yourself, Practice what we preach - BSides Austin 2015Michael Gough
 
Honey Potz - BSides SLC 2015
Honey Potz - BSides SLC 2015Honey Potz - BSides SLC 2015
Honey Potz - BSides SLC 2015Ethan Dodge
 
Splunk Dynamic lookup
Splunk Dynamic lookupSplunk Dynamic lookup
Splunk Dynamic lookupSplunk
 
A BRIEF OVERVIEW ON WILDLIFE MANAGEMENT
A BRIEF OVERVIEW ON WILDLIFE MANAGEMENTA BRIEF OVERVIEW ON WILDLIFE MANAGEMENT
A BRIEF OVERVIEW ON WILDLIFE MANAGEMENTPintu Kabiraj
 
Tubular Labs - Using Elastic to Search Over 2.5B Videos
Tubular Labs - Using Elastic to Search Over 2.5B VideosTubular Labs - Using Elastic to Search Over 2.5B Videos
Tubular Labs - Using Elastic to Search Over 2.5B VideosTubular Labs
 
Acts 6:1-7 ~ Organic Growth of the Early Church (pt. 1)
Acts 6:1-7 ~ Organic Growth of the Early Church (pt. 1)Acts 6:1-7 ~ Organic Growth of the Early Church (pt. 1)
Acts 6:1-7 ~ Organic Growth of the Early Church (pt. 1)Laura Zielke
 
Journey of The Connected Enterprise - Knowledge Graphs - Smart Data
Journey of The Connected Enterprise - Knowledge Graphs - Smart DataJourney of The Connected Enterprise - Knowledge Graphs - Smart Data
Journey of The Connected Enterprise - Knowledge Graphs - Smart DataBenjamin Nussbaum
 
Docker for PHP Developers - Madison PHP 2017
Docker for PHP Developers - Madison PHP 2017Docker for PHP Developers - Madison PHP 2017
Docker for PHP Developers - Madison PHP 2017Chris Tankersley
 
B2B Digital Transformation - Case Study
B2B Digital Transformation - Case StudyB2B Digital Transformation - Case Study
B2B Digital Transformation - Case StudyDivante
 
Catálogo Elk Sport 2016 2017
Catálogo Elk Sport 2016 2017Catálogo Elk Sport 2016 2017
Catálogo Elk Sport 2016 2017Elk Sport
 

Andere mochten auch (20)

Doç. Dr. Mehmet Ali GÜLÇELİK
Doç. Dr. Mehmet Ali GÜLÇELİKDoç. Dr. Mehmet Ali GÜLÇELİK
Doç. Dr. Mehmet Ali GÜLÇELİK
 
Mindmappen
MindmappenMindmappen
Mindmappen
 
DevOps Offerings at WhiteHedge
DevOps Offerings at WhiteHedgeDevOps Offerings at WhiteHedge
DevOps Offerings at WhiteHedge
 
Unit I.fundamental of Programmable DSP
Unit I.fundamental of Programmable DSPUnit I.fundamental of Programmable DSP
Unit I.fundamental of Programmable DSP
 
IOT Exploitation
IOT Exploitation	IOT Exploitation
IOT Exploitation
 
Using a Canary Microservice to Validate the Software Delivery Pipeline
Using a Canary Microservice to Validate the Software Delivery PipelineUsing a Canary Microservice to Validate the Software Delivery Pipeline
Using a Canary Microservice to Validate the Software Delivery Pipeline
 
Is 875 wind load
Is 875   wind loadIs 875   wind load
Is 875 wind load
 
Resume -Resume -continous monitoring
Resume -Resume -continous monitoringResume -Resume -continous monitoring
Resume -Resume -continous monitoring
 
"Mini Texts"
"Mini Texts" "Mini Texts"
"Mini Texts"
 
Failing at Scale - PNWPHP 2016
Failing at Scale - PNWPHP 2016Failing at Scale - PNWPHP 2016
Failing at Scale - PNWPHP 2016
 
Secure Yourself, Practice what we preach - BSides Austin 2015
Secure Yourself, Practice what we preach - BSides Austin 2015Secure Yourself, Practice what we preach - BSides Austin 2015
Secure Yourself, Practice what we preach - BSides Austin 2015
 
Honey Potz - BSides SLC 2015
Honey Potz - BSides SLC 2015Honey Potz - BSides SLC 2015
Honey Potz - BSides SLC 2015
 
Splunk Dynamic lookup
Splunk Dynamic lookupSplunk Dynamic lookup
Splunk Dynamic lookup
 
A BRIEF OVERVIEW ON WILDLIFE MANAGEMENT
A BRIEF OVERVIEW ON WILDLIFE MANAGEMENTA BRIEF OVERVIEW ON WILDLIFE MANAGEMENT
A BRIEF OVERVIEW ON WILDLIFE MANAGEMENT
 
Tubular Labs - Using Elastic to Search Over 2.5B Videos
Tubular Labs - Using Elastic to Search Over 2.5B VideosTubular Labs - Using Elastic to Search Over 2.5B Videos
Tubular Labs - Using Elastic to Search Over 2.5B Videos
 
Acts 6:1-7 ~ Organic Growth of the Early Church (pt. 1)
Acts 6:1-7 ~ Organic Growth of the Early Church (pt. 1)Acts 6:1-7 ~ Organic Growth of the Early Church (pt. 1)
Acts 6:1-7 ~ Organic Growth of the Early Church (pt. 1)
 
Journey of The Connected Enterprise - Knowledge Graphs - Smart Data
Journey of The Connected Enterprise - Knowledge Graphs - Smart DataJourney of The Connected Enterprise - Knowledge Graphs - Smart Data
Journey of The Connected Enterprise - Knowledge Graphs - Smart Data
 
Docker for PHP Developers - Madison PHP 2017
Docker for PHP Developers - Madison PHP 2017Docker for PHP Developers - Madison PHP 2017
Docker for PHP Developers - Madison PHP 2017
 
B2B Digital Transformation - Case Study
B2B Digital Transformation - Case StudyB2B Digital Transformation - Case Study
B2B Digital Transformation - Case Study
 
Catálogo Elk Sport 2016 2017
Catálogo Elk Sport 2016 2017Catálogo Elk Sport 2016 2017
Catálogo Elk Sport 2016 2017
 

Ähnlich wie AWS re:Invent 2014 | (ARC202) Real-World Real-Time Analytics

(ARC202) Real-World Real-Time Analytics | AWS re:Invent 2014
(ARC202) Real-World Real-Time Analytics | AWS re:Invent 2014(ARC202) Real-World Real-Time Analytics | AWS re:Invent 2014
(ARC202) Real-World Real-Time Analytics | AWS re:Invent 2014Amazon Web Services
 
Enterprise Data World 2018 - Building Cloud Self-Service Analytical Solution
Enterprise Data World 2018 - Building Cloud Self-Service Analytical SolutionEnterprise Data World 2018 - Building Cloud Self-Service Analytical Solution
Enterprise Data World 2018 - Building Cloud Self-Service Analytical SolutionDmitry Anoshin
 
Managing Performance Globally with MySQL
Managing Performance Globally with MySQLManaging Performance Globally with MySQL
Managing Performance Globally with MySQLDaniel Austin
 
AWS re:Invent 2016: Beeswax: Building a Real-Time Streaming Data Platform on ...
AWS re:Invent 2016: Beeswax: Building a Real-Time Streaming Data Platform on ...AWS re:Invent 2016: Beeswax: Building a Real-Time Streaming Data Platform on ...
AWS re:Invent 2016: Beeswax: Building a Real-Time Streaming Data Platform on ...Amazon Web Services
 
Big data and Analytics on AWS
Big data and Analytics on AWSBig data and Analytics on AWS
Big data and Analytics on AWS2nd Watch
 
Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...
Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...
Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...Amazon Web Services
 
Using AWS To Build A Scalable Machine Data Analytics Service
Using AWS To Build A Scalable Machine Data Analytics ServiceUsing AWS To Build A Scalable Machine Data Analytics Service
Using AWS To Build A Scalable Machine Data Analytics ServiceChristian Beedgen
 
AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...
AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...
AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...Amazon Web Services
 
Service quality monitoring system architecture
Service quality monitoring system architectureService quality monitoring system architecture
Service quality monitoring system architectureMatsuo Sawahashi
 
Automated product categorization
Automated product categorizationAutomated product categorization
Automated product categorizationAndreas Loupasakis
 
Automated product categorization
Automated product categorization   Automated product categorization
Automated product categorization Warply
 
(ARC309) Getting to Microservices: Cloud Architecture Patterns
(ARC309) Getting to Microservices: Cloud Architecture Patterns(ARC309) Getting to Microservices: Cloud Architecture Patterns
(ARC309) Getting to Microservices: Cloud Architecture PatternsAmazon Web Services
 
ADV Slides: Trends in Streaming Analytics and Message-oriented Middleware
ADV Slides: Trends in Streaming Analytics and Message-oriented MiddlewareADV Slides: Trends in Streaming Analytics and Message-oriented Middleware
ADV Slides: Trends in Streaming Analytics and Message-oriented MiddlewareDATAVERSITY
 
goto; London: Keeping your Cloud Footprint in Check
goto; London: Keeping your Cloud Footprint in Checkgoto; London: Keeping your Cloud Footprint in Check
goto; London: Keeping your Cloud Footprint in CheckCoburn Watson
 
Simplify Your Way To Expert Kubernetes Management
Simplify Your Way To Expert Kubernetes ManagementSimplify Your Way To Expert Kubernetes Management
Simplify Your Way To Expert Kubernetes ManagementDevOps.com
 
170215 msa intro
170215 msa intro170215 msa intro
170215 msa introSonic leigh
 
Optimus XPages: An Explosion of Techniques and Best Practices
Optimus XPages: An Explosion of Techniques and Best PracticesOptimus XPages: An Explosion of Techniques and Best Practices
Optimus XPages: An Explosion of Techniques and Best PracticesTeamstudio
 
Solving Office 365 Big Challenges using Cassandra + Spark
Solving Office 365 Big Challenges using Cassandra + Spark Solving Office 365 Big Challenges using Cassandra + Spark
Solving Office 365 Big Challenges using Cassandra + Spark Anubhav Kale
 
Transforming Enterprises through Next-generation Cloud Applications
Transforming Enterprises through Next-generation Cloud ApplicationsTransforming Enterprises through Next-generation Cloud Applications
Transforming Enterprises through Next-generation Cloud ApplicationsTata Consultancy Services
 
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Databricks
 

Ähnlich wie AWS re:Invent 2014 | (ARC202) Real-World Real-Time Analytics (20)

(ARC202) Real-World Real-Time Analytics | AWS re:Invent 2014
(ARC202) Real-World Real-Time Analytics | AWS re:Invent 2014(ARC202) Real-World Real-Time Analytics | AWS re:Invent 2014
(ARC202) Real-World Real-Time Analytics | AWS re:Invent 2014
 
Enterprise Data World 2018 - Building Cloud Self-Service Analytical Solution
Enterprise Data World 2018 - Building Cloud Self-Service Analytical SolutionEnterprise Data World 2018 - Building Cloud Self-Service Analytical Solution
Enterprise Data World 2018 - Building Cloud Self-Service Analytical Solution
 
Managing Performance Globally with MySQL
Managing Performance Globally with MySQLManaging Performance Globally with MySQL
Managing Performance Globally with MySQL
 
AWS re:Invent 2016: Beeswax: Building a Real-Time Streaming Data Platform on ...
AWS re:Invent 2016: Beeswax: Building a Real-Time Streaming Data Platform on ...AWS re:Invent 2016: Beeswax: Building a Real-Time Streaming Data Platform on ...
AWS re:Invent 2016: Beeswax: Building a Real-Time Streaming Data Platform on ...
 
Big data and Analytics on AWS
Big data and Analytics on AWSBig data and Analytics on AWS
Big data and Analytics on AWS
 
Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...
Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...
Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...
 
Using AWS To Build A Scalable Machine Data Analytics Service
Using AWS To Build A Scalable Machine Data Analytics ServiceUsing AWS To Build A Scalable Machine Data Analytics Service
Using AWS To Build A Scalable Machine Data Analytics Service
 
AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...
AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...
AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...
 
Service quality monitoring system architecture
Service quality monitoring system architectureService quality monitoring system architecture
Service quality monitoring system architecture
 
Automated product categorization
Automated product categorizationAutomated product categorization
Automated product categorization
 
Automated product categorization
Automated product categorization   Automated product categorization
Automated product categorization
 
(ARC309) Getting to Microservices: Cloud Architecture Patterns
(ARC309) Getting to Microservices: Cloud Architecture Patterns(ARC309) Getting to Microservices: Cloud Architecture Patterns
(ARC309) Getting to Microservices: Cloud Architecture Patterns
 
ADV Slides: Trends in Streaming Analytics and Message-oriented Middleware
ADV Slides: Trends in Streaming Analytics and Message-oriented MiddlewareADV Slides: Trends in Streaming Analytics and Message-oriented Middleware
ADV Slides: Trends in Streaming Analytics and Message-oriented Middleware
 
goto; London: Keeping your Cloud Footprint in Check
goto; London: Keeping your Cloud Footprint in Checkgoto; London: Keeping your Cloud Footprint in Check
goto; London: Keeping your Cloud Footprint in Check
 
Simplify Your Way To Expert Kubernetes Management
Simplify Your Way To Expert Kubernetes ManagementSimplify Your Way To Expert Kubernetes Management
Simplify Your Way To Expert Kubernetes Management
 
170215 msa intro
170215 msa intro170215 msa intro
170215 msa intro
 
Optimus XPages: An Explosion of Techniques and Best Practices
Optimus XPages: An Explosion of Techniques and Best PracticesOptimus XPages: An Explosion of Techniques and Best Practices
Optimus XPages: An Explosion of Techniques and Best Practices
 
Solving Office 365 Big Challenges using Cassandra + Spark
Solving Office 365 Big Challenges using Cassandra + Spark Solving Office 365 Big Challenges using Cassandra + Spark
Solving Office 365 Big Challenges using Cassandra + Spark
 
Transforming Enterprises through Next-generation Cloud Applications
Transforming Enterprises through Next-generation Cloud ApplicationsTransforming Enterprises through Next-generation Cloud Applications
Transforming Enterprises through Next-generation Cloud Applications
 
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
 

Mehr von Socialmetrix

7 Disparadores de Engagement para o mercado de consumo massivo
7 Disparadores de Engagement para o mercado de consumo massivo7 Disparadores de Engagement para o mercado de consumo massivo
7 Disparadores de Engagement para o mercado de consumo massivoSocialmetrix
 
The Ultimate Guide to using Social Media Media Analytics
The Ultimate Guide to using Social Media Media AnalyticsThe Ultimate Guide to using Social Media Media Analytics
The Ultimate Guide to using Social Media Media AnalyticsSocialmetrix
 
Social Media is no longer something relevant just for the area of Marketing. ...
Social Media is no longer something relevant just for the area of Marketing. ...Social Media is no longer something relevant just for the area of Marketing. ...
Social Media is no longer something relevant just for the area of Marketing. ...Socialmetrix
 
How to Create a Successful Social Media Campaign
How to Create a Successful Social Media CampaignHow to Create a Successful Social Media Campaign
How to Create a Successful Social Media CampaignSocialmetrix
 
¿Por que cambiar de Apache Hadoop a Apache Spark?
¿Por que cambiar de Apache Hadoop a Apache Spark?¿Por que cambiar de Apache Hadoop a Apache Spark?
¿Por que cambiar de Apache Hadoop a Apache Spark?Socialmetrix
 
Tutorial en Apache Spark - Clasificando tweets en realtime
Tutorial en Apache Spark - Clasificando tweets en realtimeTutorial en Apache Spark - Clasificando tweets en realtime
Tutorial en Apache Spark - Clasificando tweets en realtimeSocialmetrix
 
Introducción a Apache Spark a través de un caso de uso cotidiano
Introducción a Apache Spark a través de un caso de uso cotidianoIntroducción a Apache Spark a través de un caso de uso cotidiano
Introducción a Apache Spark a través de un caso de uso cotidianoSocialmetrix
 
Conferencia MySQL, NoSQL & Cloud: Construyendo una infraestructura de big dat...
Conferencia MySQL, NoSQL & Cloud: Construyendo una infraestructura de big dat...Conferencia MySQL, NoSQL & Cloud: Construyendo una infraestructura de big dat...
Conferencia MySQL, NoSQL & Cloud: Construyendo una infraestructura de big dat...Socialmetrix
 
Construyendo una Infraestructura de Big Data rentable y escalable (la evoluci...
Construyendo una Infraestructura de Big Data rentable y escalable (la evoluci...Construyendo una Infraestructura de Big Data rentable y escalable (la evoluci...
Construyendo una Infraestructura de Big Data rentable y escalable (la evoluci...Socialmetrix
 
Introducción a Apache Spark
Introducción a Apache SparkIntroducción a Apache Spark
Introducción a Apache SparkSocialmetrix
 
Social media brasil 2014 - O Marketing e as Redes Sociais em tempos de conver...
Social media brasil 2014 - O Marketing e as Redes Sociais em tempos de conver...Social media brasil 2014 - O Marketing e as Redes Sociais em tempos de conver...
Social media brasil 2014 - O Marketing e as Redes Sociais em tempos de conver...Socialmetrix
 
14º Encontro Locaweb - Evolução das Plataformas para Métricas Sociais
14º Encontro Locaweb - Evolução das Plataformas para Métricas Sociais14º Encontro Locaweb - Evolução das Plataformas para Métricas Sociais
14º Encontro Locaweb - Evolução das Plataformas para Métricas SociaisSocialmetrix
 
Jugar Introduccion a Scala
Jugar Introduccion a ScalaJugar Introduccion a Scala
Jugar Introduccion a ScalaSocialmetrix
 
Endeavor – métricas em mídias sociais
Endeavor – métricas em mídias sociaisEndeavor – métricas em mídias sociais
Endeavor – métricas em mídias sociaisSocialmetrix
 
MongoDB, RabbitMQ y Applicaciones en Nube
MongoDB, RabbitMQ y Applicaciones en NubeMongoDB, RabbitMQ y Applicaciones en Nube
MongoDB, RabbitMQ y Applicaciones en NubeSocialmetrix
 

Mehr von Socialmetrix (17)

7 Disparadores de Engagement para o mercado de consumo massivo
7 Disparadores de Engagement para o mercado de consumo massivo7 Disparadores de Engagement para o mercado de consumo massivo
7 Disparadores de Engagement para o mercado de consumo massivo
 
The Ultimate Guide to using Social Media Media Analytics
The Ultimate Guide to using Social Media Media AnalyticsThe Ultimate Guide to using Social Media Media Analytics
The Ultimate Guide to using Social Media Media Analytics
 
Social Media is no longer something relevant just for the area of Marketing. ...
Social Media is no longer something relevant just for the area of Marketing. ...Social Media is no longer something relevant just for the area of Marketing. ...
Social Media is no longer something relevant just for the area of Marketing. ...
 
How to Create a Successful Social Media Campaign
How to Create a Successful Social Media CampaignHow to Create a Successful Social Media Campaign
How to Create a Successful Social Media Campaign
 
¿Por que cambiar de Apache Hadoop a Apache Spark?
¿Por que cambiar de Apache Hadoop a Apache Spark?¿Por que cambiar de Apache Hadoop a Apache Spark?
¿Por que cambiar de Apache Hadoop a Apache Spark?
 
Tutorial en Apache Spark - Clasificando tweets en realtime
Tutorial en Apache Spark - Clasificando tweets en realtimeTutorial en Apache Spark - Clasificando tweets en realtime
Tutorial en Apache Spark - Clasificando tweets en realtime
 
Introducción a Apache Spark a través de un caso de uso cotidiano
Introducción a Apache Spark a través de un caso de uso cotidianoIntroducción a Apache Spark a través de un caso de uso cotidiano
Introducción a Apache Spark a través de un caso de uso cotidiano
 
Conferencia MySQL, NoSQL & Cloud: Construyendo una infraestructura de big dat...
Conferencia MySQL, NoSQL & Cloud: Construyendo una infraestructura de big dat...Conferencia MySQL, NoSQL & Cloud: Construyendo una infraestructura de big dat...
Conferencia MySQL, NoSQL & Cloud: Construyendo una infraestructura de big dat...
 
Construyendo una Infraestructura de Big Data rentable y escalable (la evoluci...
Construyendo una Infraestructura de Big Data rentable y escalable (la evoluci...Construyendo una Infraestructura de Big Data rentable y escalable (la evoluci...
Construyendo una Infraestructura de Big Data rentable y escalable (la evoluci...
 
Introducción a Apache Spark
Introducción a Apache SparkIntroducción a Apache Spark
Introducción a Apache Spark
 
Social media brasil 2014 - O Marketing e as Redes Sociais em tempos de conver...
Social media brasil 2014 - O Marketing e as Redes Sociais em tempos de conver...Social media brasil 2014 - O Marketing e as Redes Sociais em tempos de conver...
Social media brasil 2014 - O Marketing e as Redes Sociais em tempos de conver...
 
14º Encontro Locaweb - Evolução das Plataformas para Métricas Sociais
14º Encontro Locaweb - Evolução das Plataformas para Métricas Sociais14º Encontro Locaweb - Evolução das Plataformas para Métricas Sociais
14º Encontro Locaweb - Evolução das Plataformas para Métricas Sociais
 
Call2Social
Call2SocialCall2Social
Call2Social
 
Redis
RedisRedis
Redis
 
Jugar Introduccion a Scala
Jugar Introduccion a ScalaJugar Introduccion a Scala
Jugar Introduccion a Scala
 
Endeavor – métricas em mídias sociais
Endeavor – métricas em mídias sociaisEndeavor – métricas em mídias sociais
Endeavor – métricas em mídias sociais
 
MongoDB, RabbitMQ y Applicaciones en Nube
MongoDB, RabbitMQ y Applicaciones en NubeMongoDB, RabbitMQ y Applicaciones en Nube
MongoDB, RabbitMQ y Applicaciones en Nube
 

Kürzlich hochgeladen

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 

Kürzlich hochgeladen (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 

AWS re:Invent 2014 | (ARC202) Real-World Real-Time Analytics

  • 1. •  SaaS Company – since 2008 •  Social Media Analytics track and measure activity of brands and personality, providing information to market research & brand comparison •  Multi Language Technology (English, Portuguese and Spanish) •  Leader in Latin America, with operations in 5 countries, customers in LatAm and US •  1 out of 34 Twitter Certified Program Worldwide
  • 3.
  • 4.
  • 5. Ranking Brand 1 Brand 2 Brand 3 Q2 Q3 Q2 Q3 Q2 Q3 1° Flavor Breakfast Flavor Flavor Advertising Flavor 2° Healthy Flavor Packaging Brand I love Flavor Breakfast 3° Components Components Healthy Packaging Healthy Healthy 4° Advertising Healthy Components Addiction Components Advertising 5° Enquires Desire Prices Consumption Prices Components TOTAL 1.401 8.189 463 5.519 1.081 2.445 Share of Topics Which conversation my brand and my competitors are driving?
  • 8. Challenges: Variety • Different data sources • Different API • SLA • Method (Pull or Push) • Rate-Limit, Backoff strategy
  • 9. Challenges: Velocity •  Updates every second •  Top users, top hashtags each minute •  After event analysis are made with batch over complete dataset •  Spikes of 20,000+ tweets per minute Last TV Debate Results Announced
  • 11. Challenges: Alert & Report • Clear & Understandable UI • Slice-dice for business (not BI experts) • Real-time Alerts for Anomalies
  • 13. Drivers for Architecture Evolution •  More customers, bigger customers •  Add new features •  Keep costs under control
  • 15. Architecture – 1st iteration What we needed: • Complete data isolation • Trying different solutions/offerings
  • 16. Architecture – 1st iteration What we did: • All-in-one approach • Multi instance architecture • Simple vertical scalability • MySQL performance tunning
  • 17. Architecture – 1st iteration What we've learned: • Multi-instance is harder to administrate, but minimize instability impact on customers • Vertical scalability: poor resource management • MySQL schema changes translates into downtime
  • 18. Architecture – 2nd iteration What we needed: • Separation of Responsabilities (crawling, processing) • Horizontal Scalability • Fast Provisioning • Costs reduction
  • 19. Architecture – 2nd iteration What we changed: • Migrated to AWS • RabbitMQ (Single Node) • Replace MySQL for RDS • Cloud Formation • Auto Scaling Groups
  • 20. Architecture – 2nd iteration What we've learned: • PIOPs à • Tuning the auto scaling policies can be hard • Cloud Formation: great for migration, not enough for daily ops
  • 21. Architecture – 3rd iteration What we needed: • Deliver new features (NRT, more complex analytics) • Scale Fast • Be resilient against failure • Adding and improving data-sources • Keep costs under control (always)
  • 22. Architecture – 3rd iteration What we changed: • Apache Storm • RabbitMQ HA • EMR (Hadoop/Hive) • CloudFormation + Chef • Glacier + S3 lifecycles policies
  • 23. Architecture – 3rd iteration What we've learned: • Spot instances + Reserved instances • Hive = SQL à SQL scripts are hard to test • Bulk upserts on RDS can be expensive (PIOPS) • DynamoDB is great, but expensive (for our use-case)
  • 25. Architecture – 4th iteration What we needed: • Monitor millions of social media profiles • Make data accessible (exploration, PoC) • Improve UI response times • Testing our data pipelines • Reprocessing (faster)
  • 26. Architecture – 4th iteration What we changed: • Cassandra (DSE) • MongoDB MMS • Apache Spark
  • 27. What we've learned: •  Leverage on AWS ecosystem •  Datastax AMI + Opscenter integration •  MongoDB MMS: automation magic! •  Apache Spark unit testing + ec2 launch scripts •  EMR doesn’t have the latest stable versions Architecture – 4th iteration
  • 28.
  • 31. Lessons Learned •  Automate since day 1 (cloudformation + chef) •  Monitor systems activity, understand your data patterns. eg: LogStash (ELK) •  Always have a Source of Truth (S3 + Glacier) •  Make your Source of Truth Searchable
  • 32. Lessons Learned (II) • Approximation is a good thing: HLL, CMS, Bloom • Write your pipelines considering reprocessing needs •  Avoid at all costs framework explosion • AWS ecosystem allows rapid prototype
  • 35. Architecture NextGen •  Reduce moving parts •  Apache Spark as central processing framework –  Realtime (Micro-batch) –  Batch-processing •  Kafka (Message Broker) •  Cassandra (Time-series storage) •  ElasticSearch (Content Indexer)
  • 36. To infinity … and beyond!Architecture Evolution 0 20 40 60 80 100 120 #1 #2 #3 #4 NextGen ActiveCustomers
  • 37. Gustavo Arjones, CTO @arjones | gustavo@socialmetrix.com Sebastian Montini, Solutions Architect @sebamontini | sebastian@socialmetrix.com Let’s talk at Venetian-Titian Hallway Feedback and Q&A
  • 38. Please give us your feedback on this presentation © 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc. Join the conversation on Twitter with #reinvent ARC202 Thank you!