SlideShare ist ein Scribd-Unternehmen logo
1 von 24
Rebuilding Web Tracking
Infrastructure for Scale
Stephen Oakley
Principal Engineer
Marketo
What is Marketo?
Page 3
Marketo Proprietary and Confidential | © Marketo, Inc.
10/31/2016
What is Web Tracking at Marketo?
• Ingest web page visits and clicks on customer’s website
• Trigger campaigns in response to web activity
• Trigger real-time personalization of web experience
• Provide lead level analytics for known leads
• Provide aggregate analytics for all lead activity
• Typically known leads < 10 % of all traffic
Page 4
Marketo Proprietary and Confidential | © Marketo, Inc.
10/31/2016
Legacy Web Tracking Infrastructure
Page 5
Marketo Proprietary and Confidential | © Marketo, Inc.
10/31/2016
Legacy Web Tracking Infrastructure
Page 6
Marketo Proprietary and Confidential | © Marketo, Inc.
10/31/2016
Legacy Problems
• Throughput limitations – 2 million activities per day
• Processing delays can be on the order of hours
• Large customers cause web server brownouts
• Web reporting does not scale
• Fixed-sized clusters prohibit horizontal scaling
• Brittle infrastructure prevents feature development
The Vision
Page 8
Marketo Proprietary and Confidential | © Marketo, Inc.
10/31/2016
Orion Initiative
• Increase scale to support IoT for Marketers
• Support billions of marketing activities each day
• Trigger on activities in near real time (< 2 minute @ 99th %)
• Reduce operational costs
• Improve multitenancy and QoS
Requirements
Page 10
Marketo Proprietary and Confidential | © Marketo, Inc.
10/31/2016
Business Requirements
• 200 MM activities per customer per day
• Near real-time web activity processing (SLA of < 1
minute lag)
• Improve cost efficiency
• Improve flexibility for feature enhancements
Page 11
Marketo Proprietary and Confidential | © Marketo, Inc.
10/31/2016
Technical Requirements
• Multitenancy support with brownout protections
• Infrastructure must scale horizontally
• Decouple web processing from downstream processing
• Anonymous leads should cost next to nothing to track
Architecture & Design
Page 13
Marketo Proprietary and Confidential | © Marketo, Inc.
10/31/2016
Page 14
Marketo Proprietary and Confidential | © Marketo, Inc.
10/31/2016
Page 15
Marketo Proprietary and Confidential | © Marketo, Inc.
10/31/2016
Why Hbase + Phoenix?
• Horizontally scalable
• Leverages the Hadoop cluster for storage and scaling
• Provides secondary indices for query patterns through
Phoenix
• Natural integration with JDBC and Spark JDBC RDDs
Page 16
Marketo Proprietary and Confidential | © Marketo, Inc.
10/31/2016
Page 17
Marketo Proprietary and Confidential | © Marketo, Inc.
10/31/2016
Marketo Lambda Architecture
Spark Streaming
Consumers
Campaign Triggers
Solr Indexing
Solr
Spark Streaming Indexer
Ingestion Processor
Scala/Tomcat
HBase
Kafka
CRM Sync
Partner APIs
Other Marketing
Activities
Web Activity
RTP Activity
Mobile Activity
Marketo UI
Campaign Detail
Lead Detail
Other Clients
CRM Sync
Revenue Cycle Analylitcs
APIs
Email Report Loader
Web Activity Processor
Page 18
Marketo Proprietary and Confidential | © Marketo, Inc.
10/31/2016
Why Spark Streaming?
• Micro-batching provides sink-side efficiencies
• This is especially important with MySQL touchpoints
• Great integration with Kafka
• No strict real-time processing requirements
• Great community and industry adoption
Page 19
Marketo Proprietary and Confidential | © Marketo, Inc.
10/31/2016
Multitenancy
• One topic per customer (sized by volume)
• Traffic storms are isolated to a single customer
• Fairness/throttling is easy to control
• Spark Streaming job consumes from many topics
• Allows us to turn a customer off under error conditions
• See “Elastic Streaming” by Neelesh Shastry –
Spark Summit
Page 20
Marketo Proprietary and Confidential | © Marketo, Inc.
10/31/2016
Making Spark Streaming Performant
• Coalesce small partitions for the same customer
• Aggressive caching of metadata (mostly from MySQL)
• Heavily leverage Scala future composition for parallelism
• Persist RDDs that are used for multiple outputs
• e.g. write to Kafka and Activity Service
Page 21
Marketo Proprietary and Confidential | © Marketo, Inc.
10/31/2016
Making Anonymous Traffic Cheap
• High costs of web traffic in legacy system
• MySQL storage for all traffic
• Down streaming processing of all events (even anonymous)
• V2 only processes and stores known traffic in MySQL
• Defer triggering for anonymous data until promotion
• Rolled out to our highest volume customers
• Processing latencies < 30s (at 99.9th %)
• Allowed key customers to scale from ~2MM/day to > 20
MM/day
Impact and Results
• Mitigations of straggler effects on processing delays
• Adding sessionization for web reporting
• Scaling Kafka topics as customers increase volume
• Globally distributed ingestion for a single customer
Future Work
We’re Hiring!
Http://Marketo.Jobs
Q & A

Weitere ähnliche Inhalte

Was ist angesagt?

Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...DataWorks Summit/Hadoop Summit
 
Practice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China MobilePractice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China MobileDataWorks Summit
 
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterpriseUsing Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterpriseDataWorks Summit
 
Data Regions: Modernizing your company's data ecosystem
Data Regions: Modernizing your company's data ecosystemData Regions: Modernizing your company's data ecosystem
Data Regions: Modernizing your company's data ecosystemDataWorks Summit/Hadoop Summit
 
Avoiding Log Data Overload in a CI/CD System While Streaming 190 Billion Even...
Avoiding Log Data Overload in a CI/CD System While Streaming 190 Billion Even...Avoiding Log Data Overload in a CI/CD System While Streaming 190 Billion Even...
Avoiding Log Data Overload in a CI/CD System While Streaming 190 Billion Even...DataWorks Summit
 
Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017alanfgates
 
Introduction to Apache NiFi 1.10
Introduction to Apache NiFi 1.10Introduction to Apache NiFi 1.10
Introduction to Apache NiFi 1.10Timothy Spann
 
IoT with Apache MXNet and Apache NiFi and MiniFi
IoT with Apache MXNet and Apache NiFi and MiniFiIoT with Apache MXNet and Apache NiFi and MiniFi
IoT with Apache MXNet and Apache NiFi and MiniFiDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...
Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...
Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...DataWorks Summit
 
Pivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream AnalyticsPivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream Analyticskgshukla
 
Realizing the promise of portable data processing with Apache Beam
Realizing the promise of portable data processing with Apache BeamRealizing the promise of portable data processing with Apache Beam
Realizing the promise of portable data processing with Apache BeamDataWorks Summit
 
Curing the Kafka Blindness – Streams Messaging Manager
Curing the Kafka Blindness – Streams Messaging ManagerCuring the Kafka Blindness – Streams Messaging Manager
Curing the Kafka Blindness – Streams Messaging ManagerDataWorks Summit
 
Lessons learned running a container cloud on YARN
Lessons learned running a container cloud on YARNLessons learned running a container cloud on YARN
Lessons learned running a container cloud on YARNDataWorks Summit
 
MiNiFi 0.0.1 MeetUp talk
MiNiFi 0.0.1 MeetUp talkMiNiFi 0.0.1 MeetUp talk
MiNiFi 0.0.1 MeetUp talkJoe Percivall
 
Fast SQL on Hadoop, Really?
Fast SQL on Hadoop, Really?Fast SQL on Hadoop, Really?
Fast SQL on Hadoop, Really?DataWorks Summit
 

Was ist angesagt? (20)

Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
 
Practice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China MobilePractice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China Mobile
 
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterpriseUsing Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
 
Data Regions: Modernizing your company's data ecosystem
Data Regions: Modernizing your company's data ecosystemData Regions: Modernizing your company's data ecosystem
Data Regions: Modernizing your company's data ecosystem
 
Avoiding Log Data Overload in a CI/CD System While Streaming 190 Billion Even...
Avoiding Log Data Overload in a CI/CD System While Streaming 190 Billion Even...Avoiding Log Data Overload in a CI/CD System While Streaming 190 Billion Even...
Avoiding Log Data Overload in a CI/CD System While Streaming 190 Billion Even...
 
Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017
 
Introduction to Apache NiFi 1.10
Introduction to Apache NiFi 1.10Introduction to Apache NiFi 1.10
Introduction to Apache NiFi 1.10
 
IoT with Apache MXNet and Apache NiFi and MiniFi
IoT with Apache MXNet and Apache NiFi and MiniFiIoT with Apache MXNet and Apache NiFi and MiniFi
IoT with Apache MXNet and Apache NiFi and MiniFi
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...
Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...
Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...
 
Pivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream AnalyticsPivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream Analytics
 
Realizing the promise of portable data processing with Apache Beam
Realizing the promise of portable data processing with Apache BeamRealizing the promise of portable data processing with Apache Beam
Realizing the promise of portable data processing with Apache Beam
 
Curing the Kafka Blindness – Streams Messaging Manager
Curing the Kafka Blindness – Streams Messaging ManagerCuring the Kafka Blindness – Streams Messaging Manager
Curing the Kafka Blindness – Streams Messaging Manager
 
Lessons learned running a container cloud on YARN
Lessons learned running a container cloud on YARNLessons learned running a container cloud on YARN
Lessons learned running a container cloud on YARN
 
Apache deep learning 101
Apache deep learning 101Apache deep learning 101
Apache deep learning 101
 
MiNiFi 0.0.1 MeetUp talk
MiNiFi 0.0.1 MeetUp talkMiNiFi 0.0.1 MeetUp talk
MiNiFi 0.0.1 MeetUp talk
 
Why is my Hadoop cluster slow?
Why is my Hadoop cluster slow?Why is my Hadoop cluster slow?
Why is my Hadoop cluster slow?
 
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and Future
 
Ingest and Stream Processing - What will you choose?
Ingest and Stream Processing - What will you choose?Ingest and Stream Processing - What will you choose?
Ingest and Stream Processing - What will you choose?
 
Fast SQL on Hadoop, Really?
Fast SQL on Hadoop, Really?Fast SQL on Hadoop, Really?
Fast SQL on Hadoop, Really?
 

Andere mochten auch

From a single droplet to a full bottle, our journey to Hadoop at Coca-Cola Ea...
From a single droplet to a full bottle, our journey to Hadoop at Coca-Cola Ea...From a single droplet to a full bottle, our journey to Hadoop at Coca-Cola Ea...
From a single droplet to a full bottle, our journey to Hadoop at Coca-Cola Ea...DataWorks Summit/Hadoop Summit
 
Use case and Live demo : Agile data integration from Legacy system to Hadoop ...
Use case and Live demo : Agile data integration from Legacy system to Hadoop ...Use case and Live demo : Agile data integration from Legacy system to Hadoop ...
Use case and Live demo : Agile data integration from Legacy system to Hadoop ...DataWorks Summit/Hadoop Summit
 
Leveraging smart meter data for electric utilities: Comparison of Spark SQL w...
Leveraging smart meter data for electric utilities: Comparison of Spark SQL w...Leveraging smart meter data for electric utilities: Comparison of Spark SQL w...
Leveraging smart meter data for electric utilities: Comparison of Spark SQL w...DataWorks Summit/Hadoop Summit
 
Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNE
Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNEGenerating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNE
Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNEDataWorks Summit/Hadoop Summit
 
Network for the Large-scale Hadoop cluster at Yahoo! JAPAN
Network for the Large-scale Hadoop cluster at Yahoo! JAPANNetwork for the Large-scale Hadoop cluster at Yahoo! JAPAN
Network for the Large-scale Hadoop cluster at Yahoo! JAPANDataWorks Summit/Hadoop Summit
 
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...DataWorks Summit/Hadoop Summit
 
Major advancements in Apache Hive towards full support of SQL compliance
Major advancements in Apache Hive towards full support of SQL complianceMajor advancements in Apache Hive towards full support of SQL compliance
Major advancements in Apache Hive towards full support of SQL complianceDataWorks Summit/Hadoop Summit
 
Introduction to Hadoop and Spark (before joining the other talk) and An Overv...
Introduction to Hadoop and Spark (before joining the other talk) and An Overv...Introduction to Hadoop and Spark (before joining the other talk) and An Overv...
Introduction to Hadoop and Spark (before joining the other talk) and An Overv...DataWorks Summit/Hadoop Summit
 
A Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark ClustersA Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark ClustersDataWorks Summit/Hadoop Summit
 
Data infrastructure architecture for medium size organization: tips for colle...
Data infrastructure architecture for medium size organization: tips for colle...Data infrastructure architecture for medium size organization: tips for colle...
Data infrastructure architecture for medium size organization: tips for colle...DataWorks Summit/Hadoop Summit
 

Andere mochten auch (20)

The truth about SQL and Data Warehousing on Hadoop
The truth about SQL and Data Warehousing on HadoopThe truth about SQL and Data Warehousing on Hadoop
The truth about SQL and Data Warehousing on Hadoop
 
Comparison of Transactional Libraries for HBase
Comparison of Transactional Libraries for HBaseComparison of Transactional Libraries for HBase
Comparison of Transactional Libraries for HBase
 
From a single droplet to a full bottle, our journey to Hadoop at Coca-Cola Ea...
From a single droplet to a full bottle, our journey to Hadoop at Coca-Cola Ea...From a single droplet to a full bottle, our journey to Hadoop at Coca-Cola Ea...
From a single droplet to a full bottle, our journey to Hadoop at Coca-Cola Ea...
 
SEGA : Growth hacking by Spark ML for Mobile games
SEGA : Growth hacking by Spark ML for Mobile gamesSEGA : Growth hacking by Spark ML for Mobile games
SEGA : Growth hacking by Spark ML for Mobile games
 
The real world use of Big Data to change business
The real world use of Big Data to change businessThe real world use of Big Data to change business
The real world use of Big Data to change business
 
Use case and Live demo : Agile data integration from Legacy system to Hadoop ...
Use case and Live demo : Agile data integration from Legacy system to Hadoop ...Use case and Live demo : Agile data integration from Legacy system to Hadoop ...
Use case and Live demo : Agile data integration from Legacy system to Hadoop ...
 
Streamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache AmbariStreamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache Ambari
 
Case study of DevOps for Hadoop in Recruit.
Case study of DevOps for Hadoop in Recruit.Case study of DevOps for Hadoop in Recruit.
Case study of DevOps for Hadoop in Recruit.
 
Leveraging smart meter data for electric utilities: Comparison of Spark SQL w...
Leveraging smart meter data for electric utilities: Comparison of Spark SQL w...Leveraging smart meter data for electric utilities: Comparison of Spark SQL w...
Leveraging smart meter data for electric utilities: Comparison of Spark SQL w...
 
Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNE
Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNEGenerating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNE
Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNE
 
#HSTokyo16 Apache Spark Crash Course
#HSTokyo16 Apache Spark Crash Course #HSTokyo16 Apache Spark Crash Course
#HSTokyo16 Apache Spark Crash Course
 
Network for the Large-scale Hadoop cluster at Yahoo! JAPAN
Network for the Large-scale Hadoop cluster at Yahoo! JAPANNetwork for the Large-scale Hadoop cluster at Yahoo! JAPAN
Network for the Large-scale Hadoop cluster at Yahoo! JAPAN
 
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...
 
Hadoop Summit Tokyo HDP Sandbox Workshop
Hadoop Summit Tokyo HDP Sandbox Workshop Hadoop Summit Tokyo HDP Sandbox Workshop
Hadoop Summit Tokyo HDP Sandbox Workshop
 
Hadoop Summit Tokyo Apache NiFi Crash Course
Hadoop Summit Tokyo Apache NiFi Crash CourseHadoop Summit Tokyo Apache NiFi Crash Course
Hadoop Summit Tokyo Apache NiFi Crash Course
 
Major advancements in Apache Hive towards full support of SQL compliance
Major advancements in Apache Hive towards full support of SQL complianceMajor advancements in Apache Hive towards full support of SQL compliance
Major advancements in Apache Hive towards full support of SQL compliance
 
Introduction to Hadoop and Spark (before joining the other talk) and An Overv...
Introduction to Hadoop and Spark (before joining the other talk) and An Overv...Introduction to Hadoop and Spark (before joining the other talk) and An Overv...
Introduction to Hadoop and Spark (before joining the other talk) and An Overv...
 
A Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark ClustersA Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark Clusters
 
Apache Hadoop 3.0 What's new in YARN and MapReduce
Apache Hadoop 3.0 What's new in YARN and MapReduceApache Hadoop 3.0 What's new in YARN and MapReduce
Apache Hadoop 3.0 What's new in YARN and MapReduce
 
Data infrastructure architecture for medium size organization: tips for colle...
Data infrastructure architecture for medium size organization: tips for colle...Data infrastructure architecture for medium size organization: tips for colle...
Data infrastructure architecture for medium size organization: tips for colle...
 

Ähnlich wie Rebuilding Web Tracking Infrastructure for Scale

Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...
Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...
Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...Continuent
 
Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...
Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...
Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...HostedbyConfluent
 
Adobe Ask the AEM Community Expert Session Oct 2016
Adobe Ask the AEM Community Expert Session Oct 2016Adobe Ask the AEM Community Expert Session Oct 2016
Adobe Ask the AEM Community Expert Session Oct 2016AdobeMarketingCloud
 
Enabling Telco to Build and Run Modern Applications
Enabling Telco to Build and Run Modern Applications Enabling Telco to Build and Run Modern Applications
Enabling Telco to Build and Run Modern Applications Tugdual Grall
 
Accelerating a Path to Digital with a Cloud Data Strategy
Accelerating a Path to Digital with a Cloud Data StrategyAccelerating a Path to Digital with a Cloud Data Strategy
Accelerating a Path to Digital with a Cloud Data StrategyMongoDB
 
Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop
Successes, Challenges, and Pitfalls Migrating a SAAS business to HadoopSuccesses, Challenges, and Pitfalls Migrating a SAAS business to Hadoop
Successes, Challenges, and Pitfalls Migrating a SAAS business to HadoopDataWorks Summit/Hadoop Summit
 
SUGCON NA 2023 - Crafting Lightning Fast Composable Experiences.pptx
SUGCON NA 2023 - Crafting Lightning Fast Composable Experiences.pptxSUGCON NA 2023 - Crafting Lightning Fast Composable Experiences.pptx
SUGCON NA 2023 - Crafting Lightning Fast Composable Experiences.pptxVasiliy Fomichev
 
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...Big Data Spain
 
Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Ser...
Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Ser...Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Ser...
Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Ser...Kai Wähner
 
Deliver Unrivaled End-User Experience With Confidence - How Synthetic Monitor...
Deliver Unrivaled End-User Experience With Confidence - How Synthetic Monitor...Deliver Unrivaled End-User Experience With Confidence - How Synthetic Monitor...
Deliver Unrivaled End-User Experience With Confidence - How Synthetic Monitor...DevOps.com
 
Understanding the Top Four Use Cases for IoT
Understanding the Top Four Use Cases for IoTUnderstanding the Top Four Use Cases for IoT
Understanding the Top Four Use Cases for IoTVoltDB
 
Using SQL and Salesforce Data to Build a Product Catalog (or Anything) in Con...
Using SQL and Salesforce Data to Build a Product Catalog (or Anything) in Con...Using SQL and Salesforce Data to Build a Product Catalog (or Anything) in Con...
Using SQL and Salesforce Data to Build a Product Catalog (or Anything) in Con...ServiceRocket
 
Using SQL and Salesforce data to build a Product Catalog (or anything) in Con...
Using SQL and Salesforce data to build a Product Catalog (or anything) in Con...Using SQL and Salesforce data to build a Product Catalog (or anything) in Con...
Using SQL and Salesforce data to build a Product Catalog (or anything) in Con...ServiceRocket
 
Acting on Real-time Behavior: How Peak Games Won Transactions
Acting on Real-time Behavior: How Peak Games Won TransactionsActing on Real-time Behavior: How Peak Games Won Transactions
Acting on Real-time Behavior: How Peak Games Won TransactionsVoltDB
 
Webinar Slides: High Volume MySQL HA: SaaS Continuous Operations with Terabyt...
Webinar Slides: High Volume MySQL HA: SaaS Continuous Operations with Terabyt...Webinar Slides: High Volume MySQL HA: SaaS Continuous Operations with Terabyt...
Webinar Slides: High Volume MySQL HA: SaaS Continuous Operations with Terabyt...Continuent
 
Digital Transformation in Market Data and Trading Platforms
Digital Transformation in Market Data and Trading PlatformsDigital Transformation in Market Data and Trading Platforms
Digital Transformation in Market Data and Trading PlatformsSolace
 
The role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial InformaticsThe role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial InformaticsAerospike, Inc.
 
Accelerating a Path to Digital With a Cloud Data Strategy
Accelerating a Path to Digital With a Cloud Data StrategyAccelerating a Path to Digital With a Cloud Data Strategy
Accelerating a Path to Digital With a Cloud Data StrategyMongoDB
 

Ähnlich wie Rebuilding Web Tracking Infrastructure for Scale (20)

Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...
Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...
Marketing Automation at Scale: How Marketo Solved Key Data Management Challen...
 
Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...
Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...
Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...
 
Adobe Ask the AEM Community Expert Session Oct 2016
Adobe Ask the AEM Community Expert Session Oct 2016Adobe Ask the AEM Community Expert Session Oct 2016
Adobe Ask the AEM Community Expert Session Oct 2016
 
Big Kahuna
Big KahunaBig Kahuna
Big Kahuna
 
Enabling Telco to Build and Run Modern Applications
Enabling Telco to Build and Run Modern Applications Enabling Telco to Build and Run Modern Applications
Enabling Telco to Build and Run Modern Applications
 
Accelerating a Path to Digital with a Cloud Data Strategy
Accelerating a Path to Digital with a Cloud Data StrategyAccelerating a Path to Digital with a Cloud Data Strategy
Accelerating a Path to Digital with a Cloud Data Strategy
 
Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop
Successes, Challenges, and Pitfalls Migrating a SAAS business to HadoopSuccesses, Challenges, and Pitfalls Migrating a SAAS business to Hadoop
Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop
 
SUGCON NA 2023 - Crafting Lightning Fast Composable Experiences.pptx
SUGCON NA 2023 - Crafting Lightning Fast Composable Experiences.pptxSUGCON NA 2023 - Crafting Lightning Fast Composable Experiences.pptx
SUGCON NA 2023 - Crafting Lightning Fast Composable Experiences.pptx
 
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
 
Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Ser...
Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Ser...Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Ser...
Streaming Analytics Comparison of Open Source Frameworks, Products, Cloud Ser...
 
Deliver Unrivaled End-User Experience With Confidence - How Synthetic Monitor...
Deliver Unrivaled End-User Experience With Confidence - How Synthetic Monitor...Deliver Unrivaled End-User Experience With Confidence - How Synthetic Monitor...
Deliver Unrivaled End-User Experience With Confidence - How Synthetic Monitor...
 
Understanding the Top Four Use Cases for IoT
Understanding the Top Four Use Cases for IoTUnderstanding the Top Four Use Cases for IoT
Understanding the Top Four Use Cases for IoT
 
Using SQL and Salesforce Data to Build a Product Catalog (or Anything) in Con...
Using SQL and Salesforce Data to Build a Product Catalog (or Anything) in Con...Using SQL and Salesforce Data to Build a Product Catalog (or Anything) in Con...
Using SQL and Salesforce Data to Build a Product Catalog (or Anything) in Con...
 
Using SQL and Salesforce data to build a Product Catalog (or anything) in Con...
Using SQL and Salesforce data to build a Product Catalog (or anything) in Con...Using SQL and Salesforce data to build a Product Catalog (or anything) in Con...
Using SQL and Salesforce data to build a Product Catalog (or anything) in Con...
 
Acting on Real-time Behavior: How Peak Games Won Transactions
Acting on Real-time Behavior: How Peak Games Won TransactionsActing on Real-time Behavior: How Peak Games Won Transactions
Acting on Real-time Behavior: How Peak Games Won Transactions
 
Webinar Slides: High Volume MySQL HA: SaaS Continuous Operations with Terabyt...
Webinar Slides: High Volume MySQL HA: SaaS Continuous Operations with Terabyt...Webinar Slides: High Volume MySQL HA: SaaS Continuous Operations with Terabyt...
Webinar Slides: High Volume MySQL HA: SaaS Continuous Operations with Terabyt...
 
JAMStack
JAMStackJAMStack
JAMStack
 
Digital Transformation in Market Data and Trading Platforms
Digital Transformation in Market Data and Trading PlatformsDigital Transformation in Market Data and Trading Platforms
Digital Transformation in Market Data and Trading Platforms
 
The role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial InformaticsThe role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial Informatics
 
Accelerating a Path to Digital With a Cloud Data Strategy
Accelerating a Path to Digital With a Cloud Data StrategyAccelerating a Path to Digital With a Cloud Data Strategy
Accelerating a Path to Digital With a Cloud Data Strategy
 

Mehr von DataWorks Summit/Hadoop Summit

Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerDataWorks Summit/Hadoop Summit
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformDataWorks Summit/Hadoop Summit
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDataWorks Summit/Hadoop Summit
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...DataWorks Summit/Hadoop Summit
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...DataWorks Summit/Hadoop Summit
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLDataWorks Summit/Hadoop Summit
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)DataWorks Summit/Hadoop Summit
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...DataWorks Summit/Hadoop Summit
 

Mehr von DataWorks Summit/Hadoop Summit (20)

Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in ProductionRunning Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
 
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache ZeppelinState of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
 
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
 
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and ZeppelinRevolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
 
Hadoop Crash Course
Hadoop Crash CourseHadoop Crash Course
Hadoop Crash Course
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Apache Spark Crash Course
Apache Spark Crash CourseApache Spark Crash Course
Apache Spark Crash Course
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
Schema Registry - Set you Data Free
Schema Registry - Set you Data FreeSchema Registry - Set you Data Free
Schema Registry - Set you Data Free
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
 
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopBreaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
 

Kürzlich hochgeladen

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 

Kürzlich hochgeladen (20)

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 

Rebuilding Web Tracking Infrastructure for Scale

  • 1. Rebuilding Web Tracking Infrastructure for Scale Stephen Oakley Principal Engineer Marketo
  • 3. Page 3 Marketo Proprietary and Confidential | © Marketo, Inc. 10/31/2016 What is Web Tracking at Marketo? • Ingest web page visits and clicks on customer’s website • Trigger campaigns in response to web activity • Trigger real-time personalization of web experience • Provide lead level analytics for known leads • Provide aggregate analytics for all lead activity • Typically known leads < 10 % of all traffic
  • 4. Page 4 Marketo Proprietary and Confidential | © Marketo, Inc. 10/31/2016 Legacy Web Tracking Infrastructure
  • 5. Page 5 Marketo Proprietary and Confidential | © Marketo, Inc. 10/31/2016 Legacy Web Tracking Infrastructure
  • 6. Page 6 Marketo Proprietary and Confidential | © Marketo, Inc. 10/31/2016 Legacy Problems • Throughput limitations – 2 million activities per day • Processing delays can be on the order of hours • Large customers cause web server brownouts • Web reporting does not scale • Fixed-sized clusters prohibit horizontal scaling • Brittle infrastructure prevents feature development
  • 8. Page 8 Marketo Proprietary and Confidential | © Marketo, Inc. 10/31/2016 Orion Initiative • Increase scale to support IoT for Marketers • Support billions of marketing activities each day • Trigger on activities in near real time (< 2 minute @ 99th %) • Reduce operational costs • Improve multitenancy and QoS
  • 10. Page 10 Marketo Proprietary and Confidential | © Marketo, Inc. 10/31/2016 Business Requirements • 200 MM activities per customer per day • Near real-time web activity processing (SLA of < 1 minute lag) • Improve cost efficiency • Improve flexibility for feature enhancements
  • 11. Page 11 Marketo Proprietary and Confidential | © Marketo, Inc. 10/31/2016 Technical Requirements • Multitenancy support with brownout protections • Infrastructure must scale horizontally • Decouple web processing from downstream processing • Anonymous leads should cost next to nothing to track
  • 13. Page 13 Marketo Proprietary and Confidential | © Marketo, Inc. 10/31/2016
  • 14. Page 14 Marketo Proprietary and Confidential | © Marketo, Inc. 10/31/2016
  • 15. Page 15 Marketo Proprietary and Confidential | © Marketo, Inc. 10/31/2016 Why Hbase + Phoenix? • Horizontally scalable • Leverages the Hadoop cluster for storage and scaling • Provides secondary indices for query patterns through Phoenix • Natural integration with JDBC and Spark JDBC RDDs
  • 16. Page 16 Marketo Proprietary and Confidential | © Marketo, Inc. 10/31/2016
  • 17. Page 17 Marketo Proprietary and Confidential | © Marketo, Inc. 10/31/2016 Marketo Lambda Architecture Spark Streaming Consumers Campaign Triggers Solr Indexing Solr Spark Streaming Indexer Ingestion Processor Scala/Tomcat HBase Kafka CRM Sync Partner APIs Other Marketing Activities Web Activity RTP Activity Mobile Activity Marketo UI Campaign Detail Lead Detail Other Clients CRM Sync Revenue Cycle Analylitcs APIs Email Report Loader Web Activity Processor
  • 18. Page 18 Marketo Proprietary and Confidential | © Marketo, Inc. 10/31/2016 Why Spark Streaming? • Micro-batching provides sink-side efficiencies • This is especially important with MySQL touchpoints • Great integration with Kafka • No strict real-time processing requirements • Great community and industry adoption
  • 19. Page 19 Marketo Proprietary and Confidential | © Marketo, Inc. 10/31/2016 Multitenancy • One topic per customer (sized by volume) • Traffic storms are isolated to a single customer • Fairness/throttling is easy to control • Spark Streaming job consumes from many topics • Allows us to turn a customer off under error conditions • See “Elastic Streaming” by Neelesh Shastry – Spark Summit
  • 20. Page 20 Marketo Proprietary and Confidential | © Marketo, Inc. 10/31/2016 Making Spark Streaming Performant • Coalesce small partitions for the same customer • Aggressive caching of metadata (mostly from MySQL) • Heavily leverage Scala future composition for parallelism • Persist RDDs that are used for multiple outputs • e.g. write to Kafka and Activity Service
  • 21. Page 21 Marketo Proprietary and Confidential | © Marketo, Inc. 10/31/2016 Making Anonymous Traffic Cheap • High costs of web traffic in legacy system • MySQL storage for all traffic • Down streaming processing of all events (even anonymous) • V2 only processes and stores known traffic in MySQL • Defer triggering for anonymous data until promotion
  • 22. • Rolled out to our highest volume customers • Processing latencies < 30s (at 99.9th %) • Allowed key customers to scale from ~2MM/day to > 20 MM/day Impact and Results
  • 23. • Mitigations of straggler effects on processing delays • Adding sessionization for web reporting • Scaling Kafka topics as customers increase volume • Globally distributed ingestion for a single customer Future Work

Hinweis der Redaktion

  1. Next phase was when we were ready to validate our newly built event ingestion system Marketo is a powerful Engagement Marketing Platform. There are several applications that make up the platform, such as ABM, Marketing analytics, predictive content, Digital Ads, and Marketing Automation. Marketing automation is what we are focusing on today. Marketing Automation enables the marketer to create, automate and measure marketing campaigns across channels. A simple example of an automated campaign or workflow is User visits your website and fills out a form Web tracking sees that they spent most their time looking at pages about spark streaming Automatically Send an email to the user to Invite them to a webinar on spark streaming services If they attend the webinar, register their interests in your crm and request a sales person contacts the user The campaigns can be complex and can reach out and track customers across channels like web, email, mobile, social
  2. Explain what a known vs anonymous lead is Known is targetable on other channels, anonymous is only web activity Speak to how the traffic patterns are heavily skewed toward anonymous given our customer base Talk about how anonymous converts to known. Aggregate analytics include company web report, landing page reports, etc.
  3. Speak to the pod Mention how there are many many pods
  4. An additional complication is the fact that the same two webservers also serve the mlm app, soap apis, and the landing pages
  5. Although the talk isn’t about the project…  we have a few slides up front to set the context around what we are working on If you have been near technology at all in the last couple of years you know that the world has become very connected.   The number of connected devices blows my mind.  It’s not just phones anymore…   Amazon dash buttons, coffee makers, propane tanks, garage doors.  These devices are sending 10’s of billions of activities and user interactions every day... Orion is our platfor Our marketing platform ingests the user interactions process them into relevant marketing touchpoints Its enables marketers to create marketing campaigns around these activities to build relationships with their customers Become the fabric for marketers Its been a great experience building this
  6. Here are a few of the requirements Near real time processing At least a 1 billion activities per customer per day. customer demands from increasing devices caused us to evaluate next get queueing and streaming... reduction in infrastructure COGS primarily from expensive enterprise class filers... reduction in people COGS by gained efficiency from reducing tech stack from using too many similar technologies ... Multitenant… of course Secure Customer isolation and improved resource management
  7. Arch requirement driven from biz requirement Improve utilization over the existing system Lots of customers in same infra, without starving Encryption from day 1 for safe data storage Aim for horz scalability Coming from standard 3 tier app Radically reduce processing latency Eliminate backlogs Brownout protection
  8. A few words about the architecture Main goal is to inject, process and store marketing events
  9. Details overview of Munchkin FE component Spray.io for MFE Frontend has the simple job of verifying subscription status, collecting metrics and persisting to kafka Use Avro to allow for schema evolution, strong typing and compact representation in topic Use Schema registry to allow the schema to be upgraded by the producer and them automatically picked up by the spark streaming component Use asynchronous API for kafka to allow high throughput.
  10. Details overview of LeadService component Spray.io for leadservice Hbase for Cookie and anonymous lead storage Salted table Key structure is subscription-cookie-leadid Secondary index for subscription-lead-createdat MySQL for known lead storage Masterdata for reverse ip information enrichments
  11. Overall view for the system Describe how there is a Kafka topic per subscription Spark streaming transforms the raw events into activities by Enriching with web page metadata from MySQL Lead and reverse IP enrichment from LeadService Persist activities to AS for storage and secondary processing (e.g. triggering and solr indexing) Push enriched web events to Kafka for the downstream Druid OLAP infrastructure.
  12. High level diagram of our event processor Enhanced Lambda Architecture Inbound activities written to Ingestion Processor Hbase and then Kafka High volume (e.g. web) activities First written to Kafka, then enriched Spark Streaming applications consume events from Kafka Solr Indexing Email Reports Campaign Processing HBase is used for simple historical queries, and is system of record
  13. While it is not “true” streaming, we exactly need this as an optimization
  14. Our multitenant Kafka framework coalesces small kafka paritions into large spark rdd partitions to improve batch utilization Several components of the event enrichment requires outbound RPC calls, using async clients and performing the calls in parallel and then composing the futures pipelines the computation and significantly improves throughput. Caching web assets and cookies for temporal locality Cache is > 60% of the executor memory Enriched events are written out to multiple sources and be selective about persisting RDDS prevents recomputing expensive transformations (multiple RPC calls or MySQL queries)
  15. Traditionally both anonymous and known data was treated equally in MLM. This is problematic because Anonymous volumes are usually 10-20x higher than known. Additionally there is very little intrinsic value in performing downstream processing on anonymous data since you cannot target anonymous leads for Campaigns. To improve this, in Munchkin V2 we only allow known traffic to flow to downstream processing. Anonymous data is passed for downstream processing when the lead converts to a known lead Via form fillout, api calls, etc.
  16. Reiterates my points on the last slide. I included in case you wanted to look at the slides later
  17. Give a quick overview of the activities architecture. Introduce Kafka in the presentation
  18. Spend more time on this – purple is our code , teal is spark standard # SubscriptionRegistry is using ZK # OffsetManager is a library, uses low level kafka consumer API # Provisioning framework – Sirius, a new subscription provisioned to registry via oozie