22. • Rolled out to our highest volume customers
• Processing latencies < 30s (at 99.9th %)
• Allowed key customers to scale from ~2MM/day to > 20
MM/day
Impact and Results
23. • Mitigations of straggler effects on processing delays
• Adding sessionization for web reporting
• Scaling Kafka topics as customers increase volume
• Globally distributed ingestion for a single customer
Future Work
Next phase was when we were ready to validate our newly built event ingestion system
Marketo is a powerful Engagement Marketing Platform. There are several applications that make up the platform, such as ABM, Marketing analytics, predictive content, Digital Ads, and Marketing Automation. Marketing automation is what we are focusing on today. Marketing Automation enables the marketer to create, automate and measure marketing campaigns across channels. A simple example of an automated campaign or workflow is
User visits your website and fills out a form
Web tracking sees that they spent most their time looking at pages about spark streaming
Automatically Send an email to the user to Invite them to a webinar on spark streaming services
If they attend the webinar, register their interests in your crm and request a sales person contacts the user
The campaigns can be complex and can reach out and track customers across channels like web, email, mobile, social
Explain what a known vs anonymous lead is
Known is targetable on other channels, anonymous is only web activity
Speak to how the traffic patterns are heavily skewed toward anonymous given our customer base
Talk about how anonymous converts to known.
Aggregate analytics include company web report, landing page reports, etc.
Speak to the pod
Mention how there are many many pods
An additional complication is the fact that the same two webservers also serve the mlm app, soap apis, and the landing pages
Although the talk isn’t about the project… we have a few slides up front to set the context around what we are working on
If you have been near technology at all in the last couple of years you know that the world has become very connected.
The number of connected devices blows my mind. It’s not just phones anymore…
Amazon dash buttons, coffee makers, propane tanks, garage doors. These devices are sending 10’s of billions of activities and user interactions every day...
Orion is our platfor
Our marketing platform ingests the user interactions process them into relevant marketing touchpoints
Its enables marketers to create marketing campaigns around these activities to build relationships with their customers
Become the fabric for marketers
Its been a great experience building this
Here are a few of the requirements
Near real time processing
At least a 1 billion activities per customer per day.
customer demands from increasing devices caused us to evaluate next get queueing and streaming...
reduction in infrastructure COGS primarily from expensive enterprise class filers...
reduction in people COGS by gained efficiency from reducing tech stack from using too many similar technologies ...
Multitenant… of course
Secure
Customer isolation and improved resource management
Arch requirement driven from biz requirement
Improve utilization over the existing system
Lots of customers in same infra, without starving
Encryption from day 1 for safe data storage
Aim for horz scalability
Coming from standard 3 tier app
Radically reduce processing latency
Eliminate backlogs
Brownout protection
A few words about the architecture
Main goal is to inject, process and store marketing events
Details overview of Munchkin FE component
Spray.io for MFE
Frontend has the simple job of verifying subscription status, collecting metrics and persisting to kafka
Use Avro to allow for schema evolution, strong typing and compact representation in topic
Use Schema registry to allow the schema to be upgraded by the producer and them automatically picked up by the spark streaming component
Use asynchronous API for kafka to allow high throughput.
Details overview of LeadService component
Spray.io for leadservice
Hbase for Cookie and anonymous lead storage
Salted table
Key structure is subscription-cookie-leadid
Secondary index for subscription-lead-createdat
MySQL for known lead storage
Masterdata for reverse ip information enrichments
Overall view for the system
Describe how there is a Kafka topic per subscription
Spark streaming transforms the raw events into activities by
Enriching with web page metadata from MySQL
Lead and reverse IP enrichment from LeadService
Persist activities to AS for storage and secondary processing (e.g. triggering and solr indexing)
Push enriched web events to Kafka for the downstream Druid OLAP infrastructure.
High level diagram of our event processor
Enhanced Lambda Architecture
Inbound activities written to Ingestion Processor
Hbase and then Kafka
High volume (e.g. web) activities
First written to Kafka, then enriched
Spark Streaming applications consume events from Kafka
Solr Indexing
Email Reports
Campaign Processing
HBase is used for simple historical queries, and is system of record
While it is not “true” streaming, we exactly need this as an optimization
Our multitenant Kafka framework coalesces small kafka paritions into large spark rdd partitions to improve batch utilization
Several components of the event enrichment requires outbound RPC calls, using async clients and performing the calls in parallel and then composing the futures pipelines the computation and significantly improves throughput.
Caching web assets and cookies for temporal locality
Cache is > 60% of the executor memory
Enriched events are written out to multiple sources and be selective about persisting RDDS prevents recomputing expensive transformations (multiple RPC calls or MySQL queries)
Traditionally both anonymous and known data was treated equally in MLM. This is problematic because
Anonymous volumes are usually 10-20x higher than known. Additionally there is very little intrinsic value
in performing downstream processing on anonymous data since you cannot target anonymous leads for
Campaigns.
To improve this, in Munchkin V2 we only allow known traffic to flow to downstream processing.
Anonymous data is passed for downstream processing when the lead converts to a known lead
Via form fillout, api calls, etc.
Reiterates my points on the last slide. I included in case you wanted to look at the slides later
Give a quick overview of the activities architecture. Introduce Kafka in the presentation
Spend more time on this – purple is our code , teal is spark standard
# SubscriptionRegistry is using ZK
# OffsetManager is a library, uses low level kafka consumer API
# Provisioning framework – Sirius, a new subscription provisioned to registry via oozie