1. How we process half a
billion mentions a day
George & Shrikar
2. Agenda
Who we are?
Some numbers about our system
Open-Source Technologies we use
Architecture of the System
Component Overview
3. Who We Are
Social Media Analytics, Monitoring and
Engagement Company
(www.viralheat.com)
We are based in San Mateo, CA
4. Data Crunched Daily
In total we ingest around 1TB of Social Data
every day to our infrastructure
Social Data :
Twitter, Facebook, Linkedin, Pinterest, Blogs
etc.
5. How we manage it?
Redis
Mysql
Riak
ElasticSearch
Memcache
Storm (Real time data processing)
Beanstalk
7. Deep Dive
Processor tags Social Mention with
Sentiment and Intent.
Around 100 Million Social mentions every 5
hours.
Elasticsearch indexes and ranks the social
data.
Stats calculates the analytics for each
keyword grouped by sentiment and intent.
8. Near Realtime
We use Storm for near real time data
pipeline.
Benefits : Scalable, fault tolerant and easy to
operate
Easy to load and store data from existing
databases/queues.