2. •Real-time retail intelligence
•Tracking 100s of eCommerce sites
•Organise into high quality market view
•Competitive intelligence to retailers
and manufacturers (SaaS)
3. How it Works
•Tracking millions of products,
around the world, daily
•Distributed Processing Pipeline
•3K/s peak
Master
Product DB
Cleaned &
Validated
Raw Data Processor
Processor
Processors
HTML, etc.
Agents
4. Challenges
•Batch indexing
• SQL boxes can’t get bigger
• Don’t ask for that
Copy
Web Server Pool
DMZ
WebUK1
Load Balancers
WebUK2 WebUK3
Batch Sync Indexing Index
Master DB Snapshot
DB
5. Elasticsearch in the Stack
• Batch indexing
Batch Sync Indexing Index
Master DB Snapshot
DB
Copy
Web Server Pool
DMZ
WebUK1
Load Balancers
WebUK2 WebUK3
6. Elasticsearch in the Stack
• Real-time indexing
• Simultaneous writes to
SQL + ES
Master DB
Real-Time
Indexing
API Calls
Index
Web Server Pool
DMZ
Elasticsearch Cluster
WebUK1
Load Balancers
WebUK2 WebUK3
… WebUKn
…
Elastic1 Elastic2 Elastic3 Elasticn
7. Benefits
•Scalability, in both directions
•Flexibility
•High availability
•Cheaper to run
•Powerful features
8. Use Cases
•Logging (Graylog2)
•Internal Analytics
•Text Processing & IR
•Driving Web Apps
Filtering, aggregations, scripting
•Reporting
9. THANKS
Read more elasticsearch.org/case-study/cogenta/
Get in touch @YannCluchey