MySQL Day Paris 2016 - Introducing Oracle MySQL Cloud Service
L'évolution de l'infrastructure BI Viadeo par François Le Lay
1. The evolution of Business Intelligence at Viadeo
Techdays 22/11/2012
2. Agenda
What is Business Intelligence?
Key Roles
Viadeo Data
Technical Solutions : a short history
3. What is Business Intelligence ?
Application Stack
Awareness Stack
Insights
Application
Awareness
Insights
Actions
Actions
Act Marketing Actions, Business
ion Strategies, Operations
s
Forecasting, Predicting,
Statistics, Competitor
Insights Information, Analysis
Feedback Awareness
Reports, Dashboards
Meta Data, KPI’s,
Visual Templates,
Application Stack Security, Information
Dissemination,
Scheduling
Plumbing of
structured and
Data Warehouse & ETL unstructured
data, logic to
persists data
4. Key Roles : the Business Analyst
Functional
Simple Web
BI Followup (Challenge
(Metrics)
BI Dashboards Informatio Product
Dashboard Analysis PO)
Specification n Access
s Specificatio
n
(Scalars)
Technical
Complex
Proactive (Enforce
(Data viz)
Direct data quality)
(SQL,
Datameer)
5. Key Roles : the Big Data Engineer
Implement
Real Time REST/Scala/Java APIs Data
Awarenes
Visualization
Data plumbing Expose to Apps
s
Enforce
Batch JDBC/ODBC
data quality
6. Viadeo data : The Dynamics
• 45 million members
• Worldwide presence
• China, India, Russia, Mexico,..
• Mobile App, Web, API
• B2B / B2C
Mining
User
Usage Engagement
8. Technical solutions : The Beginnings
Phase 1: 2006-2008 Phase 2 : 2008-2010
Mysql Mysql
Server name : Peach Server name : Lakitu
Internal tool to allow
C-Level, Sales,…
Access data
9. Technical solutions : A better
architecture
Phase 3: 2010 - 2012
MySQL
Server name :
« Unfied ODS »
Mysql
Server name : Server name :
ODS Live ODS Live
Cluster 1 Cluster 2
Server name : Server name :
ODS Live ODS Live
Cluster 3 Cluster 5
10. Technical solutions : 2 new internal
products
Scala-centric, Play! framework
Cross-channel messaging system
Email, Mobile, Social
Flexible content management
Flexible targeting of recipients
Content testing strategies : A/B, multivariate
Event-driven : web app events, mobile events, ad hoc events
Automation, scheduling, frequency capping
Analytics
Data visualization : based on Javascript D3.js, processing.js etc.
Tabular Reports, OLAP navigation
Pluggable alerts : business activity monitoring
A common requirement : scalability!!!
Viadeo data is Big
Processing performance is not an option, it is mandatory
12. Technical solutions : a new
architecture
• Master dataset :
• Historical data stored in HBase
• Provided as a service by architects team
• Datamarts :
• Built on HDFS using MapReduce jobs
• MapReduce eased by use of Cascading library
and Scala DSL (Scalding)
• Pushed to in-memory distributed storage
• Elastic Search, Riak
14. Conclusion
• Many scalable data storage solutions
• Rapid application development frameworks and low-risk
programming languages on the JVM
• Custom analytics = what we implement is what we use
• Analytical needs are very well identified
• Blend data stream and batch processing to answer
different needs
• Pluggable Data mining R&D
• Analytics for Viadeo members/recruiters/companies :
Social Media Monitoring as a Complex Event
Processing topic