GoodData: Customer Analytics Through Cloud BI
Point of View: End-User Case Study
Session Track: Enterprise Ready Clouds: Realistic Strategies
Towards a cloud-based BI Platform as a Service
Burton Group Catalyst Conference Europe 2010, Prague, June 21 – 24 2010
3. GoodData’s!Founding!Vision:!Customer!Analytics
• The center of gravity is gradually
shifting from ERP to CRM
• The BI activities should be centered
around Customer!Analytics as
opposed to General Ledger
• Customer Analytics use cases and
data are fundamentally different
than ERP data and use cases.
3
4. Customer!Analytics:!Flexibility,!Versatility!and!People
Customer Analytics use cases and data are
fundamentally different ...
• constant innovation
– the dynamics of everchanging needs
– ad hoc analysis, hypothesis testing
• decentralization
– driven by line of business or department
– self-service, broad user base
• disparate external data sources
– impossible to enforce strict data quality
– cross-source analysis is the key use case
• variable lifespan
– from perpetual to single purpose
– low risk, time-to-value
4
5. Customer!Analytics!vs.!Cloud!Architecture
• “KPIs” of a Customer Analytics:
– time to value
– risk level (initial price + operating costs)
– agility, flexibility, usability
• Computing cloud
– an ideal environment for a BI deployment
because of the low-utilization vs. high-
peak-performance-demand nature of BI
– allows to increase HW resource utilization
• BI Platform as a Service
– tools and APIs reducing time-to-value from
months and weeks to days or hours
– takes care of all IT operations aspects
– takes care of customer support
5
6. Building!the!BI!Platform!as!a!Service!in!the!Cloud
“Computing cloud is an ideal environment for a multi tenant BI deployment because of the low-
utilization vs. high-peak-performance-demand nature of BI”
• The traditional BI tools are not suitable for cloud deployments
– they are too complex on the upstream side
– they are not multi-tenant
• Developing a complete generic cloud-ready BI stack from scratch is a
substantial challenge due to the enormous breath and depth of the BI domain
– ETL, modeling, metrics, reports, dashboards, collaboration, security
– Large data volumes, unpredictable peak loads
6
7. GoodData!Cloud!BI!Platform
• Open standards-based APIs
– HTTP, REST, FTP
• Rich user experience
– JavaScript, AJAX, interactive
charts
• Flexible application layer
– a new release every two weeks
• Robust ROLAP engine
– MAQL (Multi-dimensional
Analytical Query Language)
– Fluid data model (Attributes, Facts,
Metrics, Hierarchies)
– highly efficient MAQL-to-SQL
decomposition and caching
– suits both operational reporting as
well as ad-hoc analysis
7
8. GoodData!Cloud!BI!Platform!–!Core!Concepts
• Project = data mart
– a unit of management and distribution
– deployment: as easy as “New File”
• User Information
– security boundary – a “walled garden”
• Project Data
– raw data: numbers and classifications
• Project Metadata
– metrics, filters, reports, dashboards
– LDM, PDM, operational state
– event trace, audit log
• Cached Data
– pre-aggregated data
– materialized slices and dices
8
9. GoodData!Cloud!BI!Platform!–!Multi!Tenant
• Multi-Tenant Platform
– born on Amazon Web services
– stateless web application layer
– session-less processing layer
– redundant storage
• Horizontal Scaling
– a pre-configured node type for each role
– shared-nothing architecture between
nodes of the same type
– nodes of each type can be provisioned
on!demand independently of others
• Horizontal Partitioning
– first-level driven by project separation
– with columnar storage second-level
partitioning not needed ~100M rows
9
10. Operating!Cloud!BI!Platform!=!Continuous!Innovation
Statistics as of June 2010:
• 2,713 projects, 1,344 dashboards www.gooddata.com/trust
– 19,086 reports, 41,213 metrics
• 3.5K+ reports run per business day
– report calculations, incl. dashboards
• 5M platform events a day
– in the audit events trail
While continuously innovating:
• production release ~ 2 weeks
– 10 releases so far in 2010
• without adverse impacts on uptime
10