SlideShare ist ein Scribd-Unternehmen logo
1 von 13
Self-Service Analytics – For Enterprise
Audience
• Sreejith Madhavan
– msreejith@yahoo.com
– https://www.linkedin.com/in/msreejith
Enterprise Analytics Portfolio – Lay of
The Land
Data Analytics – Basic Concepts
• Business Intelligence
o Using the available data to make factual business decisions
o “WHAT” is happening to your business right now?
• Business Analytics
o Steps that lead up to business decision
o Data Mining - process of looking for trends, patterns, or other useful
information within dataset
o Diagnostic analytics - “WHY” something is happening right now
o Predictive analytics - “WHAT Will” happen in future
o Prescriptive analytics - “WHAT Should be Done next”
Enterprise Analytics Landscape
• Enterprises typically have Users categorized broadly as -
o Business users – most interested in current metrics, fiscal trends, dashboards
o Engineering users – most interested in diagnostics (find needle-in-haystack),
deep-analytics
o An enterprise analytics solution stack should cover self-service needs to above
broad user-base
• Existing Data-stores Have Varying Use-cases
o Representing specialized data (application specific)
o Organizational units having independent solutions (IT, Engineering, Support etc..)
o Data architecture demands (BI tool backend, Datamarts, OLTP/OLAP etc)
• Enter Hadoop Datalake…
o Answering “Why” you need Hadoop Datalake in your Analytics landscape is critical
o What short, long term goals need to be met
o Not meant to be a one-stop-shop solution to replace existing Databases and
workflows
o Enterprise has several types of Users (by broad skill level) – A self service solution
stack should cater to broad User base by having mix-of several tools
Understanding Existing Data-Stores
Structured
data of Pre-
Computed
measures
Analytical
Cubes
Currently
SQL Server
Business
Analytics
system
Structured
data as Star
schema
with Dims
and Facts
Datamart
Currently
Oracle
Decision
Support
system/
Datamart
Structured,
Semi-
structured
data per
Event
granularity
Hive, M/R,
Datameer
Big Data
system
(Datalake)
Original
data
persisted in
its incoming
form
HDFS(M/R),
NFS
(Scripts),
REST
Raw Data
Highly granular and
complete dataset
Lower granularity and
subset of source data
Good for standard
Biz Metrics of
current and fiscal
trend
Good for interactive
Adhoc reporting
Good for diagnostic
mining and general
Adhoc reports at
scale
Useful to do ELT to
feed into other data
sources
Access
Interface/Tool
Data
Characteristic
Advanced Users (Data
Engineers/Scientists)
Enhance and persist
data-model, Develop
Deep insights
workflows
Frameworks, APIs
Map-reduce, Hive, Pig,
Spark, R, Programmatic
(JDBC..)
Technical Analysts
Generate Adhoc and
canned reports
SQL and
Transformation-
workflow based Tools
Oracle, SQL-Server,
Hive, R, Vertica,
Teradata, Datameer,
Tableau, PDI
Exec-users (Non-
Technical)
Consume predefined
metrics, Dashboards,
drag-n-drop what-if
analysis
Visual, Natural
language based tools
Tableau, OBIEE, PBA,
Excel, Microstrategy,
Search UI
End User Categories and Expectations
Usage
Characteristics
Interface
Characteristics
Sample Tools In each
Vertical
User and Use-case Requirement Considerations
• Demarcate target Users – Provision right Tool to right Users/Use-cases
– Not all users can should be given a Hadoop Datalake interface in self-service model
– Not one tool can fit all Use-cases
• Get to a Consolidated view of existing Data Sources to cover most
common domain objects to target “BI” based self-service model
• Data architecture - Data-layout and Data-model for the above
“Consolidated view”
– Star-schema vs Analytic Cube vs Flat OLTP schema
– MPP Analytic Database vs OLAP Cube vs DSS
– Traversing and Finding Metadata - Search interface to find entities, attributes and data
– Documentation covering data-model and data-dictionary
• Performance considerations
– High Performance and Concurrency support backend for interfacing BI Tools
– Scalable environment for batch, mining use-cases
– Interactive programmatic platform for data engineering
• Miscellaneous Operational Considerations (slide7)
Holistic View For Building E2E Analytics
Platform
Objectives For Holistic Analytics Platform
• Establish a self-service Analytics platform to cover BI and
Analytics use-cases for Internal users
• Support 3Vs of User types and Access patterns
o Volume of data
o Variety of Users (Programmatic and Non-technical)
o Variety of Queries (Adhoc, Not pre-defined)
o Velocity (Interactive query response, Dashboarding)
• Design Principles
o Embrace ideology of “one-tool doesn’t fit all use-cases and user preferences”
o Ease of Use (Front-end interface and Backend Data-model)
o Improved Performance to query response times
Datalake Analytics Platform – Conceptual View
MPP/Analytic
Database
PUAT Datamart Hive HDFS
BI Tool Front-End
Spark
Hue UI
(Hive, Search)
DataStore
Layer
Processing
Engine
Layer
Viz.and
Data
Access
Layer
• Focus on Data Processing & Integration frameworks
• Adhoc Data mining, complex data transformations, Machine learning
• 25-50 Concurrent users
• Focus on Visualization & Metrics (not Data Processing)
• Support Adhoc and Canned Self-service Reports
• 100+ Concurrent users
Extended
Datamodel
Cloudera Search
Spark CLI,
Hive Jdbc
(Programmatic
Access)
Datameer
(Non-
Programmatic)
Engineering focused Self-serve Reporting (Analysts &
Data engineers, Data scientists)
Business focused Self-serve Reporting (Analysts, Execs,
non-technical Audience)
Search
Front-End
Datalake Analytics Platform – Technology View
HDFS
(Orig Source)
Spark Data Prep
FW
M/R Daily HDFS
Transforms
HDFS
(Transformed)
Hive/Impala
Time based
SeqFile
Layout
System based
PARQUET
Layout
Adhoc Query
Hue UI/ Edge
Node CLI
Vertica MPP
Analytic DB
(12 month window)
On-demand
Parsed content
Datam
art
Structured
Config Feed
Cloudera Search
Indexing Prep FW
SSAS
Latest System
Snapshot raw
Latest Week Raw
& Structured
Data-
Prep/Transform
(SnapLogic/Data
meer)
Cloudera
Search Hue
UI
Tableau/Penta
ho BA
Spark
CLI/MLLib
Data-Prep/Filter
& Import
(SnapLogic)
DistributedR
Flattened
Star-schema
ZoomData
Raw
Data
Export
Published
Extended schema
Text search & Search AnalyticsSelf-serve BI
Reporting
Statistical Analytics Adhoc SQL Queries On-demand Data Transformations
Other
Sources…
Existing Components
Processing Workflows
New ComponentsOther
Legend
Evolving Other Operational Requirements
Agility and Productivity for End users
Monitoring and Governance
- Monitor & recover user, system jobs/service failures
- Analytics on Analytics – user and system behaviour
- Data quality, security etc
Ease of access to Data
- Abstracting data complexities, Provisioning prep’ed data to cover standard use-cases
- Query response times, Data mobility(transfer) issues
Understanding the Dataset
- Documentation, Catalog, Data Dictionary, Data Exploration
External References
• https://www.vertica.com/2014/04/18/facebook-and-vertica-a-case-for-mpp-databases/
• https://practicalanalytics.wordpress.com/2015/06/11/databianalytics-evolution-netflix/
• http://www.thebigdatainsightgroup.com/site/sites/default/files/Teradata's%20-
%20Big%20Data%20Architecture%20-%20Putting%20all%20your%20eggs%20in%20one%20basket.pdf
• http://www.slideshare.net/Dataconomy/hp-vertica-dataconomy
• http://www.bryanbrandow.com/2014/05/microstrategy-vs-tableau.html
• http://www.experfy.com/blog/pentaho-vs-tableau-comparison-visualization-dashboards/

Weitere ähnliche Inhalte

Was ist angesagt?

Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
 
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...Simplilearn
 
Data Governance Roles as the Backbone of Your Program
Data Governance Roles as the Backbone of Your ProgramData Governance Roles as the Backbone of Your Program
Data Governance Roles as the Backbone of Your ProgramDATAVERSITY
 
Big Data Technology Stack : Nutshell
Big Data Technology Stack : NutshellBig Data Technology Stack : Nutshell
Big Data Technology Stack : NutshellKhalid Imran
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureDmitry Anoshin
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseDatabricks
 
Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0Databricks
 
Designing An Enterprise Data Fabric
Designing An Enterprise Data FabricDesigning An Enterprise Data Fabric
Designing An Enterprise Data FabricAlan McSweeney
 
How Yellowbrick Data Integrates to Existing Environments Webcast
How Yellowbrick Data Integrates to Existing Environments WebcastHow Yellowbrick Data Integrates to Existing Environments Webcast
How Yellowbrick Data Integrates to Existing Environments WebcastYellowbrick Data
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouseJames Serra
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Databricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDatabricks
 
Azure DataBricks for Data Engineering by Eugene Polonichko
Azure DataBricks for Data Engineering by Eugene PolonichkoAzure DataBricks for Data Engineering by Eugene Polonichko
Azure DataBricks for Data Engineering by Eugene PolonichkoDimko Zhluktenko
 
How to Prepare for a BI Migration
How to Prepare for a BI MigrationHow to Prepare for a BI Migration
How to Prepare for a BI MigrationSenturus
 
Data Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceData Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceDenodo
 
Microsoft Azure Data Factory Hands-On Lab Overview Slides
Microsoft Azure Data Factory Hands-On Lab Overview SlidesMicrosoft Azure Data Factory Hands-On Lab Overview Slides
Microsoft Azure Data Factory Hands-On Lab Overview SlidesMark Kromer
 
Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?DATAVERSITY
 
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data PipelinesPutting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data PipelinesDATAVERSITY
 

Was ist angesagt? (20)

Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
 
Data Governance Roles as the Backbone of Your Program
Data Governance Roles as the Backbone of Your ProgramData Governance Roles as the Backbone of Your Program
Data Governance Roles as the Backbone of Your Program
 
Big Data Technology Stack : Nutshell
Big Data Technology Stack : NutshellBig Data Technology Stack : Nutshell
Big Data Technology Stack : Nutshell
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft Azure
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a Lakehouse
 
Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0
 
Designing An Enterprise Data Fabric
Designing An Enterprise Data FabricDesigning An Enterprise Data Fabric
Designing An Enterprise Data Fabric
 
How Yellowbrick Data Integrates to Existing Environments Webcast
How Yellowbrick Data Integrates to Existing Environments WebcastHow Yellowbrick Data Integrates to Existing Environments Webcast
How Yellowbrick Data Integrates to Existing Environments Webcast
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Data Management for Dummies
Data Management for DummiesData Management for Dummies
Data Management for Dummies
 
Azure DataBricks for Data Engineering by Eugene Polonichko
Azure DataBricks for Data Engineering by Eugene PolonichkoAzure DataBricks for Data Engineering by Eugene Polonichko
Azure DataBricks for Data Engineering by Eugene Polonichko
 
How to Prepare for a BI Migration
How to Prepare for a BI MigrationHow to Prepare for a BI Migration
How to Prepare for a BI Migration
 
Data Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceData Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and Governance
 
Microsoft Azure Data Factory Hands-On Lab Overview Slides
Microsoft Azure Data Factory Hands-On Lab Overview SlidesMicrosoft Azure Data Factory Hands-On Lab Overview Slides
Microsoft Azure Data Factory Hands-On Lab Overview Slides
 
Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?
 
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data PipelinesPutting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 

Andere mochten auch

The Power of Self Service Reporting
The Power of Self Service ReportingThe Power of Self Service Reporting
The Power of Self Service ReportingAras
 
Obiee metadata dictionary
Obiee metadata dictionaryObiee metadata dictionary
Obiee metadata dictionaryobieefans
 
Extending the Self-Service Capabilities of SAP BI with SAP BusinessObjects Ex...
Extending the Self-Service Capabilities of SAP BI with SAP BusinessObjects Ex...Extending the Self-Service Capabilities of SAP BI with SAP BusinessObjects Ex...
Extending the Self-Service Capabilities of SAP BI with SAP BusinessObjects Ex...SAP Analytics
 
Agile collaborative practices
Agile collaborative practicesAgile collaborative practices
Agile collaborative practicesSreejith Madhavan
 
Trivial works.com introduction
Trivial works.com introductionTrivial works.com introduction
Trivial works.com introductionTrivialWorks
 
Agile Development For Rte Systems
Agile Development For Rte SystemsAgile Development For Rte Systems
Agile Development For Rte SystemsBruce Douglass
 
Collaborative and agile development of mobile applications
Collaborative and agile development of mobile applicationsCollaborative and agile development of mobile applications
Collaborative and agile development of mobile applicationsAyushman Jain
 
The Business Benefits of a Data-Driven, Self-Service BI Organization
The Business Benefits of a Data-Driven, Self-Service BI OrganizationThe Business Benefits of a Data-Driven, Self-Service BI Organization
The Business Benefits of a Data-Driven, Self-Service BI OrganizationLooker
 
Realtime Reporting using Spark Streaming
Realtime Reporting using Spark StreamingRealtime Reporting using Spark Streaming
Realtime Reporting using Spark StreamingSantosh Sahoo
 
The Complete Guide to Embedded Analytics
The Complete Guide to Embedded AnalyticsThe Complete Guide to Embedded Analytics
The Complete Guide to Embedded AnalyticsJessica Sprinkel
 
Agile presentation
Agile presentationAgile presentation
Agile presentationinfolock
 
Overview of Agile Methodology
Overview of Agile MethodologyOverview of Agile Methodology
Overview of Agile MethodologyHaresh Karkar
 

Andere mochten auch (13)

The Power of Self Service Reporting
The Power of Self Service ReportingThe Power of Self Service Reporting
The Power of Self Service Reporting
 
Obiee metadata dictionary
Obiee metadata dictionaryObiee metadata dictionary
Obiee metadata dictionary
 
Extending the Self-Service Capabilities of SAP BI with SAP BusinessObjects Ex...
Extending the Self-Service Capabilities of SAP BI with SAP BusinessObjects Ex...Extending the Self-Service Capabilities of SAP BI with SAP BusinessObjects Ex...
Extending the Self-Service Capabilities of SAP BI with SAP BusinessObjects Ex...
 
Agile collaborative practices
Agile collaborative practicesAgile collaborative practices
Agile collaborative practices
 
Trivial works.com introduction
Trivial works.com introductionTrivial works.com introduction
Trivial works.com introduction
 
Agile Development For Rte Systems
Agile Development For Rte SystemsAgile Development For Rte Systems
Agile Development For Rte Systems
 
Collaborative and agile development of mobile applications
Collaborative and agile development of mobile applicationsCollaborative and agile development of mobile applications
Collaborative and agile development of mobile applications
 
The Business Benefits of a Data-Driven, Self-Service BI Organization
The Business Benefits of a Data-Driven, Self-Service BI OrganizationThe Business Benefits of a Data-Driven, Self-Service BI Organization
The Business Benefits of a Data-Driven, Self-Service BI Organization
 
Realtime Reporting using Spark Streaming
Realtime Reporting using Spark StreamingRealtime Reporting using Spark Streaming
Realtime Reporting using Spark Streaming
 
The Complete Guide to Embedded Analytics
The Complete Guide to Embedded AnalyticsThe Complete Guide to Embedded Analytics
The Complete Guide to Embedded Analytics
 
Agile presentation
Agile presentationAgile presentation
Agile presentation
 
Tableau Server Basics
Tableau Server BasicsTableau Server Basics
Tableau Server Basics
 
Overview of Agile Methodology
Overview of Agile MethodologyOverview of Agile Methodology
Overview of Agile Methodology
 

Ähnlich wie Self Service Reporting & Analytics For an Enterprise

The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...Revolution Analytics
 
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata Hortonworks
 
Tableau and hadoop
Tableau and hadoopTableau and hadoop
Tableau and hadoopCraig Jordan
 
Big Data Integration Webinar: Reducing Implementation Efforts of Hadoop, NoSQ...
Big Data Integration Webinar: Reducing Implementation Efforts of Hadoop, NoSQ...Big Data Integration Webinar: Reducing Implementation Efforts of Hadoop, NoSQ...
Big Data Integration Webinar: Reducing Implementation Efforts of Hadoop, NoSQ...Pentaho
 
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...Databricks
 
18Mar14 Find the Hidden Signal in Market Data Noise Webinar
18Mar14 Find the Hidden Signal in Market Data Noise Webinar 18Mar14 Find the Hidden Signal in Market Data Noise Webinar
18Mar14 Find the Hidden Signal in Market Data Noise Webinar Revolution Analytics
 
In-Memory Analytics - SAP Big Data - Analytics Tools Selection - SAP HANA & ...
In-Memory Analytics - SAP Big Data - Analytics Tools Selection  - SAP HANA & ...In-Memory Analytics - SAP Big Data - Analytics Tools Selection  - SAP HANA & ...
In-Memory Analytics - SAP Big Data - Analytics Tools Selection - SAP HANA & ...Jothi Periasamy
 
Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014Hortonworks
 
Big data unit 2
Big data unit 2Big data unit 2
Big data unit 2RojaT4
 
AzureDay - Introduction Big Data Analytics.
AzureDay  - Introduction Big Data Analytics.AzureDay  - Introduction Big Data Analytics.
AzureDay - Introduction Big Data Analytics.Łukasz Grala
 
Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Group
 
SFScon19 - Grazia Cazzin - KNOWAGE the open source answer to the new needs in...
SFScon19 - Grazia Cazzin - KNOWAGE the open source answer to the new needs in...SFScon19 - Grazia Cazzin - KNOWAGE the open source answer to the new needs in...
SFScon19 - Grazia Cazzin - KNOWAGE the open source answer to the new needs in...South Tyrol Free Software Conference
 
Architecting Agile Data Applications for Scale
Architecting Agile Data Applications for ScaleArchitecting Agile Data Applications for Scale
Architecting Agile Data Applications for ScaleDatabricks
 
No sql and sql - open analytics summit
No sql and sql - open analytics summitNo sql and sql - open analytics summit
No sql and sql - open analytics summitOpen Analytics
 
Introduction To Big Data & Hadoop
Introduction To Big Data & HadoopIntroduction To Big Data & Hadoop
Introduction To Big Data & HadoopBlackvard
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dataconomy Media
 
Kushal Data Warehousing PPT
Kushal Data Warehousing PPTKushal Data Warehousing PPT
Kushal Data Warehousing PPTKushal Singh
 
Hadoop meets Agile! - An Agile Big Data Model
Hadoop meets Agile! - An Agile Big Data ModelHadoop meets Agile! - An Agile Big Data Model
Hadoop meets Agile! - An Agile Big Data ModelUwe Printz
 

Ähnlich wie Self Service Reporting & Analytics For an Enterprise (20)

The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
 
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
 
Tableau and hadoop
Tableau and hadoopTableau and hadoop
Tableau and hadoop
 
Big Data Integration Webinar: Reducing Implementation Efforts of Hadoop, NoSQ...
Big Data Integration Webinar: Reducing Implementation Efforts of Hadoop, NoSQ...Big Data Integration Webinar: Reducing Implementation Efforts of Hadoop, NoSQ...
Big Data Integration Webinar: Reducing Implementation Efforts of Hadoop, NoSQ...
 
Big Data SE vs. SE for Big Data
Big Data SE vs. SE for Big DataBig Data SE vs. SE for Big Data
Big Data SE vs. SE for Big Data
 
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...
 
18Mar14 Find the Hidden Signal in Market Data Noise Webinar
18Mar14 Find the Hidden Signal in Market Data Noise Webinar 18Mar14 Find the Hidden Signal in Market Data Noise Webinar
18Mar14 Find the Hidden Signal in Market Data Noise Webinar
 
In-Memory Analytics - SAP Big Data - Analytics Tools Selection - SAP HANA & ...
In-Memory Analytics - SAP Big Data - Analytics Tools Selection  - SAP HANA & ...In-Memory Analytics - SAP Big Data - Analytics Tools Selection  - SAP HANA & ...
In-Memory Analytics - SAP Big Data - Analytics Tools Selection - SAP HANA & ...
 
Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014
 
Big data unit 2
Big data unit 2Big data unit 2
Big data unit 2
 
AzureDay - Introduction Big Data Analytics.
AzureDay  - Introduction Big Data Analytics.AzureDay  - Introduction Big Data Analytics.
AzureDay - Introduction Big Data Analytics.
 
Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Big Data part 2
Skillwise Big Data part 2
 
SFScon19 - Grazia Cazzin - KNOWAGE the open source answer to the new needs in...
SFScon19 - Grazia Cazzin - KNOWAGE the open source answer to the new needs in...SFScon19 - Grazia Cazzin - KNOWAGE the open source answer to the new needs in...
SFScon19 - Grazia Cazzin - KNOWAGE the open source answer to the new needs in...
 
Architecting Agile Data Applications for Scale
Architecting Agile Data Applications for ScaleArchitecting Agile Data Applications for Scale
Architecting Agile Data Applications for Scale
 
No sql and sql - open analytics summit
No sql and sql - open analytics summitNo sql and sql - open analytics summit
No sql and sql - open analytics summit
 
Introduction To Big Data & Hadoop
Introduction To Big Data & HadoopIntroduction To Big Data & Hadoop
Introduction To Big Data & Hadoop
 
Skilwise Big data
Skilwise Big dataSkilwise Big data
Skilwise Big data
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
 
Kushal Data Warehousing PPT
Kushal Data Warehousing PPTKushal Data Warehousing PPT
Kushal Data Warehousing PPT
 
Hadoop meets Agile! - An Agile Big Data Model
Hadoop meets Agile! - An Agile Big Data ModelHadoop meets Agile! - An Agile Big Data Model
Hadoop meets Agile! - An Agile Big Data Model
 

Kürzlich hochgeladen

Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjurptikerjasaptiker
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxchadhar227
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样wsppdmt
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...gajnagarg
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRajesh Mondal
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制vexqp
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...Bertram Ludäscher
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...gajnagarg
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowgargpaaro
 
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
SR-101-01012024-EN.docx  Federal Constitution  of the Swiss ConfederationSR-101-01012024-EN.docx  Federal Constitution  of the Swiss Confederation
SR-101-01012024-EN.docx Federal Constitution of the Swiss ConfederationEfruzAsilolu
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schscnajjemba
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...nirzagarg
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...nirzagarg
 

Kürzlich hochgeladen (20)

Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
 
Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
SR-101-01012024-EN.docx  Federal Constitution  of the Swiss ConfederationSR-101-01012024-EN.docx  Federal Constitution  of the Swiss Confederation
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schs
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit RiyadhCytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
 

Self Service Reporting & Analytics For an Enterprise

  • 1. Self-Service Analytics – For Enterprise Audience • Sreejith Madhavan – msreejith@yahoo.com – https://www.linkedin.com/in/msreejith
  • 2. Enterprise Analytics Portfolio – Lay of The Land
  • 3. Data Analytics – Basic Concepts • Business Intelligence o Using the available data to make factual business decisions o “WHAT” is happening to your business right now? • Business Analytics o Steps that lead up to business decision o Data Mining - process of looking for trends, patterns, or other useful information within dataset o Diagnostic analytics - “WHY” something is happening right now o Predictive analytics - “WHAT Will” happen in future o Prescriptive analytics - “WHAT Should be Done next”
  • 4. Enterprise Analytics Landscape • Enterprises typically have Users categorized broadly as - o Business users – most interested in current metrics, fiscal trends, dashboards o Engineering users – most interested in diagnostics (find needle-in-haystack), deep-analytics o An enterprise analytics solution stack should cover self-service needs to above broad user-base • Existing Data-stores Have Varying Use-cases o Representing specialized data (application specific) o Organizational units having independent solutions (IT, Engineering, Support etc..) o Data architecture demands (BI tool backend, Datamarts, OLTP/OLAP etc) • Enter Hadoop Datalake… o Answering “Why” you need Hadoop Datalake in your Analytics landscape is critical o What short, long term goals need to be met o Not meant to be a one-stop-shop solution to replace existing Databases and workflows o Enterprise has several types of Users (by broad skill level) – A self service solution stack should cater to broad User base by having mix-of several tools
  • 5. Understanding Existing Data-Stores Structured data of Pre- Computed measures Analytical Cubes Currently SQL Server Business Analytics system Structured data as Star schema with Dims and Facts Datamart Currently Oracle Decision Support system/ Datamart Structured, Semi- structured data per Event granularity Hive, M/R, Datameer Big Data system (Datalake) Original data persisted in its incoming form HDFS(M/R), NFS (Scripts), REST Raw Data Highly granular and complete dataset Lower granularity and subset of source data Good for standard Biz Metrics of current and fiscal trend Good for interactive Adhoc reporting Good for diagnostic mining and general Adhoc reports at scale Useful to do ELT to feed into other data sources Access Interface/Tool Data Characteristic
  • 6. Advanced Users (Data Engineers/Scientists) Enhance and persist data-model, Develop Deep insights workflows Frameworks, APIs Map-reduce, Hive, Pig, Spark, R, Programmatic (JDBC..) Technical Analysts Generate Adhoc and canned reports SQL and Transformation- workflow based Tools Oracle, SQL-Server, Hive, R, Vertica, Teradata, Datameer, Tableau, PDI Exec-users (Non- Technical) Consume predefined metrics, Dashboards, drag-n-drop what-if analysis Visual, Natural language based tools Tableau, OBIEE, PBA, Excel, Microstrategy, Search UI End User Categories and Expectations Usage Characteristics Interface Characteristics Sample Tools In each Vertical
  • 7. User and Use-case Requirement Considerations • Demarcate target Users – Provision right Tool to right Users/Use-cases – Not all users can should be given a Hadoop Datalake interface in self-service model – Not one tool can fit all Use-cases • Get to a Consolidated view of existing Data Sources to cover most common domain objects to target “BI” based self-service model • Data architecture - Data-layout and Data-model for the above “Consolidated view” – Star-schema vs Analytic Cube vs Flat OLTP schema – MPP Analytic Database vs OLAP Cube vs DSS – Traversing and Finding Metadata - Search interface to find entities, attributes and data – Documentation covering data-model and data-dictionary • Performance considerations – High Performance and Concurrency support backend for interfacing BI Tools – Scalable environment for batch, mining use-cases – Interactive programmatic platform for data engineering • Miscellaneous Operational Considerations (slide7)
  • 8. Holistic View For Building E2E Analytics Platform
  • 9. Objectives For Holistic Analytics Platform • Establish a self-service Analytics platform to cover BI and Analytics use-cases for Internal users • Support 3Vs of User types and Access patterns o Volume of data o Variety of Users (Programmatic and Non-technical) o Variety of Queries (Adhoc, Not pre-defined) o Velocity (Interactive query response, Dashboarding) • Design Principles o Embrace ideology of “one-tool doesn’t fit all use-cases and user preferences” o Ease of Use (Front-end interface and Backend Data-model) o Improved Performance to query response times
  • 10. Datalake Analytics Platform – Conceptual View MPP/Analytic Database PUAT Datamart Hive HDFS BI Tool Front-End Spark Hue UI (Hive, Search) DataStore Layer Processing Engine Layer Viz.and Data Access Layer • Focus on Data Processing & Integration frameworks • Adhoc Data mining, complex data transformations, Machine learning • 25-50 Concurrent users • Focus on Visualization & Metrics (not Data Processing) • Support Adhoc and Canned Self-service Reports • 100+ Concurrent users Extended Datamodel Cloudera Search Spark CLI, Hive Jdbc (Programmatic Access) Datameer (Non- Programmatic) Engineering focused Self-serve Reporting (Analysts & Data engineers, Data scientists) Business focused Self-serve Reporting (Analysts, Execs, non-technical Audience) Search Front-End
  • 11. Datalake Analytics Platform – Technology View HDFS (Orig Source) Spark Data Prep FW M/R Daily HDFS Transforms HDFS (Transformed) Hive/Impala Time based SeqFile Layout System based PARQUET Layout Adhoc Query Hue UI/ Edge Node CLI Vertica MPP Analytic DB (12 month window) On-demand Parsed content Datam art Structured Config Feed Cloudera Search Indexing Prep FW SSAS Latest System Snapshot raw Latest Week Raw & Structured Data- Prep/Transform (SnapLogic/Data meer) Cloudera Search Hue UI Tableau/Penta ho BA Spark CLI/MLLib Data-Prep/Filter & Import (SnapLogic) DistributedR Flattened Star-schema ZoomData Raw Data Export Published Extended schema Text search & Search AnalyticsSelf-serve BI Reporting Statistical Analytics Adhoc SQL Queries On-demand Data Transformations Other Sources… Existing Components Processing Workflows New ComponentsOther Legend
  • 12. Evolving Other Operational Requirements Agility and Productivity for End users Monitoring and Governance - Monitor & recover user, system jobs/service failures - Analytics on Analytics – user and system behaviour - Data quality, security etc Ease of access to Data - Abstracting data complexities, Provisioning prep’ed data to cover standard use-cases - Query response times, Data mobility(transfer) issues Understanding the Dataset - Documentation, Catalog, Data Dictionary, Data Exploration
  • 13. External References • https://www.vertica.com/2014/04/18/facebook-and-vertica-a-case-for-mpp-databases/ • https://practicalanalytics.wordpress.com/2015/06/11/databianalytics-evolution-netflix/ • http://www.thebigdatainsightgroup.com/site/sites/default/files/Teradata's%20- %20Big%20Data%20Architecture%20-%20Putting%20all%20your%20eggs%20in%20one%20basket.pdf • http://www.slideshare.net/Dataconomy/hp-vertica-dataconomy • http://www.bryanbrandow.com/2014/05/microstrategy-vs-tableau.html • http://www.experfy.com/blog/pentaho-vs-tableau-comparison-visualization-dashboards/

Hinweis der Redaktion

  1. Business users (typically from Sales, Product management, Other execs) Engineering users (Developers, QA, Technical support engineers, Analysts, Data scientists)
  2. User Types: - Semi/non- technical users – easy to use drag-n-drop interface - advanced users - Programmatic and SQL based interfaces Improved Performance considerations - High Performance and Concurrent platform for user interactions via BI Tools - Scalable environment for batch, mining use-cases - nteractive programmatic platform for data engineering
  3. Business users workflows: - Self-service - Answer “What” questions - Analytic Database – consolidate data model supporting quick Vizn, Performance and lower learning curve Engineering users workflows: - Self-service – Answer “Why” and “What next” questions
  4. CLI – Command-line Interface MLLib – Machine learning Lib Data Prep FW – Data Preparation framework MPP – Massive Parallel Processing BI – Business Intelligence