It refers to the capability of managing data that are growing along three dimensions - volume, velocity and variety - respecting the simplicity of the user interface. The speech describes SpagoBI approach to the “big data” scenario and presents SpagoBI suite roadmap, which is two-fold. It aims to address existing emerging analytical areas and domains, providing the suite with new capabilities - including big data and open data support, in-memory analysis, real time and mobile BI - and following a research path towards the realization of a new generation of SpagoBI suite.
1. November 27 - 29, 2012
Orange Labs, Paris-Issy-les-Moulineaux, Paris
SpagoBI and Big Data:
Next Open Source Information Management suite
Monica Franceschini
SpagoBI Architect - SpagoBI Competency Center -
Engineering Group
www.spagobi.org 1
2. SpagoBI Suite now
The business intelligence comprehensive suite
All BI capabilities, many analytical areas, various
engines
Unique solutions
Location Intelligence, Real-time BI and Mobile BI
Suitable for any requirement
Just-in-time BI, agile and sustainable developments
100% open source, forever
Pure open source model, no software lock-in
A complete range of support services
Pay-as-you-go support services
www.spagobi.org Creative Commons Attribution-NonCommercial-ShareAlike 3.0 2
3. What Big Data are
www.spagobi.org Creative Commons Attribution-NonCommercial-ShareAlike 3.0 3
4. SpagoBI Basic concepts
Data Source: Data Set: query over Engines: produce
Data storage the Data Source to get a different
(DWH, result that can be used documents
Database) across SpagoBI
documents
Documents:
different Analysis
that can use the
same data sets
www.spagobi.org Creative Commons Attribution-NonCommercial-ShareAlike 3.0 4
5. SpagoBI and Big Data now
Talend for Big Data offers a BI
ETL approach
Behavioural
model
Cross
Load, extract and improve data located in services
these diverse data sources and to govern
big data projects centrally DATA
MART
Talend integration enables the NO-Sql BIG
Data Management DATA
www.spagobi.org Creative Commons Attribution-NonCommercial-ShareAlike 3.0 5
6. Data Set and Data Sources evolution
Use of Data Source/Data Sets as connections to new Data Storages
Data Sets as
PLUGINS
Analytical
DBMS
NoSQL
Hadoop
Behavioural
Cross model
services
www.spagobi.org Creative Commons Attribution-NonCommercial-ShareAlike 3.0 6
7. SpagoBI - Big Data Scenario #1
Scenario #1
Data Sources: distributed data storages for Big
Data such as Hadoop HDFS, HBase, Hive
Data Sets: with Query Languages depending on
Behavioural the data source, such as HQL, Cloudera Impala,
model Pig
Cloudera
Impala
Cross
services HiveQL
Pig latin
Apache Drill
BIG DATA Not Real Time Analysis
www.spagobi.org Creative Commons Attribution-NonCommercial-ShareAlike 3.0 7
8. SpagoBI - Big Data Scenario #2
Scenario #2
Complex Event Processing Engine CEP
Data Source populated with Storm/S4/Druid
Data Set with Esper (EPL query language)
Real Time
Analysis
Console
BYTES document
STREAM Advanced platform for the monitoring of heterogeneous
services and applications and the historical analysis of data
www.spagobi.org Creative Commons Attribution-NonCommercial-ShareAlike 3.0 8
9. SpagoBI - Big Data Scenario #3
Scenario #3
Data Sources: Storages for RDF-semantic
format
Data Sets: Interrogation with SPARQL syntax or
others, depending on the data source (HBase,
Hadoop...)
Not Real time/ Real
Time depending on the
storage
Linked Data
SEMANTIC
Analysis
www.spagobi.org Creative Commons Attribution-NonCommercial-ShareAlike 3.0 9
10. SpagoBI & Big Data – the complete view
SEMANTIC
BIG
DATA BYTES STREAM
www.spagobi.org Creative Commons Attribution-NonCommercial-ShareAlike 3.0 10
11. Contacts
Visit SpagoBI website:
www.spagobi.org
Contact us:
spagobi@eng.it
Download SpagoBI from OW2 Forge:
http://forge.ow2.org/
or visit us at Engineering Group's stand
www.spagobi.org Creative Commons Attribution-NonCommercial-ShareAlike 3.0 11