SlideShare ist ein Scribd-Unternehmen logo
1 von 35
www.infocepts.com
BI Reporting SF Bay User
Group
08 July 2014
#BI Reporting for Bay Area Start-ups
Presented by: Scott Mitchell
DWApplications
www.infocepts.com
Presenter – Scott Mitchell Background
 Currently based in San Francisco Bay Area
 Consultant – Working for DWApplications
 Partnering with Infocepts for their off-shore blended staffing capacity
 BI and DW Experience
 Started working with BI/DW tools in 1997 (17yrs)
 Worked on all sides of the fence
 Reporting, DBA, ETL, Solution Architect
 Significant experience in Agile BI Application Integration
 Previous Implementations
 Start-ups - ePredix, Telephia/Nielsen Mobile, Quantros, mFoundry/FIS, TradePulse, iQ-ity
 Enterprise - Victoria Secrets, eBay, Ross, Safeway, Bank of America, VISA
2
BI Reporting for Bay Area Start-ups
www.infocepts.com
BIG Data
#
www.infocepts.com
Big Data Agenda
#
• BIG Data
 Standard BIG Data Reference Architecture
 5/7 Vs of BIG Data
 Hadoop Ecosystem
 Connecting Hadoop
 Components of Hadoop Ecosystem
• BIG Data Questions
 What Hadoop can do Vs What Hadoop can’t do?
 When to use Hadoop
 When not to use Hadoop
 When BIG Data over RDBMS
 Can Big Data and traditional RDBMS co-exist?
 RDBMS or BIG Data or both?
 Real time Analytics using Big Data
• BIG Data Platform Comparison
• BI Tool Comparison
www.infocepts.com
Standard BIG Data Reference Architecture
#
http://thinkbiganalytics.com/leading_big_data_technologies/big-data-
reference-architecture/
www.infocepts.com
5 Vs of BIG Data
6
 Volume: This is the aspect that comes to most people’s minds when they think of
Big Data. Volumes of data have increased exponentially in recent times. It is not
uncommon for businesses to deal with petabytes of data, and typically analysis is
performed over the entire data set, not just a sample
 Velocity: Big Data is not just about the volume though. Just as important is the rate
of change of the data. For a large volume of data which doesn’t change very often,
analysis that takes a number of hours or days to complete may be acceptable, but if
the dataset is growing by terabytes per day, or the data is changing at a high rate of
speed, the processing time of analysis becomes much more important
 Variety: Big Data is not always structured data and it is not always easy to put big
data into a relational database. Big Data includes data types such as videos, music
files, emails, unstructured word documents and social media feeds. Dealing with a
variety of structured and unstructured data greatly increases the complexity of
both storing and analyzing Big Data
www.infocepts.com
5 Vs of BIG Data
7
 Veracity: When we are dealing with a high volume, velocity and variety of data, it
is inevitable that not all of the data is going to be 100% correct – there will be
dirty data. The question is, how clean is good enough for the analysis to be
performed? Often the data does not need to be perfect, but does need to be close
enough to gain relevant insight. Dependent on the application, the veracity, or
verification of the data may be essential, or simply “nice to have”
 Value : This is the most important aspect of big data. It costs a lot of money to
implement IT infrastructure systems to store big data, and businesses are going to
require a return on investment. At the end of the day, if you can’t extract value
from your data, there is No point in building the capability to store and manage it.
www.infocepts.com
Additional Vs – Part of 7Vs of BIG Data
8
Additionally some experts also add:
 Validity: The interpreted data having a sound basis in logic or fact – is a result of
the logical inferences from matching data. One of the most common errors being
the confusion between correlation and causation. Context of the data becomes
very important.
 Visibility: The state of being able to see or be seen – is implied. Data from
disparate sources need to be stitched together where they are visible to the
technology stack making up Big Data. Critical data that is otherwise available, but
not visible to the processes of Big Data may be one of the Achilles Heels of the Big
Data paradigm. Conversely, unauthorized visibility is a risk.
www.infocepts.com
Hadoop Ecosystem
#
Components that
can directly use
YARN
Components using
MapReduce
framework
SQL based
database tools
www.infocepts.com 10
BI Tools
ETL Tools
JDBC/ODBC
JDBC/ODBC/Native
Databases
Connecting Hadoop
www.infocepts.com
Modules of Hadoop Ecosystem
#
• Hadoop Distributed File System (HDFS): A distributed file system that provides
high-throughput access to application data
• Hadoop YARN: A framework for job scheduling and cluster resource
management
• Hadoop Common: The common utilities that support the other Hadoop
modules
• Hadoop MapReduce: A YARN-based system for parallel processing of large data
sets
www.infocepts.com #
• HBase: A scalable, distributed database that supports structured data storage
for large tables
• Hive: A data warehouse infrastructure that provides data summarization and ad
hoc querying
• Pig: A high-level data-flow language and execution framework for parallel
computation. Used for constructing data flows for (ETL) extract, transform, load
• ZooKeeper: A high-performance coordination service for distributed applications
• Ambari: A web-based tool for provisioning, managing, and monitoring Apache
Hadoop clusters which includes support for Hadoop HDFS, Hadoop MapReduce,
Hive, HCatalog, HBase, ZooKeeper, Oozie, Pig and Sqoop. Ambari also provides a
dashboard for viewing cluster health such as heatmaps and ability to view
MapReduce, Pig and Hive applications visually along with features to diagnose
their performance characteristics in a user-friendly manner
Other Components
www.infocepts.com
Gartner’s 12 Dimensions of Big Data – Extreme Information
#
There are three tiers of Information management in the model with four
dimensions in each tier
www.infocepts.com
Quantification
#
Physical data characteristics with reference to
 Complexity
 Velocity
 Growing by terabytes per day, or the data is changing at a high rate of speed, the processing
time of analysis becomes much more important
 Variety
 Various structures of data from different data sources like unstructured (from
websites, sensors, social media, etc.), semi-structured (from xml, web services, etc.)
and structured (from transactional systems)
 Velocity
 Speed of data collection, processing and access in real-time, near real-time,
historical and older.
 Volume
 High volume of data generated during different timeframes
 Complexity
 Individual data sets with different standards, business domain rules, storage formats
for each asset type.
www.infocepts.com
Access Enablement and Control
Information control based on nature of the data and information provided by
it (like, confidential HR, Finance and Sales data, Customer details, negative
tweets, etc.)
 Classification
 Classification of data in various classes depending on information hidden in it (like sensitive,
non-sensitive, private, public, etc.)
 Contracts
 Governance rules of enterprise data governance framework to allow access to specific data (like
agreements on who will share what information and how).
 Pervasiveness
 Spread and availability of data across various levels of organization depending on the
requirement by organization and details of information in the data (like how long does data
remain active, how long the Aggregation of data is valid for summary reports, when data
refreshes, etc…)
 Technology-enablement (specifications for tools and technology)
 Controlling empowerment of users to access various functionalities of Tools and technologies to
get information from Data (like security roles in MicroStrategy, etc.)
#
www.infocepts.com
Qualification and Assurance
 Fidelity
 Reliability of data source and authenticity of data.
 Linked data
 Association of data with its context (Affiliation)
 Validation of data
 Validity of data for its business use case and rules.
 Perishability
 Longevity, i.e. how long data is relevant to its context and analysis?
 Aging of data while retaining its state and originality
#
www.infocepts.com
BIG Data Questions
#
www.infocepts.com
When to use Hadoop?
 Your Data Sets Are Really Big – If data is in GBs, use Excel, a SQL BI tool on Postgres,
or some similar combination, but if data is in TBs or Petabytes, Hadoop’s superior
scalability will save you a considerable amount of time and money
 You Celebrate Data Diversity – It doesn’t matter whether your raw data is structured
(like out of an ERP system), semi-structured (like XML and log files), unstructured
(like video files) or all three–Hadoop and its forgiving schema will gobble it up
 You Have Strong Programming Skills – Hadoop is written in Java, and therefore
requires Java programming skills to master. That is changing with new tools in
Hadoop ecosystem but right now it largely remains a venue for excellent Java skills
 You Are Building an ‘Enterprise Data Hub’ for the Future – If you work for large
enterprise, you might sign up for Hadoop even if your data isn’t particularly massive
or diverse or fast at this point in time. It might make sense to start experimenting
with Hadoop to be ready to take advantage when the elephant really starts sizzling
and goes mainstream in a few years.
 You Find Yourself Throwing Away Perfectly Good Data –Hadoop can store petabytes
of data. If you find that you are throwing away potentially valuable data because its
costs too much to archive, you may find that setting up a Hadoop cluster allows you
to retain this data, and gives you the time to figure out how to best make use of that
data
#
www.infocepts.com
When not to use Hadoop?
 You Want to Store Sensitive Data – One of the things that Hadoop is not particularly
good at today is storing sensitive data. Hadoop today has basic data and use access
security. And while these features are improving by the month, the risks of
accidentally losing personally identifiable information due to Hadoop’s less-than-
stellar security capabilities is probably not worth the risk
 You Want to Replace Your Data Warehouse – Still a majority of data pros tell that
Hadoop is complementary to a traditional data warehouse, not a replacement for it.
The superior economics of Hadoop-based storage make it an excellent place to land
raw data and pre-process it before siphoning it over to a traditional data warehouse
to run analytic workloads
 You want to Delete or Update data frequently – Hive does not support DELETE and
UPDATE commands, so if there exists a business need where frequent deletion or
updating data is paramount, Hadoop is not the way to go
#
www.infocepts.com
When to use BIG Data technologies over RDBMS? When you
No longer can achieve the desired results with your RDBMS
#
• When data is highly unstructured. Ex: Scanner data, social media data, streaming
data, videos, documents tweets, photos, etc.
• When data is huge in volume and complexity (Greater than 1TB and complex
data)
• Customers adopt BIG Data for specific roles – especially exploratory data-science
sandboxes and unstructured data staging
And for some very technical issue oriented reasoning:
• Count Distinct Queries: A count distinct query by definition has to process every
record, including sorting and counting. And this becomes a difficult problem when
the volume of data is huge. Mixing one or more such distinct aggregates with non-
distinct aggregates in the same select list, or mixing two or more distinct aggregates
causes more performance issues as it leads to spooling and re-reading of
intermediate results.
• Cursors: A cursor is where you are stepping through a table row by row in a
database. If you are doing some analysis using some kind of a case statement using a
cursor on each row of the database and if the table is of any significant size, this is a
very bad situation. Cursors are good for iterating through small metadata tables.
RDBM Systems are not optimized for stepping through large datasets one entry at a
time
www.infocepts.com #
• Alter Table: You have a big data warehouse of a customer and you have a table X.
This table X is so big and is so important with so many columns that if you want to
alter it by adding a column, changing a column data type or running any DML
operation, it would require a long time to complete. Such operations need to be
planned and done very carefully as they lock out the table during this whole
operation until the statement completes. In addition if the column that you are
adding has a NOT NULL clause it would be very painful as the DBMS has to insert
default values into all of the existing rows which may overburden your transaction
logs.
• Data Merge and Mashup (Structured meeting Unstructured): Most retailers today
have both online and in-store presence. Consider a scenario where you have
customers' online product search data (search logs) in the retailer’s website for the
last 15 days, their past in-store purchase history (RDBMS), their in-store charge card
transaction data and their daily commute pattern data that you have from their
cellphone provider. If you want to build an analytical model that aims to combine
these myriad sources of data to send custom discount offers that are valid in a
specific store located along the customer’s daily commute path, then you would need
to combine all of these sources of data to achieve this. It’s difficult to deal with
unstructured data using an RDBMS, let alone combining unstructured data with
structured.
www.infocepts.com
When using Big Data Technologies like Hadoop and Hive, Do
we still need standard RDBMS to perform Analytics? No
#
Hive is essentially a data warehouse infrastructure that provides data
summarization and ad hoc querying. It performs the role of a Data Warehouse
platform for using the organization’s structured data in the Hadoop Ecosystem. The
Hadoop long term vision is that an organization can completely rely on Hadoop
ecosystem for Analytics even in absence of RDBMS.
However Right Now:
- Hadoop is IT heavy and business users need IT hand holding
- Lacks highly accessible self-service tools for business users
- Hadoop does not have extensive pre-existing adapters for ERP systems
- Would require significant investment to re-write advanced ETL feeding DW
Do I need a RDBMS or a BIG Data database or Both? Varies
from one organization to other
As organizations become aware of their data and their needs, they will be in a
better position to decide which technology fits their requirement. As covered
earlier – structured vs unstructured and the volume and complexity of data are
major attributes that can help in deciding.
www.infocepts.com
How close can we get to Real time Analytics using BIG Data
technologies(than having to move data through ETL
Processes) ? Really Real Time or Streaming Real Time
Analytics is possible with BIG Data
#
Hadoop ecosystem has already got many customer examples where the Real time
Analytics is really real time / streaming real time.
Learn from this recently concluded Hadoop Summit keynote how a large truck
agency tracks various events like starting, stopping, traffic violations like speeding,
excessive braking and unsafe tail distance while trucks are on the roads and
delivery goods.
The system also gives interactive inputs on historical data as well – to see how
other routes have performed in violations.
http://hadoopsummit.org/san-jose/keynote-day2/
www.infocepts.com #
Can we can replace RDBMS with BIG Data databases some
day? Yes and No
Why Yes?
• BIG Data Eco systems like Hadoop already have components that can handle
unstructured as well as the traditional structured data.
• RDBMS Is expensive. Even with a Terabyte or two of data. The license fees and
hardware needed to run even a 2-3 TB DWH and BI solution will be massive for a
RDBMS based system. BIG Data technologies are quickly filling up here – giving
away stable ecosystems without hampering performance or budget.
Why No?
• RDBMS, its been around for ages, is mature and has a lot of helpful tools. And
then “Transactional Applications” is still one thing that RDBMS handles best, and
we don’t see anything yet from the BIG Data technologies that tackles it as well.
• Hadoop’s inventor Doug Cutting feels so. He recently opined Hadoop is
"augmenting and not replacing“. He mentions things like doing payroll – the real
nuts and bolts things for which people have been using RDBMS will not be a
good fit for Hadoop or other BIG Data platforms
www.infocepts.com #
Augment your EDW with Hadoop adding new capabilities/insight
- Continue to store summary structured data from your OLTP and back
office systems into the EDW.
- Store unstructured data into Hadoop that does not fit nicely into “Tables.”
This means all the communication with your customers from phone logs,
customer feedbacks, GPS locations, photos, tweets, emails, text messages,
etc. can be stored in Hadoop. You can store this a lot more cost effectively in
Hadoop.
- Co-relate data in your EDW with the data in your Hadoop cluster to get
better insight about your customers, products, equipment, etc. You can now
use this data for analytics that are computation-intensive, such as clustering
and targeting. Run ad-hoc analytics and models against your data in
Hadoop, while you are still transforming and loading your EDW.
- Do not build Hadoop capabilities within your enterprise in a silo. Hadoop
and other big data technologies should work in tandem with and extend
the value of your existing data warehouse and analytics technologies.
- Data warehouse vendors are adding capabilities of Hadoop and
MapReduce into their offerings while Hadoop is trying to take on more
traditional DW activities
www.infocepts.com
Big Data Tool Comparison
www.infocepts.com
Big Data Technologies Comparison
#
Features Cassandra HBase Hive MongoDB
Description
Wide-column store
based on ideas of
BigTable and
DynamoDB
Wide-column store
based on Apache
Hadoop and on
concepts of BigTable
data warehouse
software for querying
and managing large
distributed datasets,
built on Hadoop
One of the most
popular document
stores
Developer
Apache Software
Foundation
Apache Software
Foundation
Apache Software
Foundation MongoDB, Inc
Initial release 2008 2008 2012 2009
License Open Source Open Source Open Source Open Source
Implementation
language Java Java Java C++
Server operating
systems BSD Linux, Unix, Windows
All OS with a Java
VM,
Linux, OSX, Solaris,
Windows
Database model Wide column store Wide column store Relational DBMS Document store
Data scheme schema-free schema-free Yes schema-free
Transaction concepts No No No No
www.infocepts.com
Big Data Technologies Comparison
#
Name Cassandra HBase Hive MongoDB
Typing Yes No Yes Yes
Secondary indexes restricted No Yes Yes
SQL No No No No
APIs and other
access methods Proprietary protocol
Java API, RESTful
HTTP API, Thrift JDBC, ODBC, Thrift
proprietary protocol
using JSON
Partitioning methods Sharding Sharding Sharding Sharding
Durability Yes Yes Yes Yes
Server-side scripts No Yes Yes JavaScript
Triggers Yes Yes No No
Replication methods
selectable replication
factor
selectable replication
factor
selectable replication
factor
Master-slave
replication
MapReduce Yes Yes Yes Yes
www.infocepts.com #
Features Cassandra HBase Hive MongoDB
Supported
programming
languages
C#, C++, Clojure,
Erlang, Go, Haskell,
Java, JavaScript , Perl,
PHP, Python, Ruby,
Scala
C, C#, C++, Groovy,
Java, PHP, Python,
Scala
C++, Java, PHP,
Python
Actionscript , C, C#,
C++, Clojure ,
ColdFusion , D , Dart ,
Delphi , , Erlang, Go ,
Groovy , Haskell,
Java, JavaScript, Lisp ,
Lua , MatLab , Perl,
PHP, PowerShell ,
Prolog , Python, R ,
Ruby, Scala, Smalltalk
Consistency concepts
Eventual Consistency,
Immediate
Consistency
Immediate
Consistency Eventual Consistency
Eventual Consistency,
Immediate
Consistency
Foreign keys No No No No
Concurrency Yes Yes Yes Yes
User concepts
Access rights for
users can be defined
per object
Access Control Lists
(ACL)
Access rights for
users, groups and
roles
Users can be defined
with full access or
read-only access
Big Data Technologies Comparison
www.infocepts.com
BI Tool Comparison
www.infocepts.com
BI Landscape
#
Vendor Category Vendor Products
Megavendors IBM, Microsoft, Oracle, SAP
Large Independent
Vendors
Information Builders,
MicroStrategy, SAS
Data Discovery
Vendors
Qlik, Tableau, Tibco Spotfire
Open Source Actuate, Jaspersoft, Pentaho
SaaS Birst,
Small Independent
Vendors
Bitam, Salient, Panorama,
Logi Analytics, Targit,
GoodData, arcplan, Infor,
Alteryx, Pyramid Analytics,
Board International,
Prognoz, Yellowfin
www.infocepts.com
Gartner’s 17 Categories
#
Information Delivery
1. Reporting – Ability to create print-ready and interactive reports
2. Dashboards – Multi-object, linked reports in an intuitive and interactive display.
3. Ad hock report/query – Ability for end-users to create their own reports
4. Microsoft Office Integration – How the tool integrates with Office suite
5. Mobile BI – Ability to deliver to mobile devices using the native features of
mobile
Analysis
6. Interactive Visualization – Exploring the data that goes beyond pie/bar charts.
Includes heat maps, geographic maps, scatter plots, etc.
7. Search-based Data Discovery – Easily search structured and unstructured data
sources.
8. Geospatial and Location Intelligence – Ability to show relationships on
interactive maps using geographic, spatial and time information.
9. Embedded Advanced Analytics – Leverages statistical function libraries,
Predictive Model Markup Language (PMML )and R-based models.
10. OLAP – Fast, multidimensional access and manipulation of the data.
www.infocepts.com
Gartner’s 17 Categories
#
Integration
11. BI Infrastructure and Administration – Shared security, metadata,
administration, object model, query engine and scheduling/distribution.
12. Metadata Management (MDM) – Centralized and robust way to
administer/manage dimensions, facts, performance, report layouts, etc.
13. Business User Data Mashup and Modeling – Code-free, drag-and-drop and user
driving ability to mix and match different data.
14. Development Tools – Programmatic and visual tools for developing reports,
dashboards and analysis.
15. Embeddable Analytics – Includes software development kit (SDK) for truly
customizing, porting and embedding analysis both within and outside the
platform.
16. Collaboration – Ability to share and discuss.
17. Support for Big Data – Ability to query hybrid, columnar and array-based data
sources – MapReduce and NoSQL databases.
www.infocepts.com
BI Platforms Comparison - Gartner
Tool Strengths Weakness
Actuate • Release of Birt iHub 3 – consistent, streamlined interface with
better integration across product line
• Expanded big data connectivity and mashup capabilities
• Functionality and ease of use rated high
• Deterioration of market understanding, user
experience and contract experience
• Overall product capability score below average
• Not highly used for dash boarding, ad hoc analysis
and interactive visualization/discovery
Jaspersoft • End-to-end BI
• First pay-as-you-go BI server on AWS
• Low cost of ownership
• Capabilities scored below average
• Used narrowly in organizations
• Below average data volumes
• Embeddable analytics and advanced analytics
Pentaho • Low cost of ownership
• Ranked high for development tools
• Investing and launching emerging analytic application
capabilities – Big Data Layer, Instaview, Storm ad Splunk
• Customer experience, product quality and support
below average.
• Difficult to use and implement
Qlik • Launch of redesigned visualization experience – Natural
Analytics (Q3/Q4 2014).
• Ease of use for analysis and development
• Associative search eliminates some complex SQL
• Strong on dashboards, visualizations, mashups, collaboration,
mobile and big data support
• Not enterprise-ready – lacks MDM, infrastructure
and embeddability
• Limited compared to other stand-alone data
vendors in visual-based interactive exploration and
analysis
• Major rearchitecting poses risks to current
customers – could loose market traction
Tableau • Highly intuitive, visual-based data discovery, dash boarding,
and data mashup capabilities
• High customer satisfaction and experience
• Reusability, scalability and embeddability
• Wide range of support for data access
• Used as a complement, not the standard
• Inflexible in negotiations / high maintenance fees
• Ability to address governance and broader BI
functionality a work in progress
34
www.infocepts.com
BI Platforms Comparison - Gartner
Tool Strengths Weakness
MicroStrategy • Go-to platform to handle the most complex deployments
• Organic integration and superior product quality
• Choice where mobile is strategic requirement
• Big Data integration
• Visual data discovery and multi-TB, in-memory engine (in dev)
• Steep initial learning curve (Moblie/VI combating
that)
• Cost of software
• Longest to develop reports (along w/SAP)
• Blurred marketing message
SAP Business
Object
• Large deployments and enterprise BI standards – integration
key
• Heavy investing in visual data discovery/embeddable analytics
• Expansion of BI Customer Success initiative
• Hard to use and do complex analysis
• Software quality/difficult to migrate
• High cost and hard sale
• Integration concerns/questions on BI commitment
IBM Cognos • Handles some of the largest deployments
• Watson Analytics (2014) – smart data discovery
• Simplified licensing modeling
• Unrecognizable differentiation in market
• Cost, poor performance, lack of ease-of-use and
support quality all customer concerns
• Scores low/not reaching business benefits
Oracle BI
EE
• Leader in information management
• Integration, pre-built solutions and large scale deployments
• Large network of partners
• Unavailability of complex types/advanced analytics
• Requires sophisticated BI-related competencies
• Scores low in quality and late with mobile
Tibco • Aims to stay ahead of the curve with aggressive
development/acquisition
• Quality, functional and ease of use rated high
• Used for complex analyses
• Large, complex reports take a long time to develop
• Dashboards rated average
• Administration, development and MDM rated
below average
• Support staff coverage not always adequate
Microsoft • Ubiquitous BI across products - it is already there and being
used
• Attractive packaging and pricing
• Investing heavily in cloud
• Excel widely used and accelerated investments in feature
releases
• Mobile BI, interactive visualization and MDM are
product weaknesses
• Multiproduct complexity = on-premises or hybrid
deployments.
• Do-it-yourself approach – onus is on customer
35

Weitere ähnliche Inhalte

Was ist angesagt?

SqlSaturday#699 Power BI - Create a dashboard from zero to hero
SqlSaturday#699 Power BI - Create a dashboard from zero to heroSqlSaturday#699 Power BI - Create a dashboard from zero to hero
SqlSaturday#699 Power BI - Create a dashboard from zero to heroVishal Pawar
 
Learn why Microsoft Power BI is an Undisputed Market Leader?
Learn why Microsoft Power BI is an Undisputed Market Leader?Learn why Microsoft Power BI is an Undisputed Market Leader?
Learn why Microsoft Power BI is an Undisputed Market Leader?Visual_BI
 
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...Igor De Souza
 
Business Intelligence Solution on Windows Azure
Business Intelligence Solution on Windows AzureBusiness Intelligence Solution on Windows Azure
Business Intelligence Solution on Windows AzureInfosys
 
Ibm machine learning for z os
Ibm machine learning for z osIbm machine learning for z os
Ibm machine learning for z osCuneyt Goksu
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
 
Bi presentation to bkk
Bi presentation to bkkBi presentation to bkk
Bi presentation to bkkguest4e975e2
 
Traditional Data-warehousing / BI overview
Traditional Data-warehousing / BI overviewTraditional Data-warehousing / BI overview
Traditional Data-warehousing / BI overviewNagaraj Yerram
 
20100430 introduction to business objects data services
20100430 introduction to business objects data services20100430 introduction to business objects data services
20100430 introduction to business objects data servicesJunhyun Song
 
Business objects data services in an sap landscape
Business objects data services in an sap landscapeBusiness objects data services in an sap landscape
Business objects data services in an sap landscapePradeep Ketoli
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureJames Serra
 
Modernize & Automate Analytics Data Pipelines
Modernize & Automate Analytics Data PipelinesModernize & Automate Analytics Data Pipelines
Modernize & Automate Analytics Data PipelinesCarole Gunst
 
The Future of Data Warehousing: ETL Will Never be the Same
The Future of Data Warehousing: ETL Will Never be the SameThe Future of Data Warehousing: ETL Will Never be the Same
The Future of Data Warehousing: ETL Will Never be the SameCloudera, Inc.
 
Anatomy of a data driven architecture - Tamir Dresher
Anatomy of a data driven architecture - Tamir Dresher   Anatomy of a data driven architecture - Tamir Dresher
Anatomy of a data driven architecture - Tamir Dresher Tamir Dresher
 
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big DataMicrosoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big DataHortonworks
 
Chug building a data lake in azure with spark and databricks
Chug   building a data lake in azure with spark and databricksChug   building a data lake in azure with spark and databricks
Chug building a data lake in azure with spark and databricksBrandon Berlinrut
 
Big Data: It’s all about the Use Cases
Big Data: It’s all about the Use CasesBig Data: It’s all about the Use Cases
Big Data: It’s all about the Use CasesJames Serra
 
Data warehouse 101-fundamentals-
Data warehouse 101-fundamentals-Data warehouse 101-fundamentals-
Data warehouse 101-fundamentals-AshishGuleria
 

Was ist angesagt? (20)

SqlSaturday#699 Power BI - Create a dashboard from zero to hero
SqlSaturday#699 Power BI - Create a dashboard from zero to heroSqlSaturday#699 Power BI - Create a dashboard from zero to hero
SqlSaturday#699 Power BI - Create a dashboard from zero to hero
 
Learn why Microsoft Power BI is an Undisputed Market Leader?
Learn why Microsoft Power BI is an Undisputed Market Leader?Learn why Microsoft Power BI is an Undisputed Market Leader?
Learn why Microsoft Power BI is an Undisputed Market Leader?
 
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...
 
Business Intelligence Solution on Windows Azure
Business Intelligence Solution on Windows AzureBusiness Intelligence Solution on Windows Azure
Business Intelligence Solution on Windows Azure
 
Ibm machine learning for z os
Ibm machine learning for z osIbm machine learning for z os
Ibm machine learning for z os
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Bi presentation to bkk
Bi presentation to bkkBi presentation to bkk
Bi presentation to bkk
 
Traditional Data-warehousing / BI overview
Traditional Data-warehousing / BI overviewTraditional Data-warehousing / BI overview
Traditional Data-warehousing / BI overview
 
20100430 introduction to business objects data services
20100430 introduction to business objects data services20100430 introduction to business objects data services
20100430 introduction to business objects data services
 
Data Lake
Data LakeData Lake
Data Lake
 
Business objects data services in an sap landscape
Business objects data services in an sap landscapeBusiness objects data services in an sap landscape
Business objects data services in an sap landscape
 
SAP HANA Database
SAP HANA DatabaseSAP HANA Database
SAP HANA Database
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
 
Modernize & Automate Analytics Data Pipelines
Modernize & Automate Analytics Data PipelinesModernize & Automate Analytics Data Pipelines
Modernize & Automate Analytics Data Pipelines
 
The Future of Data Warehousing: ETL Will Never be the Same
The Future of Data Warehousing: ETL Will Never be the SameThe Future of Data Warehousing: ETL Will Never be the Same
The Future of Data Warehousing: ETL Will Never be the Same
 
Anatomy of a data driven architecture - Tamir Dresher
Anatomy of a data driven architecture - Tamir Dresher   Anatomy of a data driven architecture - Tamir Dresher
Anatomy of a data driven architecture - Tamir Dresher
 
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big DataMicrosoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big Data
 
Chug building a data lake in azure with spark and databricks
Chug   building a data lake in azure with spark and databricksChug   building a data lake in azure with spark and databricks
Chug building a data lake in azure with spark and databricks
 
Big Data: It’s all about the Use Cases
Big Data: It’s all about the Use CasesBig Data: It’s all about the Use Cases
Big Data: It’s all about the Use Cases
 
Data warehouse 101-fundamentals-
Data warehouse 101-fundamentals-Data warehouse 101-fundamentals-
Data warehouse 101-fundamentals-
 

Andere mochten auch

BI Reporting Application Comparison
BI Reporting Application ComparisonBI Reporting Application Comparison
BI Reporting Application ComparisonScott Mitchell
 
Become BI Architect with 1KEY Agile BI Suite - Architecture
Become BI Architect with 1KEY Agile BI Suite  - ArchitectureBecome BI Architect with 1KEY Agile BI Suite  - Architecture
Become BI Architect with 1KEY Agile BI Suite - ArchitectureDhiren Gala
 
Implementing bi in proof of concept techniques
Implementing bi in proof of concept techniquesImplementing bi in proof of concept techniques
Implementing bi in proof of concept techniquesRanjith Ramanan
 
BI architecture presentation and involved models (short)
BI architecture presentation and involved models (short)BI architecture presentation and involved models (short)
BI architecture presentation and involved models (short)Thierry de Spirlet
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lakeJames Serra
 
B5108 g formation-ibm-cognos-bi-vue-d-ensemble
B5108 g formation-ibm-cognos-bi-vue-d-ensembleB5108 g formation-ibm-cognos-bi-vue-d-ensemble
B5108 g formation-ibm-cognos-bi-vue-d-ensembleCERTyou Formation
 
Scorecarding with IBM Cognos 10 Business Intelligence
Scorecarding with IBM Cognos 10 Business IntelligenceScorecarding with IBM Cognos 10 Business Intelligence
Scorecarding with IBM Cognos 10 Business IntelligenceSenturus
 
backbase-cxp-datasheet
backbase-cxp-datasheetbackbase-cxp-datasheet
backbase-cxp-datasheetMykola Bova
 
Business Intelligence
Business IntelligenceBusiness Intelligence
Business Intelligences.poles
 
Anomaly Detection using Spark MLlib and Spark Streaming
Anomaly Detection using Spark MLlib and Spark StreamingAnomaly Detection using Spark MLlib and Spark Streaming
Anomaly Detection using Spark MLlib and Spark StreamingKeira Zhou
 
Spark One Platform Webinar
Spark One Platform WebinarSpark One Platform Webinar
Spark One Platform WebinarCloudera, Inc.
 
Automated Machine Learning Using Spark Mllib to Improve Customer Experience-(...
Automated Machine Learning Using Spark Mllib to Improve Customer Experience-(...Automated Machine Learning Using Spark Mllib to Improve Customer Experience-(...
Automated Machine Learning Using Spark Mllib to Improve Customer Experience-(...Spark Summit
 
Learn How to Use Microsoft Power BI for Office 365 to Analyze Salesforce Data
Learn How to Use Microsoft Power BI for Office 365 to Analyze Salesforce DataLearn How to Use Microsoft Power BI for Office 365 to Analyze Salesforce Data
Learn How to Use Microsoft Power BI for Office 365 to Analyze Salesforce DataNetwoven Inc.
 
Big data landscape map collection by aibdp
Big data landscape map collection by aibdpBig data landscape map collection by aibdp
Big data landscape map collection by aibdpAIBDP
 
The Future of Omni-Channel Banking
The Future of Omni-Channel BankingThe Future of Omni-Channel Banking
The Future of Omni-Channel BankingBackbase
 
(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWSAmazon Web Services
 
Power BI Architecture
Power BI ArchitecturePower BI Architecture
Power BI ArchitectureArthur Graus
 
An introduction to fundamental architecture concepts
An introduction to fundamental architecture conceptsAn introduction to fundamental architecture concepts
An introduction to fundamental architecture conceptswweinmeyer79
 

Andere mochten auch (20)

BI Reporting Application Comparison
BI Reporting Application ComparisonBI Reporting Application Comparison
BI Reporting Application Comparison
 
Become BI Architect with 1KEY Agile BI Suite - Architecture
Become BI Architect with 1KEY Agile BI Suite  - ArchitectureBecome BI Architect with 1KEY Agile BI Suite  - Architecture
Become BI Architect with 1KEY Agile BI Suite - Architecture
 
Implementing bi in proof of concept techniques
Implementing bi in proof of concept techniquesImplementing bi in proof of concept techniques
Implementing bi in proof of concept techniques
 
Bi methodology
Bi methodologyBi methodology
Bi methodology
 
BI architecture presentation and involved models (short)
BI architecture presentation and involved models (short)BI architecture presentation and involved models (short)
BI architecture presentation and involved models (short)
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
 
B5108 g formation-ibm-cognos-bi-vue-d-ensemble
B5108 g formation-ibm-cognos-bi-vue-d-ensembleB5108 g formation-ibm-cognos-bi-vue-d-ensemble
B5108 g formation-ibm-cognos-bi-vue-d-ensemble
 
Scorecarding with IBM Cognos 10 Business Intelligence
Scorecarding with IBM Cognos 10 Business IntelligenceScorecarding with IBM Cognos 10 Business Intelligence
Scorecarding with IBM Cognos 10 Business Intelligence
 
backbase-cxp-datasheet
backbase-cxp-datasheetbackbase-cxp-datasheet
backbase-cxp-datasheet
 
Business Intelligence
Business IntelligenceBusiness Intelligence
Business Intelligence
 
Anomaly Detection using Spark MLlib and Spark Streaming
Anomaly Detection using Spark MLlib and Spark StreamingAnomaly Detection using Spark MLlib and Spark Streaming
Anomaly Detection using Spark MLlib and Spark Streaming
 
Spark One Platform Webinar
Spark One Platform WebinarSpark One Platform Webinar
Spark One Platform Webinar
 
Automated Machine Learning Using Spark Mllib to Improve Customer Experience-(...
Automated Machine Learning Using Spark Mllib to Improve Customer Experience-(...Automated Machine Learning Using Spark Mllib to Improve Customer Experience-(...
Automated Machine Learning Using Spark Mllib to Improve Customer Experience-(...
 
Learn How to Use Microsoft Power BI for Office 365 to Analyze Salesforce Data
Learn How to Use Microsoft Power BI for Office 365 to Analyze Salesforce DataLearn How to Use Microsoft Power BI for Office 365 to Analyze Salesforce Data
Learn How to Use Microsoft Power BI for Office 365 to Analyze Salesforce Data
 
Big data landscape map collection by aibdp
Big data landscape map collection by aibdpBig data landscape map collection by aibdp
Big data landscape map collection by aibdp
 
The Future of Omni-Channel Banking
The Future of Omni-Channel BankingThe Future of Omni-Channel Banking
The Future of Omni-Channel Banking
 
(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS
 
Power BI Architecture
Power BI ArchitecturePower BI Architecture
Power BI Architecture
 
An introduction to fundamental architecture concepts
An introduction to fundamental architecture conceptsAn introduction to fundamental architecture concepts
An introduction to fundamental architecture concepts
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 

Ähnlich wie Big Data and BI Tools - BI Reporting for Bay Area Startups User Group

INTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOPINTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOPDr Geetha Mohan
 
IRJET- Survey of Big Data with Hadoop
IRJET-  	  Survey of Big Data with HadoopIRJET-  	  Survey of Big Data with Hadoop
IRJET- Survey of Big Data with HadoopIRJET Journal
 
Big data data lake and beyond
Big data data lake and beyond Big data data lake and beyond
Big data data lake and beyond Rajesh Kumar
 
Lecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptLecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptalmaraniabwmalk
 
Big data and Hadoop overview
Big data and Hadoop overviewBig data and Hadoop overview
Big data and Hadoop overviewNitesh Ghosh
 
Big Data-Survey
Big Data-SurveyBig Data-Survey
Big Data-Surveyijeei-iaes
 
Defining and Applying Data Governance in Today’s Business Environment
Defining and Applying Data Governance in Today’s Business EnvironmentDefining and Applying Data Governance in Today’s Business Environment
Defining and Applying Data Governance in Today’s Business EnvironmentCaserta
 
Big Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsBig Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsFredReynolds2
 
Big data-analytics-cpe8035
Big data-analytics-cpe8035Big data-analytics-cpe8035
Big data-analytics-cpe8035Neelam Rawat
 
BAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneyBAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneySai Paravastu
 
Big data analytics - Introduction to Big Data and Hadoop
Big data analytics - Introduction to Big Data and HadoopBig data analytics - Introduction to Big Data and Hadoop
Big data analytics - Introduction to Big Data and HadoopSamiraChandan
 

Ähnlich wie Big Data and BI Tools - BI Reporting for Bay Area Startups User Group (20)

INTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOPINTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOP
 
IRJET- Survey of Big Data with Hadoop
IRJET-  	  Survey of Big Data with HadoopIRJET-  	  Survey of Big Data with Hadoop
IRJET- Survey of Big Data with Hadoop
 
Big data rmoug
Big data rmougBig data rmoug
Big data rmoug
 
Big data data lake and beyond
Big data data lake and beyond Big data data lake and beyond
Big data data lake and beyond
 
Lecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptLecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.ppt
 
TSE_Pres12.pptx
TSE_Pres12.pptxTSE_Pres12.pptx
TSE_Pres12.pptx
 
Big data and Hadoop overview
Big data and Hadoop overviewBig data and Hadoop overview
Big data and Hadoop overview
 
Hadoop(Term Paper)
Hadoop(Term Paper)Hadoop(Term Paper)
Hadoop(Term Paper)
 
Big data
Big dataBig data
Big data
 
Data lake ppt
Data lake pptData lake ppt
Data lake ppt
 
Unit 1
Unit 1Unit 1
Unit 1
 
Big Data
Big DataBig Data
Big Data
 
Benefits of a data lake
Benefits of a data lake Benefits of a data lake
Benefits of a data lake
 
Big Data-Survey
Big Data-SurveyBig Data-Survey
Big Data-Survey
 
Defining and Applying Data Governance in Today’s Business Environment
Defining and Applying Data Governance in Today’s Business EnvironmentDefining and Applying Data Governance in Today’s Business Environment
Defining and Applying Data Governance in Today’s Business Environment
 
Big Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsBig Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential Tools
 
Big data-analytics-cpe8035
Big data-analytics-cpe8035Big data-analytics-cpe8035
Big data-analytics-cpe8035
 
BAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneyBAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, Sydney
 
Big data analytics - Introduction to Big Data and Hadoop
Big data analytics - Introduction to Big Data and HadoopBig data analytics - Introduction to Big Data and Hadoop
Big data analytics - Introduction to Big Data and Hadoop
 
Big data
Big dataBig data
Big data
 

Kürzlich hochgeladen

Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 

Kürzlich hochgeladen (20)

(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 

Big Data and BI Tools - BI Reporting for Bay Area Startups User Group

  • 1. www.infocepts.com BI Reporting SF Bay User Group 08 July 2014 #BI Reporting for Bay Area Start-ups Presented by: Scott Mitchell DWApplications
  • 2. www.infocepts.com Presenter – Scott Mitchell Background  Currently based in San Francisco Bay Area  Consultant – Working for DWApplications  Partnering with Infocepts for their off-shore blended staffing capacity  BI and DW Experience  Started working with BI/DW tools in 1997 (17yrs)  Worked on all sides of the fence  Reporting, DBA, ETL, Solution Architect  Significant experience in Agile BI Application Integration  Previous Implementations  Start-ups - ePredix, Telephia/Nielsen Mobile, Quantros, mFoundry/FIS, TradePulse, iQ-ity  Enterprise - Victoria Secrets, eBay, Ross, Safeway, Bank of America, VISA 2 BI Reporting for Bay Area Start-ups
  • 4. www.infocepts.com Big Data Agenda # • BIG Data  Standard BIG Data Reference Architecture  5/7 Vs of BIG Data  Hadoop Ecosystem  Connecting Hadoop  Components of Hadoop Ecosystem • BIG Data Questions  What Hadoop can do Vs What Hadoop can’t do?  When to use Hadoop  When not to use Hadoop  When BIG Data over RDBMS  Can Big Data and traditional RDBMS co-exist?  RDBMS or BIG Data or both?  Real time Analytics using Big Data • BIG Data Platform Comparison • BI Tool Comparison
  • 5. www.infocepts.com Standard BIG Data Reference Architecture # http://thinkbiganalytics.com/leading_big_data_technologies/big-data- reference-architecture/
  • 6. www.infocepts.com 5 Vs of BIG Data 6  Volume: This is the aspect that comes to most people’s minds when they think of Big Data. Volumes of data have increased exponentially in recent times. It is not uncommon for businesses to deal with petabytes of data, and typically analysis is performed over the entire data set, not just a sample  Velocity: Big Data is not just about the volume though. Just as important is the rate of change of the data. For a large volume of data which doesn’t change very often, analysis that takes a number of hours or days to complete may be acceptable, but if the dataset is growing by terabytes per day, or the data is changing at a high rate of speed, the processing time of analysis becomes much more important  Variety: Big Data is not always structured data and it is not always easy to put big data into a relational database. Big Data includes data types such as videos, music files, emails, unstructured word documents and social media feeds. Dealing with a variety of structured and unstructured data greatly increases the complexity of both storing and analyzing Big Data
  • 7. www.infocepts.com 5 Vs of BIG Data 7  Veracity: When we are dealing with a high volume, velocity and variety of data, it is inevitable that not all of the data is going to be 100% correct – there will be dirty data. The question is, how clean is good enough for the analysis to be performed? Often the data does not need to be perfect, but does need to be close enough to gain relevant insight. Dependent on the application, the veracity, or verification of the data may be essential, or simply “nice to have”  Value : This is the most important aspect of big data. It costs a lot of money to implement IT infrastructure systems to store big data, and businesses are going to require a return on investment. At the end of the day, if you can’t extract value from your data, there is No point in building the capability to store and manage it.
  • 8. www.infocepts.com Additional Vs – Part of 7Vs of BIG Data 8 Additionally some experts also add:  Validity: The interpreted data having a sound basis in logic or fact – is a result of the logical inferences from matching data. One of the most common errors being the confusion between correlation and causation. Context of the data becomes very important.  Visibility: The state of being able to see or be seen – is implied. Data from disparate sources need to be stitched together where they are visible to the technology stack making up Big Data. Critical data that is otherwise available, but not visible to the processes of Big Data may be one of the Achilles Heels of the Big Data paradigm. Conversely, unauthorized visibility is a risk.
  • 9. www.infocepts.com Hadoop Ecosystem # Components that can directly use YARN Components using MapReduce framework SQL based database tools
  • 10. www.infocepts.com 10 BI Tools ETL Tools JDBC/ODBC JDBC/ODBC/Native Databases Connecting Hadoop
  • 11. www.infocepts.com Modules of Hadoop Ecosystem # • Hadoop Distributed File System (HDFS): A distributed file system that provides high-throughput access to application data • Hadoop YARN: A framework for job scheduling and cluster resource management • Hadoop Common: The common utilities that support the other Hadoop modules • Hadoop MapReduce: A YARN-based system for parallel processing of large data sets
  • 12. www.infocepts.com # • HBase: A scalable, distributed database that supports structured data storage for large tables • Hive: A data warehouse infrastructure that provides data summarization and ad hoc querying • Pig: A high-level data-flow language and execution framework for parallel computation. Used for constructing data flows for (ETL) extract, transform, load • ZooKeeper: A high-performance coordination service for distributed applications • Ambari: A web-based tool for provisioning, managing, and monitoring Apache Hadoop clusters which includes support for Hadoop HDFS, Hadoop MapReduce, Hive, HCatalog, HBase, ZooKeeper, Oozie, Pig and Sqoop. Ambari also provides a dashboard for viewing cluster health such as heatmaps and ability to view MapReduce, Pig and Hive applications visually along with features to diagnose their performance characteristics in a user-friendly manner Other Components
  • 13. www.infocepts.com Gartner’s 12 Dimensions of Big Data – Extreme Information # There are three tiers of Information management in the model with four dimensions in each tier
  • 14. www.infocepts.com Quantification # Physical data characteristics with reference to  Complexity  Velocity  Growing by terabytes per day, or the data is changing at a high rate of speed, the processing time of analysis becomes much more important  Variety  Various structures of data from different data sources like unstructured (from websites, sensors, social media, etc.), semi-structured (from xml, web services, etc.) and structured (from transactional systems)  Velocity  Speed of data collection, processing and access in real-time, near real-time, historical and older.  Volume  High volume of data generated during different timeframes  Complexity  Individual data sets with different standards, business domain rules, storage formats for each asset type.
  • 15. www.infocepts.com Access Enablement and Control Information control based on nature of the data and information provided by it (like, confidential HR, Finance and Sales data, Customer details, negative tweets, etc.)  Classification  Classification of data in various classes depending on information hidden in it (like sensitive, non-sensitive, private, public, etc.)  Contracts  Governance rules of enterprise data governance framework to allow access to specific data (like agreements on who will share what information and how).  Pervasiveness  Spread and availability of data across various levels of organization depending on the requirement by organization and details of information in the data (like how long does data remain active, how long the Aggregation of data is valid for summary reports, when data refreshes, etc…)  Technology-enablement (specifications for tools and technology)  Controlling empowerment of users to access various functionalities of Tools and technologies to get information from Data (like security roles in MicroStrategy, etc.) #
  • 16. www.infocepts.com Qualification and Assurance  Fidelity  Reliability of data source and authenticity of data.  Linked data  Association of data with its context (Affiliation)  Validation of data  Validity of data for its business use case and rules.  Perishability  Longevity, i.e. how long data is relevant to its context and analysis?  Aging of data while retaining its state and originality #
  • 18. www.infocepts.com When to use Hadoop?  Your Data Sets Are Really Big – If data is in GBs, use Excel, a SQL BI tool on Postgres, or some similar combination, but if data is in TBs or Petabytes, Hadoop’s superior scalability will save you a considerable amount of time and money  You Celebrate Data Diversity – It doesn’t matter whether your raw data is structured (like out of an ERP system), semi-structured (like XML and log files), unstructured (like video files) or all three–Hadoop and its forgiving schema will gobble it up  You Have Strong Programming Skills – Hadoop is written in Java, and therefore requires Java programming skills to master. That is changing with new tools in Hadoop ecosystem but right now it largely remains a venue for excellent Java skills  You Are Building an ‘Enterprise Data Hub’ for the Future – If you work for large enterprise, you might sign up for Hadoop even if your data isn’t particularly massive or diverse or fast at this point in time. It might make sense to start experimenting with Hadoop to be ready to take advantage when the elephant really starts sizzling and goes mainstream in a few years.  You Find Yourself Throwing Away Perfectly Good Data –Hadoop can store petabytes of data. If you find that you are throwing away potentially valuable data because its costs too much to archive, you may find that setting up a Hadoop cluster allows you to retain this data, and gives you the time to figure out how to best make use of that data #
  • 19. www.infocepts.com When not to use Hadoop?  You Want to Store Sensitive Data – One of the things that Hadoop is not particularly good at today is storing sensitive data. Hadoop today has basic data and use access security. And while these features are improving by the month, the risks of accidentally losing personally identifiable information due to Hadoop’s less-than- stellar security capabilities is probably not worth the risk  You Want to Replace Your Data Warehouse – Still a majority of data pros tell that Hadoop is complementary to a traditional data warehouse, not a replacement for it. The superior economics of Hadoop-based storage make it an excellent place to land raw data and pre-process it before siphoning it over to a traditional data warehouse to run analytic workloads  You want to Delete or Update data frequently – Hive does not support DELETE and UPDATE commands, so if there exists a business need where frequent deletion or updating data is paramount, Hadoop is not the way to go #
  • 20. www.infocepts.com When to use BIG Data technologies over RDBMS? When you No longer can achieve the desired results with your RDBMS # • When data is highly unstructured. Ex: Scanner data, social media data, streaming data, videos, documents tweets, photos, etc. • When data is huge in volume and complexity (Greater than 1TB and complex data) • Customers adopt BIG Data for specific roles – especially exploratory data-science sandboxes and unstructured data staging And for some very technical issue oriented reasoning: • Count Distinct Queries: A count distinct query by definition has to process every record, including sorting and counting. And this becomes a difficult problem when the volume of data is huge. Mixing one or more such distinct aggregates with non- distinct aggregates in the same select list, or mixing two or more distinct aggregates causes more performance issues as it leads to spooling and re-reading of intermediate results. • Cursors: A cursor is where you are stepping through a table row by row in a database. If you are doing some analysis using some kind of a case statement using a cursor on each row of the database and if the table is of any significant size, this is a very bad situation. Cursors are good for iterating through small metadata tables. RDBM Systems are not optimized for stepping through large datasets one entry at a time
  • 21. www.infocepts.com # • Alter Table: You have a big data warehouse of a customer and you have a table X. This table X is so big and is so important with so many columns that if you want to alter it by adding a column, changing a column data type or running any DML operation, it would require a long time to complete. Such operations need to be planned and done very carefully as they lock out the table during this whole operation until the statement completes. In addition if the column that you are adding has a NOT NULL clause it would be very painful as the DBMS has to insert default values into all of the existing rows which may overburden your transaction logs. • Data Merge and Mashup (Structured meeting Unstructured): Most retailers today have both online and in-store presence. Consider a scenario where you have customers' online product search data (search logs) in the retailer’s website for the last 15 days, their past in-store purchase history (RDBMS), their in-store charge card transaction data and their daily commute pattern data that you have from their cellphone provider. If you want to build an analytical model that aims to combine these myriad sources of data to send custom discount offers that are valid in a specific store located along the customer’s daily commute path, then you would need to combine all of these sources of data to achieve this. It’s difficult to deal with unstructured data using an RDBMS, let alone combining unstructured data with structured.
  • 22. www.infocepts.com When using Big Data Technologies like Hadoop and Hive, Do we still need standard RDBMS to perform Analytics? No # Hive is essentially a data warehouse infrastructure that provides data summarization and ad hoc querying. It performs the role of a Data Warehouse platform for using the organization’s structured data in the Hadoop Ecosystem. The Hadoop long term vision is that an organization can completely rely on Hadoop ecosystem for Analytics even in absence of RDBMS. However Right Now: - Hadoop is IT heavy and business users need IT hand holding - Lacks highly accessible self-service tools for business users - Hadoop does not have extensive pre-existing adapters for ERP systems - Would require significant investment to re-write advanced ETL feeding DW Do I need a RDBMS or a BIG Data database or Both? Varies from one organization to other As organizations become aware of their data and their needs, they will be in a better position to decide which technology fits their requirement. As covered earlier – structured vs unstructured and the volume and complexity of data are major attributes that can help in deciding.
  • 23. www.infocepts.com How close can we get to Real time Analytics using BIG Data technologies(than having to move data through ETL Processes) ? Really Real Time or Streaming Real Time Analytics is possible with BIG Data # Hadoop ecosystem has already got many customer examples where the Real time Analytics is really real time / streaming real time. Learn from this recently concluded Hadoop Summit keynote how a large truck agency tracks various events like starting, stopping, traffic violations like speeding, excessive braking and unsafe tail distance while trucks are on the roads and delivery goods. The system also gives interactive inputs on historical data as well – to see how other routes have performed in violations. http://hadoopsummit.org/san-jose/keynote-day2/
  • 24. www.infocepts.com # Can we can replace RDBMS with BIG Data databases some day? Yes and No Why Yes? • BIG Data Eco systems like Hadoop already have components that can handle unstructured as well as the traditional structured data. • RDBMS Is expensive. Even with a Terabyte or two of data. The license fees and hardware needed to run even a 2-3 TB DWH and BI solution will be massive for a RDBMS based system. BIG Data technologies are quickly filling up here – giving away stable ecosystems without hampering performance or budget. Why No? • RDBMS, its been around for ages, is mature and has a lot of helpful tools. And then “Transactional Applications” is still one thing that RDBMS handles best, and we don’t see anything yet from the BIG Data technologies that tackles it as well. • Hadoop’s inventor Doug Cutting feels so. He recently opined Hadoop is "augmenting and not replacing“. He mentions things like doing payroll – the real nuts and bolts things for which people have been using RDBMS will not be a good fit for Hadoop or other BIG Data platforms
  • 25. www.infocepts.com # Augment your EDW with Hadoop adding new capabilities/insight - Continue to store summary structured data from your OLTP and back office systems into the EDW. - Store unstructured data into Hadoop that does not fit nicely into “Tables.” This means all the communication with your customers from phone logs, customer feedbacks, GPS locations, photos, tweets, emails, text messages, etc. can be stored in Hadoop. You can store this a lot more cost effectively in Hadoop. - Co-relate data in your EDW with the data in your Hadoop cluster to get better insight about your customers, products, equipment, etc. You can now use this data for analytics that are computation-intensive, such as clustering and targeting. Run ad-hoc analytics and models against your data in Hadoop, while you are still transforming and loading your EDW. - Do not build Hadoop capabilities within your enterprise in a silo. Hadoop and other big data technologies should work in tandem with and extend the value of your existing data warehouse and analytics technologies. - Data warehouse vendors are adding capabilities of Hadoop and MapReduce into their offerings while Hadoop is trying to take on more traditional DW activities
  • 27. www.infocepts.com Big Data Technologies Comparison # Features Cassandra HBase Hive MongoDB Description Wide-column store based on ideas of BigTable and DynamoDB Wide-column store based on Apache Hadoop and on concepts of BigTable data warehouse software for querying and managing large distributed datasets, built on Hadoop One of the most popular document stores Developer Apache Software Foundation Apache Software Foundation Apache Software Foundation MongoDB, Inc Initial release 2008 2008 2012 2009 License Open Source Open Source Open Source Open Source Implementation language Java Java Java C++ Server operating systems BSD Linux, Unix, Windows All OS with a Java VM, Linux, OSX, Solaris, Windows Database model Wide column store Wide column store Relational DBMS Document store Data scheme schema-free schema-free Yes schema-free Transaction concepts No No No No
  • 28. www.infocepts.com Big Data Technologies Comparison # Name Cassandra HBase Hive MongoDB Typing Yes No Yes Yes Secondary indexes restricted No Yes Yes SQL No No No No APIs and other access methods Proprietary protocol Java API, RESTful HTTP API, Thrift JDBC, ODBC, Thrift proprietary protocol using JSON Partitioning methods Sharding Sharding Sharding Sharding Durability Yes Yes Yes Yes Server-side scripts No Yes Yes JavaScript Triggers Yes Yes No No Replication methods selectable replication factor selectable replication factor selectable replication factor Master-slave replication MapReduce Yes Yes Yes Yes
  • 29. www.infocepts.com # Features Cassandra HBase Hive MongoDB Supported programming languages C#, C++, Clojure, Erlang, Go, Haskell, Java, JavaScript , Perl, PHP, Python, Ruby, Scala C, C#, C++, Groovy, Java, PHP, Python, Scala C++, Java, PHP, Python Actionscript , C, C#, C++, Clojure , ColdFusion , D , Dart , Delphi , , Erlang, Go , Groovy , Haskell, Java, JavaScript, Lisp , Lua , MatLab , Perl, PHP, PowerShell , Prolog , Python, R , Ruby, Scala, Smalltalk Consistency concepts Eventual Consistency, Immediate Consistency Immediate Consistency Eventual Consistency Eventual Consistency, Immediate Consistency Foreign keys No No No No Concurrency Yes Yes Yes Yes User concepts Access rights for users can be defined per object Access Control Lists (ACL) Access rights for users, groups and roles Users can be defined with full access or read-only access Big Data Technologies Comparison
  • 31. www.infocepts.com BI Landscape # Vendor Category Vendor Products Megavendors IBM, Microsoft, Oracle, SAP Large Independent Vendors Information Builders, MicroStrategy, SAS Data Discovery Vendors Qlik, Tableau, Tibco Spotfire Open Source Actuate, Jaspersoft, Pentaho SaaS Birst, Small Independent Vendors Bitam, Salient, Panorama, Logi Analytics, Targit, GoodData, arcplan, Infor, Alteryx, Pyramid Analytics, Board International, Prognoz, Yellowfin
  • 32. www.infocepts.com Gartner’s 17 Categories # Information Delivery 1. Reporting – Ability to create print-ready and interactive reports 2. Dashboards – Multi-object, linked reports in an intuitive and interactive display. 3. Ad hock report/query – Ability for end-users to create their own reports 4. Microsoft Office Integration – How the tool integrates with Office suite 5. Mobile BI – Ability to deliver to mobile devices using the native features of mobile Analysis 6. Interactive Visualization – Exploring the data that goes beyond pie/bar charts. Includes heat maps, geographic maps, scatter plots, etc. 7. Search-based Data Discovery – Easily search structured and unstructured data sources. 8. Geospatial and Location Intelligence – Ability to show relationships on interactive maps using geographic, spatial and time information. 9. Embedded Advanced Analytics – Leverages statistical function libraries, Predictive Model Markup Language (PMML )and R-based models. 10. OLAP – Fast, multidimensional access and manipulation of the data.
  • 33. www.infocepts.com Gartner’s 17 Categories # Integration 11. BI Infrastructure and Administration – Shared security, metadata, administration, object model, query engine and scheduling/distribution. 12. Metadata Management (MDM) – Centralized and robust way to administer/manage dimensions, facts, performance, report layouts, etc. 13. Business User Data Mashup and Modeling – Code-free, drag-and-drop and user driving ability to mix and match different data. 14. Development Tools – Programmatic and visual tools for developing reports, dashboards and analysis. 15. Embeddable Analytics – Includes software development kit (SDK) for truly customizing, porting and embedding analysis both within and outside the platform. 16. Collaboration – Ability to share and discuss. 17. Support for Big Data – Ability to query hybrid, columnar and array-based data sources – MapReduce and NoSQL databases.
  • 34. www.infocepts.com BI Platforms Comparison - Gartner Tool Strengths Weakness Actuate • Release of Birt iHub 3 – consistent, streamlined interface with better integration across product line • Expanded big data connectivity and mashup capabilities • Functionality and ease of use rated high • Deterioration of market understanding, user experience and contract experience • Overall product capability score below average • Not highly used for dash boarding, ad hoc analysis and interactive visualization/discovery Jaspersoft • End-to-end BI • First pay-as-you-go BI server on AWS • Low cost of ownership • Capabilities scored below average • Used narrowly in organizations • Below average data volumes • Embeddable analytics and advanced analytics Pentaho • Low cost of ownership • Ranked high for development tools • Investing and launching emerging analytic application capabilities – Big Data Layer, Instaview, Storm ad Splunk • Customer experience, product quality and support below average. • Difficult to use and implement Qlik • Launch of redesigned visualization experience – Natural Analytics (Q3/Q4 2014). • Ease of use for analysis and development • Associative search eliminates some complex SQL • Strong on dashboards, visualizations, mashups, collaboration, mobile and big data support • Not enterprise-ready – lacks MDM, infrastructure and embeddability • Limited compared to other stand-alone data vendors in visual-based interactive exploration and analysis • Major rearchitecting poses risks to current customers – could loose market traction Tableau • Highly intuitive, visual-based data discovery, dash boarding, and data mashup capabilities • High customer satisfaction and experience • Reusability, scalability and embeddability • Wide range of support for data access • Used as a complement, not the standard • Inflexible in negotiations / high maintenance fees • Ability to address governance and broader BI functionality a work in progress 34
  • 35. www.infocepts.com BI Platforms Comparison - Gartner Tool Strengths Weakness MicroStrategy • Go-to platform to handle the most complex deployments • Organic integration and superior product quality • Choice where mobile is strategic requirement • Big Data integration • Visual data discovery and multi-TB, in-memory engine (in dev) • Steep initial learning curve (Moblie/VI combating that) • Cost of software • Longest to develop reports (along w/SAP) • Blurred marketing message SAP Business Object • Large deployments and enterprise BI standards – integration key • Heavy investing in visual data discovery/embeddable analytics • Expansion of BI Customer Success initiative • Hard to use and do complex analysis • Software quality/difficult to migrate • High cost and hard sale • Integration concerns/questions on BI commitment IBM Cognos • Handles some of the largest deployments • Watson Analytics (2014) – smart data discovery • Simplified licensing modeling • Unrecognizable differentiation in market • Cost, poor performance, lack of ease-of-use and support quality all customer concerns • Scores low/not reaching business benefits Oracle BI EE • Leader in information management • Integration, pre-built solutions and large scale deployments • Large network of partners • Unavailability of complex types/advanced analytics • Requires sophisticated BI-related competencies • Scores low in quality and late with mobile Tibco • Aims to stay ahead of the curve with aggressive development/acquisition • Quality, functional and ease of use rated high • Used for complex analyses • Large, complex reports take a long time to develop • Dashboards rated average • Administration, development and MDM rated below average • Support staff coverage not always adequate Microsoft • Ubiquitous BI across products - it is already there and being used • Attractive packaging and pricing • Investing heavily in cloud • Excel widely used and accelerated investments in feature releases • Mobile BI, interactive visualization and MDM are product weaknesses • Multiproduct complexity = on-premises or hybrid deployments. • Do-it-yourself approach – onus is on customer 35