SlideShare ist ein Scribd-Unternehmen logo
1 von 12
Building a Big Data Analytics Platform-
Going beyond the Traditional
Enterprise Data warehouse
W H I T E P A P E R
Abstract
In this white paper, Impetus Technologies focuses on the need
for building a Big Data analytics platform for better business
insights.
It also looks at why organizations need to design an Enterprise
Data Warehouse (EDW) to support the business analytics derived
from the Big Data.
Additionally, it discusses the options and challenges of building a
successful EDW architecture to meet the new Big Data business
requirements. It talks about why it may include extreme
integration with semi-structured and unstructured data sources,
that could be very large in size, or could be streaming data,
accessed through Hadoop, as well as massively parallel
databases.
Impetus Technologies, Inc.
www.impetus.com
Building a Big Data Analytics Platform - Going beyond the Traditional Enterprise Data warehouse
2
Table of Contents
Introduction..............................................................................................3
Limitations of traditional EDWs................................................................4
The key features of a Big Data Analytics platform ...................................5
Options available for building the Big Data platform...............................6
Using Open Source to build Big Data solutions ........................................7
Opting for a Hybrid solution .....................................................................8
Harnessing existing investments in building a Big Data Analytics platform
................................................................................................................10
Summary.................................................................................................12
Building a Big Data Analytics Platform - Going beyond the Traditional Enterprise Data warehouse
3
Introduction
In the post recession world, organizations are under pressure to maximize
profits and reduce expenditure. Business owners need to find the right target
users, figure out the distribution channels, successfully sell their offerings; as
well as keep all the stakeholders happy.
Moreover, every time the business comes up with new products or campaigns,
or wishes to evaluate its existing business performance, it has to deal with the
following questions: What kind of products are my customers interested in?
Where should I open my new store next year? What is the most effective
Distribution channel?
Traditionally, businesses have used Enterprise Data Warehouses (EDW)
solutions for providing analytics and gaining deeper insights to address their
business requirements and expansion plans.
An EDW can play a pivotal role in an enterprise IT strategy. A comprehensive
EDW plan provides companies the following benefits:
• Enables disciplined data integration within a large enterprise
• Generates output and facilitates effective representations of all
business processes
It’s important to examine how the traditional EDW works. Traditional data
sources include an operational DB, old archived data, flat/xml files or ERP
systems. Here, the data is extracted, cleaned and transformed into the desired
format and then loaded into the data warehouse storage system. This data can
be further divided into marts. Once the data is available in the central EDW,
query or reporting tools are used for analytics. However, for deeper or forecast-
based analysis, data mining tools are used.
The question however is whether such data warehouses are ready to deal with
Big Data and more importantly, what is Big Data?
The term Big Data is used to describe data sets which cannot be managed or
processed by traditionally used software tools within an agreed elapsed time.
The Big data size is constantly increasing, and can range from a few terabytes to
many petabytes. However, it is expected to reach around 35 zettabytes by the
year 2020!
Building a Big Data Analytics Platform - Going beyond the Traditional Enterprise Data warehouse
4
Traditional Enterprise Data Warehouses have fallen short of expectations when
it comes to handling Big Data, on account of the following reasons:
• Inability to handle large data sizes
• Storing and Managing the Big Data
• Gaining insights from this data
• Costs involved in dealing with Big Data
Limitations of traditional EDWs
Let us examine the limitations of traditional EDWs.
Traditionally, Enterprise Data Warehouses focused only on transactional or
archived data. However, in the last few years, the need to capture additional
data for deeper insights has come-up. This includes, real time data, which may
be the low latency operational data or customer behavior data, which captures
the sub-transactional processes. At the same time, additional data sources such
as devices and sensors have also emerged.
Social Media also provides valuable information on product preferences and
user sentiments. It is extremely useful for generating business intelligence, from
the large unstructured data generated from the Web applications.
Building a Big Data Analytics Platform - Going beyond the Traditional Enterprise Data warehouse
5
It is clear that traditional EDWs cannot gain meaningful insights from Big Data.
This is possibly because traditional EDWs were just not meant to handle TBs and
PBs of data. Most of these systems were designed in the 1990s using database
technologies.
Another difference is that in place of Extra Transform Load, the Big Data
Warehouses need ELTL which is Extract-Load-Transform-Load. The new system
needs a staging area where data is uploaded before the
cleansing/transformations operations.
Traditional relational database solutions are not suitable for a majority of data
sets. The data is too unstructured and/or too voluminous for a traditional
RDBMS to handle. Big Data cannot be analyzed with SQL or similar technologies.
In fact, database schema does not allow complex unstructured formats to be
defined and managed in these data warehouses. Moreover, the costs involved
in handling these new data sets by using traditional technologies is also very
high.
Clearly, existing EDW environments, which were designed decades ago, lack the
ability to capture and process the new forms of data within reasonable
processing times. Moreover, these traditional EDWs have limited capabilities
when it comes to analyzing user behavioral data.
Cost is another important factor. Currently, organizations are spending
hundreds of thousands of dollars per terabyte per year for producing and
replacing data in their existing environments, which is huge. Additionally, the
models in use tend to require specialized hardware, which in-turn results in big
dollars-per-terabyte cost, making large-scale deployments expensive. It is also
really hard to predict the infrastructure workload for managing this Big Data.
The key features of a Big Data Analytics
platform
To manage the Big Data trend, a new breed of Big Data Open Source and
proprietary technologies have come up, that leverage commodity hardware. A
Big Data Analytics platform helps capture and analyze these new data sets.
The ideal Big Data Analytics platform needs to match up to these key
characteristics:
• It should have the ability to scale easily to support large data, which will
typically be in terabytes or petabytes.
Building a Big Data Analytics Platform - Going beyond the Traditional Enterprise Data warehouse
6
• The system should ideally be distributed across geographically unaware
processors.
• It should enable quick response to highly complex queries as well as
support a wide variety of data types
• It should be able to incorporate machine learning, providing
recommendations, and executing analytics on real time incoming data
such as logs, as well as providing domain specific canned reports.
• It should be able to handle data from heterogeneous data sources,
while providing a high rate for loading and analysis, as well as the ability
to handle failover.
Options available for building the Big Data
platform
It is important to understand that for building a Big Data analytics platform, any
single vendor technology may not be sufficient. The platform should have
certain capabilities to address specific sets of requirements.
There are two different approaches that are being used to address Big Data
analytics.
The first one is using Massive Parallel Processing and Columnar Databases. This
solution can help address scaling, distribution, load management, response time
Building a Big Data Analytics Platform - Going beyond the Traditional Enterprise Data warehouse
7
and failover management issues. Additionally, it may also have some domain
specific capabilities to provide a ready-made solution.
The second option is using MapReduce implementations. This framework was
initially used by Google to perform Web searches and is now easily available as
the Open Source Apache project called Hadoop.
Companies therefore, have the option to choose between Open Source
solutions and commercial options. However, they can also build a hybrid
solution, which has a mix of different capabilities that handle the Big Data
paradigm.
The commercial tools of today have strong analytical proficiencies as well as
sophisticated reporting and OLAP cube capabilities. There are a large number of
vendors in the market who are offering solutions for the main components of
the EDWs, which are ETL, query tools and BI.
Some of the commercial options for MPP are GreenPlum, Teradata,etc.
Informatica is an example of ETL. A few commercial solutions for BI and
Analytics are Pentaho, Business Objects, MicroStrategy, among others. It is
possible to build a Big Data warehouse solution using these commercial
products together.
Using Open Source to build Big Data solutions
Every organization, big or small, is now focused on cutting IT expenditure.
Despite this, business analytics remains a major business driver for these
companies. If the commercial solutions are scaled to really huge volumes and
deeper BI, it can result in exorbitant licensing costs.
This is clearly not a viable proposition. Companies can instead choose from the
numerous Open Source implementations that are available. Lower costs,
extensibility, and integration are some of the benefits that organizations realize
from Open Source solutions. The good news is that the community is
continuously making efforts to enhance these features and add new
functionalities to these solutions.
Some of the Open Source solutions stacks in the analytics world are jasper soft,
and Pantaho Reporting, while the ETL tools are lover ETL, Talend, etc. Pentaho
also provides commercial extensions of its solution, while Apache Hadoop and
Cassandra provide implementations to the MapReduce framework. These
products solve huge data storage issues and provide ETL and analytics support.
Building a Big Data Analytics Platform - Going beyond the Traditional Enterprise Data warehouse
8
Opting for a Hybrid solution
In this scenario, it is possible to use an Open Source solution for ETL or BI and a
commercial solution for Analytics, or vice-versa. Hadoop and MPP solutions for
instance, can work together as ETL pipes along with a commercial Analytics tool.
Alternatively, MPP and columnar databases can be chosen, along with Map Reduce to
provide another perfect hybrid solution.
When there are larger volumes of data to be analyzed, organizations are better-off
using Open Source solutions. Hadoop is one of the best available Open Source
solutions that can help them in handling their Big Data in a cost-effective manner. It
also makes sense to use parallel processing or other fast mechanisms while trying to
import from the source system or export to the destination system.
Incidentally, ‘real time’ is a myth in Big Data. The data warehouse system has to be
carefully designed so that real time data can be limited by size or by time. It is possible
to re-use some of the existing EDW investments in building a Big Data platform.
Building a Big Data Analytics Platform - Going beyond the Traditional Enterprise Data warehouse
9
The Impetus solution
Based on its project experiences, Impetus Technologies has built a Big Data
Analytics platform for its clients that can help them roll out their Big Data
Analytics initiatives. The platform is called iLaDaP, which is short for Impetus
Large Data Analytics Platform.
The core of the iLaDaP platform is built using SOA, and incorporates all the key
characteristics of an ideal Big Data Analytics platform discussed earlier. iLaDaP is
designed to derive intelligence and operate on huge datasets collected from
numerous data sources in multiple data formats. It is powered by Hadoop, and
therefore, can linearly scale up to thousands of nodes using commodity
hardware. This spells a significant cost advantage in the long run. iLaDaPalso
comes with a set of pre-canned and customized reports.
Building a Big Data Analytics Platform - Going beyond the Traditional Enterprise Data warehouse
10
Recognizing that it is important for businesses to track down and take advantage of an
opportunity, as it happens, Impetus’ platform enables them to react to the events as
they occur. iLaDaP is also capable of collecting data from a range of disparate sources.
This unstructured data can be transformed and utilized for strategic business
decisions.
iLaDaP can be seamlessly integrated with current platforms, without the need for
major changes. The core iLaDaP platform is built using Open Source technologies,
where the components can be replaced with other commercial technologies, in
accordance with requirements.
Harnessing existing investments in building a
Big Data Analytics platform
It is possible to reuse investments made in the traditional data warehouse, to
build a Big Data Analytics platform. It is possible to reuse most of the hardware
since the Big Data solutions can run on commodity grade hardware. Therefore,
an existing RDBMS-based infrastructure can be reutilized. The existing code
logic and algorithms can be also used after minor modifications to enable them
to run in a state-less architectural environment. In this scenario, tools like
MATLAB can be integrated with Hadoop-like technologies.
Another way of utilizing the data warehouse investments is by extending or
enhancing their capacity by plugging them together with a Big Data warehouse
Building a Big Data Analytics Platform - Going beyond the Traditional Enterprise Data warehouse
11
solution. Hadoop for example, is a cost-effective option for storing archival data;
performing deeper analytics and providing summarized reporting data to an
existing data warehouse. This strategy can also help in reusing the reporting
tools. Similarly, ETL tools can be modified to use the Big Data warehouse as
sinks. Tools like Talend or Informatica provide connectors for using Hadoop and
commercial MPPs as data sinks.
The development and testing strategy can also be reused. Most of the new Big
Data warehouse solutions support SQL or Java or scripting languages and allow
the re-use of existing development and testing investments.
Organizations can deploy
iLaDaP on-premise, as
well as in a Cloud
supported deployment
set up.
Building a Big Data Analytics Platform - Going beyond the Traditional Enterprise Data warehouse
12
Summary
In conclusion it can be said that traditional Enterprise Data Warehouses do not have
the ability to keep up with the growing demands of Big Data. The need of the hour is
to effectively strategize and build a Big Data analytics platform to manage, store and
derive insights from this digital data.
Also, no single vendor technology will be sufficient. It is recommended that
organizations go for a hybrid solution constituted by commercial and Open Source
option to build their Big Data analytics platform.
When there is a large volume of data to be analyzed, it is suggested that an Open
Source solution be used, and Hadoop is the best option. The success of a Big Data
platform depends entirely on the tools that are chosen. Therefore, the most
appropriate tools must be selected from the available options. Companies can
additionally re-use existing EDW investments for their Big Data analytics platform.
About Impetus
Impetus Technologies is a leading provider of Big Data solutions for the
Fortune 500®. We help customers effectively manage the “3-Vs” of Big Data
and create new business insights across their enterprises.
Website: www.bigdata.impetus.com | Email: bigdata@impetus.com
© 2013 Impetus Technologies,
Inc. All rights reserved. Product
and company names mentioned
herein may be trademarks of
their respective companies.
May 2013

Weitere ähnliche Inhalte

Was ist angesagt?

Filling the Data Lake - Strata + HadoopWorld San Jose 2016 Preview Presentation
Filling the Data Lake - Strata + HadoopWorld San Jose 2016 Preview PresentationFilling the Data Lake - Strata + HadoopWorld San Jose 2016 Preview Presentation
Filling the Data Lake - Strata + HadoopWorld San Jose 2016 Preview PresentationPentaho
 
Case Study - Spotad: Rebuilding And Optimizing Real-Time Mobile Adverting Bid...
Case Study - Spotad: Rebuilding And Optimizing Real-Time Mobile Adverting Bid...Case Study - Spotad: Rebuilding And Optimizing Real-Time Mobile Adverting Bid...
Case Study - Spotad: Rebuilding And Optimizing Real-Time Mobile Adverting Bid...Vasu S
 
Data-Ed Online Presents: Data Warehouse Strategies
Data-Ed Online Presents: Data Warehouse StrategiesData-Ed Online Presents: Data Warehouse Strategies
Data-Ed Online Presents: Data Warehouse StrategiesDATAVERSITY
 
Enabling Cloud Data Integration (EMEA)
Enabling Cloud Data Integration (EMEA)Enabling Cloud Data Integration (EMEA)
Enabling Cloud Data Integration (EMEA)Denodo
 
Traditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A ComparisonTraditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A ComparisonCapgemini
 
Hadoop and Your Data Warehouse
Hadoop and Your Data WarehouseHadoop and Your Data Warehouse
Hadoop and Your Data WarehouseCaserta
 
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...Denodo
 
Microsoft SQL Azure - Scaling Out with SQL Azure Whitepaper
Microsoft SQL Azure - Scaling Out with SQL Azure WhitepaperMicrosoft SQL Azure - Scaling Out with SQL Azure Whitepaper
Microsoft SQL Azure - Scaling Out with SQL Azure WhitepaperMicrosoft Private Cloud
 
What to focus on when choosing a Business Intelligence tool?
What to focus on when choosing a Business Intelligence tool? What to focus on when choosing a Business Intelligence tool?
What to focus on when choosing a Business Intelligence tool? Marketplanet
 
Big Data and Data Virtualization
Big Data and Data VirtualizationBig Data and Data Virtualization
Big Data and Data VirtualizationKenneth Peeples
 
Data Warehouse
Data WarehouseData Warehouse
Data WarehouseSana Alvi
 
Business intelligence and data warehousing
Business intelligence and data warehousingBusiness intelligence and data warehousing
Business intelligence and data warehousingOZ Assignment help
 
Introduction to data warehousing
Introduction to data warehousingIntroduction to data warehousing
Introduction to data warehousinguncleRhyme
 
PowerPoint Template
PowerPoint TemplatePowerPoint Template
PowerPoint Templatebutest
 
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...Denodo
 
Gulabs Ppt On Data Warehousing And Mining
Gulabs Ppt On Data Warehousing And MiningGulabs Ppt On Data Warehousing And Mining
Gulabs Ppt On Data Warehousing And Mininggulab sharma
 
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)
How to Quickly and Easily Draw Value  from Big Data Sources_Q3 symposia(Moa)How to Quickly and Easily Draw Value  from Big Data Sources_Q3 symposia(Moa)
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)Moacyr Passador
 
Data warehousing - Dr. Radhika Kotecha
Data warehousing - Dr. Radhika KotechaData warehousing - Dr. Radhika Kotecha
Data warehousing - Dr. Radhika KotechaRadhika Kotecha
 

Was ist angesagt? (20)

Filling the Data Lake - Strata + HadoopWorld San Jose 2016 Preview Presentation
Filling the Data Lake - Strata + HadoopWorld San Jose 2016 Preview PresentationFilling the Data Lake - Strata + HadoopWorld San Jose 2016 Preview Presentation
Filling the Data Lake - Strata + HadoopWorld San Jose 2016 Preview Presentation
 
Case Study - Spotad: Rebuilding And Optimizing Real-Time Mobile Adverting Bid...
Case Study - Spotad: Rebuilding And Optimizing Real-Time Mobile Adverting Bid...Case Study - Spotad: Rebuilding And Optimizing Real-Time Mobile Adverting Bid...
Case Study - Spotad: Rebuilding And Optimizing Real-Time Mobile Adverting Bid...
 
Data-Ed Online Presents: Data Warehouse Strategies
Data-Ed Online Presents: Data Warehouse StrategiesData-Ed Online Presents: Data Warehouse Strategies
Data-Ed Online Presents: Data Warehouse Strategies
 
Taming Big Data With Modern Software Architecture
Taming Big Data  With Modern Software ArchitectureTaming Big Data  With Modern Software Architecture
Taming Big Data With Modern Software Architecture
 
Enabling Cloud Data Integration (EMEA)
Enabling Cloud Data Integration (EMEA)Enabling Cloud Data Integration (EMEA)
Enabling Cloud Data Integration (EMEA)
 
Traditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A ComparisonTraditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A Comparison
 
Hadoop and Your Data Warehouse
Hadoop and Your Data WarehouseHadoop and Your Data Warehouse
Hadoop and Your Data Warehouse
 
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
Analyst View of Data Virtualization: Conversations with Boulder Business Inte...
 
Data lakes
Data lakesData lakes
Data lakes
 
Microsoft SQL Azure - Scaling Out with SQL Azure Whitepaper
Microsoft SQL Azure - Scaling Out with SQL Azure WhitepaperMicrosoft SQL Azure - Scaling Out with SQL Azure Whitepaper
Microsoft SQL Azure - Scaling Out with SQL Azure Whitepaper
 
What to focus on when choosing a Business Intelligence tool?
What to focus on when choosing a Business Intelligence tool? What to focus on when choosing a Business Intelligence tool?
What to focus on when choosing a Business Intelligence tool?
 
Big Data and Data Virtualization
Big Data and Data VirtualizationBig Data and Data Virtualization
Big Data and Data Virtualization
 
Data Warehouse
Data WarehouseData Warehouse
Data Warehouse
 
Business intelligence and data warehousing
Business intelligence and data warehousingBusiness intelligence and data warehousing
Business intelligence and data warehousing
 
Introduction to data warehousing
Introduction to data warehousingIntroduction to data warehousing
Introduction to data warehousing
 
PowerPoint Template
PowerPoint TemplatePowerPoint Template
PowerPoint Template
 
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
 
Gulabs Ppt On Data Warehousing And Mining
Gulabs Ppt On Data Warehousing And MiningGulabs Ppt On Data Warehousing And Mining
Gulabs Ppt On Data Warehousing And Mining
 
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)
How to Quickly and Easily Draw Value  from Big Data Sources_Q3 symposia(Moa)How to Quickly and Easily Draw Value  from Big Data Sources_Q3 symposia(Moa)
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)
 
Data warehousing - Dr. Radhika Kotecha
Data warehousing - Dr. Radhika KotechaData warehousing - Dr. Radhika Kotecha
Data warehousing - Dr. Radhika Kotecha
 

Andere mochten auch

Copavacaciones daviajes 2010
Copavacaciones daviajes 2010Copavacaciones daviajes 2010
Copavacaciones daviajes 2010neira320
 
Sistema Computacao Revisao 1
Sistema Computacao Revisao 1Sistema Computacao Revisao 1
Sistema Computacao Revisao 1Duílio Andrade
 
Economia para no economistas
Economia para no economistasEconomia para no economistas
Economia para no economistasAtuel Ledesma
 
Abonnet Oneslider
Abonnet OnesliderAbonnet Oneslider
Abonnet OnesliderAbonnetcom
 
ARO Questionnaire Items_Internship.docx
ARO Questionnaire Items_Internship.docxARO Questionnaire Items_Internship.docx
ARO Questionnaire Items_Internship.docxcmartin_liaison
 
Y tú dónde te ves
Y tú dónde te vesY tú dónde te ves
Y tú dónde te vesyeimijaraba
 
Funcionarios parqueadero
Funcionarios parqueaderoFuncionarios parqueadero
Funcionarios parqueaderorutaimedellin
 
Outcome n2.1
Outcome n2.1Outcome n2.1
Outcome n2.1susan70
 
Articolo ecodi bg nov 2011 cava pag2
Articolo ecodi bg nov 2011   cava pag2Articolo ecodi bg nov 2011   cava pag2
Articolo ecodi bg nov 2011 cava pag2Manu5891
 
Ivan Ortenzi Assise 2014 CI Giovani
Ivan Ortenzi Assise 2014 CI GiovaniIvan Ortenzi Assise 2014 CI Giovani
Ivan Ortenzi Assise 2014 CI GiovaniIvan Ortenzi
 
Hoje no brasil o que mais se existe são leis
Hoje no brasil o que mais se existe são leisHoje no brasil o que mais se existe são leis
Hoje no brasil o que mais se existe são leisJoze Fllávio
 

Andere mochten auch (20)

Northcrest salmon
Northcrest salmonNorthcrest salmon
Northcrest salmon
 
Reflexion
ReflexionReflexion
Reflexion
 
Copavacaciones daviajes 2010
Copavacaciones daviajes 2010Copavacaciones daviajes 2010
Copavacaciones daviajes 2010
 
Sistema Computacao Revisao 1
Sistema Computacao Revisao 1Sistema Computacao Revisao 1
Sistema Computacao Revisao 1
 
Economia para no economistas
Economia para no economistasEconomia para no economistas
Economia para no economistas
 
Abonnet Oneslider
Abonnet OnesliderAbonnet Oneslider
Abonnet Oneslider
 
Customer Case Study - CMCC
Customer Case Study - CMCCCustomer Case Study - CMCC
Customer Case Study - CMCC
 
PM Mitaussteller BioFach.pdf
PM Mitaussteller BioFach.pdfPM Mitaussteller BioFach.pdf
PM Mitaussteller BioFach.pdf
 
ARO Questionnaire Items_Internship.docx
ARO Questionnaire Items_Internship.docxARO Questionnaire Items_Internship.docx
ARO Questionnaire Items_Internship.docx
 
Y tú dónde te ves
Y tú dónde te vesY tú dónde te ves
Y tú dónde te ves
 
Funcionarios parqueadero
Funcionarios parqueaderoFuncionarios parqueadero
Funcionarios parqueadero
 
Online shop home
Online shop homeOnline shop home
Online shop home
 
Outcome n2.1
Outcome n2.1Outcome n2.1
Outcome n2.1
 
Eric
EricEric
Eric
 
Articolo ecodi bg nov 2011 cava pag2
Articolo ecodi bg nov 2011   cava pag2Articolo ecodi bg nov 2011   cava pag2
Articolo ecodi bg nov 2011 cava pag2
 
Apache Hive
Apache HiveApache Hive
Apache Hive
 
Ivan Ortenzi Assise 2014 CI Giovani
Ivan Ortenzi Assise 2014 CI GiovaniIvan Ortenzi Assise 2014 CI Giovani
Ivan Ortenzi Assise 2014 CI Giovani
 
Doc2
Doc2Doc2
Doc2
 
Hoje no brasil o que mais se existe são leis
Hoje no brasil o que mais se existe são leisHoje no brasil o que mais se existe são leis
Hoje no brasil o que mais se existe são leis
 
Forgiveness
ForgivenessForgiveness
Forgiveness
 

Ähnlich wie Building a Big Data Analytics Platform- Impetus White Paper

WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRobertsWP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRobertsJane Roberts
 
Traditional data word
Traditional data wordTraditional data word
Traditional data wordorcoxsm
 
Hd insight overview
Hd insight overviewHd insight overview
Hd insight overviewvhrocca
 
Big data an elephant business opportunities
Big data an elephant   business opportunitiesBig data an elephant   business opportunities
Big data an elephant business opportunitiesBigdata Meetup Kochi
 
bigdatasqloverview21jan2015-2408000
bigdatasqloverview21jan2015-2408000bigdatasqloverview21jan2015-2408000
bigdatasqloverview21jan2015-2408000Kartik Padmanabhan
 
Enterprise Data Lake
Enterprise Data LakeEnterprise Data Lake
Enterprise Data Lakesambiswal
 
Enterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable DigitalEnterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable Digitalsambiswal
 
TDWI checklist - Evolving to Modern DW
TDWI checklist - Evolving to Modern DWTDWI checklist - Evolving to Modern DW
TDWI checklist - Evolving to Modern DWJeannette Browning
 
Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to Life
Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to LifeEvolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to Life
Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to LifeSG Analytics
 
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...Experfy
 
MapR Data Hub White Paper V2 2014
MapR Data Hub White Paper V2 2014MapR Data Hub White Paper V2 2014
MapR Data Hub White Paper V2 2014Erni Susanti
 
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKBig Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKRajesh Jayarman
 
Hadoop: Data Storage Locker or Agile Analytics Platform? It’s Up to You.
Hadoop: Data Storage Locker or Agile Analytics Platform? It’s Up to You.Hadoop: Data Storage Locker or Agile Analytics Platform? It’s Up to You.
Hadoop: Data Storage Locker or Agile Analytics Platform? It’s Up to You.Jennifer Walker
 
Big Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsBig Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsFredReynolds2
 
Top Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practicesTop Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practicesSpringPeople
 
Big data and apache hadoop adoption
Big data and apache hadoop adoptionBig data and apache hadoop adoption
Big data and apache hadoop adoptionfaizrashid1995
 

Ähnlich wie Building a Big Data Analytics Platform- Impetus White Paper (20)

WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRobertsWP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
 
Big data rmoug
Big data rmougBig data rmoug
Big data rmoug
 
Traditional data word
Traditional data wordTraditional data word
Traditional data word
 
Hd insight overview
Hd insight overviewHd insight overview
Hd insight overview
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
Big data an elephant business opportunities
Big data an elephant   business opportunitiesBig data an elephant   business opportunities
Big data an elephant business opportunities
 
bigdatasqloverview21jan2015-2408000
bigdatasqloverview21jan2015-2408000bigdatasqloverview21jan2015-2408000
bigdatasqloverview21jan2015-2408000
 
Enterprise Data Lake
Enterprise Data LakeEnterprise Data Lake
Enterprise Data Lake
 
Enterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable DigitalEnterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable Digital
 
TDWI checklist - Evolving to Modern DW
TDWI checklist - Evolving to Modern DWTDWI checklist - Evolving to Modern DW
TDWI checklist - Evolving to Modern DW
 
Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to Life
Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to LifeEvolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to Life
Evolving Big Data Strategies: Bringing Data Lake and Data Mesh Vision to Life
 
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
 
Oracle sql plsql & dw
Oracle sql plsql & dwOracle sql plsql & dw
Oracle sql plsql & dw
 
MapR Data Hub White Paper V2 2014
MapR Data Hub White Paper V2 2014MapR Data Hub White Paper V2 2014
MapR Data Hub White Paper V2 2014
 
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKBig Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RK
 
Hadoop: Data Storage Locker or Agile Analytics Platform? It’s Up to You.
Hadoop: Data Storage Locker or Agile Analytics Platform? It’s Up to You.Hadoop: Data Storage Locker or Agile Analytics Platform? It’s Up to You.
Hadoop: Data Storage Locker or Agile Analytics Platform? It’s Up to You.
 
Big Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsBig Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential Tools
 
Big Data at a Glance
Big Data at a GlanceBig Data at a Glance
Big Data at a Glance
 
Top Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practicesTop Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practices
 
Big data and apache hadoop adoption
Big data and apache hadoop adoptionBig data and apache hadoop adoption
Big data and apache hadoop adoption
 

Mehr von Impetus Technologies

Data Warehouse Modernization Webinar Series- Critical Trends, Implementation ...
Data Warehouse Modernization Webinar Series- Critical Trends, Implementation ...Data Warehouse Modernization Webinar Series- Critical Trends, Implementation ...
Data Warehouse Modernization Webinar Series- Critical Trends, Implementation ...Impetus Technologies
 
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix WebinarFuture-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix WebinarImpetus Technologies
 
Building Real-time Streaming Apps in Minutes- Impetus Webinar
Building Real-time Streaming Apps in Minutes- Impetus WebinarBuilding Real-time Streaming Apps in Minutes- Impetus Webinar
Building Real-time Streaming Apps in Minutes- Impetus WebinarImpetus Technologies
 
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...Impetus Technologies
 
Impetus White Paper- Handling Data Corruption in Elasticsearch
Impetus White Paper- Handling  Data Corruption  in ElasticsearchImpetus White Paper- Handling  Data Corruption  in Elasticsearch
Impetus White Paper- Handling Data Corruption in ElasticsearchImpetus Technologies
 
Real-world Applications of Streaming Analytics- StreamAnalytix Webinar
Real-world Applications of Streaming Analytics- StreamAnalytix WebinarReal-world Applications of Streaming Analytics- StreamAnalytix Webinar
Real-world Applications of Streaming Analytics- StreamAnalytix WebinarImpetus Technologies
 
Real-world Applications of Streaming Analytics- StreamAnalytix Webinar
Real-world Applications of Streaming Analytics- StreamAnalytix WebinarReal-world Applications of Streaming Analytics- StreamAnalytix Webinar
Real-world Applications of Streaming Analytics- StreamAnalytix WebinarImpetus Technologies
 
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...Impetus Technologies
 
Accelerating Hadoop Solution Lifecycle and Improving ROI- Impetus On-demand W...
Accelerating Hadoop Solution Lifecycle and Improving ROI- Impetus On-demand W...Accelerating Hadoop Solution Lifecycle and Improving ROI- Impetus On-demand W...
Accelerating Hadoop Solution Lifecycle and Improving ROI- Impetus On-demand W...Impetus Technologies
 
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...Impetus Technologies
 
SPARK USE CASE- Distributed Reinforcement Learning for Electricity Market Bi...
SPARK USE CASE-  Distributed Reinforcement Learning for Electricity Market Bi...SPARK USE CASE-  Distributed Reinforcement Learning for Electricity Market Bi...
SPARK USE CASE- Distributed Reinforcement Learning for Electricity Market Bi...Impetus Technologies
 
Enterprise Ready Android and Manageability- Impetus Webcast
Enterprise Ready Android and Manageability- Impetus WebcastEnterprise Ready Android and Manageability- Impetus Webcast
Enterprise Ready Android and Manageability- Impetus WebcastImpetus Technologies
 
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...Impetus Technologies
 
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...Impetus Technologies
 
Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...
Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...
Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...Impetus Technologies
 
Big Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLabBig Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLabImpetus Technologies
 
Webinar maturity of mobile test automation- approaches and future trends
Webinar  maturity of mobile test automation- approaches and future trendsWebinar  maturity of mobile test automation- approaches and future trends
Webinar maturity of mobile test automation- approaches and future trendsImpetus Technologies
 
Next generation analytics with yarn, spark and graph lab
Next generation analytics with yarn, spark and graph labNext generation analytics with yarn, spark and graph lab
Next generation analytics with yarn, spark and graph labImpetus Technologies
 
The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...
The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...
The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...Impetus Technologies
 
Performance Testing of Big Data Applications - Impetus Webcast
Performance Testing of Big Data Applications - Impetus WebcastPerformance Testing of Big Data Applications - Impetus Webcast
Performance Testing of Big Data Applications - Impetus WebcastImpetus Technologies
 

Mehr von Impetus Technologies (20)

Data Warehouse Modernization Webinar Series- Critical Trends, Implementation ...
Data Warehouse Modernization Webinar Series- Critical Trends, Implementation ...Data Warehouse Modernization Webinar Series- Critical Trends, Implementation ...
Data Warehouse Modernization Webinar Series- Critical Trends, Implementation ...
 
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix WebinarFuture-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
 
Building Real-time Streaming Apps in Minutes- Impetus Webinar
Building Real-time Streaming Apps in Minutes- Impetus WebinarBuilding Real-time Streaming Apps in Minutes- Impetus Webinar
Building Real-time Streaming Apps in Minutes- Impetus Webinar
 
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...
 
Impetus White Paper- Handling Data Corruption in Elasticsearch
Impetus White Paper- Handling  Data Corruption  in ElasticsearchImpetus White Paper- Handling  Data Corruption  in Elasticsearch
Impetus White Paper- Handling Data Corruption in Elasticsearch
 
Real-world Applications of Streaming Analytics- StreamAnalytix Webinar
Real-world Applications of Streaming Analytics- StreamAnalytix WebinarReal-world Applications of Streaming Analytics- StreamAnalytix Webinar
Real-world Applications of Streaming Analytics- StreamAnalytix Webinar
 
Real-world Applications of Streaming Analytics- StreamAnalytix Webinar
Real-world Applications of Streaming Analytics- StreamAnalytix WebinarReal-world Applications of Streaming Analytics- StreamAnalytix Webinar
Real-world Applications of Streaming Analytics- StreamAnalytix Webinar
 
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
 
Accelerating Hadoop Solution Lifecycle and Improving ROI- Impetus On-demand W...
Accelerating Hadoop Solution Lifecycle and Improving ROI- Impetus On-demand W...Accelerating Hadoop Solution Lifecycle and Improving ROI- Impetus On-demand W...
Accelerating Hadoop Solution Lifecycle and Improving ROI- Impetus On-demand W...
 
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
 
SPARK USE CASE- Distributed Reinforcement Learning for Electricity Market Bi...
SPARK USE CASE-  Distributed Reinforcement Learning for Electricity Market Bi...SPARK USE CASE-  Distributed Reinforcement Learning for Electricity Market Bi...
SPARK USE CASE- Distributed Reinforcement Learning for Electricity Market Bi...
 
Enterprise Ready Android and Manageability- Impetus Webcast
Enterprise Ready Android and Manageability- Impetus WebcastEnterprise Ready Android and Manageability- Impetus Webcast
Enterprise Ready Android and Manageability- Impetus Webcast
 
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
 
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
 
Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...
Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...
Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...
 
Big Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLabBig Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLab
 
Webinar maturity of mobile test automation- approaches and future trends
Webinar  maturity of mobile test automation- approaches and future trendsWebinar  maturity of mobile test automation- approaches and future trends
Webinar maturity of mobile test automation- approaches and future trends
 
Next generation analytics with yarn, spark and graph lab
Next generation analytics with yarn, spark and graph labNext generation analytics with yarn, spark and graph lab
Next generation analytics with yarn, spark and graph lab
 
The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...
The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...
The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...
 
Performance Testing of Big Data Applications - Impetus Webcast
Performance Testing of Big Data Applications - Impetus WebcastPerformance Testing of Big Data Applications - Impetus Webcast
Performance Testing of Big Data Applications - Impetus Webcast
 

Kürzlich hochgeladen

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 

Kürzlich hochgeladen (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 

Building a Big Data Analytics Platform- Impetus White Paper

  • 1. Building a Big Data Analytics Platform- Going beyond the Traditional Enterprise Data warehouse W H I T E P A P E R Abstract In this white paper, Impetus Technologies focuses on the need for building a Big Data analytics platform for better business insights. It also looks at why organizations need to design an Enterprise Data Warehouse (EDW) to support the business analytics derived from the Big Data. Additionally, it discusses the options and challenges of building a successful EDW architecture to meet the new Big Data business requirements. It talks about why it may include extreme integration with semi-structured and unstructured data sources, that could be very large in size, or could be streaming data, accessed through Hadoop, as well as massively parallel databases. Impetus Technologies, Inc. www.impetus.com
  • 2. Building a Big Data Analytics Platform - Going beyond the Traditional Enterprise Data warehouse 2 Table of Contents Introduction..............................................................................................3 Limitations of traditional EDWs................................................................4 The key features of a Big Data Analytics platform ...................................5 Options available for building the Big Data platform...............................6 Using Open Source to build Big Data solutions ........................................7 Opting for a Hybrid solution .....................................................................8 Harnessing existing investments in building a Big Data Analytics platform ................................................................................................................10 Summary.................................................................................................12
  • 3. Building a Big Data Analytics Platform - Going beyond the Traditional Enterprise Data warehouse 3 Introduction In the post recession world, organizations are under pressure to maximize profits and reduce expenditure. Business owners need to find the right target users, figure out the distribution channels, successfully sell their offerings; as well as keep all the stakeholders happy. Moreover, every time the business comes up with new products or campaigns, or wishes to evaluate its existing business performance, it has to deal with the following questions: What kind of products are my customers interested in? Where should I open my new store next year? What is the most effective Distribution channel? Traditionally, businesses have used Enterprise Data Warehouses (EDW) solutions for providing analytics and gaining deeper insights to address their business requirements and expansion plans. An EDW can play a pivotal role in an enterprise IT strategy. A comprehensive EDW plan provides companies the following benefits: • Enables disciplined data integration within a large enterprise • Generates output and facilitates effective representations of all business processes It’s important to examine how the traditional EDW works. Traditional data sources include an operational DB, old archived data, flat/xml files or ERP systems. Here, the data is extracted, cleaned and transformed into the desired format and then loaded into the data warehouse storage system. This data can be further divided into marts. Once the data is available in the central EDW, query or reporting tools are used for analytics. However, for deeper or forecast- based analysis, data mining tools are used. The question however is whether such data warehouses are ready to deal with Big Data and more importantly, what is Big Data? The term Big Data is used to describe data sets which cannot be managed or processed by traditionally used software tools within an agreed elapsed time. The Big data size is constantly increasing, and can range from a few terabytes to many petabytes. However, it is expected to reach around 35 zettabytes by the year 2020!
  • 4. Building a Big Data Analytics Platform - Going beyond the Traditional Enterprise Data warehouse 4 Traditional Enterprise Data Warehouses have fallen short of expectations when it comes to handling Big Data, on account of the following reasons: • Inability to handle large data sizes • Storing and Managing the Big Data • Gaining insights from this data • Costs involved in dealing with Big Data Limitations of traditional EDWs Let us examine the limitations of traditional EDWs. Traditionally, Enterprise Data Warehouses focused only on transactional or archived data. However, in the last few years, the need to capture additional data for deeper insights has come-up. This includes, real time data, which may be the low latency operational data or customer behavior data, which captures the sub-transactional processes. At the same time, additional data sources such as devices and sensors have also emerged. Social Media also provides valuable information on product preferences and user sentiments. It is extremely useful for generating business intelligence, from the large unstructured data generated from the Web applications.
  • 5. Building a Big Data Analytics Platform - Going beyond the Traditional Enterprise Data warehouse 5 It is clear that traditional EDWs cannot gain meaningful insights from Big Data. This is possibly because traditional EDWs were just not meant to handle TBs and PBs of data. Most of these systems were designed in the 1990s using database technologies. Another difference is that in place of Extra Transform Load, the Big Data Warehouses need ELTL which is Extract-Load-Transform-Load. The new system needs a staging area where data is uploaded before the cleansing/transformations operations. Traditional relational database solutions are not suitable for a majority of data sets. The data is too unstructured and/or too voluminous for a traditional RDBMS to handle. Big Data cannot be analyzed with SQL or similar technologies. In fact, database schema does not allow complex unstructured formats to be defined and managed in these data warehouses. Moreover, the costs involved in handling these new data sets by using traditional technologies is also very high. Clearly, existing EDW environments, which were designed decades ago, lack the ability to capture and process the new forms of data within reasonable processing times. Moreover, these traditional EDWs have limited capabilities when it comes to analyzing user behavioral data. Cost is another important factor. Currently, organizations are spending hundreds of thousands of dollars per terabyte per year for producing and replacing data in their existing environments, which is huge. Additionally, the models in use tend to require specialized hardware, which in-turn results in big dollars-per-terabyte cost, making large-scale deployments expensive. It is also really hard to predict the infrastructure workload for managing this Big Data. The key features of a Big Data Analytics platform To manage the Big Data trend, a new breed of Big Data Open Source and proprietary technologies have come up, that leverage commodity hardware. A Big Data Analytics platform helps capture and analyze these new data sets. The ideal Big Data Analytics platform needs to match up to these key characteristics: • It should have the ability to scale easily to support large data, which will typically be in terabytes or petabytes.
  • 6. Building a Big Data Analytics Platform - Going beyond the Traditional Enterprise Data warehouse 6 • The system should ideally be distributed across geographically unaware processors. • It should enable quick response to highly complex queries as well as support a wide variety of data types • It should be able to incorporate machine learning, providing recommendations, and executing analytics on real time incoming data such as logs, as well as providing domain specific canned reports. • It should be able to handle data from heterogeneous data sources, while providing a high rate for loading and analysis, as well as the ability to handle failover. Options available for building the Big Data platform It is important to understand that for building a Big Data analytics platform, any single vendor technology may not be sufficient. The platform should have certain capabilities to address specific sets of requirements. There are two different approaches that are being used to address Big Data analytics. The first one is using Massive Parallel Processing and Columnar Databases. This solution can help address scaling, distribution, load management, response time
  • 7. Building a Big Data Analytics Platform - Going beyond the Traditional Enterprise Data warehouse 7 and failover management issues. Additionally, it may also have some domain specific capabilities to provide a ready-made solution. The second option is using MapReduce implementations. This framework was initially used by Google to perform Web searches and is now easily available as the Open Source Apache project called Hadoop. Companies therefore, have the option to choose between Open Source solutions and commercial options. However, they can also build a hybrid solution, which has a mix of different capabilities that handle the Big Data paradigm. The commercial tools of today have strong analytical proficiencies as well as sophisticated reporting and OLAP cube capabilities. There are a large number of vendors in the market who are offering solutions for the main components of the EDWs, which are ETL, query tools and BI. Some of the commercial options for MPP are GreenPlum, Teradata,etc. Informatica is an example of ETL. A few commercial solutions for BI and Analytics are Pentaho, Business Objects, MicroStrategy, among others. It is possible to build a Big Data warehouse solution using these commercial products together. Using Open Source to build Big Data solutions Every organization, big or small, is now focused on cutting IT expenditure. Despite this, business analytics remains a major business driver for these companies. If the commercial solutions are scaled to really huge volumes and deeper BI, it can result in exorbitant licensing costs. This is clearly not a viable proposition. Companies can instead choose from the numerous Open Source implementations that are available. Lower costs, extensibility, and integration are some of the benefits that organizations realize from Open Source solutions. The good news is that the community is continuously making efforts to enhance these features and add new functionalities to these solutions. Some of the Open Source solutions stacks in the analytics world are jasper soft, and Pantaho Reporting, while the ETL tools are lover ETL, Talend, etc. Pentaho also provides commercial extensions of its solution, while Apache Hadoop and Cassandra provide implementations to the MapReduce framework. These products solve huge data storage issues and provide ETL and analytics support.
  • 8. Building a Big Data Analytics Platform - Going beyond the Traditional Enterprise Data warehouse 8 Opting for a Hybrid solution In this scenario, it is possible to use an Open Source solution for ETL or BI and a commercial solution for Analytics, or vice-versa. Hadoop and MPP solutions for instance, can work together as ETL pipes along with a commercial Analytics tool. Alternatively, MPP and columnar databases can be chosen, along with Map Reduce to provide another perfect hybrid solution. When there are larger volumes of data to be analyzed, organizations are better-off using Open Source solutions. Hadoop is one of the best available Open Source solutions that can help them in handling their Big Data in a cost-effective manner. It also makes sense to use parallel processing or other fast mechanisms while trying to import from the source system or export to the destination system. Incidentally, ‘real time’ is a myth in Big Data. The data warehouse system has to be carefully designed so that real time data can be limited by size or by time. It is possible to re-use some of the existing EDW investments in building a Big Data platform.
  • 9. Building a Big Data Analytics Platform - Going beyond the Traditional Enterprise Data warehouse 9 The Impetus solution Based on its project experiences, Impetus Technologies has built a Big Data Analytics platform for its clients that can help them roll out their Big Data Analytics initiatives. The platform is called iLaDaP, which is short for Impetus Large Data Analytics Platform. The core of the iLaDaP platform is built using SOA, and incorporates all the key characteristics of an ideal Big Data Analytics platform discussed earlier. iLaDaP is designed to derive intelligence and operate on huge datasets collected from numerous data sources in multiple data formats. It is powered by Hadoop, and therefore, can linearly scale up to thousands of nodes using commodity hardware. This spells a significant cost advantage in the long run. iLaDaPalso comes with a set of pre-canned and customized reports.
  • 10. Building a Big Data Analytics Platform - Going beyond the Traditional Enterprise Data warehouse 10 Recognizing that it is important for businesses to track down and take advantage of an opportunity, as it happens, Impetus’ platform enables them to react to the events as they occur. iLaDaP is also capable of collecting data from a range of disparate sources. This unstructured data can be transformed and utilized for strategic business decisions. iLaDaP can be seamlessly integrated with current platforms, without the need for major changes. The core iLaDaP platform is built using Open Source technologies, where the components can be replaced with other commercial technologies, in accordance with requirements. Harnessing existing investments in building a Big Data Analytics platform It is possible to reuse investments made in the traditional data warehouse, to build a Big Data Analytics platform. It is possible to reuse most of the hardware since the Big Data solutions can run on commodity grade hardware. Therefore, an existing RDBMS-based infrastructure can be reutilized. The existing code logic and algorithms can be also used after minor modifications to enable them to run in a state-less architectural environment. In this scenario, tools like MATLAB can be integrated with Hadoop-like technologies. Another way of utilizing the data warehouse investments is by extending or enhancing their capacity by plugging them together with a Big Data warehouse
  • 11. Building a Big Data Analytics Platform - Going beyond the Traditional Enterprise Data warehouse 11 solution. Hadoop for example, is a cost-effective option for storing archival data; performing deeper analytics and providing summarized reporting data to an existing data warehouse. This strategy can also help in reusing the reporting tools. Similarly, ETL tools can be modified to use the Big Data warehouse as sinks. Tools like Talend or Informatica provide connectors for using Hadoop and commercial MPPs as data sinks. The development and testing strategy can also be reused. Most of the new Big Data warehouse solutions support SQL or Java or scripting languages and allow the re-use of existing development and testing investments. Organizations can deploy iLaDaP on-premise, as well as in a Cloud supported deployment set up.
  • 12. Building a Big Data Analytics Platform - Going beyond the Traditional Enterprise Data warehouse 12 Summary In conclusion it can be said that traditional Enterprise Data Warehouses do not have the ability to keep up with the growing demands of Big Data. The need of the hour is to effectively strategize and build a Big Data analytics platform to manage, store and derive insights from this digital data. Also, no single vendor technology will be sufficient. It is recommended that organizations go for a hybrid solution constituted by commercial and Open Source option to build their Big Data analytics platform. When there is a large volume of data to be analyzed, it is suggested that an Open Source solution be used, and Hadoop is the best option. The success of a Big Data platform depends entirely on the tools that are chosen. Therefore, the most appropriate tools must be selected from the available options. Companies can additionally re-use existing EDW investments for their Big Data analytics platform. About Impetus Impetus Technologies is a leading provider of Big Data solutions for the Fortune 500®. We help customers effectively manage the “3-Vs” of Big Data and create new business insights across their enterprises. Website: www.bigdata.impetus.com | Email: bigdata@impetus.com © 2013 Impetus Technologies, Inc. All rights reserved. Product and company names mentioned herein may be trademarks of their respective companies. May 2013