SlideShare ist ein Scribd-Unternehmen logo
1 von 36
Downloaden Sie, um offline zu lesen
Hadoop and the new BI:
The Modern Data Architecture
…for in memory Big Data Analytics

10 December 2013
Quick Housekeeping
Q&A box is available for your questions

Webinar will be recorded for future viewing

Thank You for joining!

© Hortonworks Inc. 2013
Modern Data Architecture
…for in memory Big Data Analytics

© Hortonworks Inc. 2013

Page 3
Your Presenters
• Paul Groom (@datagroom)
– Chief Innovation Officer
– 28 years buried in the big data of the data
guiding business users to value
– Two wheels are more fun than four

• John Kreisa (@marked_man)
– VP Strategic Marketing, Hortonworks
– Over 20 years in data management as a
developer and a marketer
– Avid camper

© Hortonworks Inc. 2013

Page 4
Today’s Topics
• Introduction
• Drivers for the Modern Data Architecture (MDA)
• Apache Hadoop in the MDA
• Kognitio’s role in the MDA
• Q&A

© Hortonworks Inc. 2013

Page 5
APPLICATIONS

Existing Data Architecture
Business 
Analytics

Custom 
Applications

Packaged
Applications

DATA  SYSTEM

2.8 ZB in 2012
85% from New Data Types
RDBMS

EDW

MPP

REPOSITORIES

15x Machine Data by 2020
40 ZB by 2020

SOURCES

Source: IDC

Existing Sources 
(CRM, ERP, Clickstream, Logs)

© Hortonworks Inc. 2013

Page 6
APPLICATIONS

Modern Data Architecture Enabled
Business 
Analytics

Custom 
Applications

Packaged
Applications
DEV & DATA
TOOLS

SOURCES

DATA  SYSTEM

BUILD & 
TEST

OPERATIONAL
TOOLS
RDBMS

EDW

MANAGE & 
MONITOR

MPP

REPOSITORIES

Existing Sources 

Emerging Sources 

(CRM, ERP, Clickstream, Logs)

(Sensor, Sentiment, Geo, Unstructured)

© Hortonworks Inc. 2013

Page 7
Hadoop Powers Modern Data Architecture
Hadoop Cluster
compute
&
storage

.

.

.

.

.

.

.

.

.

.

compute
&
storage

Hadoop clusters provide
scale-out storage and
distributed data processing
on commodity hardware

Apache Hadoop is an open source project
governed by the Apache Software Foundation
(ASF) that allows you to gain insight from massive
amounts of structured and unstructured data
quickly and without significant investment.

© Hortonworks Inc. 2013

Page 8
Drivers of Hadoop Adoption

New Business
Applications
From NEW types of
Data (or existing
types for longer)

© Hortonworks Inc. 2013

Page 9
Most Common NEW TYPES OF DATA
1. Sentiment
Understand how your customers feel about your brand and
products – right now

2. Clickstream
Capture and analyze website visitors’ data trails and
optimize your website

3. Sensor/Machine
Discover patterns in data streaming automatically from
remote sensors and machines

4. Geographic
Analyze location-based data to manage operations where
they occur

5. Server Logs
Research logs to diagnose process failures and prevent
security breaches

6. Unstructured (txt, video, pictures, etc..)
Understand patterns in files across millions of web pages,
emails, and documents

© Hortonworks Inc. 2013

Value
Keep Existing Data Around Longer
• Online archive
– Data that was once moved to tape can
now be queried to understand long term trends

• Compliance retention
– Industry specific requirements for retention
of data

Value

• Combine with external historical data sources
– Weather, survey, research, purchased, etc.

© Hortonworks Inc. 2013
Drivers of Hadoop Adoption
Architectural
A Modern Data
Architecture
Complement your existing data
systems: the right workload in the
right place

New Business
Applications

© Hortonworks Inc. 2013

Page 12
Requirements for Hadoop Adoption
Requirements for Hadoop’s Role
in the Modern Data Architecture

Integrated

Key Services

Interoperable with
existing data center
investments

Platform, operational and
data services essential for
the enterprise

Skills
Leverage your existing
skills: development,
operations, analytics

© Hortonworks Inc. 2013

Page 13
Requirements for Enterprise Hadoop

1
2
3

Key Services
Platform, Operational and
Data services essential
for the enterprise

OPERATIONAL 
SERVICES
AMBARI

HBASE

PIG

SQOOP

HIVE &
HCATALOG

LOAD & 
EXTRACT

Skills

NFS

CORE
PLATFORM 
SERVICES

Integrated

WebHDFS

KNOX*

MAP 
REDUCE

TEZ

YARN  
HDFS
Enterprise Readiness
High Availability, Disaster
Recovery, Rolling Upgrades,
Security and Snapshots

HORTONWORKS 
DATA PLATFORM (HDP)

Engineered with existing
data center investments
OS/VM

© Hortonworks Inc. 2013

FLUME

FALCON*
OOZIE

Leverage your existing
skills: development,
analytics, operations

DATA
SERVICES

Cloud

Appliance

Page 14
Requirements for Enterprise Hadoop

3

Leverage your existing
skills: development,
analytics, operations

Integration

DEVELOP
ANALYZE

2

Skills

Platform, operational and
data services essential
for the enterprise

OPERATE

1

Key Services
COLLECT

PROCESS

BUILD

EXPLORE

QUERY

DELIVER

PROVISION

MANAGE

MONITOR

Engineered with existing
data center investments

© Hortonworks Inc. 2013

Page 15
Familiar and Existing Tools

3

Leverage your existing
skills: development,
analytics, operations

Integration

DEVELOP
ANALYZE

2

Skills

Platform, operational and
data services essential
for the enterprise

OPERATE

1

Key Services
COLLECT

PROCESS

BUILD

EXPLORE

QUERY

DELIVER

PROVISION

MANAGE

MONITOR

Engineered with existing
data center investments

© Hortonworks Inc. 2013

Page 16
APPLICATIONS

Requirements for Enterprise Hadoop
Business 
Analytics

Custom 
Applications

Packaged
Applications

Integrated with
DEV & DATA
TOOLS

Applications
BUILD & 

DATA  SYSTEM

Business Intelligence,
TEST
Developer IDEs,
Data Integration

SOURCES

3

OPERATIONAL
TOOLS
RDBMS

EDW

MANAGE & 
Systems
MONITOR

MPP

Data Systems & Storage,
Systems Management

REPOSITORIES

Platforms

Integration
Existing Sources 

Engineered with existing
(CRM, ERP, Clickstream, Logs)
data center investments

© Hortonworks Inc. 2013

Emerging Sources 
(Sensor, Sentiment, Geo, Unstructured)

Operating Systems,
Virtualization, Cloud,
Appliances

Page 17
SOURCES

DATA  SYSTEM

APPLICATIONS

A Modern Data Architecture Applied
Business 
Analytics

Custom 
Applications

Packaged
Applications

Complement data systems
RDBMS

EDW

MPP

Right workload right place

REPOSITORIES

Existing Sources 

Emerging Sources 

(CRM, ERP, Clickstream, Logs)

(Sensor, Sentiment, Geo, Unstructured)

© Hortonworks Inc. 2013 - Confidential

Page 18
APPLICATIONS

Kognitio in the Modern Data Architecture
Business 
Analytics

Business 
Intelligence Tools

OLAP Clients
DEV & DATA
TOOLS

SOURCES

DATA  SYSTEM

In‐memory MPP Accelerator

BUILD & 
TEST

OPERATIONAL
TOOLS
RDBMS

EDW

MANAGE & 
MONITOR

MPP

REPOSITORIES

Existing Sources 

Emerging Sources 

(CRM, ERP, Clickstream, Logs)

(Sensor, Sentiment, Geo, Unstructured)

© Hortonworks Inc. 2013 - Confidential

Page 19
APPLICATIONS

Kognitio in the Modern Data Architecture
BusinessObjects BI
DEV & DATA TOOLS

DATA SYSTEM

In‐memory MPP Accelerator

OPERATIONAL TOOLS

RDBMS

HANA

EDW

MPP

SOURCES

INFRASTRUCTURE

Existing Sources 

Emerging Sources 

(CRM, ERP, Clickstream, Logs)

(Sensor, Sentiment, Geo, Unstructured)

© Hortonworks Inc. 2013 - Confidential

Page 20
Today’s Topics
• Introduction
• Drivers for the Modern Data Architecture (MDA)
• Apache Hadoop’s role in the MDA
• Kognitio’s role in the MDA
• Q&A

© Hortonworks Inc. 2013

Page 21
Hadoop and the new BI
Requirements for Hadoop’s Role
in the Modern Data Architecture

1

Integrated
Interoperable with
existing data center
investments

© Hortonworks Inc. 2013

2

Skills

3

Key Services
Platform, operational and
data services essential for
the enterprise

Leverage your existing
skills: development,
operations, analytics

Page 22
Motivation
• Historical architecture = Existing investment

1

Key Services
Platform, Operational a
Data services essential
for the enterprise

Cognos

• Must plug-and-play with MDA
– Do not disrupt, enhance!

• Performance and behavior expectations
– Dynamic ad-hoc access
– Drill unlimited
– Report on-demand

© Hortonworks Inc. 2013

Page 23
Business [Intelligence] Desires

More timely
Lower latency
Richer data model
More granularity
Better concurrency
Self service

© Hortonworks Inc. 2013

Page 24
BI Activity

Insulate the Hadoop cluster

© Hortonworks Inc. 2013

Page 25
In-memory analytical platform
• Software only
– Easy to deploy alongside HDP
– Simple two stage install

• Commodity Hardware

3

Integration
Engineered with existing
data center investments

– X86/64 Linux Platform with 10GbE network – same as HDP
– Biased to more RAM and less disk

• Scale-out MPP
– Same compute model as Hadoop
– Strong focus on 100% effective CPU utilization for any given query

• Exploits features of underlying persistent store
– Simple ‘Pull data’ access methods
– Parallelism – all HDP nodes intercommunicating with all Kognitio nodes

• ANSI 2011 SQL
– Mature fully featured
– Transaction processing capable

• Not-only-SQL

2

Skills
Leverage your existing
skills: development,
analytics, operations

– Any script or binaries executed in-line within SQL queries
© Hortonworks Inc. 2013

Page 26
Tight Integration

3

• Map-reduce Connector
– Filtered access
© Hortonworks Inc. 2013

Integration
Engineered with existing
data center investments

• HDFS Connector
– Low Latency access
Page 27
So why In-memory?

INSTANT WAIT

• Exploit the ‘Dynamic’ access element of ‘D’-RAM
– Data placed in memory in structures best suited for CPUs, not for disks
© Hortonworks Inc. 2013

Page 28
In-memory – getting work done

© Hortonworks Inc. 2013

Page 29
Building Data Models
• Hadoop is a great repository
• Perfect to handle volume and variability without effort
• Perfect to ‘triage’ the data, to reshape, filter and project into…
• Data Virtualisation / Logical Data Warehouse
… but with the associated horsepower to dynamically analyse the data
• Plug standard tools straight in – not a Java programmer in sight!
• Central control and security
• Data model shelf life getting shorter – sandboxes and workbenches
– Build on-demand to meet todays needs – just pull data from your HDP
– Lots of project based discovery and analytics
– World is changing rapidly
– Ever tighter feedback loops

© Hortonworks Inc. 2013

Page 30
Analytical Complexity

Increasing Computation
Machine learning
algorithms
Behaviour
modelling
Statistical
Analysis

Dynamic
Simulation

Clustering

Dynamic
Interaction
Reporting &
BPM

Campaign
Management

Fraud
detection

Technology/Automation
© Hortonworks Inc. 2013

Page 31
The Analytical Enterprise

Data
Scientist

Systems
Admin

Business
Analyst

Key: “Graduation”
• Projects will need to easily Graduate
from the Data Science Lab and
become part of Business as Usual
© Hortonworks Inc. 2013
Mature SQL atop Hadoop
Kognitio is an in‐memory 
analytical platform that is tightly 
integrated with Hadoop for high‐
performance advanced analytics 
that make Big Data more 
consumable for enterprises, 
especially those with mature BI 
environments or engrained 
tools. 

• Powering advanced analytics at 
organizations worldwide, such as: 

• Privately held
• Invented the in‐memory analytical platform
• Labs in the UK ‐ HQ in New York, NY 

© Hortonworks Inc. 2013

Page 33
APPLICATIONS

Kognitio in the Modern Data Architecture
Business 
Analytics

Business 
Intelligence Tools

OLAP Clients
DEV & DATA
TOOLS

SOURCES

DATA  SYSTEM

In‐memory MPP Accelerator

BUILD & 
TEST

OPERATIONAL
TOOLS
RDBMS

EDW

MANAGE & 
MONITOR

MPP

REPOSITORIES

Existing Sources 

Emerging Sources 

(CRM, ERP, Clickstream, Logs)

(Sensor, Sentiment, Geo, Unstructured)

© Hortonworks Inc. 2013

Page 34
Forrester Wave: a “strong performer”

•

•

Kognitio’s EDW is a strong, cost-effective
alternative to SAP HANA.

•

Kognitio…was designed from the start as an
MPP (distributed) in-memory RDBMS,
making extensive use of RAM-based
processing for maximum performance.

•

© Forrester Corp. Used with permission.
© Hortonworks Inc. 2013

Kognitio’s entirely in-memory, distributed
EDW is appealing for customers looking for
fast performance on commodity hardware

Download a complimentary copy of the
full report at www.kognitio.com/wave

Page 35
The Modern Data Architecture
…for in memory Big Data Analytics
More about Kognito and Hortonworks
http://hortonworks.com/partner/kognitio

Get started with Hortonworks Sandbox
http://hortonworks.com/hadoop-tutorial/

Follow us:
@hortonworks @kognitio

Question & Answer session will be conducted electronically,
using the panel to the right of your screen
Today’s Slides available at: www.slideshare.net/kognitio

Weitere ähnliche Inhalte

Was ist angesagt?

Hadoop data-lake-white-paper
Hadoop data-lake-white-paperHadoop data-lake-white-paper
Hadoop data-lake-white-paperSupratim Ray
 
Creating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data ArchitectureCreating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data ArchitecturePerficient, Inc.
 
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata Hortonworks
 
Hadoop Powers Modern Enterprise Data Architectures
Hadoop Powers Modern Enterprise Data ArchitecturesHadoop Powers Modern Enterprise Data Architectures
Hadoop Powers Modern Enterprise Data ArchitecturesDataWorks Summit
 
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...NoSQLmatters
 
Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applic...
Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applic...Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applic...
Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applic...DATAVERSITY
 
Data Lake Architecture
Data Lake ArchitectureData Lake Architecture
Data Lake ArchitectureDATAVERSITY
 
Monitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersMonitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersDataWorks Summit
 
10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data LakeVMware Tanzu
 
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data LakeMetroStar
 
Big Data: Architecture and Performance Considerations in Logical Data Lakes
Big Data: Architecture and Performance Considerations in Logical Data LakesBig Data: Architecture and Performance Considerations in Logical Data Lakes
Big Data: Architecture and Performance Considerations in Logical Data LakesDenodo
 
Designing Fast Data Architecture for Big Data using Logical Data Warehouse a...
Designing Fast Data Architecture for Big Data  using Logical Data Warehouse a...Designing Fast Data Architecture for Big Data  using Logical Data Warehouse a...
Designing Fast Data Architecture for Big Data using Logical Data Warehouse a...Denodo
 
Incorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureIncorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureCaserta
 
IDC Retail Insights - What's Possible with a Modern Data Architecture?
IDC Retail Insights - What's Possible with a Modern Data Architecture?IDC Retail Insights - What's Possible with a Modern Data Architecture?
IDC Retail Insights - What's Possible with a Modern Data Architecture?Hortonworks
 
Data Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop ImplementationData Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop ImplementationHortonworks
 
Dell Technology World - IT as a Business - Multi-Cloud Strategy is your Product
Dell Technology World - IT as a Business - Multi-Cloud Strategy is your ProductDell Technology World - IT as a Business - Multi-Cloud Strategy is your Product
Dell Technology World - IT as a Business - Multi-Cloud Strategy is your ProductManuel "Manny" Rodriguez-Perez
 
Pervasive analytics through data & analytic centricity
Pervasive analytics through data & analytic centricityPervasive analytics through data & analytic centricity
Pervasive analytics through data & analytic centricityCloudera, Inc.
 
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...Data Con LA
 
Enterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable DigitalEnterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable Digitalsambiswal
 
2012 10 bigdata_overview
2012 10 bigdata_overview2012 10 bigdata_overview
2012 10 bigdata_overviewjdijcks
 

Was ist angesagt? (20)

Hadoop data-lake-white-paper
Hadoop data-lake-white-paperHadoop data-lake-white-paper
Hadoop data-lake-white-paper
 
Creating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data ArchitectureCreating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data Architecture
 
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
 
Hadoop Powers Modern Enterprise Data Architectures
Hadoop Powers Modern Enterprise Data ArchitecturesHadoop Powers Modern Enterprise Data Architectures
Hadoop Powers Modern Enterprise Data Architectures
 
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...
 
Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applic...
Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applic...Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applic...
Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applic...
 
Data Lake Architecture
Data Lake ArchitectureData Lake Architecture
Data Lake Architecture
 
Monitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersMonitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service Providers
 
10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake
 
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake
 
Big Data: Architecture and Performance Considerations in Logical Data Lakes
Big Data: Architecture and Performance Considerations in Logical Data LakesBig Data: Architecture and Performance Considerations in Logical Data Lakes
Big Data: Architecture and Performance Considerations in Logical Data Lakes
 
Designing Fast Data Architecture for Big Data using Logical Data Warehouse a...
Designing Fast Data Architecture for Big Data  using Logical Data Warehouse a...Designing Fast Data Architecture for Big Data  using Logical Data Warehouse a...
Designing Fast Data Architecture for Big Data using Logical Data Warehouse a...
 
Incorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureIncorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic Architecture
 
IDC Retail Insights - What's Possible with a Modern Data Architecture?
IDC Retail Insights - What's Possible with a Modern Data Architecture?IDC Retail Insights - What's Possible with a Modern Data Architecture?
IDC Retail Insights - What's Possible with a Modern Data Architecture?
 
Data Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop ImplementationData Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop Implementation
 
Dell Technology World - IT as a Business - Multi-Cloud Strategy is your Product
Dell Technology World - IT as a Business - Multi-Cloud Strategy is your ProductDell Technology World - IT as a Business - Multi-Cloud Strategy is your Product
Dell Technology World - IT as a Business - Multi-Cloud Strategy is your Product
 
Pervasive analytics through data & analytic centricity
Pervasive analytics through data & analytic centricityPervasive analytics through data & analytic centricity
Pervasive analytics through data & analytic centricity
 
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...
 
Enterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable DigitalEnterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable Digital
 
2012 10 bigdata_overview
2012 10 bigdata_overview2012 10 bigdata_overview
2012 10 bigdata_overview
 

Ähnlich wie Hadoop and the new BI: The Modern Data Architecture for in memory Big Data Analytics

Non-Stop Hadoop for Hortonworks
Non-Stop Hadoop for Hortonworks Non-Stop Hadoop for Hortonworks
Non-Stop Hadoop for Hortonworks Hortonworks
 
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Hortonworks
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks
 
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu BariApache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu Barijaxconf
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopSlim Baltagi
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...MapR Technologies
 
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataHortonworks
 
Apache Hadoop on the Open Cloud
Apache Hadoop on the Open CloudApache Hadoop on the Open Cloud
Apache Hadoop on the Open CloudHortonworks
 
Fueling AI & Machine Learning: Legacy Data as a Competitive Advantage
Fueling AI & Machine Learning: Legacy Data as a Competitive AdvantageFueling AI & Machine Learning: Legacy Data as a Competitive Advantage
Fueling AI & Machine Learning: Legacy Data as a Competitive AdvantagePrecisely
 
Big Data Made Easy: A Simple, Scalable Solution for Getting Started with Hadoop
Big Data Made Easy:  A Simple, Scalable Solution for Getting Started with HadoopBig Data Made Easy:  A Simple, Scalable Solution for Getting Started with Hadoop
Big Data Made Easy: A Simple, Scalable Solution for Getting Started with HadoopPrecisely
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduCloudera, Inc.
 
Hortonworks Big Data & Hadoop
Hortonworks Big Data & HadoopHortonworks Big Data & Hadoop
Hortonworks Big Data & HadoopMark Ginnebaugh
 
Enrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache HadoopEnrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache HadoopHortonworks
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization Denodo
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopHortonworks
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopHortonworks
 
Hortonworks Hadoop @ Oslo Hadoop User Group
Hortonworks Hadoop @ Oslo Hadoop User GroupHortonworks Hadoop @ Oslo Hadoop User Group
Hortonworks Hadoop @ Oslo Hadoop User GroupMats Johansson
 
Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Hortonworks
 

Ähnlich wie Hadoop and the new BI: The Modern Data Architecture for in memory Big Data Analytics (20)

Non-Stop Hadoop for Hortonworks
Non-Stop Hadoop for Hortonworks Non-Stop Hadoop for Hortonworks
Non-Stop Hadoop for Hortonworks
 
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration
 
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu BariApache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise Hadoop
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
 
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
 
Apache Hadoop on the Open Cloud
Apache Hadoop on the Open CloudApache Hadoop on the Open Cloud
Apache Hadoop on the Open Cloud
 
Hadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data WarehouseHadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data Warehouse
 
Fueling AI & Machine Learning: Legacy Data as a Competitive Advantage
Fueling AI & Machine Learning: Legacy Data as a Competitive AdvantageFueling AI & Machine Learning: Legacy Data as a Competitive Advantage
Fueling AI & Machine Learning: Legacy Data as a Competitive Advantage
 
Big Data Made Easy: A Simple, Scalable Solution for Getting Started with Hadoop
Big Data Made Easy:  A Simple, Scalable Solution for Getting Started with HadoopBig Data Made Easy:  A Simple, Scalable Solution for Getting Started with Hadoop
Big Data Made Easy: A Simple, Scalable Solution for Getting Started with Hadoop
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
 
Hortonworks Big Data & Hadoop
Hortonworks Big Data & HadoopHortonworks Big Data & Hadoop
Hortonworks Big Data & Hadoop
 
Enrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache HadoopEnrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache Hadoop
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
 
Hortonworks Hadoop @ Oslo Hadoop User Group
Hortonworks Hadoop @ Oslo Hadoop User GroupHortonworks Hadoop @ Oslo Hadoop User Group
Hortonworks Hadoop @ Oslo Hadoop User Group
 
Meetup oslo hortonworks HDP
Meetup oslo hortonworks HDPMeetup oslo hortonworks HDP
Meetup oslo hortonworks HDP
 
Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014
 

Kürzlich hochgeladen

Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 

Kürzlich hochgeladen (20)

Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 

Hadoop and the new BI: The Modern Data Architecture for in memory Big Data Analytics

  • 1. Hadoop and the new BI: The Modern Data Architecture …for in memory Big Data Analytics 10 December 2013
  • 2. Quick Housekeeping Q&A box is available for your questions Webinar will be recorded for future viewing Thank You for joining! © Hortonworks Inc. 2013
  • 3. Modern Data Architecture …for in memory Big Data Analytics © Hortonworks Inc. 2013 Page 3
  • 4. Your Presenters • Paul Groom (@datagroom) – Chief Innovation Officer – 28 years buried in the big data of the data guiding business users to value – Two wheels are more fun than four • John Kreisa (@marked_man) – VP Strategic Marketing, Hortonworks – Over 20 years in data management as a developer and a marketer – Avid camper © Hortonworks Inc. 2013 Page 4
  • 5. Today’s Topics • Introduction • Drivers for the Modern Data Architecture (MDA) • Apache Hadoop in the MDA • Kognitio’s role in the MDA • Q&A © Hortonworks Inc. 2013 Page 5
  • 6. APPLICATIONS Existing Data Architecture Business  Analytics Custom  Applications Packaged Applications DATA  SYSTEM 2.8 ZB in 2012 85% from New Data Types RDBMS EDW MPP REPOSITORIES 15x Machine Data by 2020 40 ZB by 2020 SOURCES Source: IDC Existing Sources  (CRM, ERP, Clickstream, Logs) © Hortonworks Inc. 2013 Page 6
  • 7. APPLICATIONS Modern Data Architecture Enabled Business  Analytics Custom  Applications Packaged Applications DEV & DATA TOOLS SOURCES DATA  SYSTEM BUILD &  TEST OPERATIONAL TOOLS RDBMS EDW MANAGE &  MONITOR MPP REPOSITORIES Existing Sources  Emerging Sources  (CRM, ERP, Clickstream, Logs) (Sensor, Sentiment, Geo, Unstructured) © Hortonworks Inc. 2013 Page 7
  • 8. Hadoop Powers Modern Data Architecture Hadoop Cluster compute & storage . . . . . . . . . . compute & storage Hadoop clusters provide scale-out storage and distributed data processing on commodity hardware Apache Hadoop is an open source project governed by the Apache Software Foundation (ASF) that allows you to gain insight from massive amounts of structured and unstructured data quickly and without significant investment. © Hortonworks Inc. 2013 Page 8
  • 9. Drivers of Hadoop Adoption New Business Applications From NEW types of Data (or existing types for longer) © Hortonworks Inc. 2013 Page 9
  • 10. Most Common NEW TYPES OF DATA 1. Sentiment Understand how your customers feel about your brand and products – right now 2. Clickstream Capture and analyze website visitors’ data trails and optimize your website 3. Sensor/Machine Discover patterns in data streaming automatically from remote sensors and machines 4. Geographic Analyze location-based data to manage operations where they occur 5. Server Logs Research logs to diagnose process failures and prevent security breaches 6. Unstructured (txt, video, pictures, etc..) Understand patterns in files across millions of web pages, emails, and documents © Hortonworks Inc. 2013 Value
  • 11. Keep Existing Data Around Longer • Online archive – Data that was once moved to tape can now be queried to understand long term trends • Compliance retention – Industry specific requirements for retention of data Value • Combine with external historical data sources – Weather, survey, research, purchased, etc. © Hortonworks Inc. 2013
  • 12. Drivers of Hadoop Adoption Architectural A Modern Data Architecture Complement your existing data systems: the right workload in the right place New Business Applications © Hortonworks Inc. 2013 Page 12
  • 13. Requirements for Hadoop Adoption Requirements for Hadoop’s Role in the Modern Data Architecture Integrated Key Services Interoperable with existing data center investments Platform, operational and data services essential for the enterprise Skills Leverage your existing skills: development, operations, analytics © Hortonworks Inc. 2013 Page 13
  • 14. Requirements for Enterprise Hadoop 1 2 3 Key Services Platform, Operational and Data services essential for the enterprise OPERATIONAL  SERVICES AMBARI HBASE PIG SQOOP HIVE & HCATALOG LOAD &  EXTRACT Skills NFS CORE PLATFORM  SERVICES Integrated WebHDFS KNOX* MAP  REDUCE TEZ YARN   HDFS Enterprise Readiness High Availability, Disaster Recovery, Rolling Upgrades, Security and Snapshots HORTONWORKS  DATA PLATFORM (HDP) Engineered with existing data center investments OS/VM © Hortonworks Inc. 2013 FLUME FALCON* OOZIE Leverage your existing skills: development, analytics, operations DATA SERVICES Cloud Appliance Page 14
  • 15. Requirements for Enterprise Hadoop 3 Leverage your existing skills: development, analytics, operations Integration DEVELOP ANALYZE 2 Skills Platform, operational and data services essential for the enterprise OPERATE 1 Key Services COLLECT PROCESS BUILD EXPLORE QUERY DELIVER PROVISION MANAGE MONITOR Engineered with existing data center investments © Hortonworks Inc. 2013 Page 15
  • 16. Familiar and Existing Tools 3 Leverage your existing skills: development, analytics, operations Integration DEVELOP ANALYZE 2 Skills Platform, operational and data services essential for the enterprise OPERATE 1 Key Services COLLECT PROCESS BUILD EXPLORE QUERY DELIVER PROVISION MANAGE MONITOR Engineered with existing data center investments © Hortonworks Inc. 2013 Page 16
  • 17. APPLICATIONS Requirements for Enterprise Hadoop Business  Analytics Custom  Applications Packaged Applications Integrated with DEV & DATA TOOLS Applications BUILD &  DATA  SYSTEM Business Intelligence, TEST Developer IDEs, Data Integration SOURCES 3 OPERATIONAL TOOLS RDBMS EDW MANAGE &  Systems MONITOR MPP Data Systems & Storage, Systems Management REPOSITORIES Platforms Integration Existing Sources  Engineered with existing (CRM, ERP, Clickstream, Logs) data center investments © Hortonworks Inc. 2013 Emerging Sources  (Sensor, Sentiment, Geo, Unstructured) Operating Systems, Virtualization, Cloud, Appliances Page 17
  • 18. SOURCES DATA  SYSTEM APPLICATIONS A Modern Data Architecture Applied Business  Analytics Custom  Applications Packaged Applications Complement data systems RDBMS EDW MPP Right workload right place REPOSITORIES Existing Sources  Emerging Sources  (CRM, ERP, Clickstream, Logs) (Sensor, Sentiment, Geo, Unstructured) © Hortonworks Inc. 2013 - Confidential Page 18
  • 19. APPLICATIONS Kognitio in the Modern Data Architecture Business  Analytics Business  Intelligence Tools OLAP Clients DEV & DATA TOOLS SOURCES DATA  SYSTEM In‐memory MPP Accelerator BUILD &  TEST OPERATIONAL TOOLS RDBMS EDW MANAGE &  MONITOR MPP REPOSITORIES Existing Sources  Emerging Sources  (CRM, ERP, Clickstream, Logs) (Sensor, Sentiment, Geo, Unstructured) © Hortonworks Inc. 2013 - Confidential Page 19
  • 20. APPLICATIONS Kognitio in the Modern Data Architecture BusinessObjects BI DEV & DATA TOOLS DATA SYSTEM In‐memory MPP Accelerator OPERATIONAL TOOLS RDBMS HANA EDW MPP SOURCES INFRASTRUCTURE Existing Sources  Emerging Sources  (CRM, ERP, Clickstream, Logs) (Sensor, Sentiment, Geo, Unstructured) © Hortonworks Inc. 2013 - Confidential Page 20
  • 21. Today’s Topics • Introduction • Drivers for the Modern Data Architecture (MDA) • Apache Hadoop’s role in the MDA • Kognitio’s role in the MDA • Q&A © Hortonworks Inc. 2013 Page 21
  • 22. Hadoop and the new BI Requirements for Hadoop’s Role in the Modern Data Architecture 1 Integrated Interoperable with existing data center investments © Hortonworks Inc. 2013 2 Skills 3 Key Services Platform, operational and data services essential for the enterprise Leverage your existing skills: development, operations, analytics Page 22
  • 23. Motivation • Historical architecture = Existing investment 1 Key Services Platform, Operational a Data services essential for the enterprise Cognos • Must plug-and-play with MDA – Do not disrupt, enhance! • Performance and behavior expectations – Dynamic ad-hoc access – Drill unlimited – Report on-demand © Hortonworks Inc. 2013 Page 23
  • 24. Business [Intelligence] Desires More timely Lower latency Richer data model More granularity Better concurrency Self service © Hortonworks Inc. 2013 Page 24
  • 25. BI Activity Insulate the Hadoop cluster © Hortonworks Inc. 2013 Page 25
  • 26. In-memory analytical platform • Software only – Easy to deploy alongside HDP – Simple two stage install • Commodity Hardware 3 Integration Engineered with existing data center investments – X86/64 Linux Platform with 10GbE network – same as HDP – Biased to more RAM and less disk • Scale-out MPP – Same compute model as Hadoop – Strong focus on 100% effective CPU utilization for any given query • Exploits features of underlying persistent store – Simple ‘Pull data’ access methods – Parallelism – all HDP nodes intercommunicating with all Kognitio nodes • ANSI 2011 SQL – Mature fully featured – Transaction processing capable • Not-only-SQL 2 Skills Leverage your existing skills: development, analytics, operations – Any script or binaries executed in-line within SQL queries © Hortonworks Inc. 2013 Page 26
  • 27. Tight Integration 3 • Map-reduce Connector – Filtered access © Hortonworks Inc. 2013 Integration Engineered with existing data center investments • HDFS Connector – Low Latency access Page 27
  • 28. So why In-memory? INSTANT WAIT • Exploit the ‘Dynamic’ access element of ‘D’-RAM – Data placed in memory in structures best suited for CPUs, not for disks © Hortonworks Inc. 2013 Page 28
  • 29. In-memory – getting work done © Hortonworks Inc. 2013 Page 29
  • 30. Building Data Models • Hadoop is a great repository • Perfect to handle volume and variability without effort • Perfect to ‘triage’ the data, to reshape, filter and project into… • Data Virtualisation / Logical Data Warehouse … but with the associated horsepower to dynamically analyse the data • Plug standard tools straight in – not a Java programmer in sight! • Central control and security • Data model shelf life getting shorter – sandboxes and workbenches – Build on-demand to meet todays needs – just pull data from your HDP – Lots of project based discovery and analytics – World is changing rapidly – Ever tighter feedback loops © Hortonworks Inc. 2013 Page 30
  • 31. Analytical Complexity Increasing Computation Machine learning algorithms Behaviour modelling Statistical Analysis Dynamic Simulation Clustering Dynamic Interaction Reporting & BPM Campaign Management Fraud detection Technology/Automation © Hortonworks Inc. 2013 Page 31
  • 32. The Analytical Enterprise Data Scientist Systems Admin Business Analyst Key: “Graduation” • Projects will need to easily Graduate from the Data Science Lab and become part of Business as Usual © Hortonworks Inc. 2013
  • 33. Mature SQL atop Hadoop Kognitio is an in‐memory  analytical platform that is tightly  integrated with Hadoop for high‐ performance advanced analytics  that make Big Data more  consumable for enterprises,  especially those with mature BI  environments or engrained  tools.  • Powering advanced analytics at  organizations worldwide, such as:  • Privately held • Invented the in‐memory analytical platform • Labs in the UK ‐ HQ in New York, NY  © Hortonworks Inc. 2013 Page 33
  • 34. APPLICATIONS Kognitio in the Modern Data Architecture Business  Analytics Business  Intelligence Tools OLAP Clients DEV & DATA TOOLS SOURCES DATA  SYSTEM In‐memory MPP Accelerator BUILD &  TEST OPERATIONAL TOOLS RDBMS EDW MANAGE &  MONITOR MPP REPOSITORIES Existing Sources  Emerging Sources  (CRM, ERP, Clickstream, Logs) (Sensor, Sentiment, Geo, Unstructured) © Hortonworks Inc. 2013 Page 34
  • 35. Forrester Wave: a “strong performer” • • Kognitio’s EDW is a strong, cost-effective alternative to SAP HANA. • Kognitio…was designed from the start as an MPP (distributed) in-memory RDBMS, making extensive use of RAM-based processing for maximum performance. • © Forrester Corp. Used with permission. © Hortonworks Inc. 2013 Kognitio’s entirely in-memory, distributed EDW is appealing for customers looking for fast performance on commodity hardware Download a complimentary copy of the full report at www.kognitio.com/wave Page 35
  • 36. The Modern Data Architecture …for in memory Big Data Analytics More about Kognito and Hortonworks http://hortonworks.com/partner/kognitio Get started with Hortonworks Sandbox http://hortonworks.com/hadoop-tutorial/ Follow us: @hortonworks @kognitio Question & Answer session will be conducted electronically, using the panel to the right of your screen Today’s Slides available at: www.slideshare.net/kognitio