SlideShare ist ein Scribd-Unternehmen logo
1 von 8
Overview of RedPoint Data Management
for Hortonworks Hadoop
2014
1 RedPoint Global Inc.28 May 2014© Confidential
What is Hadoop/Hadoop 2.0?
Hadoop 1.0
• All operations based on Map Reduce
• Intrinsic inconsistency of code based
solutions
• Highly skilled and expensive resources
needed
• 3rd party applications constrained by the
need to generate code
Lower
cost
scaling
No need
for
structure
Ease of
data
capture
Hadoop 2.0
• Introduction of the YARN:
“a general-purpose, distributed, application
management framework that supersedes the classic
Apache Hadoop MapReduce framework for
processing data in Hadoop clusters.”
• Mature applications can now operate
directly on Hadoop
• Reduce skill requirements and increased
consistency
2 RedPoint Global Inc.28 May 2014© Confidential
Challenges to Hadoop Adoption
• Severe shortage of MR
skilled resources
• Very expensive resources
and hard to retain
• Inconsistent skills lead to
inconsistent results
• Under utilizes existing
resources
• Prevents broad leverage
of investments across
enterprise
Skills Gap
• A nascent technology
ecosystem around
Hadoop
• Emerging technologies
only address narrow
slivers of functionality
• New applications are not
enterprise class
• Legacy applications have
built short term
capabilities
Maturity & Governance
• Data is not useful in its
raw state, it must be
turned into information
• Benefit of Hadoop is that
same data can be used
from many perspectives
• Analysts must now do
the structuring of the
data based on intended
use of the data
Data Into Information
3 RedPoint Global Inc.28 May 2014© Confidential
How RedPoint Helps
First YARN compliant ETL/data quality
toolset on the market – brings together
both Big Data and traditional data to create
Big Information!
• Customer or Party Data
• Processing Speed
• Match Quality
• Ease of Use
by in:
RANKED
#1 The power to make
your data the biggest
asset your organization
has
4 RedPoint Global Inc.28 May 2014© Confidential
RedPoint in a Hortonworks environment
APPLICATIONSDATASYSTEMSOURCES
OLTP, ERP,
CRM Systems
Documents,
Emails
Web Logs,
Click Streams
Social
Networks
Machine
Generated
Sensor
Data
Geolocation
Data
Repositories
Governance
&Integration
Security
Operations
Data Access
Data Management
RDBMS
EDW
MPP
Data Quality
Data Integration
One application, one graphical user interface for traditional and Big Data
ELT  ETL  Cleanse  Match  De-dupe  Merge/Purge  Household
Partition  Parse  Append  Standardize  Key  Automate  Monitor  Notify
Pre-built adapters
and ODBC drivers.
Pure YARN application
No MapReduce needed
No in-cluster installation
5 RedPoint Global Inc.28 May 2014© Confidential
Monitoring and Management Tools
Typical Hadoop architecture without RedPoint
AMBARI
MAPREDUCE
REST
DATA REFINEMENT
HIVEPIG
HTTP
STREAM
STRUCTURE
HCATALOG
(metadata services)
Query/Visualization/
Reporting/Analytical
Tools and Apps
SOURCE
DATA
- Sensor Logs
- Clickstream
- Flat Files
- Unstructured
- Sentiment
- Customer
- Inventory
DBs
JMS
Queue’s
Fil
es
Fil
esFiles
Data Sources
RDBMS
EDW
INTERACTIVE
HIVE Server2
LOAD
SQOOP
FLUME
WebHDFS
NFS
LOAD
SQOOP/Hive
Web HDFS
YARN
         
          
          
 
 
 n
HDFS
1            

           
           
            
6 RedPoint Global Inc.28 May 2014© Confidential
Monitoring and Management Tools
Typical Hadoop architecture with RedPoint
AMBARI
MAPREDUCE
REST
DATA REFINEMENT
HIVEPIG
HTTP
STREAM
STRUCTURE
HCATALOG
(metadata services)
Query/Visualization/
Reporting/Analytical
Tools and Apps
SOURCE
DATA
- Sensor Logs
- Clickstream
- Flat Files
- Unstructured
- Sentiment
- Customer
- Inventory
DBs
JMS
Queue’s
Fil
es
Fil
esFiles
Data Sources
RDBMS
EDW
INTERACTIVE
HIVE Server2
LOAD
SQOOP
WebHDFS
Flume
NFS
LOAD
SQOOP/Hive
Web HDFS
YARN
         
          
          
 
 
 n
HDFS
1            

           
           
            
7 RedPoint Global Inc.28 May 2014© Confidential

Weitere ähnliche Inhalte

Andere mochten auch

Wedia Social Media presentation at DigitalDays
Wedia Social Media presentation at DigitalDaysWedia Social Media presentation at DigitalDays
Wedia Social Media presentation at DigitalDaysPanos Kontopoulos
 
Innovation In The Workplace Andrew James
Innovation In The Workplace   Andrew JamesInnovation In The Workplace   Andrew James
Innovation In The Workplace Andrew JamesKonica Minolta
 
Presence Agent y Presence Scripting para personas con limitaciones visuales
Presence Agent y Presence Scripting para personas con limitaciones visualesPresence Agent y Presence Scripting para personas con limitaciones visuales
Presence Agent y Presence Scripting para personas con limitaciones visualesPresence Technology
 
Running SagePFW in a Private Cloud
Running SagePFW in a Private CloudRunning SagePFW in a Private Cloud
Running SagePFW in a Private CloudVertical Solutions
 
New Research: Cloud, Cost & Complexity Impact IAM & IT
New Research: Cloud, Cost & Complexity Impact IAM & ITNew Research: Cloud, Cost & Complexity Impact IAM & IT
New Research: Cloud, Cost & Complexity Impact IAM & ITSymplified
 
Pramata Tech Dinosaurs ePaper - Social Sharing
Pramata Tech Dinosaurs ePaper - Social SharingPramata Tech Dinosaurs ePaper - Social Sharing
Pramata Tech Dinosaurs ePaper - Social SharingTidemark Systems Inc.
 
Visual Studio 2013 - Recursos da IDE
Visual Studio 2013 - Recursos da IDEVisual Studio 2013 - Recursos da IDE
Visual Studio 2013 - Recursos da IDEStefanini
 
Getting started with performance testing
Getting started with performance testingGetting started with performance testing
Getting started with performance testingTestplant
 

Andere mochten auch (10)

Wedia Social Media presentation at DigitalDays
Wedia Social Media presentation at DigitalDaysWedia Social Media presentation at DigitalDays
Wedia Social Media presentation at DigitalDays
 
Innovation In The Workplace Andrew James
Innovation In The Workplace   Andrew JamesInnovation In The Workplace   Andrew James
Innovation In The Workplace Andrew James
 
Presence Agent y Presence Scripting para personas con limitaciones visuales
Presence Agent y Presence Scripting para personas con limitaciones visualesPresence Agent y Presence Scripting para personas con limitaciones visuales
Presence Agent y Presence Scripting para personas con limitaciones visuales
 
Running SagePFW in a Private Cloud
Running SagePFW in a Private CloudRunning SagePFW in a Private Cloud
Running SagePFW in a Private Cloud
 
Dr Ravi Gupta
Dr Ravi GuptaDr Ravi Gupta
Dr Ravi Gupta
 
New Research: Cloud, Cost & Complexity Impact IAM & IT
New Research: Cloud, Cost & Complexity Impact IAM & ITNew Research: Cloud, Cost & Complexity Impact IAM & IT
New Research: Cloud, Cost & Complexity Impact IAM & IT
 
Pramata Tech Dinosaurs ePaper - Social Sharing
Pramata Tech Dinosaurs ePaper - Social SharingPramata Tech Dinosaurs ePaper - Social Sharing
Pramata Tech Dinosaurs ePaper - Social Sharing
 
Visual Studio 2013 - Recursos da IDE
Visual Studio 2013 - Recursos da IDEVisual Studio 2013 - Recursos da IDE
Visual Studio 2013 - Recursos da IDE
 
TXT Next Presentation
TXT Next Presentation TXT Next Presentation
TXT Next Presentation
 
Getting started with performance testing
Getting started with performance testingGetting started with performance testing
Getting started with performance testing
 

Kürzlich hochgeladen

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 

Kürzlich hochgeladen (20)

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 

Overview of RedPoint Data Management for Hortonworks Hadoop

  • 1. Overview of RedPoint Data Management for Hortonworks Hadoop 2014
  • 2. 1 RedPoint Global Inc.28 May 2014© Confidential What is Hadoop/Hadoop 2.0? Hadoop 1.0 • All operations based on Map Reduce • Intrinsic inconsistency of code based solutions • Highly skilled and expensive resources needed • 3rd party applications constrained by the need to generate code Lower cost scaling No need for structure Ease of data capture Hadoop 2.0 • Introduction of the YARN: “a general-purpose, distributed, application management framework that supersedes the classic Apache Hadoop MapReduce framework for processing data in Hadoop clusters.” • Mature applications can now operate directly on Hadoop • Reduce skill requirements and increased consistency
  • 3. 2 RedPoint Global Inc.28 May 2014© Confidential Challenges to Hadoop Adoption • Severe shortage of MR skilled resources • Very expensive resources and hard to retain • Inconsistent skills lead to inconsistent results • Under utilizes existing resources • Prevents broad leverage of investments across enterprise Skills Gap • A nascent technology ecosystem around Hadoop • Emerging technologies only address narrow slivers of functionality • New applications are not enterprise class • Legacy applications have built short term capabilities Maturity & Governance • Data is not useful in its raw state, it must be turned into information • Benefit of Hadoop is that same data can be used from many perspectives • Analysts must now do the structuring of the data based on intended use of the data Data Into Information
  • 4. 3 RedPoint Global Inc.28 May 2014© Confidential How RedPoint Helps First YARN compliant ETL/data quality toolset on the market – brings together both Big Data and traditional data to create Big Information! • Customer or Party Data • Processing Speed • Match Quality • Ease of Use by in: RANKED #1 The power to make your data the biggest asset your organization has
  • 5. 4 RedPoint Global Inc.28 May 2014© Confidential RedPoint in a Hortonworks environment APPLICATIONSDATASYSTEMSOURCES OLTP, ERP, CRM Systems Documents, Emails Web Logs, Click Streams Social Networks Machine Generated Sensor Data Geolocation Data Repositories Governance &Integration Security Operations Data Access Data Management RDBMS EDW MPP Data Quality Data Integration One application, one graphical user interface for traditional and Big Data ELT  ETL  Cleanse  Match  De-dupe  Merge/Purge  Household Partition  Parse  Append  Standardize  Key  Automate  Monitor  Notify Pre-built adapters and ODBC drivers. Pure YARN application No MapReduce needed No in-cluster installation
  • 6. 5 RedPoint Global Inc.28 May 2014© Confidential Monitoring and Management Tools Typical Hadoop architecture without RedPoint AMBARI MAPREDUCE REST DATA REFINEMENT HIVEPIG HTTP STREAM STRUCTURE HCATALOG (metadata services) Query/Visualization/ Reporting/Analytical Tools and Apps SOURCE DATA - Sensor Logs - Clickstream - Flat Files - Unstructured - Sentiment - Customer - Inventory DBs JMS Queue’s Fil es Fil esFiles Data Sources RDBMS EDW INTERACTIVE HIVE Server2 LOAD SQOOP FLUME WebHDFS NFS LOAD SQOOP/Hive Web HDFS YARN                                      n HDFS 1                                                  
  • 7. 6 RedPoint Global Inc.28 May 2014© Confidential Monitoring and Management Tools Typical Hadoop architecture with RedPoint AMBARI MAPREDUCE REST DATA REFINEMENT HIVEPIG HTTP STREAM STRUCTURE HCATALOG (metadata services) Query/Visualization/ Reporting/Analytical Tools and Apps SOURCE DATA - Sensor Logs - Clickstream - Flat Files - Unstructured - Sentiment - Customer - Inventory DBs JMS Queue’s Fil es Fil esFiles Data Sources RDBMS EDW INTERACTIVE HIVE Server2 LOAD SQOOP WebHDFS Flume NFS LOAD SQOOP/Hive Web HDFS YARN                                      n HDFS 1                                                  
  • 8. 7 RedPoint Global Inc.28 May 2014© Confidential