SlideShare ist ein Scribd-Unternehmen logo
1 von 24
Downloaden Sie, um offline zu lesen
Hadoop
                                                         (Shanghai Developer

                                                         Meetup – Sept 15, 2011)

                                                         余家昌 (Andrew Yu)
                                                         EMC Greenplum




© Copyright 2011 EMC Corporation. All rights reserved.                             1
The Elephant Chase




© Copyright 2011 EMC Corporation. All rights reserved.   2
© Copyright 2011 EMC Corporation. All rights reserved.   3
Yahoo! Hadoop use cases
• Personalized Yahoo! Homepage
• Yahoo! Mail anti-spam
• Search and Ad pipelines
• Ad inventory prediction
• Data analytics
• etc




© Copyright 2011 EMC Corporation. All rights reserved.   4
Enterprise Use Case: “Big ETL”
Challenge: Transform Massive Data                          Solution: Hadoop/MapReduce as ETL
Flows Containing Data Needed for                           fabric to load to Analytic Database
Complex Analysis
• Examples:                                                • Components:
         –      Web Traffic Reduction                            –   Hadoop: Massively-parallel ingest, storage and
         –      Network Traffic & Performance Analysis               analysis
         –      Location Analytics for People and Goods          –   MapReduce: Runs multiple cascaded custom
                                                                     analysis / extraction on capture data
         –      Smart Electric Power Grid
                                                                 –   Connectors move structured data to Analytics
         –      Genome Analysis                                      DB
         –      Clinical Outcome Research & Analysis
                                                           • Hadoop’s Roles:
• Data Sources:                                                  –   Capture TBs/day of machine-generated data
         –      Web server & app server logs                     –   Quality: Run data quality tasks in MapReduce
         –      CDR / xDRs                                       –   Execute MapReduce flows
         –      Router & Switching Subsystem Logs                –   Extract/Combine data/metadata
         –      Sensor networks                                  –   Move processed data to analytic DB


• Limitations & Cautions:
         –     Software development, More parts (Cascading/Flow), Maintainability



© Copyright 2011 EMC Corporation. All rights reserved.                                                                5
Enterprise Use Case: Fraud Detection
Challenge: Identify & alert fraudulent                       Solution: Hadoop/MapReduce to filter
activity patterns                                            & correlate communications
• Examples:                                                  • Components:
         –      ESP’s - Email Fraud                                –   Hadoop: Massively-parallel ingest,
         –      Finance/Banking - Bank Fraud                           storage and analysis
         –      Advertising - Click Fraud                          –   Mahout: Machine learning tool for building
         –      Telecom – Network fraud                                fraud algorithms
                                                                   –   MapReduce: Rapid analysis & algorithm
• Data Sources:                                                        deployment
         –      Web & app server logs
                                                             • Hadoop’s Role(s):
         –      IP/Call Records
                                                                   –   Massive ingest of historical/real-time data
         –      Email Traffic
                                                                   –   Build/Validate model for fraud detection
         –      Customer Transaction Data
                                                                       manually or using Mahout
         –      Banking/Credit Data
                                                                   –   Parallel MapReduce jobs for near real-
                                                                       time fraud detection

• Limitations & Cautions:
         –     Software development, Partial Solution (not Real-time, not Interactive)
         –


© Copyright 2011 EMC Corporation. All rights reserved.                                                               6
Enterprise Use Case: Cluster Analysis
 Challenge: Grouping a collection of                      Solution: Process and Refine in
 data according to common similarities                    Hadoop and load into Analytical DB
• Examples:                                               • Components:
         –      Customer segmentation                          –   Hadoop: Flexible data storage as volume
         –      Financial cost/risk analysis                       increases and structures vary
         –      Patient-centric healthcare                     –   MapReduce: Cascading allows data
         –      Financial stock classification                     processing with minimal adjustments
         –      Social network analysis                        –   Optional: Connectors to move results to
                                                                   Analytic DB
• Data Sources:
                                                          • Hadoop’s Role(s):
         –      Health records
                                                               –   Flexible: Allow agile implementation of
         –      Sales data
                                                                   and unit testing of algorithms
         –      Human genome sequences
                                                               –   Large scale analysis in Hadoop creates
         –      Financial trading data                             more accurate groupings
         –      Facebook/Twitter/LinkedIn                      –   Rapid, parallel processing in MapReduce

• Limitations & Cautions:
         –     Software development, Complex Integration with Sources



© Copyright 2011 EMC Corporation. All rights reserved.                                                       7
Greenplum HD:
 Community Edition Stack



              100%
            APACHE




                                                                                                   Hive
                                                                                          Pig




                                                                                                           HBase
                                                          Zookeeper




                                                                      MapReduce Framework (MapRed)


                                                                       Hadoop Distributed File System (HDFS)


Currently supported

 Future releases may include support for Oozie and Mahout
 © Copyright 2011 EMC Corporation. All rights reserved.                                                            9
Greenplum HD:
 Enterprise Edition Stack


              100%
             APACHE




                                                                                                                   Enhanced Monitoring
           INTERFACE




                                                                                                   Hive
                                                                                          Pig




                                                                                                           HBase
                                                          Zookeeper




                                                                      MapReduce Framework (MapRed)


                                                                       Hadoop Distributed File System (HDFS)


Currently supported

 Future releases may include support for Oozie and Mahout
 © Copyright 2011 EMC Corporation. All rights reserved.                                                                                  10
Greenplum HD: Enterprise Edition
Enterprise-Ready Hadoop Platform for Unstructured Data



                                                         • 2 – 5x Faster than Apache
                  Faster                                   Hadoop

                                                         • High Availability
               Reliable                                  • Mirroring

              Easier to                                  • NFS mountable
                Use                                      • System Management




© Copyright 2011 EMC Corporation. All rights reserved.                                 11
Greenplum Enterprise HD is Faster than
Other Distributions

                                           DFSIO                                                         Terasort
                                  (higher is better)                                                (lower is better)

           1000                                                                        250




                                                                 Elapsed time in minutes
            900
            800                                                                        200
            700
  MB/sec




            600                                                                        150
            500
            400                                                                        100
            300
            200                                                                            50
            100
              0                                                                            0
                            Read                         Write                                  3.5 TB



   10 node cluster, 2x Quad-Core, 24G DRAM, 12 x 1TB SATA Drives @ 7200 rpm, Quad NICs




© Copyright 2011 EMC Corporation. All rights reserved.                                                                  12
Greenplum Enterprise HD
Distributed Name Node
• Fully distributed                                      Hadoop      Hadoop
                                                         Node        Node
  service running on                                            NN          NN

  all Hadoop nodes                                       Hadoop      Hadoop
                                                         Node   NN   Node   NN
• Automatic and                                          Hadoop      Hadoop
  transparent failover                                   Node   NN   Node   NN


• Persistent metadata                                    Hadoop
                                                         Node
                                                                     Hadoop
                                                                     Node
                                                                NN          NN

• Highly scalable in                                     Hadoop      Hadoop
                                                         Node        Node
  number of files                                               NN          NN




© Copyright 2011 EMC Corporation. All rights reserved.                           13
Greenplum Enterprise HD
Job Tracker High Availability
• Assures business
  continuity
• Designed for mission                                      Greenplum Enterprise HD
                                                         Distribution for Apache Hadoop
  critical use
         – Automatic stateful restart
         – Task Tracker reconnects                          Enterprise HD MapReduce
           without task loss                                                  Distributed
         – Persistent completed task                     Job Tracker HA       Name Node
           state
                                                                    Enterprise HD
                                                              Lockless Storage Services




© Copyright 2011 EMC Corporation. All rights reserved.                                      14
Greenplum Enterprise HD
Snapshots
• Intelligent Snapshots
         – Automatic data deduplication                  Hadoop / HBASE                 NFS
                                                         APPLICATIONS               APPLICATIONS
         – Block sharing for space
                                                                         READ / WRITE
           savings
                                                              Enterprise HD Lockless Storage
• Fast and flexible                                                      Services


         – Zero performance loss when
                                                                          REDIRECT ON
                                                                             WRITE
                                                                         FOR SNAPSHOT
           writing to the original                        A          B         C        C’         D

• Easy to manage
         – Scheduled or on-demand
         – Drag and drop recovery                                                            Snapshot
                                                          Snapshot           Snapshot
                                                             1                  2               3




© Copyright 2011 EMC Corporation. All rights reserved.                                                  15
Greenplum Enterprise HD
Mirroring
                                                                    • Business Continuity
        Production                                       Research      – Efficient design
                                                                       – Differential deltas are
                                                                         updated
                                                                       – Data is compressed and
   Datacenter 1                     WAN             Datacenter 2         check-summed
                                                                    • Easy to manage
                                                                       – Scheduled or on-demand
                                                                       – Consistent point-in-time
        Production                   WAN                  Cloud




© Copyright 2011 EMC Corporation. All rights reserved.                                              16
Greenplum Enterprise HD
   Direct Access Using NFS

• Simple application
  integration                                               Greenplum Enterprise HD
                                                         Distribution for Apache Hadoop
         – Leverage NFS for
           random read/write
                                                            Enterprise HD MapReduce
           access
• Direct access for                                      Job Tracker HA
                                                                            Distributed
                                                                            Name Node
  standard Hadoop tools
         – Command line utilities                                  Enterprise HD
                                                             Lockless Storage Services
         – File browsers
         – Desktop utilities


© Copyright 2011 EMC Corporation. All rights reserved.                                    17
Greenplum Enterprise HD
 Simple Management

• Intuitive
• Insightful
• Complete
• One node
  or
  thousands




 © Copyright 2011 EMC Corporation. All rights reserved.   18
Greenplum HD: Software Distributions

Features                        Community Edition            Enterprise Edition
Apache Compatibility        100% Apache Open Source        100% API Compatible
Name Node High Availability Reference Implementation Distributed and High Avaiability
Job Tracker HA              Reference Implementation        HT High Availability
Name Node Scalability        NN Metadata in Memory        Distributed Name Node
Premium Support                        Yes                          Yes
Performance                                           2 - 5x than Community Edition
Snapshots                              No                           Yes
Mirrors                                No                           Yes
NFS Mounts                             No                           Yes
System Management                      No                           Yes
Available for Ordering            May 9th 2011                      Q3
Pricing                          Per Node Pricing             Per Node Pricing




 © Copyright 2011 EMC Corporation. All rights reserved.                                 19
Greenplum HD on
Data Computing Appliance
• Introducing the world’s first:
         – High-performance
         – Purpose-built
         – Data co-processing Hadoop
           appliance
• Combining Greenplum Database
  and Greenplum Hadoop in one
  appliance




© Copyright 2011 EMC Corporation. All rights reserved.   20
GPDB  GPHD Interoperability


                                                         GPHD data in/out   GPHD
                                                         in GPDB Query
                                                                            File on
                                                                              HD




                        GPDB
                 External Tables




© Copyright 2011 EMC Corporation. All rights reserved.                                21
Greenplum Database
External Tables for Hadoop

• Bring GPDB relational expressive
                                                         Example:
  power to HDFS
         – HDFS data presented as external tables        Select count(*) from
         – HDFS data supporting full SQL syntax          HDFS_data h,
                                                         GPDB_data g
• Have ALL, PART or NONE of your                         where h.key = g.key;
  data in HDFS
                                                         Insert into
• Leverage full parallelism of both                      HDFS_data select *
  Hadoop and GPDB                                        from GPDB_data;
         – GPDB can read from/write to HDFS,




© Copyright 2011 EMC Corporation. All rights reserved.                          22
Greenplum Enterprise HD
HDFS Integration – Parallelized Flow
• Reading:
         – Each GPDB segment reads a portion of the file
                   • Segment i of n reads the i/n-th portion
         – Access offset from HDFS namenode
         – Read data directly from HDFS datanode
• Writing:
         – Each GPDB segment writes a file
         – HDFS balancing distributes the load evenly
           across datanodes




© Copyright 2011 EMC Corporation. All rights reserved.         23
Big Data Analytics “Stack”
                                                                Analytic Toolsets
                                                          (Business Analytics, BI, Statistics, etc.)



                                                               Greenplum Chorus
                                                         Enterprise Collaboration Platform for Data




                Greenplum Database                                                              Greenplum HD
         World’s Most Scalable MPP Database Platform                                Enterprise Analytics Platform for Unstructured Data




                                       Greenplum Data Computing Appliances
                                                            Purpose-built for Big Data Analytics




© Copyright 2011 EMC Corporation. All rights reserved.                                                                                    24
THANK YOU



© Copyright 2011 EMC Corporation. All rights reserved.   25

Weitere ähnliche Inhalte

Was ist angesagt?

Hadoop Data Reservoir Webinar
Hadoop Data Reservoir WebinarHadoop Data Reservoir Webinar
Hadoop Data Reservoir WebinarPlatfora
 
Hadoop World 2011: Building Scalable Data Platforms ; Hadoop & Netezza Deploy...
Hadoop World 2011: Building Scalable Data Platforms ; Hadoop & Netezza Deploy...Hadoop World 2011: Building Scalable Data Platforms ; Hadoop & Netezza Deploy...
Hadoop World 2011: Building Scalable Data Platforms ; Hadoop & Netezza Deploy...Krishnan Parasuraman
 
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012Jonathan Seidman
 
Using hadoop to expand data warehousing
Using hadoop to expand data warehousingUsing hadoop to expand data warehousing
Using hadoop to expand data warehousingDataWorks Summit
 
Agile analytics applications on hadoop
Agile analytics applications on hadoopAgile analytics applications on hadoop
Agile analytics applications on hadoopHortonworks
 
Hadoop - Where did it come from and what's next? (Pasadena Sept 2014)
Hadoop - Where did it come from and what's next? (Pasadena Sept 2014)Hadoop - Where did it come from and what's next? (Pasadena Sept 2014)
Hadoop - Where did it come from and what's next? (Pasadena Sept 2014)Eric Baldeschwieler
 
Hw09 Data Processing In The Enterprise
Hw09   Data Processing In The EnterpriseHw09   Data Processing In The Enterprise
Hw09 Data Processing In The EnterpriseCloudera, Inc.
 
Introduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for WindowsIntroduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for WindowsHortonworks
 
Linking Data and Actions on the Web
Linking Data and Actions on the WebLinking Data and Actions on the Web
Linking Data and Actions on the WebStuart Charlton
 
I'll See You On the Write Side of the Web
I'll See You On the Write Side of the WebI'll See You On the Write Side of the Web
I'll See You On the Write Side of the WebStuart Charlton
 
Introduction to Microsoft HDInsight and BI Tools
Introduction to Microsoft HDInsight and BI ToolsIntroduction to Microsoft HDInsight and BI Tools
Introduction to Microsoft HDInsight and BI ToolsDataWorks Summit
 
Big data, map reduce and beyond
Big data, map reduce and beyondBig data, map reduce and beyond
Big data, map reduce and beyonddatasalt
 
Emergent Distributed Data Storage
Emergent Distributed Data StorageEmergent Distributed Data Storage
Emergent Distributed Data Storagehybrid cloud
 
Tajo_Meetup_20141120
Tajo_Meetup_20141120Tajo_Meetup_20141120
Tajo_Meetup_20141120Hyoungjun Kim
 
Hadoop summit EU - Crowd Sourcing Reflected Intelligence
Hadoop summit EU - Crowd Sourcing Reflected IntelligenceHadoop summit EU - Crowd Sourcing Reflected Intelligence
Hadoop summit EU - Crowd Sourcing Reflected IntelligenceTed Dunning
 
Demonstrating the Future of Data Science
Demonstrating the Future of Data ScienceDemonstrating the Future of Data Science
Demonstrating the Future of Data Sciencegreenplum
 
Hortonworks and Red Hat Webinar - Part 2
Hortonworks and Red Hat Webinar - Part 2Hortonworks and Red Hat Webinar - Part 2
Hortonworks and Red Hat Webinar - Part 2Hortonworks
 
Complex Er[jl]ang Processing with StreamBase
Complex Er[jl]ang Processing with StreamBaseComplex Er[jl]ang Processing with StreamBase
Complex Er[jl]ang Processing with StreamBasedarach
 

Was ist angesagt? (20)

Hadoop Data Reservoir Webinar
Hadoop Data Reservoir WebinarHadoop Data Reservoir Webinar
Hadoop Data Reservoir Webinar
 
Hadoop World 2011: Building Scalable Data Platforms ; Hadoop & Netezza Deploy...
Hadoop World 2011: Building Scalable Data Platforms ; Hadoop & Netezza Deploy...Hadoop World 2011: Building Scalable Data Platforms ; Hadoop & Netezza Deploy...
Hadoop World 2011: Building Scalable Data Platforms ; Hadoop & Netezza Deploy...
 
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
 
Using hadoop to expand data warehousing
Using hadoop to expand data warehousingUsing hadoop to expand data warehousing
Using hadoop to expand data warehousing
 
Agile analytics applications on hadoop
Agile analytics applications on hadoopAgile analytics applications on hadoop
Agile analytics applications on hadoop
 
hadoop @ Ibmbigdata
hadoop @ Ibmbigdatahadoop @ Ibmbigdata
hadoop @ Ibmbigdata
 
Hadoop - Where did it come from and what's next? (Pasadena Sept 2014)
Hadoop - Where did it come from and what's next? (Pasadena Sept 2014)Hadoop - Where did it come from and what's next? (Pasadena Sept 2014)
Hadoop - Where did it come from and what's next? (Pasadena Sept 2014)
 
Hw09 Data Processing In The Enterprise
Hw09   Data Processing In The EnterpriseHw09   Data Processing In The Enterprise
Hw09 Data Processing In The Enterprise
 
Introduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for WindowsIntroduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for Windows
 
Linking Data and Actions on the Web
Linking Data and Actions on the WebLinking Data and Actions on the Web
Linking Data and Actions on the Web
 
I'll See You On the Write Side of the Web
I'll See You On the Write Side of the WebI'll See You On the Write Side of the Web
I'll See You On the Write Side of the Web
 
Greenplum hadoop
Greenplum hadoopGreenplum hadoop
Greenplum hadoop
 
Introduction to Microsoft HDInsight and BI Tools
Introduction to Microsoft HDInsight and BI ToolsIntroduction to Microsoft HDInsight and BI Tools
Introduction to Microsoft HDInsight and BI Tools
 
Big data, map reduce and beyond
Big data, map reduce and beyondBig data, map reduce and beyond
Big data, map reduce and beyond
 
Emergent Distributed Data Storage
Emergent Distributed Data StorageEmergent Distributed Data Storage
Emergent Distributed Data Storage
 
Tajo_Meetup_20141120
Tajo_Meetup_20141120Tajo_Meetup_20141120
Tajo_Meetup_20141120
 
Hadoop summit EU - Crowd Sourcing Reflected Intelligence
Hadoop summit EU - Crowd Sourcing Reflected IntelligenceHadoop summit EU - Crowd Sourcing Reflected Intelligence
Hadoop summit EU - Crowd Sourcing Reflected Intelligence
 
Demonstrating the Future of Data Science
Demonstrating the Future of Data ScienceDemonstrating the Future of Data Science
Demonstrating the Future of Data Science
 
Hortonworks and Red Hat Webinar - Part 2
Hortonworks and Red Hat Webinar - Part 2Hortonworks and Red Hat Webinar - Part 2
Hortonworks and Red Hat Webinar - Part 2
 
Complex Er[jl]ang Processing with StreamBase
Complex Er[jl]ang Processing with StreamBaseComplex Er[jl]ang Processing with StreamBase
Complex Er[jl]ang Processing with StreamBase
 

Andere mochten auch

WANA GROUP agence full services de talents digitaux + de 180 collaborateurs
WANA GROUP agence full services de talents digitaux + de 180 collaborateursWANA GROUP agence full services de talents digitaux + de 180 collaborateurs
WANA GROUP agence full services de talents digitaux + de 180 collaborateursAurélien Malo
 
Conférence "le big data en entreprise" de René Lefébure lors de l'évènement ...
Conférence "le big data en entreprise" de René Lefébure lors de l'évènement ...Conférence "le big data en entreprise" de René Lefébure lors de l'évènement ...
Conférence "le big data en entreprise" de René Lefébure lors de l'évènement ...WANA GROUP
 
JSS2014 – Le grand tour de Power BI
JSS2014 – Le grand tour de Power BIJSS2014 – Le grand tour de Power BI
JSS2014 – Le grand tour de Power BIGUSS
 
La Data, levier pour personnaliser sa relation client
La Data, levier pour personnaliser sa relation clientLa Data, levier pour personnaliser sa relation client
La Data, levier pour personnaliser sa relation clientHassan Lâasri
 
10 minutes : Tableaux de bord
10 minutes : Tableaux de bord10 minutes : Tableaux de bord
10 minutes : Tableaux de bordConverteo
 

Andere mochten auch (6)

WANA GROUP agence full services de talents digitaux + de 180 collaborateurs
WANA GROUP agence full services de talents digitaux + de 180 collaborateursWANA GROUP agence full services de talents digitaux + de 180 collaborateurs
WANA GROUP agence full services de talents digitaux + de 180 collaborateurs
 
Conférence "le big data en entreprise" de René Lefébure lors de l'évènement ...
Conférence "le big data en entreprise" de René Lefébure lors de l'évènement ...Conférence "le big data en entreprise" de René Lefébure lors de l'évènement ...
Conférence "le big data en entreprise" de René Lefébure lors de l'évènement ...
 
JSS2014 – Le grand tour de Power BI
JSS2014 – Le grand tour de Power BIJSS2014 – Le grand tour de Power BI
JSS2014 – Le grand tour de Power BI
 
La Data, levier pour personnaliser sa relation client
La Data, levier pour personnaliser sa relation clientLa Data, levier pour personnaliser sa relation client
La Data, levier pour personnaliser sa relation client
 
Les secrets d'un bon tableau de bord excel
Les secrets d'un bon tableau de bord excelLes secrets d'un bon tableau de bord excel
Les secrets d'un bon tableau de bord excel
 
10 minutes : Tableaux de bord
10 minutes : Tableaux de bord10 minutes : Tableaux de bord
10 minutes : Tableaux de bord
 

Ähnlich wie Hadoop for shanghai dev meetup

Apache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingApache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingm_hepburn
 
Keynote from ApacheCon NA 2011
Keynote from ApacheCon NA 2011Keynote from ApacheCon NA 2011
Keynote from ApacheCon NA 2011Hortonworks
 
Hadoop on Azure, Blue elephants
Hadoop on Azure,  Blue elephantsHadoop on Azure,  Blue elephants
Hadoop on Azure, Blue elephantsOvidiu Dimulescu
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopHortonworks
 
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011Cloudera, Inc.
 
Introduction To Big Data & Hadoop
Introduction To Big Data & HadoopIntroduction To Big Data & Hadoop
Introduction To Big Data & HadoopBlackvard
 
Big Data/Hadoop Infrastructure Considerations
Big Data/Hadoop Infrastructure ConsiderationsBig Data/Hadoop Infrastructure Considerations
Big Data/Hadoop Infrastructure ConsiderationsRichard McDougall
 
Bi with apache hadoop(en)
Bi with apache hadoop(en)Bi with apache hadoop(en)
Bi with apache hadoop(en)Alexander Alten
 
Analytics on Hadoop
Analytics on HadoopAnalytics on Hadoop
Analytics on HadoopEMC
 
Hadoop Overview
Hadoop Overview Hadoop Overview
Hadoop Overview EMC
 
Hortonworks: Agile Analytics Applications
Hortonworks: Agile Analytics ApplicationsHortonworks: Agile Analytics Applications
Hortonworks: Agile Analytics Applicationsrussell_jurney
 
Hadoop - Architectural road map for Hadoop Ecosystem
Hadoop -  Architectural road map for Hadoop EcosystemHadoop -  Architectural road map for Hadoop Ecosystem
Hadoop - Architectural road map for Hadoop Ecosystemnallagangus
 
Présentation on radoop
Présentation on radoop   Présentation on radoop
Présentation on radoop siliconsudipt
 

Ähnlich wie Hadoop for shanghai dev meetup (20)

Apache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingApache hadoop bigdata-in-banking
Apache hadoop bigdata-in-banking
 
Hadoop Trends
Hadoop TrendsHadoop Trends
Hadoop Trends
 
Keynote from ApacheCon NA 2011
Keynote from ApacheCon NA 2011Keynote from ApacheCon NA 2011
Keynote from ApacheCon NA 2011
 
Cloud computing era
Cloud computing eraCloud computing era
Cloud computing era
 
Hadoop on Azure, Blue elephants
Hadoop on Azure,  Blue elephantsHadoop on Azure,  Blue elephants
Hadoop on Azure, Blue elephants
 
Firebird meets NoSQL
Firebird meets NoSQLFirebird meets NoSQL
Firebird meets NoSQL
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache Hadoop
 
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
 
Introduction To Big Data & Hadoop
Introduction To Big Data & HadoopIntroduction To Big Data & Hadoop
Introduction To Big Data & Hadoop
 
Big Data/Hadoop Infrastructure Considerations
Big Data/Hadoop Infrastructure ConsiderationsBig Data/Hadoop Infrastructure Considerations
Big Data/Hadoop Infrastructure Considerations
 
Bi with apache hadoop(en)
Bi with apache hadoop(en)Bi with apache hadoop(en)
Bi with apache hadoop(en)
 
201305 hadoop jpl-v3
201305 hadoop jpl-v3201305 hadoop jpl-v3
201305 hadoop jpl-v3
 
2012 06 hortonworks paris hug
2012 06 hortonworks paris hug2012 06 hortonworks paris hug
2012 06 hortonworks paris hug
 
Analytics on Hadoop
Analytics on HadoopAnalytics on Hadoop
Analytics on Hadoop
 
Zh tw cloud computing era
Zh tw cloud computing eraZh tw cloud computing era
Zh tw cloud computing era
 
Hadoop Overview
Hadoop Overview Hadoop Overview
Hadoop Overview
 
Hadoop programming
Hadoop programmingHadoop programming
Hadoop programming
 
Hortonworks: Agile Analytics Applications
Hortonworks: Agile Analytics ApplicationsHortonworks: Agile Analytics Applications
Hortonworks: Agile Analytics Applications
 
Hadoop - Architectural road map for Hadoop Ecosystem
Hadoop -  Architectural road map for Hadoop EcosystemHadoop -  Architectural road map for Hadoop Ecosystem
Hadoop - Architectural road map for Hadoop Ecosystem
 
Présentation on radoop
Présentation on radoop   Présentation on radoop
Présentation on radoop
 

Kürzlich hochgeladen

A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 

Kürzlich hochgeladen (20)

A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 

Hadoop for shanghai dev meetup

  • 1. Hadoop (Shanghai Developer Meetup – Sept 15, 2011) 余家昌 (Andrew Yu) EMC Greenplum © Copyright 2011 EMC Corporation. All rights reserved. 1
  • 2. The Elephant Chase © Copyright 2011 EMC Corporation. All rights reserved. 2
  • 3. © Copyright 2011 EMC Corporation. All rights reserved. 3
  • 4. Yahoo! Hadoop use cases • Personalized Yahoo! Homepage • Yahoo! Mail anti-spam • Search and Ad pipelines • Ad inventory prediction • Data analytics • etc © Copyright 2011 EMC Corporation. All rights reserved. 4
  • 5. Enterprise Use Case: “Big ETL” Challenge: Transform Massive Data Solution: Hadoop/MapReduce as ETL Flows Containing Data Needed for fabric to load to Analytic Database Complex Analysis • Examples: • Components: – Web Traffic Reduction – Hadoop: Massively-parallel ingest, storage and – Network Traffic & Performance Analysis analysis – Location Analytics for People and Goods – MapReduce: Runs multiple cascaded custom analysis / extraction on capture data – Smart Electric Power Grid – Connectors move structured data to Analytics – Genome Analysis DB – Clinical Outcome Research & Analysis • Hadoop’s Roles: • Data Sources: – Capture TBs/day of machine-generated data – Web server & app server logs – Quality: Run data quality tasks in MapReduce – CDR / xDRs – Execute MapReduce flows – Router & Switching Subsystem Logs – Extract/Combine data/metadata – Sensor networks – Move processed data to analytic DB • Limitations & Cautions: – Software development, More parts (Cascading/Flow), Maintainability © Copyright 2011 EMC Corporation. All rights reserved. 5
  • 6. Enterprise Use Case: Fraud Detection Challenge: Identify & alert fraudulent Solution: Hadoop/MapReduce to filter activity patterns & correlate communications • Examples: • Components: – ESP’s - Email Fraud – Hadoop: Massively-parallel ingest, – Finance/Banking - Bank Fraud storage and analysis – Advertising - Click Fraud – Mahout: Machine learning tool for building – Telecom – Network fraud fraud algorithms – MapReduce: Rapid analysis & algorithm • Data Sources: deployment – Web & app server logs • Hadoop’s Role(s): – IP/Call Records – Massive ingest of historical/real-time data – Email Traffic – Build/Validate model for fraud detection – Customer Transaction Data manually or using Mahout – Banking/Credit Data – Parallel MapReduce jobs for near real- time fraud detection • Limitations & Cautions: – Software development, Partial Solution (not Real-time, not Interactive) – © Copyright 2011 EMC Corporation. All rights reserved. 6
  • 7. Enterprise Use Case: Cluster Analysis Challenge: Grouping a collection of Solution: Process and Refine in data according to common similarities Hadoop and load into Analytical DB • Examples: • Components: – Customer segmentation – Hadoop: Flexible data storage as volume – Financial cost/risk analysis increases and structures vary – Patient-centric healthcare – MapReduce: Cascading allows data – Financial stock classification processing with minimal adjustments – Social network analysis – Optional: Connectors to move results to Analytic DB • Data Sources: • Hadoop’s Role(s): – Health records – Flexible: Allow agile implementation of – Sales data and unit testing of algorithms – Human genome sequences – Large scale analysis in Hadoop creates – Financial trading data more accurate groupings – Facebook/Twitter/LinkedIn – Rapid, parallel processing in MapReduce • Limitations & Cautions: – Software development, Complex Integration with Sources © Copyright 2011 EMC Corporation. All rights reserved. 7
  • 8. Greenplum HD: Community Edition Stack 100% APACHE Hive Pig HBase Zookeeper MapReduce Framework (MapRed) Hadoop Distributed File System (HDFS) Currently supported Future releases may include support for Oozie and Mahout © Copyright 2011 EMC Corporation. All rights reserved. 9
  • 9. Greenplum HD: Enterprise Edition Stack 100% APACHE Enhanced Monitoring INTERFACE Hive Pig HBase Zookeeper MapReduce Framework (MapRed) Hadoop Distributed File System (HDFS) Currently supported Future releases may include support for Oozie and Mahout © Copyright 2011 EMC Corporation. All rights reserved. 10
  • 10. Greenplum HD: Enterprise Edition Enterprise-Ready Hadoop Platform for Unstructured Data • 2 – 5x Faster than Apache Faster Hadoop • High Availability Reliable • Mirroring Easier to • NFS mountable Use • System Management © Copyright 2011 EMC Corporation. All rights reserved. 11
  • 11. Greenplum Enterprise HD is Faster than Other Distributions DFSIO Terasort (higher is better) (lower is better) 1000 250 Elapsed time in minutes 900 800 200 700 MB/sec 600 150 500 400 100 300 200 50 100 0 0 Read Write 3.5 TB 10 node cluster, 2x Quad-Core, 24G DRAM, 12 x 1TB SATA Drives @ 7200 rpm, Quad NICs © Copyright 2011 EMC Corporation. All rights reserved. 12
  • 12. Greenplum Enterprise HD Distributed Name Node • Fully distributed Hadoop Hadoop Node Node service running on NN NN all Hadoop nodes Hadoop Hadoop Node NN Node NN • Automatic and Hadoop Hadoop transparent failover Node NN Node NN • Persistent metadata Hadoop Node Hadoop Node NN NN • Highly scalable in Hadoop Hadoop Node Node number of files NN NN © Copyright 2011 EMC Corporation. All rights reserved. 13
  • 13. Greenplum Enterprise HD Job Tracker High Availability • Assures business continuity • Designed for mission Greenplum Enterprise HD Distribution for Apache Hadoop critical use – Automatic stateful restart – Task Tracker reconnects Enterprise HD MapReduce without task loss Distributed – Persistent completed task Job Tracker HA Name Node state Enterprise HD Lockless Storage Services © Copyright 2011 EMC Corporation. All rights reserved. 14
  • 14. Greenplum Enterprise HD Snapshots • Intelligent Snapshots – Automatic data deduplication Hadoop / HBASE NFS APPLICATIONS APPLICATIONS – Block sharing for space READ / WRITE savings Enterprise HD Lockless Storage • Fast and flexible Services – Zero performance loss when REDIRECT ON WRITE FOR SNAPSHOT writing to the original A B C C’ D • Easy to manage – Scheduled or on-demand – Drag and drop recovery Snapshot Snapshot Snapshot 1 2 3 © Copyright 2011 EMC Corporation. All rights reserved. 15
  • 15. Greenplum Enterprise HD Mirroring • Business Continuity Production Research – Efficient design – Differential deltas are updated – Data is compressed and Datacenter 1 WAN Datacenter 2 check-summed • Easy to manage – Scheduled or on-demand – Consistent point-in-time Production WAN Cloud © Copyright 2011 EMC Corporation. All rights reserved. 16
  • 16. Greenplum Enterprise HD Direct Access Using NFS • Simple application integration Greenplum Enterprise HD Distribution for Apache Hadoop – Leverage NFS for random read/write Enterprise HD MapReduce access • Direct access for Job Tracker HA Distributed Name Node standard Hadoop tools – Command line utilities Enterprise HD Lockless Storage Services – File browsers – Desktop utilities © Copyright 2011 EMC Corporation. All rights reserved. 17
  • 17. Greenplum Enterprise HD Simple Management • Intuitive • Insightful • Complete • One node or thousands © Copyright 2011 EMC Corporation. All rights reserved. 18
  • 18. Greenplum HD: Software Distributions Features Community Edition Enterprise Edition Apache Compatibility 100% Apache Open Source 100% API Compatible Name Node High Availability Reference Implementation Distributed and High Avaiability Job Tracker HA Reference Implementation HT High Availability Name Node Scalability NN Metadata in Memory Distributed Name Node Premium Support Yes Yes Performance 2 - 5x than Community Edition Snapshots No Yes Mirrors No Yes NFS Mounts No Yes System Management No Yes Available for Ordering May 9th 2011 Q3 Pricing Per Node Pricing Per Node Pricing © Copyright 2011 EMC Corporation. All rights reserved. 19
  • 19. Greenplum HD on Data Computing Appliance • Introducing the world’s first: – High-performance – Purpose-built – Data co-processing Hadoop appliance • Combining Greenplum Database and Greenplum Hadoop in one appliance © Copyright 2011 EMC Corporation. All rights reserved. 20
  • 20. GPDB  GPHD Interoperability GPHD data in/out GPHD in GPDB Query File on HD GPDB External Tables © Copyright 2011 EMC Corporation. All rights reserved. 21
  • 21. Greenplum Database External Tables for Hadoop • Bring GPDB relational expressive Example: power to HDFS – HDFS data presented as external tables Select count(*) from – HDFS data supporting full SQL syntax HDFS_data h, GPDB_data g • Have ALL, PART or NONE of your where h.key = g.key; data in HDFS Insert into • Leverage full parallelism of both HDFS_data select * Hadoop and GPDB from GPDB_data; – GPDB can read from/write to HDFS, © Copyright 2011 EMC Corporation. All rights reserved. 22
  • 22. Greenplum Enterprise HD HDFS Integration – Parallelized Flow • Reading: – Each GPDB segment reads a portion of the file • Segment i of n reads the i/n-th portion – Access offset from HDFS namenode – Read data directly from HDFS datanode • Writing: – Each GPDB segment writes a file – HDFS balancing distributes the load evenly across datanodes © Copyright 2011 EMC Corporation. All rights reserved. 23
  • 23. Big Data Analytics “Stack” Analytic Toolsets (Business Analytics, BI, Statistics, etc.) Greenplum Chorus Enterprise Collaboration Platform for Data Greenplum Database Greenplum HD World’s Most Scalable MPP Database Platform Enterprise Analytics Platform for Unstructured Data Greenplum Data Computing Appliances Purpose-built for Big Data Analytics © Copyright 2011 EMC Corporation. All rights reserved. 24
  • 24. THANK YOU © Copyright 2011 EMC Corporation. All rights reserved. 25