SlideShare ist ein Scribd-Unternehmen logo
1 von 23
Downloaden Sie, um offline zu lesen
Open source Big Data case study: Building a
platform for remote device support at NetApp
(Part II – Technical)
Topics



                                                     Big Data Perspective

                                                     Case Study: NetApp AutoSupport

                                                     Technology Primer

                                                     Design Overview




Copyright © 2012 Accenture All rights reserved.                                        2
Big Data

         The concept is disruptive. The technology is disruptive. And, markets and
         clients are being impacted.




                                                        1 Wordle for   Credit Suisse, Does Size Matter Only?, September 2011


Copyright © 2012 Accenture All rights reserved.                                                                                3
Shifts in Data and Analytics
                    The changing landscape and required winning strategies are creating shifts
                    within Big Data collection and analytics
                         Data Explosion                                                  Monetization
                                                   • Unstructured data is doubling                         • Growth of enterprise data
                                                     every 3 months                                          monetization services
                                                   • 2011 saw 47% growth overall                           • Large retailers monetizing own
                                                   • By 2015, number of networked                            data to provide insights to
                                                     devices will be 2x global                               suppliers
                                                     population
                      Data-led Innovation                                                Social Media
                                                   • De-coupling data from                                 • Growing market for scrubbed,
                                                     applications                                            aggregate data from social
                                                   • Disparate external data shaping                         media and blogs
                                                     context                                               • Greater focus on data that
                                                   • Cost effective mobilization of                          provides insight in a customer’s
                                                     massive scale data                                      digital persona

                           Technology                                                  Data Mobilization
                                                   • Commodity priced storage and                          • Novel approaches to analyze
                                                     compute                                                 unstructured data creating
                                                                                                             shorter time from data to insight
                                                   • Emergence of open source and
                                                     big data technologies solving                         • Shift towards data consumption
                                                     production problems at scale                            in multiple environments
                                                                                                             (business apps, mobile, social)


 Copyright © 2012 Accenture All rights reserved.                                                                                                 4
The Big Data Approach

                                                        Treat data as a strategic asset, seek to
                                                        maximize it’s value to the organization


                                                        Invest in common services, data platforms
                                                        and tools


                                                        Rapidly prototype, deliver, and measure
                                                        value-added data services, evolve over time


                                              •   Data-driven decision making   •   End-to-end ownership of
                                              •   Experimentation and               services
                                                  continuous improvement with   •   Sharing of platform, tools and
                                                  academic rigor                    code
                                                                                                          Culture
Copyright © 2012 Accenture All rights reserved.                                                                      5
Topics



                                                     Big Data Perspective

                                                     Case Study: NetApp AutoSupport

                                                     Technology Primer

                                                     Design Overview




Copyright © 2012 Accenture All rights reserved.                                        6
Client Context

                      NetApp, Inc.
                      • Industry: Data storage, data management
                      • 77% Fortune 500 companies are customers
                      • Creator of Data ONTAP: industry leading storage OS




Copyright © 2012 Accenture All rights reserved.                              7
AutoSupport

                                                                •   Secure automated “call-home” service
                                                                •   Catch issues before they become critical
                                                                •   System monitoring and alerting
                                                                •   RMA requests without customer action
                                                                •   Faster incident management


                                                                         AutoSupport
                                                  Storage Devices         Messages        AutoSupport
                                                                                         Data Warehouse




Copyright © 2012 Accenture All rights reserved.                                                                8
Business Challenges
                                                                                      SAP CRM                   MyASUP               eBI              STOR             ASUP Tools              Analytics & Mining


   • Increase in response times / lower                                                                                                                                                                              Presentation




     availability for services                                                            CRM Module

                                                                                      Rules Module
                                                                                                                    Java Interface

                                                                                                                       Rules
                                                                                                                        Rules
                                                                                                                                                          Jasper

                                                                                                                                                      Stored Proc
                                                                                                                                                                     Rest Interface

                                                                                                                                                                              Rules
                                                                                                                                                                               Rules
                                                                                                                                                                                                     Rules
                                                                                                                                                                                                      Rules
                                                                                                                                                                                                           Various   Interface


                                                                                                                                                                                                                     Rules

   • Incoming data volume doubling every 16
                                                                                                                         Rules                                                   Rules                  Rules
                                                                                                                                           eB
                                                                                  PMBTA                                                                                BI
                                                                                                                                           I
                                                                                                                                                                                                                     Integrate


     months                                                                               Custom ETL            Custom ETL
                                                                                                                                                DSS

                                                                                                                                                      Custom ETL         Custom ETL                                  Transform


   • Proliferation of ad hoc datamarts and                                      Xterra DB               PWillows
                                                                                                                                       DW 3
                                                                                                                                                ODS

                                                                                                                                                                       DW 2                             Adhoc DB’s
                                                                                                                                                                                                                     Stage



     point solutions                                                             Xterra
                                                                                 Parser
                                                                                                          Light
                                                                                                          Parser
                                                                                                                       Parser
                                                                                                                                                Loader

                                                                                                                                            Parser
                                                                                                                                                                    Core
                                                                                                                                                                    Parser                           Adhoc           Extract



   • Unable to analyze full AutoSupport
                                                                                                                                                                                                     Parsers

                                                                                                       Xterra
                                                                                                       File
                                                                                                                                                                                                                     Source

     contents efficiently
                                                                                                                                                         SAP CRM                GEO      DRM      HDD
                                                                                ASUP                                                                     STAGE      PNOW                                   DM
                                                                                                                             File Storage
                                                                                Messages




                                                                     AutoSupport Flat-File Storage Requirement
                                                  3500
                                                  3000
                                                                                 Total Usage (tb)
                                                  2500
                                                                                 Projected Total Usage (tb)
                                                  2000
                                                  1500                           Doubles
                                                  1000
                                                   500
                                                    0
                                                    Jan-05 Jan-06 Jan-07 Jan-08 Jan-09 Jan-10 Jan-11 Jan-12 Jan-13 Jan-14 Jan-15 Jan-16


Copyright © 2012 Accenture All rights reserved.                                                                                                                                                                                     9
Solution Design Goals
Improve data access and technology cost effectiveness and performance.

 •    Improve system response times
      and data availability
 •    Expose common data services for
      consumption across business units
 •    Standardize key business metrics
      into common rules repository
 •    Lower operational costs as
      ecosystem continues to scale
 •    Provide more granular analytical
      capabilities


 Copyright © 2012 Accenture All rights reserved.                         10
Role of Open Source
                      Platform is composed of open source technologies purpose-built for large-scale
                      storage, processing and analysis




                                                     1 Actual Big Data Solution Blueprint for a hybrid deployment




Copyright © 2012 Accenture All rights reserved.                                                                     11
Topics



                                                     Big Data Perspective

                                                     Case Study: NetApp AutoSupport

                                                     Technology Primer

                                                     Design Overview




Copyright © 2012 Accenture All rights reserved.                                        12
Technology Primer – Hadoop
Hadoop Distributed Filesystem                     Hadoop MapReduce
(HDFS)                                            • Parallel processing for large datasets
• Divides files into smaller “blocks”,              across machines
  stored across machines                          • Breaks job into tasks, using a simple map()
• Automated replication, fault tolerance            and reduce() paradigm for data flows




Copyright © 2012 Accenture All rights reserved.                                              13
Technology Primer – MapReduce

MapReduce
                                                                                         Map(key,value)
(Simple Example – Word Count)
                                                                                         Reduce(key, List<value> values)
                                                  Map Phase              Shuffle Phase

                                                              <one,1>
                                                                                                          <one,1>
                                                     m        <fish,1>
                    Input                                                                                 <two,1>
                                                                                              r
                 One fish,                                    <two,1>
                                                     m        <fish,1>                                    <red,1>
                 two fish,
                                                                                              r           <blue,1>
                 red fish,
                 blue fish.                                   <red,1>
                                                     m
                                                              <fish,1>
                                                                                              r            <fish,4>

                                                     m        <blue,1>
                                                              <fish,1>
Copyright © 2012 Accenture All rights reserved.                                                                            14
Technology Primer – NoSQL

• “Not only” SQL
   • Catch-all term for various non-relational database systems

• Typical areas of differentation
   • Data model semantics
                 • eg. Database, Document, Key-Value
        • CAP trade-offs
                 • Consistency, Availability, Partition-Tolerance
        • Scale-out architecture
                 • eg. Sharding, Distributed hash
        • Query language

                                  Examples: HBase, Cassandra, mongoDB, Neo4j, etc.
Copyright © 2012 Accenture All rights reserved.                                      15
Topics



                                                     Big Data Perspective

                                                     Case Study: NetApp AutoSupport

                                                     Technology Primer

                                                     Design Overview




Copyright © 2012 Accenture All rights reserved.                                        16
Data Pipeline Overview



                                                                           Data Service
                                                                            Interface

                      Incoming Messages


                                                              Core Data      Ad hoc
                                                  Ingestion
                                                              Processing    analytics




                                                                               ETL




Copyright © 2012 Accenture All rights reserved.                                           17
Data Ingestion
    Technologies
    • Apache Flume, Apache Hadoop, Drools BRMS, JMS
    Capabilities
    • Handle dynamic data volumes
                                                                                           Notifications
    • Normalization of disparate file formats
    • Real-time aggregation of documents                                                         JMS

    • JMS alerts for critical messages
                                                         Parsing tier           Aggregation & sink tier

Documents from
Front End HTTP/SMTP                                  Flume              Flume           Flume
Gateway                               Routing tier   agent              agent           agent
                                                                                                           Aggregated files


                                            Flume    Flume              Flume           Flume
                                            client   agent              agent           agent
                                            Rules                                                                    HDFS
                                            Engine
                                                     Flume              Flume           Flume
                                                     agent              agent           agent

Copyright © 2012 Accenture All rights reserved.                                                                               18
Core Data Processing
Technologies
• MapReduce, HBase, Solr, Avro
Capabilities
• Parallel processing for increased throughput
• Efficient storage of complex data objects in Avro
                                                                                                   Search indexes



                                                  Parse text                                           Solr
                                                  contents     Transform and derive data objects
                                                                                                          Primary storage
           Documents gathered
           from Flume                              Map
                                                                                                            HBase
                                                                        Reduce
                                                   Map                 HDFS
                                                                     Write derived objects to            Data warehouse
                                                                     data stores

                                                   Map
                                                                         Reduce                               Hive
Copyright © 2012 Accenture All rights reserved.                                                                             19
Data Services
 Technologies
 • Apache HBase, Solr, Tomcat
 Capabilities
 • Unified web services API for end
   users
 • Support for complex queries and
   searches across multiple dimensions
   with Solr
 • Access both raw and derived content
   for a given system




Copyright © 2012 Accenture All rights reserved.   20
Analytics / ETL
 Technologies
 • Apache Hive, Pig, Datameer (Ad hoc analytics)
 • Pentaho (ETL / Data Integration)
 Capabilities
 • Analytical environment for both business analysts and “power
   users”
    • Hive or Pig as higher level query languages
    • Datameer for analytics with a spreadsheet UI
 • ETL through Pentaho MapReduce
          • (runs Pentaho ETL server inside of a MapReduce Job)



Copyright © 2012 Accenture All rights reserved.                   21
Successes and Challenges
  Successes
  • Web service interface contracts simplified integration with
    user tools, allowed for flexibility in internal implementation
  • Open source core allowed rapid for rapid iteration
  • Met or exceeded all SLAs using commodity hardware,
    significantly driving down costs
  Challenges
  • Monitoring a large distributed system requires discipline and
    a strong operations team
  • Shared storage systems and Big Data technologies don’t
    always play well together
  • “Schemaless” systems can become a headache to
    maintain, especially with complex data models

Copyright © 2012 Accenture All rights reserved.                      22
Thank you

                                                  Jonathan Bender
                                                  Consultant, Accenture Technology Labs
                                                  jonathan.bender@accenture.com




Copyright © 2012 Accenture All rights reserved.                                           23

Weitere ähnliche Inhalte

Was ist angesagt?

Big Data - A Real Life Revolution
Big Data - A Real Life RevolutionBig Data - A Real Life Revolution
Big Data - A Real Life RevolutionCapgemini
 
3 Pillars Reworking the Revolution
3 Pillars Reworking the Revolution3 Pillars Reworking the Revolution
3 Pillars Reworking the RevolutionTracey Williamson
 
The Digital Consumer: Know me, Inform me, Make it easy and Get it to me
The Digital Consumer: Know me, Inform me, Make it easy and Get it to meThe Digital Consumer: Know me, Inform me, Make it easy and Get it to me
The Digital Consumer: Know me, Inform me, Make it easy and Get it to meAccenture the Netherlands
 
The Connected Industrial Workforce
The Connected Industrial WorkforceThe Connected Industrial Workforce
The Connected Industrial Workforceaccenture
 
HPMC 2014 - CX and Mobility showcase - Oracle
HPMC 2014 - CX and Mobility showcase - OracleHPMC 2014 - CX and Mobility showcase - Oracle
HPMC 2014 - CX and Mobility showcase - OracleAccenture the Netherlands
 
Delivering applications at the pace of business
Delivering applications at the pace of businessDelivering applications at the pace of business
Delivering applications at the pace of businessAccenture Technology
 
Digital transformation slideshare
Digital transformation   slideshareDigital transformation   slideshare
Digital transformation slideshareShivamPatsariya1
 
The Return on Invest in the Internet of Things. Mastering the Digital Transfo...
The Return on Invest in the Internet of Things. Mastering the Digital Transfo...The Return on Invest in the Internet of Things. Mastering the Digital Transfo...
The Return on Invest in the Internet of Things. Mastering the Digital Transfo...Capgemini
 
CWIN17 san francisco-al liubinskas- api amplification v4
CWIN17 san francisco-al liubinskas- api amplification v4CWIN17 san francisco-al liubinskas- api amplification v4
CWIN17 san francisco-al liubinskas- api amplification v4Capgemini
 
Nextgen invent services slideshare
Nextgen invent services   slideshareNextgen invent services   slideshare
Nextgen invent services slideshareShivamPatsariya1
 
Accenture Technology Vision 2019 for Pega
Accenture Technology Vision 2019 for PegaAccenture Technology Vision 2019 for Pega
Accenture Technology Vision 2019 for PegaAccenture Technology
 
Value-driven Warehouse Automation | Accenture
Value-driven Warehouse Automation | AccentureValue-driven Warehouse Automation | Accenture
Value-driven Warehouse Automation | Accentureaccenture
 
Return on Digital Technologies: Insights for OFES Companies
Return on Digital Technologies: Insights for OFES CompaniesReturn on Digital Technologies: Insights for OFES Companies
Return on Digital Technologies: Insights for OFES Companiesaccenture
 
Technology Vision 2020: The Analytics Angle with SAS
Technology Vision 2020: The Analytics Angle with SASTechnology Vision 2020: The Analytics Angle with SAS
Technology Vision 2020: The Analytics Angle with SASaccenture
 

Was ist angesagt? (20)

Big Data - A Real Life Revolution
Big Data - A Real Life RevolutionBig Data - A Real Life Revolution
Big Data - A Real Life Revolution
 
3 Pillars Reworking the Revolution
3 Pillars Reworking the Revolution3 Pillars Reworking the Revolution
3 Pillars Reworking the Revolution
 
Infinite investor presentation March 2013
Infinite investor presentation   March 2013Infinite investor presentation   March 2013
Infinite investor presentation March 2013
 
The Digital Consumer: Know me, Inform me, Make it easy and Get it to me
The Digital Consumer: Know me, Inform me, Make it easy and Get it to meThe Digital Consumer: Know me, Inform me, Make it easy and Get it to me
The Digital Consumer: Know me, Inform me, Make it easy and Get it to me
 
Case study slideshare
Case study   slideshareCase study   slideshare
Case study slideshare
 
The Connected Industrial Workforce
The Connected Industrial WorkforceThe Connected Industrial Workforce
The Connected Industrial Workforce
 
HPMC 2014 - CX and Mobility showcase - Oracle
HPMC 2014 - CX and Mobility showcase - OracleHPMC 2014 - CX and Mobility showcase - Oracle
HPMC 2014 - CX and Mobility showcase - Oracle
 
Delivering applications at the pace of business
Delivering applications at the pace of businessDelivering applications at the pace of business
Delivering applications at the pace of business
 
Digital transformation slideshare
Digital transformation   slideshareDigital transformation   slideshare
Digital transformation slideshare
 
Microsoft Dynamics Customer Stories
Microsoft Dynamics Customer StoriesMicrosoft Dynamics Customer Stories
Microsoft Dynamics Customer Stories
 
The Return on Invest in the Internet of Things. Mastering the Digital Transfo...
The Return on Invest in the Internet of Things. Mastering the Digital Transfo...The Return on Invest in the Internet of Things. Mastering the Digital Transfo...
The Return on Invest in the Internet of Things. Mastering the Digital Transfo...
 
CWIN17 san francisco-al liubinskas- api amplification v4
CWIN17 san francisco-al liubinskas- api amplification v4CWIN17 san francisco-al liubinskas- api amplification v4
CWIN17 san francisco-al liubinskas- api amplification v4
 
Nextgen invent services slideshare
Nextgen invent services   slideshareNextgen invent services   slideshare
Nextgen invent services slideshare
 
LBC FINAL presentation
LBC FINAL presentationLBC FINAL presentation
LBC FINAL presentation
 
Accenture Technology Vision 2019 for Pega
Accenture Technology Vision 2019 for PegaAccenture Technology Vision 2019 for Pega
Accenture Technology Vision 2019 for Pega
 
Value-driven Warehouse Automation | Accenture
Value-driven Warehouse Automation | AccentureValue-driven Warehouse Automation | Accenture
Value-driven Warehouse Automation | Accenture
 
Return on Digital Technologies: Insights for OFES Companies
Return on Digital Technologies: Insights for OFES CompaniesReturn on Digital Technologies: Insights for OFES Companies
Return on Digital Technologies: Insights for OFES Companies
 
Strategy slideshare
Strategy   slideshareStrategy   slideshare
Strategy slideshare
 
Gaining Momentum for IaaS
Gaining Momentum for IaaSGaining Momentum for IaaS
Gaining Momentum for IaaS
 
Technology Vision 2020: The Analytics Angle with SAS
Technology Vision 2020: The Analytics Angle with SASTechnology Vision 2020: The Analytics Angle with SAS
Technology Vision 2020: The Analytics Angle with SAS
 

Andere mochten auch

AWS Summit 2013 | Singapore - NetApp Private Storage for AWS with Equinix, Pr...
AWS Summit 2013 | Singapore - NetApp Private Storage for AWS with Equinix, Pr...AWS Summit 2013 | Singapore - NetApp Private Storage for AWS with Equinix, Pr...
AWS Summit 2013 | Singapore - NetApp Private Storage for AWS with Equinix, Pr...Amazon Web Services
 
VMworld 2013: Low-Cost, High-Performance Storage for VMware Horizon Desktops
VMworld 2013: Low-Cost, High-Performance Storage for VMware Horizon Desktops VMworld 2013: Low-Cost, High-Performance Storage for VMware Horizon Desktops
VMworld 2013: Low-Cost, High-Performance Storage for VMware Horizon Desktops VMworld
 
NetApp FAS2200 Series with Flash Pool
NetApp FAS2200 Series with Flash PoolNetApp FAS2200 Series with Flash Pool
NetApp FAS2200 Series with Flash PoolNetApp
 
File recovery with ShareFile on NetApp
File recovery with ShareFile on NetAppFile recovery with ShareFile on NetApp
File recovery with ShareFile on NetAppNetApp
 
NetApp Management Pack for VMware vRealize Operations | Blue Medora
NetApp Management Pack for VMware vRealize Operations | Blue MedoraNetApp Management Pack for VMware vRealize Operations | Blue Medora
NetApp Management Pack for VMware vRealize Operations | Blue MedoraBlue Medora
 
Cisco UCS with NetApp Storage for SAP HANA Solution
Cisco UCS with NetApp Storage for SAP HANA Solution Cisco UCS with NetApp Storage for SAP HANA Solution
Cisco UCS with NetApp Storage for SAP HANA Solution NetApp
 
How to solve misalignment lun netapp on linux servers by Ivan
How to solve misalignment lun netapp on linux servers by IvanHow to solve misalignment lun netapp on linux servers by Ivan
How to solve misalignment lun netapp on linux servers by IvanIvan Silva
 
Apresentações | Jantar Exclusivo Cisco e Netapp | 27 de Junho de 2012 | Spett...
Apresentações | Jantar Exclusivo Cisco e Netapp | 27 de Junho de 2012 | Spett...Apresentações | Jantar Exclusivo Cisco e Netapp | 27 de Junho de 2012 | Spett...
Apresentações | Jantar Exclusivo Cisco e Netapp | 27 de Junho de 2012 | Spett...Softcorp
 
Use the power of Microsoft Azure with NetApp Storage
Use the power of Microsoft Azure with NetApp StorageUse the power of Microsoft Azure with NetApp Storage
Use the power of Microsoft Azure with NetApp StorageProact Netherlands B.V.
 
MongoDB Europe 2016 - Deploying MongoDB on NetApp storage
MongoDB Europe 2016 - Deploying MongoDB on NetApp storageMongoDB Europe 2016 - Deploying MongoDB on NetApp storage
MongoDB Europe 2016 - Deploying MongoDB on NetApp storageMongoDB
 
FedRAMP Compliant FlexPod architecture from NetApp, Cisco, HyTrust and Coalfire
FedRAMP Compliant FlexPod architecture from NetApp, Cisco, HyTrust and CoalfireFedRAMP Compliant FlexPod architecture from NetApp, Cisco, HyTrust and Coalfire
FedRAMP Compliant FlexPod architecture from NetApp, Cisco, HyTrust and CoalfireEric Chiu
 
Addressing Issues of Risk & Governance in OpenStack without sacrificing Agili...
Addressing Issues of Risk & Governance in OpenStack without sacrificing Agili...Addressing Issues of Risk & Governance in OpenStack without sacrificing Agili...
Addressing Issues of Risk & Governance in OpenStack without sacrificing Agili...OpenStack
 
How to shutdown and power up of the netapp cluster mode storage system
How to shutdown and power up of the netapp cluster mode storage systemHow to shutdown and power up of the netapp cluster mode storage system
How to shutdown and power up of the netapp cluster mode storage systemSaroj Sahu
 
IDC EMEA Flash numbers for Q3CY16
IDC EMEA Flash numbers for Q3CY16IDC EMEA Flash numbers for Q3CY16
IDC EMEA Flash numbers for Q3CY16NetApp
 
(ENT201) New Generation Hybrid Architectures with Suncorp, NetApp, and AWS | ...
(ENT201) New Generation Hybrid Architectures with Suncorp, NetApp, and AWS | ...(ENT201) New Generation Hybrid Architectures with Suncorp, NetApp, and AWS | ...
(ENT201) New Generation Hybrid Architectures with Suncorp, NetApp, and AWS | ...Amazon Web Services
 
Build a Website on AWS for Your First 10 Million Users
Build a Website on AWS for Your First 10 Million UsersBuild a Website on AWS for Your First 10 Million Users
Build a Website on AWS for Your First 10 Million UsersAmazon Web Services
 

Andere mochten auch (17)

AWS Summit 2013 | Singapore - NetApp Private Storage for AWS with Equinix, Pr...
AWS Summit 2013 | Singapore - NetApp Private Storage for AWS with Equinix, Pr...AWS Summit 2013 | Singapore - NetApp Private Storage for AWS with Equinix, Pr...
AWS Summit 2013 | Singapore - NetApp Private Storage for AWS with Equinix, Pr...
 
VMworld 2013: Low-Cost, High-Performance Storage for VMware Horizon Desktops
VMworld 2013: Low-Cost, High-Performance Storage for VMware Horizon Desktops VMworld 2013: Low-Cost, High-Performance Storage for VMware Horizon Desktops
VMworld 2013: Low-Cost, High-Performance Storage for VMware Horizon Desktops
 
NetApp FAS2200 Series with Flash Pool
NetApp FAS2200 Series with Flash PoolNetApp FAS2200 Series with Flash Pool
NetApp FAS2200 Series with Flash Pool
 
File recovery with ShareFile on NetApp
File recovery with ShareFile on NetAppFile recovery with ShareFile on NetApp
File recovery with ShareFile on NetApp
 
NetApp Management Pack for VMware vRealize Operations | Blue Medora
NetApp Management Pack for VMware vRealize Operations | Blue MedoraNetApp Management Pack for VMware vRealize Operations | Blue Medora
NetApp Management Pack for VMware vRealize Operations | Blue Medora
 
Cisco UCS with NetApp Storage for SAP HANA Solution
Cisco UCS with NetApp Storage for SAP HANA Solution Cisco UCS with NetApp Storage for SAP HANA Solution
Cisco UCS with NetApp Storage for SAP HANA Solution
 
Docker Orchestrators
Docker OrchestratorsDocker Orchestrators
Docker Orchestrators
 
How to solve misalignment lun netapp on linux servers by Ivan
How to solve misalignment lun netapp on linux servers by IvanHow to solve misalignment lun netapp on linux servers by Ivan
How to solve misalignment lun netapp on linux servers by Ivan
 
Apresentações | Jantar Exclusivo Cisco e Netapp | 27 de Junho de 2012 | Spett...
Apresentações | Jantar Exclusivo Cisco e Netapp | 27 de Junho de 2012 | Spett...Apresentações | Jantar Exclusivo Cisco e Netapp | 27 de Junho de 2012 | Spett...
Apresentações | Jantar Exclusivo Cisco e Netapp | 27 de Junho de 2012 | Spett...
 
Use the power of Microsoft Azure with NetApp Storage
Use the power of Microsoft Azure with NetApp StorageUse the power of Microsoft Azure with NetApp Storage
Use the power of Microsoft Azure with NetApp Storage
 
MongoDB Europe 2016 - Deploying MongoDB on NetApp storage
MongoDB Europe 2016 - Deploying MongoDB on NetApp storageMongoDB Europe 2016 - Deploying MongoDB on NetApp storage
MongoDB Europe 2016 - Deploying MongoDB on NetApp storage
 
FedRAMP Compliant FlexPod architecture from NetApp, Cisco, HyTrust and Coalfire
FedRAMP Compliant FlexPod architecture from NetApp, Cisco, HyTrust and CoalfireFedRAMP Compliant FlexPod architecture from NetApp, Cisco, HyTrust and Coalfire
FedRAMP Compliant FlexPod architecture from NetApp, Cisco, HyTrust and Coalfire
 
Addressing Issues of Risk & Governance in OpenStack without sacrificing Agili...
Addressing Issues of Risk & Governance in OpenStack without sacrificing Agili...Addressing Issues of Risk & Governance in OpenStack without sacrificing Agili...
Addressing Issues of Risk & Governance in OpenStack without sacrificing Agili...
 
How to shutdown and power up of the netapp cluster mode storage system
How to shutdown and power up of the netapp cluster mode storage systemHow to shutdown and power up of the netapp cluster mode storage system
How to shutdown and power up of the netapp cluster mode storage system
 
IDC EMEA Flash numbers for Q3CY16
IDC EMEA Flash numbers for Q3CY16IDC EMEA Flash numbers for Q3CY16
IDC EMEA Flash numbers for Q3CY16
 
(ENT201) New Generation Hybrid Architectures with Suncorp, NetApp, and AWS | ...
(ENT201) New Generation Hybrid Architectures with Suncorp, NetApp, and AWS | ...(ENT201) New Generation Hybrid Architectures with Suncorp, NetApp, and AWS | ...
(ENT201) New Generation Hybrid Architectures with Suncorp, NetApp, and AWS | ...
 
Build a Website on AWS for Your First 10 Million Users
Build a Website on AWS for Your First 10 Million UsersBuild a Website on AWS for Your First 10 Million Users
Build a Website on AWS for Your First 10 Million Users
 

Ähnlich wie NetApp builds remote device support platform with open source Big Data

Zakipoint Introduction
Zakipoint IntroductionZakipoint Introduction
Zakipoint Introductionrameshkbudhani
 
01 im overview high level
01 im overview high level01 im overview high level
01 im overview high levelJames Findlay
 
Exploring Big Data value for your business
Exploring Big Data value for your businessExploring Big Data value for your business
Exploring Big Data value for your businessAcunu
 
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)Mark Heid
 
Building a business intelligence architecture fit for the 21st century by Jon...
Building a business intelligence architecture fit for the 21st century by Jon...Building a business intelligence architecture fit for the 21st century by Jon...
Building a business intelligence architecture fit for the 21st century by Jon...Mark Tapley
 
Modernizing Your IT Infrastructure with Hadoop - Cloudera Summer Webinar Seri...
Modernizing Your IT Infrastructure with Hadoop - Cloudera Summer Webinar Seri...Modernizing Your IT Infrastructure with Hadoop - Cloudera Summer Webinar Seri...
Modernizing Your IT Infrastructure with Hadoop - Cloudera Summer Webinar Seri...Cloudera, Inc.
 
Analytics big data ibm
Analytics big data ibmAnalytics big data ibm
Analytics big data ibmAccenture
 
IBM-Infoworld Big Data deep dive
IBM-Infoworld Big Data deep diveIBM-Infoworld Big Data deep dive
IBM-Infoworld Big Data deep diveKun Le
 
Ibm big data ibm marriage of hadoop and data warehousing
Ibm big dataibm marriage of hadoop and data warehousingIbm big dataibm marriage of hadoop and data warehousing
Ibm big data ibm marriage of hadoop and data warehousing DataWorks Summit
 
Hortonworks roadshow
Hortonworks roadshowHortonworks roadshow
Hortonworks roadshowAccenture
 
A Strategic View of Enterprise Reporting and Analytics: The Data Funnel
A Strategic View of Enterprise Reporting and Analytics: The Data FunnelA Strategic View of Enterprise Reporting and Analytics: The Data Funnel
A Strategic View of Enterprise Reporting and Analytics: The Data FunnelInside Analysis
 
Big data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You WantBig data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You WantStuart Miniman
 
Webinar | Using Hadoop Analytics to Gain a Big Data Advantage
Webinar | Using Hadoop Analytics to Gain a Big Data AdvantageWebinar | Using Hadoop Analytics to Gain a Big Data Advantage
Webinar | Using Hadoop Analytics to Gain a Big Data AdvantageCloudera, Inc.
 
Shared Services Canada - A Transformational Journey Through Enterprise Initia...
Shared Services Canada - A Transformational Journey Through Enterprise Initia...Shared Services Canada - A Transformational Journey Through Enterprise Initia...
Shared Services Canada - A Transformational Journey Through Enterprise Initia...KBIZEAU
 
Information på agendaen
Information på agendaenInformation på agendaen
Information på agendaenIBM Danmark
 
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)Ajay Ohri
 
APAC Big Data Strategy_RK
APAC Big Data Strategy_RKAPAC Big Data Strategy_RK
APAC Big Data Strategy_RKIntelAPAC
 

Ähnlich wie NetApp builds remote device support platform with open source Big Data (20)

Zakipoint Introduction
Zakipoint IntroductionZakipoint Introduction
Zakipoint Introduction
 
01 im overview high level
01 im overview high level01 im overview high level
01 im overview high level
 
The New Enterprise Data Platform
The New Enterprise Data PlatformThe New Enterprise Data Platform
The New Enterprise Data Platform
 
Exploring Big Data value for your business
Exploring Big Data value for your businessExploring Big Data value for your business
Exploring Big Data value for your business
 
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
 
Building a business intelligence architecture fit for the 21st century by Jon...
Building a business intelligence architecture fit for the 21st century by Jon...Building a business intelligence architecture fit for the 21st century by Jon...
Building a business intelligence architecture fit for the 21st century by Jon...
 
Modernizing Your IT Infrastructure with Hadoop - Cloudera Summer Webinar Seri...
Modernizing Your IT Infrastructure with Hadoop - Cloudera Summer Webinar Seri...Modernizing Your IT Infrastructure with Hadoop - Cloudera Summer Webinar Seri...
Modernizing Your IT Infrastructure with Hadoop - Cloudera Summer Webinar Seri...
 
Analytics big data ibm
Analytics big data ibmAnalytics big data ibm
Analytics big data ibm
 
IBM-Infoworld Big Data deep dive
IBM-Infoworld Big Data deep diveIBM-Infoworld Big Data deep dive
IBM-Infoworld Big Data deep dive
 
Ibm big data ibm marriage of hadoop and data warehousing
Ibm big dataibm marriage of hadoop and data warehousingIbm big dataibm marriage of hadoop and data warehousing
Ibm big data ibm marriage of hadoop and data warehousing
 
Hortonworks roadshow
Hortonworks roadshowHortonworks roadshow
Hortonworks roadshow
 
Secure Big Data Analytics - Hadoop & Intel
Secure Big Data Analytics - Hadoop & IntelSecure Big Data Analytics - Hadoop & Intel
Secure Big Data Analytics - Hadoop & Intel
 
A Strategic View of Enterprise Reporting and Analytics: The Data Funnel
A Strategic View of Enterprise Reporting and Analytics: The Data FunnelA Strategic View of Enterprise Reporting and Analytics: The Data Funnel
A Strategic View of Enterprise Reporting and Analytics: The Data Funnel
 
Big data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You WantBig data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You Want
 
Webinar | Using Hadoop Analytics to Gain a Big Data Advantage
Webinar | Using Hadoop Analytics to Gain a Big Data AdvantageWebinar | Using Hadoop Analytics to Gain a Big Data Advantage
Webinar | Using Hadoop Analytics to Gain a Big Data Advantage
 
Using Big Data Smarter Decision Making
Using Big Data Smarter Decision MakingUsing Big Data Smarter Decision Making
Using Big Data Smarter Decision Making
 
Shared Services Canada - A Transformational Journey Through Enterprise Initia...
Shared Services Canada - A Transformational Journey Through Enterprise Initia...Shared Services Canada - A Transformational Journey Through Enterprise Initia...
Shared Services Canada - A Transformational Journey Through Enterprise Initia...
 
Information på agendaen
Information på agendaenInformation på agendaen
Information på agendaen
 
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)
 
APAC Big Data Strategy_RK
APAC Big Data Strategy_RKAPAC Big Data Strategy_RK
APAC Big Data Strategy_RK
 

Mehr von Accenture the Netherlands

Achieving Success in Digital for Manufacturing & Operations
Achieving Success in Digital for Manufacturing & OperationsAchieving Success in Digital for Manufacturing & Operations
Achieving Success in Digital for Manufacturing & OperationsAccenture the Netherlands
 
Digital grid: Disruptive digital technologies
Digital grid: Disruptive digital technologiesDigital grid: Disruptive digital technologies
Digital grid: Disruptive digital technologiesAccenture the Netherlands
 
Using simulations and serious games for enabling transformational change
Using simulations and serious games for enabling transformational changeUsing simulations and serious games for enabling transformational change
Using simulations and serious games for enabling transformational changeAccenture the Netherlands
 
HPMC 2014 - How Analytics can improve your customer experience - SAS
HPMC 2014 - How Analytics can improve your customer experience - SASHPMC 2014 - How Analytics can improve your customer experience - SAS
HPMC 2014 - How Analytics can improve your customer experience - SASAccenture the Netherlands
 
HPMC 2014 - Optimizing customer value - Pega Systems
HPMC 2014 - Optimizing customer value - Pega SystemsHPMC 2014 - Optimizing customer value - Pega Systems
HPMC 2014 - Optimizing customer value - Pega SystemsAccenture the Netherlands
 
HPMC 2014 - How the customer 2.0 changed marketing - Microsoft
HPMC 2014 - How the customer 2.0 changed marketing - MicrosoftHPMC 2014 - How the customer 2.0 changed marketing - Microsoft
HPMC 2014 - How the customer 2.0 changed marketing - MicrosoftAccenture the Netherlands
 
HPMC 2014 - Annual trends service design 2014 - Fjord
HPMC 2014 - Annual trends service design 2014 - FjordHPMC 2014 - Annual trends service design 2014 - Fjord
HPMC 2014 - Annual trends service design 2014 - FjordAccenture the Netherlands
 
HPMC 2014 - Thirteen tenets of a successful digital business - Acquity group
HPMC 2014 - Thirteen tenets of a successful digital business - Acquity groupHPMC 2014 - Thirteen tenets of a successful digital business - Acquity group
HPMC 2014 - Thirteen tenets of a successful digital business - Acquity groupAccenture the Netherlands
 
HPMC 2014 - Digital business edge - Accenture
HPMC 2014 - Digital business edge - AccentureHPMC 2014 - Digital business edge - Accenture
HPMC 2014 - Digital business edge - AccentureAccenture the Netherlands
 
HPMC 2014 - The New Energy Consumer - Accenture Netherlands
HPMC 2014 - The New Energy Consumer - Accenture NetherlandsHPMC 2014 - The New Energy Consumer - Accenture Netherlands
HPMC 2014 - The New Energy Consumer - Accenture NetherlandsAccenture the Netherlands
 
OSC2012: Identity Analytics: Exploiting Digital Breadcrumbs
OSC2012: Identity Analytics: Exploiting Digital BreadcrumbsOSC2012: Identity Analytics: Exploiting Digital Breadcrumbs
OSC2012: Identity Analytics: Exploiting Digital BreadcrumbsAccenture the Netherlands
 
OSC2012: How a 2000-Year Old Knot Untangles Legacy
OSC2012: How a 2000-Year Old Knot Untangles LegacyOSC2012: How a 2000-Year Old Knot Untangles Legacy
OSC2012: How a 2000-Year Old Knot Untangles LegacyAccenture the Netherlands
 

Mehr von Accenture the Netherlands (20)

Achieving Success in Digital for Manufacturing & Operations
Achieving Success in Digital for Manufacturing & OperationsAchieving Success in Digital for Manufacturing & Operations
Achieving Success in Digital for Manufacturing & Operations
 
Digital grid: Disruptive digital technologies
Digital grid: Disruptive digital technologiesDigital grid: Disruptive digital technologies
Digital grid: Disruptive digital technologies
 
High Performance Business Study 2015 H1
High Performance Business Study 2015 H1High Performance Business Study 2015 H1
High Performance Business Study 2015 H1
 
Using simulations and serious games for enabling transformational change
Using simulations and serious games for enabling transformational changeUsing simulations and serious games for enabling transformational change
Using simulations and serious games for enabling transformational change
 
Digital Business - Accenture
Digital Business - AccentureDigital Business - Accenture
Digital Business - Accenture
 
HPMC 2014 - How Analytics can improve your customer experience - SAS
HPMC 2014 - How Analytics can improve your customer experience - SASHPMC 2014 - How Analytics can improve your customer experience - SAS
HPMC 2014 - How Analytics can improve your customer experience - SAS
 
HPMC 2014 - Optimizing customer value - Pega Systems
HPMC 2014 - Optimizing customer value - Pega SystemsHPMC 2014 - Optimizing customer value - Pega Systems
HPMC 2014 - Optimizing customer value - Pega Systems
 
HPMC 2014 - The value of analytics - SAS
HPMC 2014 - The value of analytics - SASHPMC 2014 - The value of analytics - SAS
HPMC 2014 - The value of analytics - SAS
 
HPMC 2014 - How the customer 2.0 changed marketing - Microsoft
HPMC 2014 - How the customer 2.0 changed marketing - MicrosoftHPMC 2014 - How the customer 2.0 changed marketing - Microsoft
HPMC 2014 - How the customer 2.0 changed marketing - Microsoft
 
HPMC 2014 - Annual trends service design 2014 - Fjord
HPMC 2014 - Annual trends service design 2014 - FjordHPMC 2014 - Annual trends service design 2014 - Fjord
HPMC 2014 - Annual trends service design 2014 - Fjord
 
HPMC 2014 - Thirteen tenets of a successful digital business - Acquity group
HPMC 2014 - Thirteen tenets of a successful digital business - Acquity groupHPMC 2014 - Thirteen tenets of a successful digital business - Acquity group
HPMC 2014 - Thirteen tenets of a successful digital business - Acquity group
 
HPMC 2014 - Digital business edge - Accenture
HPMC 2014 - Digital business edge - AccentureHPMC 2014 - Digital business edge - Accenture
HPMC 2014 - Digital business edge - Accenture
 
HPMC 2014 - The New Energy Consumer - Accenture Netherlands
HPMC 2014 - The New Energy Consumer - Accenture NetherlandsHPMC 2014 - The New Energy Consumer - Accenture Netherlands
HPMC 2014 - The New Energy Consumer - Accenture Netherlands
 
HPMC 2013 - Philips
HPMC 2013 - PhilipsHPMC 2013 - Philips
HPMC 2013 - Philips
 
HPMC 2013 - Nationale Nederlanden
HPMC 2013 - Nationale NederlandenHPMC 2013 - Nationale Nederlanden
HPMC 2013 - Nationale Nederlanden
 
HPMC 2013 - Shell
HPMC 2013 - ShellHPMC 2013 - Shell
HPMC 2013 - Shell
 
HPMC 2013 - Oracle
HPMC 2013 - OracleHPMC 2013 - Oracle
HPMC 2013 - Oracle
 
HPMC 2013 - Microsoft Avanade
HPMC 2013 - Microsoft AvanadeHPMC 2013 - Microsoft Avanade
HPMC 2013 - Microsoft Avanade
 
OSC2012: Identity Analytics: Exploiting Digital Breadcrumbs
OSC2012: Identity Analytics: Exploiting Digital BreadcrumbsOSC2012: Identity Analytics: Exploiting Digital Breadcrumbs
OSC2012: Identity Analytics: Exploiting Digital Breadcrumbs
 
OSC2012: How a 2000-Year Old Knot Untangles Legacy
OSC2012: How a 2000-Year Old Knot Untangles LegacyOSC2012: How a 2000-Year Old Knot Untangles Legacy
OSC2012: How a 2000-Year Old Knot Untangles Legacy
 

NetApp builds remote device support platform with open source Big Data

  • 1. Open source Big Data case study: Building a platform for remote device support at NetApp (Part II – Technical)
  • 2. Topics  Big Data Perspective  Case Study: NetApp AutoSupport  Technology Primer  Design Overview Copyright © 2012 Accenture All rights reserved. 2
  • 3. Big Data The concept is disruptive. The technology is disruptive. And, markets and clients are being impacted. 1 Wordle for Credit Suisse, Does Size Matter Only?, September 2011 Copyright © 2012 Accenture All rights reserved. 3
  • 4. Shifts in Data and Analytics The changing landscape and required winning strategies are creating shifts within Big Data collection and analytics Data Explosion Monetization • Unstructured data is doubling • Growth of enterprise data every 3 months monetization services • 2011 saw 47% growth overall • Large retailers monetizing own • By 2015, number of networked data to provide insights to devices will be 2x global suppliers population Data-led Innovation Social Media • De-coupling data from • Growing market for scrubbed, applications aggregate data from social • Disparate external data shaping media and blogs context • Greater focus on data that • Cost effective mobilization of provides insight in a customer’s massive scale data digital persona Technology Data Mobilization • Commodity priced storage and • Novel approaches to analyze compute unstructured data creating shorter time from data to insight • Emergence of open source and big data technologies solving • Shift towards data consumption production problems at scale in multiple environments (business apps, mobile, social) Copyright © 2012 Accenture All rights reserved. 4
  • 5. The Big Data Approach Treat data as a strategic asset, seek to maximize it’s value to the organization Invest in common services, data platforms and tools Rapidly prototype, deliver, and measure value-added data services, evolve over time • Data-driven decision making • End-to-end ownership of • Experimentation and services continuous improvement with • Sharing of platform, tools and academic rigor code Culture Copyright © 2012 Accenture All rights reserved. 5
  • 6. Topics  Big Data Perspective  Case Study: NetApp AutoSupport  Technology Primer  Design Overview Copyright © 2012 Accenture All rights reserved. 6
  • 7. Client Context NetApp, Inc. • Industry: Data storage, data management • 77% Fortune 500 companies are customers • Creator of Data ONTAP: industry leading storage OS Copyright © 2012 Accenture All rights reserved. 7
  • 8. AutoSupport • Secure automated “call-home” service • Catch issues before they become critical • System monitoring and alerting • RMA requests without customer action • Faster incident management AutoSupport Storage Devices Messages AutoSupport Data Warehouse Copyright © 2012 Accenture All rights reserved. 8
  • 9. Business Challenges SAP CRM MyASUP eBI STOR ASUP Tools Analytics & Mining • Increase in response times / lower Presentation availability for services CRM Module Rules Module Java Interface Rules Rules Jasper Stored Proc Rest Interface Rules Rules Rules Rules Various Interface Rules • Incoming data volume doubling every 16 Rules Rules Rules eB PMBTA BI I Integrate months Custom ETL Custom ETL DSS Custom ETL Custom ETL Transform • Proliferation of ad hoc datamarts and Xterra DB PWillows DW 3 ODS DW 2 Adhoc DB’s Stage point solutions Xterra Parser Light Parser Parser Loader Parser Core Parser Adhoc Extract • Unable to analyze full AutoSupport Parsers Xterra File Source contents efficiently SAP CRM GEO DRM HDD ASUP STAGE PNOW DM File Storage Messages AutoSupport Flat-File Storage Requirement 3500 3000 Total Usage (tb) 2500 Projected Total Usage (tb) 2000 1500 Doubles 1000 500 0 Jan-05 Jan-06 Jan-07 Jan-08 Jan-09 Jan-10 Jan-11 Jan-12 Jan-13 Jan-14 Jan-15 Jan-16 Copyright © 2012 Accenture All rights reserved. 9
  • 10. Solution Design Goals Improve data access and technology cost effectiveness and performance. • Improve system response times and data availability • Expose common data services for consumption across business units • Standardize key business metrics into common rules repository • Lower operational costs as ecosystem continues to scale • Provide more granular analytical capabilities Copyright © 2012 Accenture All rights reserved. 10
  • 11. Role of Open Source Platform is composed of open source technologies purpose-built for large-scale storage, processing and analysis 1 Actual Big Data Solution Blueprint for a hybrid deployment Copyright © 2012 Accenture All rights reserved. 11
  • 12. Topics  Big Data Perspective  Case Study: NetApp AutoSupport  Technology Primer  Design Overview Copyright © 2012 Accenture All rights reserved. 12
  • 13. Technology Primer – Hadoop Hadoop Distributed Filesystem Hadoop MapReduce (HDFS) • Parallel processing for large datasets • Divides files into smaller “blocks”, across machines stored across machines • Breaks job into tasks, using a simple map() • Automated replication, fault tolerance and reduce() paradigm for data flows Copyright © 2012 Accenture All rights reserved. 13
  • 14. Technology Primer – MapReduce MapReduce Map(key,value) (Simple Example – Word Count) Reduce(key, List<value> values) Map Phase Shuffle Phase <one,1> <one,1> m <fish,1> Input <two,1> r One fish, <two,1> m <fish,1> <red,1> two fish, r <blue,1> red fish, blue fish. <red,1> m <fish,1> r <fish,4> m <blue,1> <fish,1> Copyright © 2012 Accenture All rights reserved. 14
  • 15. Technology Primer – NoSQL • “Not only” SQL • Catch-all term for various non-relational database systems • Typical areas of differentation • Data model semantics • eg. Database, Document, Key-Value • CAP trade-offs • Consistency, Availability, Partition-Tolerance • Scale-out architecture • eg. Sharding, Distributed hash • Query language Examples: HBase, Cassandra, mongoDB, Neo4j, etc. Copyright © 2012 Accenture All rights reserved. 15
  • 16. Topics  Big Data Perspective  Case Study: NetApp AutoSupport  Technology Primer  Design Overview Copyright © 2012 Accenture All rights reserved. 16
  • 17. Data Pipeline Overview Data Service Interface Incoming Messages Core Data Ad hoc Ingestion Processing analytics ETL Copyright © 2012 Accenture All rights reserved. 17
  • 18. Data Ingestion Technologies • Apache Flume, Apache Hadoop, Drools BRMS, JMS Capabilities • Handle dynamic data volumes Notifications • Normalization of disparate file formats • Real-time aggregation of documents JMS • JMS alerts for critical messages Parsing tier Aggregation & sink tier Documents from Front End HTTP/SMTP Flume Flume Flume Gateway Routing tier agent agent agent Aggregated files Flume Flume Flume Flume client agent agent agent Rules HDFS Engine Flume Flume Flume agent agent agent Copyright © 2012 Accenture All rights reserved. 18
  • 19. Core Data Processing Technologies • MapReduce, HBase, Solr, Avro Capabilities • Parallel processing for increased throughput • Efficient storage of complex data objects in Avro Search indexes Parse text Solr contents Transform and derive data objects Primary storage Documents gathered from Flume Map HBase Reduce Map HDFS Write derived objects to Data warehouse data stores Map Reduce Hive Copyright © 2012 Accenture All rights reserved. 19
  • 20. Data Services Technologies • Apache HBase, Solr, Tomcat Capabilities • Unified web services API for end users • Support for complex queries and searches across multiple dimensions with Solr • Access both raw and derived content for a given system Copyright © 2012 Accenture All rights reserved. 20
  • 21. Analytics / ETL Technologies • Apache Hive, Pig, Datameer (Ad hoc analytics) • Pentaho (ETL / Data Integration) Capabilities • Analytical environment for both business analysts and “power users” • Hive or Pig as higher level query languages • Datameer for analytics with a spreadsheet UI • ETL through Pentaho MapReduce • (runs Pentaho ETL server inside of a MapReduce Job) Copyright © 2012 Accenture All rights reserved. 21
  • 22. Successes and Challenges Successes • Web service interface contracts simplified integration with user tools, allowed for flexibility in internal implementation • Open source core allowed rapid for rapid iteration • Met or exceeded all SLAs using commodity hardware, significantly driving down costs Challenges • Monitoring a large distributed system requires discipline and a strong operations team • Shared storage systems and Big Data technologies don’t always play well together • “Schemaless” systems can become a headache to maintain, especially with complex data models Copyright © 2012 Accenture All rights reserved. 22
  • 23. Thank you Jonathan Bender Consultant, Accenture Technology Labs jonathan.bender@accenture.com Copyright © 2012 Accenture All rights reserved. 23