SlideShare ist ein Scribd-Unternehmen logo
1 von 63
Hadoop - Validated Network Architecture
and Reference Deployment in Enterprise



                Nimish Desai – nidesai@cisco.com

                Technical Leader, Data Center Group
                Cisco Systems Inc.
Session Objectives & Takeways

Goal 1: Provide Reference Network
Architecture for Hadoop in Enterprise

Goal 2: Characterize Hadoop Application
on Network

Goal 3: Network Validation Results with
Hadoop Workload

                                          2
Big Data in
Enterprise




              3
Validated 96 Node Hadoop Cluster
                                                                                  Nexus 7000             Nexus 7000
              Nexus 5548               Nexus 5548




                                                 2248TP-E                                                        Nexus 3000
                                                                    Nexus 3000
     2248TP-E

                                                                                                                         Name Node
                                                   Name Node
                                                                                                                       Cisco UCS C 200
                                                 Cisco UCS C200
                                                                                                                          Single NIC
                                                    Single NIC


     …                           …                                   …                            …
     Data Nodes 1 – 48            Data Nodes 49- 96                   Data Nodes 1 – 48            Data Nodes 49 - 96
 Cisco UCS C 200 Single NIC   Cisco UCS 200 Single NIC            Cisco UCS C 200 Single NIC   Cisco UCS C 200 Single NIC

 Traditional DC Design Nexus 55xx/2248                                  Nexus 7K-N3K based Topology

§  Hadoop Framework                                                §  Network
        Apache 0.20.2
                                                                             Three Racks each with 32 nodes
        Linux 6.2
                                                                             Distribution Layer – Nexus 7000 or
        Slots – 10 Maps & 2 Reducers per node                                Nexus 5000
§  Compute – UCS C200 M2                                                    ToR – FEX or Nexus 3000
    Cores: 12                                                                2 FEX per Rack
    Processor: 2 x Intel(R) Xeon(R) CPU X5670
    @ 2.93GHz                                                                Each Rack with either 32 single or
    Disk: 4 x 2TB (7.2K RPM)                                                 dual attached host
    Network: 1G: LOM, 10G: Cisco UCS P81E
Data Center Infrastructure
WAN Edge
 Layer

                                                            FC                   FC
                                                           SAN A                SAN B
                                                                                                                           Nexus 7000                       Layer 3
                                                                    MDS 9500                                               10 GE Core                       Layer 2 - 1GE
                                                                       SAN
                                                                     Director
                                                                                                                                                            Layer 2 - 10GE
Core Layer                                                                                                                                                  10 GE DCB
(LAN & SAN)                                                                                                                                                 10 GE FCoE/DCB
                                                                                                                                                            4/8 Gb FC


                                                                                                                        Nexus 7000
                                                                                                                        10 GE Aggr

                                                                                                         vPC+                                              L3
                                                                                                       FabricPath
Aggregation
 & Services                                                                                                                                                L2
   Layer
                                                                                                                                            Network
                                                                                                                                            Services


                                                                                                                                                                                FC
                                                                                                                                                              FC
                                                                                                                                                                               SAN
                                                                                                                                                             SAN A
  Access                                                                                                                                                                         B
   Layer
                                                                                                                                                                                     Nexus
 SAN Edge                                                                                                                                                                             5500
                                                                   MDS 9200 /                                                                                                        FCoE
                                                                     9100




                                                                                                                                                                                       B22
                                                                                                                                                                                       FEX
                                                               Nexus 5500 10GE CBS 31xx        Nexus 7000           Nexus 5500 FCoE      UCS FCoE                                      HP
                                       Bare Metal               Nexus 2148TP-E Blade switch                           Nexus 2232                                  Nexus 3000          Blade
                                                                                               End-of-Row
                                     1G Nexus 3000                Bare Metal                                          Top-of-Rack                                 Top-of-Rack
                                                                                                                                                                                     C-class
                                      Top-of-Rack                                                                                                                     10G

                            1 GbE Server Access & 4/8Gb FC via dual HBA (SAN A // SAN B)      10Gb DCB / FCoE Server Access or 10 GbE Server Access & 4/8Gb FC via dual HBA (SAN A // SAN B)
© 2010 Cisco and/or its affiliates. All rights reserved.                                                                                                                               Cisco Confidential   5
Big Data Application Realm – Web
    2.0 & Social/Community Networks
§  Data live/die in Internet only
    entities
§  Data Domain Partially private                     Data
                                       UI   Service
§  Homogeneous Data Life Cycle                       store
    Mostly Unstructured
    Web Centric, User Driven
    Unified workload – few process &
    owners
    Typically non-virtualized

§  Scaling & Integration Dynamics
    Purpose Driven Apps
    Thousands of nodes
    Hundreds of PB and growing
    exponentially

                                                              6
Big Data Application Realm - Enterprise
§  Data Lives in a confined zone of enterprise
    repository
      §    Long Lived, Regulatory and Compliance               Call      Sales    ERP      Doc      Recor   Doc
            Driven                                             Cente      Pipeli   Modul    Mgmt      ds     Mgmt
                                                                 r         ne       eA       A       Mgmt     B

§  Heterogeneous Data Life Cycle
                                                               Data                ERP
                                                                           Soc              Office   Video
      §    Many Data Models                                   Servic
                                                                          Media
                                                                                   Modul
                                                                                    eB
                                                                                            Apps     Conf
                                                                                                             Collab
                                                                 e
      §    Diverse data – Structured and Unstructured
                                                                                   Produc
                                                                                            Catalo            Exec
      §    Diverse data sources - Subscriber based              Customer DB
                                                                 (Oracle/SAP)
                                                                                      t
                                                                                   Catalo
                                                                                              g      VOIP    Report
                                                                                      g      Data              s
      §    Diverse workload from many sources/groups/
            process/technology
      §    Virtualized and non-virtualized with mostly
            SAN/NAS base
§  Scaling & Integration Dynamics are different
   §  Data Warehousing(structured) with divers repository +
       Unstructured Data
   §  Few hundred to thousand nodes, few PB
   §  Integration, Policy & Security Challenges

§  Each Apps/Group/Technology limited in
   §  data generation
   §  Consumption
   §  Servicing confined domains
                                                                                                                      7
Big Data Framework Application Comparison
                                   Batch-oriented              Real-time Big
        Relational                    Big Data
        Database                                               Data NoSQL
                                     (Hadoop)
  •  Structured Data – Rows    •  Unstructured Data –       •  Hbase, Cassandra,
     Oriented                     Files, logs, Web-Clicks      Oracle
  •  Optimized for OLTP/       •  Data format is            •  Structured and
     OLAP                         abstracted to higher         Unstructured Data
  •  Rigid schema applied to      level application         •  Sparse column-family
     data on insert/update        programing                   data storage or Key-
  •  Read and write (insert,   •  Schema-less, flexible        value pair
     update) many times           for later re-use          •  Not a RDBMS, though
  •  Non-linear scaling        •  Write once, read many        with some schema
  •  Most transactions and     •  Data never dies           •  Random read and write
     queries involve a small   •  Linear scaling            •  Modeled after Google’s
     subset of data set        •  Entire data set at play      BigTable
  •  Transactional – scaling      for a given query         •  High transaction – real
     to thousands of queries   •  Multi PB                     time scaling to millions
  •  GB to TBs size                                         •  Not suited for ad-hoc
                                                               analysis
                                                            •  More suited for ~1 PB




                                                                                          8
Data Sources
                                                 Big Data
       Enterprise Application
                                                  Machine logs Sensor data
      Sales	
  	
  	
  Products	
                 Call data records Web click
      Process	
  	
  Inventory	
  	
  	
  	
      stream data
      Finance	
  Payroll	
  	
  	
  	
            Satellite feeds  GPS data Sales
      Shipping	
  	
  	
  	
  Tracking	
          data Blogs Emails Pictures
      Authoriza;on	
  	
                          Video
      Customers	
  Profile	
  
      	
  




                       mn
                  Colu
                   Store


                         ness
                   Busi ence
                         ig
                  Intell




                                                                                    9
Big Data Building Blocks into the Enterprise
                Big Data                                               Socia
                                                          Event
             Application                        Click
                                               Streams    Data           l
                                                                       Media
                                                                               Mobility
                                                                               Trends
                    Virtualized,
                Bare Metal and Cloud
                                                         Sensor
                                               Logs       Data




                           Cisco Unified Fabric



                                 Traditional
    “Big Data”                                           Storage                   “Big Data”
                                  Database
        NoSQL

    Real-Time Capture,                                                             Store and Analyze
                                                         SAN and NAS
     Read and Update                   RDBMS
       Operations



                                                                                                       10
Infinite Use Cases

§  Web & E-Commerce
     Faster User Response
     Customer Behaviors & Pricing Models
     Ad Target

§  Retails
     Customer Churn & Integration of brick &
     mortar with .com business models
     PoS Transactional Analysis

§  Insurance & Finance
     Risk Management
     User Behavior & Incentive Management
     Trade Surveillance for Financials

§  Network Analytics – Splunk
     Text Mining
     Fault Prediction

§  Security & Threat Defense


                                               11
Hadoop Cluster Design &
Reference Network
Architecture




                          12
Hadoop Components and Operations
  Hadoop Distributed File System
                                               Blo    Blo    Blo    Blo    Blo    Blo
                                               ck 1   ck 2   ck 3   ck 4   ck 5   ck 6

§  Data is not centrally located,
    Data is stored across all data nodes
    in the cluster
§  Scalable & Fault Tolerant
§  Data is divided in multiple large      ToR FEX/          ToR FEX/             ToR FEX/
    blocks – 64MB default, typical          switch            switch               switch
    block 128MB
                                              Data               Data                Data
§  Blocks are not the related to disk       node 1             node 6              node 11
    geometry
                                              Data               Data                Data
§  Data is stored reliably. Each block      node 2             node 7              node 12
    is replicated 3 times
                                              Data               Data                Data
§  Types of Functions                       node 3             node 8              node 13
      §  Name Node (Master) - Manages        Data               Data                Data
          Cluster
                                             node 4             node 9              node 14
      §  Data Node (Map and Reducer) –
          Carries blocks                      Data              Data                 Data
                                             node 5            node 10              node 15




                                                                                              13
Hadoop Components and Operations
                                                                                  Name
§  Name Node                                                                     Node
     Runs a scheduler – Job Tracker

     Manages all data nodes, in memory

     Secondary Name Node – Snapshot of meta data of
     HDFS cluster
                                                            ToR FEX/   ToR FEX/   ToR FEX/
     Typically all three JVM can run on single node          switch     switch     switch

                                                               Data       Data      Data
§  Data Node                                                 node 1     node 6    node 11

     Task Tracker Receives Job Info from Job Tracker           Data       Data      Data
     (Name Node)                                              node 2     node 7    node 12
     Map & Reducer Task Managed by Task Tracker
                                                               Data       Data      Data
     Configurable Ratio of Map & Reduce Task for various      node 3     node 8    node 13
     workload per Node/CPU/Core
                                                               Data       Data      Data
     Data Locality - IF data not available where the map      node 4     node 9    node 14
     task is assigned, a missing block be copied over the
     network
                                                               Data      Data       Data
                                                              node 5    node 10    node 15




                                                                                             14
Characteristics that Affect Hadoop Clusters
§  Cluster Size                                    §  Characteristics of Data Node
    Number of Data Nodes                              ‒ I/O, CPU, Memory, etc.
§  Data Model & Mapper/Reduces                     §  Networking Characteristics
    Ratio                                             ‒  Availability
    MapReduce functions                               ‒  Buffering
§  Input Data Size                                   ‒  Data Node Speed (1G vs. 10G)

    Total starting dataset                            ‒  Oversubscription
                                                      ‒  Latency
§  Data Locality in HDFS
    Ability to processes data where it already is
    located

§  Background Activity
    Number of Jobs running
                                   http://www.cloudera.com/resource/hadoop-
    type of jobs                   world-2011-presentation-video-hadoop-network-
    Importing                      and-compute-architecture-considerations/
    exporting

                                                                                        15
Hadoop Components and Operations
   Hadoop Distributed File System
                                                        Unstructured Data

§  The Data Ingest & Replication
    External Connectivity                    Map      Map       Map     Map
                                               Map      Map       Map     Map
    East West Traffic (Replication of data       Map      Map
                                                            Map
                                                                    Map
                                                                      Map
                                                                            Map
                                                                              Map
                                                   Map
    blocks)                                          Map      Map       Map     Map


§  Map Phase – Raw data Analyzed
    and converted to name/value pair.                       Shuffle Phase

    Workload translate to multiple
    batches of Map task                        Key 1
                                                 Key 1
                                                          Key 1
                                                            Key 1
                                                                     Key 1
                                                                       Key 1
                                                                                Key 1
                                                                                  Key 1
                                                   Key 1      Key 1      Key 1      Key 1
    Reducer can start the reduce                     Key 1      Key 2      Key 3      Key 4

    phase ONLY after the entire Map
    set is complete
§  Mostly a IO/compute function                   Reduce     Reduce    Reduce     Reduce




                                                             Result/Output



                                                                                              16
Hadoop Components and Operations
    Hadoop Distributed File System
                                                                 Unstructured Data
§  Shuffle Phase - All name/value pair are
    sorted and grouped by their keys.
                                                      Map      Map       Map     Map
§  Mapper sending the data to Reducers                 Map      Map       Map     Map
                                                          Map      Map       Map     Map
§  High Network Activity                                   Map      Map       Map     Map
                                                              Map      Map       Map     Map
§  Reduce Phase – All values associates with a key
    are process for results, three phases

      Copy - get intermediate result from each data                      Shuffle Phase
      node local disk
      Merge - to reduce the number of files             Key 1       Key 1       Key 1       Key 1
                                                          Key 1       Key 1       Key 1       Key 1
      Reduce method                                         Key 1       Key 1       Key 1       Key 1
                                                              Key 1       Key 2       Key 3       Key 4
§  Output Replication Phase - Reducer
   replicating result to multiple nodes
      Highest Network Activity
                                                            Reduce     Reduce      Reduce      Reduce
§  Network Activities Dependent on Workload
    Behavior

                                                                     Result/Output



                                                                                                          17
MapReduce Data Model
             ETL & BI Workload Benchmark

                  The complexity of the functions used in Map and/or Reduce has
                   a large impact on the job completion time and network traffic.

                     Yahoo	
  TeraSort	
  –	
  ETL	
  Workload	
  –	
  Most	
  Network	
  Intensive	
  

                                        Reducers	
  Start	
  

                 Map	
  Start	
                                                     Map	
  Finish	
                                 Job	
  Finish	
  


•  Input,	
  Shuffle	
  and	
  Output	
  data	
  size	
  is	
  the	
  same	
  –	
  e.g.	
  10	
  TB	
  data	
  set	
  in	
  all	
  phases	
  
•  Yahoo	
  Terasort	
  has	
  a	
  more	
  balanced	
  Map	
  vs.	
  Reduce	
  funcEons	
  -­‐	
  linear	
  compute	
  and	
  IO	
  

                          Shakespeare	
  WordCount	
  –	
  BI	
  Workload	
  

                                                            Reducers	
  Start	
                                 Map	
  Finish	
  

                     Map	
  Start	
                                                                                                      Job	
  Finish	
  

                 •  Data	
  set	
  size	
  varies	
  in	
  various	
  phase	
  –	
  Varying	
  impact	
  on	
  the	
  network	
  e.g.	
  1TB	
  Input,	
  
                    10MB	
  Shuffle,	
  1MB	
  Output	
  
                 •  Most	
  of	
  the	
  processing	
  in	
  the	
  Map	
  FuncEons,	
  smaller	
  intermediate	
  and	
  even	
  smaller	
  final	
  
                    Data	
  	
  

                                                                                                                                                             18
ETL Workload (1TB Yahoo Terasort)
    Network Graph of all Traffic Received on an Single Node (80 Node Run)
Shortly	
  aNer	
  the	
  Reducers	
  start	
  Map	
  tasks	
  are	
  finishing	
  and	
  data	
  is	
  being	
  shuffled	
  to	
  reducers	
  
As	
  Maps	
  completely	
  finish	
  the	
  network	
  is	
  no	
  loner	
  used	
  as	
  Reducers	
  have	
  all	
  the	
  data	
  they	
  
need	
  to	
  finish	
  the	
  job	
  

                                                   The red line is
                                                   the total                    These
                                                   amount of                    symbols
                                                   traffic                      represent a
                                                   received by                  node sending
                                                   hpc064                       traffic to
                                                                                HPC064




                     Reducers                                                                                Job
                     Start                                                 Maps                              Complete
                  Maps Start
                                                                           Finish
                                                                                                                                           19
ETL Workload (1TB Yahoo Terasort)
Network Activity of all Traffic Received on an Single Node (80 Node Run)

  If	
  output	
  replica;on	
  is	
  enabled,	
  then	
  the	
  end	
  of	
  the	
  terasort,	
  must	
  store	
  addi;onal	
  
  copies.	
  For	
  a	
  1TB	
  sort,	
  2TB	
  will	
  need	
  to	
  be	
  replicated	
  across	
  the	
  network.	
  




                                            Output Data Replication Enabled
                                            §  Replication of 3 enabled (1 copy stored locally, 2 stored remotely)
                                            §  Each reduce output is replicated now, instead of just stored locally
                                                                                                                                   20
BI Workload
Network Graph of all Traffic Received on an Single Node (80 Node Run)
   Wordcount on 200K Copies of complete works of Shakespeare
 Due	
  the	
  combinaEon	
  of	
  the	
  length	
  of	
  the	
  Map	
  phase	
  and	
  the	
  reduced	
  data	
  set	
  being	
  
 shuffled,	
  the	
  network	
  is	
  being	
  uElized	
  throughout	
  the	
  job,	
  but	
  by	
  a	
  limited	
  amount.	
  

                                                                                          These
                                                 The red line is                          symbols
                                                 the total                                represent a
                                                 amount of                                node sending
                                                 traffic                                  traffic to
                                                 received by                              HPC064
                                                 hpc064




                            Reducers                                                                                         Job
                    Maps Start
                            Start                                                                            Maps            Complete
                                                                                                             Finish
                                                                                                                                        21
Data Locality in HDFS

  Data Locality – The
   ability to process
    data where it is
    locally stored.
                                                                                                 Observations
                                                                                             §Notice this initial
                                                                                             spike in RX Traffic is
                                                                                             before the Reducers
                                                                                             kick in.

                                                                                             § It represents data
                                                                                             each map task needs
Note:                                                                                        that is not local.

During the Map Phase, the JobTracker                                                         § Looking at the spike
attempts to use data locality to schedule                                                    it is mainly data from
                                                                                             only a few nodes.
map tasks where the data is locally
stored. This is not perfect and is
dependent on a data nodes where the             Reducers Start                                                                Job
data is located. This is a consideration     Maps Start
                                                                                               Maps Finish                    Complete

when choosing the replication factor.       Map	
  Tasks:	
  IniEal	
  spike	
  for	
  non-­‐local	
  data.	
  SomeEmes	
  a	
  task	
  
                                            may	
  be	
  scheduled	
  on	
  a	
  node	
  that	
  does	
  not	
  have	
  the	
  data	
  
More replicas tend to create higher         available	
  locally.	
  	
  
probability for data locality.



                                                                                                                                           22
Map to Reducer Ratio Impact on Job Completion
§  1 TB file with 128 MB Blocks == 7,813 Map Tasks
§  The job completion time is directly related to number of reducers
§  Average Network buffer usage lowers as number of reducer gets lower (see
    hidden slides) and vice versa.
                                                        Job Completion Time in Sec
                                                  800
           Total Graph of Job                     700
         Completion Time in Sec                   600
                                                  500
 30000                                            400
                                                  300
                                                  200
 25000
                                                  100
                                                    0
                                                             192              96            48
 20000
                                                                        No. Of Reduceers

 15000
                                                            Job Completion Time in Sec
 10000                                              30000
                                                    25000
                                                    20000
  5000
                                                    15000
                                                    10000
     0                                               5000
          192   96      48     24       12   6          0
                     No. Of Reduceers                              24           12          6
                                                                         No. Of Reduceers

                                                                                                 23
Job Completion Time with 96 Reducers




                                       24
Job Completion Time with 48 Reducers




                                       25
Job Completion Graph with 24 Reducers




                                        26
Network Characteristics
      The relative impact of various network
       characteristics on Hadoop clusters*




                                                Availablity
                                                Buffering
                                                Oversubscription
                                                Data Node Speed
                                                Latency




              * Not a scaled or measured data

                                                                   27
Validated Network
Reference
Architecture




                    28
Data Center Access Connectivity
                   Nexus 7000                                                                                MDS 9000
      Core
  Distribution




                        LAN                                                                                       SAN

    Unified
  Access Layer
                                                                           Nexus 5000                                              Nexus 1000V
                                                                                                Direct Attach
                                                                                                    10GE         Nexus
                                                                                                                  4000              Cisco
Nexus                  Nexus                             Nexus
 2000                   2000                              2000                                                                      UCS




                                                                  1 & 10GE                                         10GE Blade
     1GE Rack              10GE Rack                                                          10GE Rack                                 UCS Compute
                                                                Blade Servers                                     Switch w/ FCoE
   Mount Servers          Mount Servers                                                      Mount Servers                              Blade & Rack
                                                                w/ Pass-Thru                                        (IBM/Dell)


         BRKAPP-2027           © 2012 Cisco and/or its affiliates. All rights reserved.   Cisco Public                                                 29
Network Reference Architecture


§ Network Attributes




                                                      Nexus LAN and SAN Core:
                                                      Optimized for Data Centre
 § Architecture
 §  Availability
 § Capacity, Scale &
    Oversubscription
                                            blade1




 § Flexibility
                                  blade1
                                  slot 1     slot 1
                                  blade2
                                   slot 2   blade2
                                             slot 2
                                  blade3    blade3
                                  slot 3     slot 3
                                  blade4
                                   slot 4   blade4
                                             slot 4
                                  blade5    blade5
                                  slot 5     slot 5
                                  blade6
                                   slot 6   blade6
                                             slot 6
                                  blade7    blade7
                                  slot 7     slot 7
                                  blade8
                                   slot 8   blade8
                                             slot 8




                                                               Edge/Access Layer
                                  blade1    blade1
                                  slot 1     slot 1




 § Management & Visibility
                                  blade2
                                   slot 2   blade2
                                             slot 2
                                  blade3    blade3
                                  slot 3     slot 3
                                  blade4
                                   slot 4   blade4
                                             slot 4
                                  blade5    blade5
                                  slot 5     slot 5
                                  blade6
                                   slot 6   blade6
                                             slot 6
                                  blade7    blade7
                                  slot 7     slot 7
                                  blade8
                                   slot 8   blade8
                                             slot 8


                                  blade1    blade1
                                  slot 1     slot 1
                                  blade2
                                   slot 2   blade2
                                             slot 2
                                  blade3    blade3
                                  slot 3     slot 3
                                  blade4
                                   slot 4   blade4
                                             slot 4
                                  blade5    blade5
                                  slot 5     slot 5
                                  blade6
                                   slot 6   blade6
                                             slot 6
                                  blade7    blade7
                                  slot 7     slot 7
                                  blade8
                                   slot 8   blade8
                                             slot 8




                                                                                   30
Scaling the Data Centre Fabric
      Changing the device paradigm
§  De-Coupling of the Layer 1 and Layer 2 Topologies

§  Simplified Management Model, plug and play provisioning,
    centralized configuration

§  Line Card Portability (N2K supported with Multiple Parent
    Switches – N5K, 6100, N7K)

§  Unified access for any server (100Mà1GEà10GEà
    FCoE): Scalable Ethernet, HPC, unified fabric or
    virtualization deployment




                                                             ...
                                                                   Virtualized Switch

  © 2010 Cisco and/or its affiliates. All rights reserved.                     Cisco Confidential   31
Hadoop Network Topologies - Reference
 Unified Fabric & ToR DC Design

§  Integration with Enterprise architecture –
    essential pathway for data flow                 §  1Gbps Attached Server
     Integration                                        §    Nexus 7000/5000 with 2248TP-E
     Consistency                                        §    Nexus 7000 and 3048
     Management
     Risk-assurance                                 §  NIC Teaming - 1Gbps Attached
     Enterprise grade features                          §    Nexus 7000/5000 with 2248TP-E

§  Consistent Operational Model                        §    Nexus 7000 and 3048
     NxOS, CLI, Fault Behavior and Management       §  10 Gbps Attached Server
§  Though higher BW east-west compared                 §    Nexus 7000/5000 with 2232PP
    to traditional transactional networks
                                                        §    Nexus 7000 and 3064
§  Over the time it will have multi-user, multi-
    workload behavior                               §  NIC Teaming – 10 Gbps Attached
     Need enterprise centric features                   Server
     Security, SLA, QoS etc                             §    Nexus 7000/5000 with 2232PP

                                                        §    Nexus 7000 & 3064
Validated Reference Network Topology
                                                                                  Nexus 7000             Nexus 7000
              Nexus 5548               Nexus 5548




                                                                                                                 Nexus 3000
                                               2248TP-E             Nexus 3000
     2248TP-E

                                                                                                                         Name Node
                                                   Name Node
                                                                                                                       Cisco UCS C 200
                                                 Cisco UCS C200
                                                                                                                          Single NIC
                                                    Single NIC


     …                           …                                   …                            …
     Data Nodes 1 – 48            Data Nodes 49- 96                   Data Nodes 1 – 48            Data Nodes 49 - 96
 Cisco UCS C 200 Single NIC   Cisco UCS 200 Single NIC            Cisco UCS C 200 Single NIC   Cisco UCS C 200 Single NIC

 Traditional DC Design Nexus 55xx/2248                                  Nexus 7K-N3K based Topology

§  Hadoop Framework                                                §  Network
        Apache 0.20.2
                                                                             Three Racks each with 32 nodes
        Linux 6.2
                                                                             Distribution Layer – Nexus 7000 or
        Slots – 10 Maps & 2 Reducers per node                                Nexus 5000
§  Compute – UCS C200 M2                                                    ToR – FEX or Nexus 3000
    Cores: 12                                                                2 FEX per Rack
    Processor: 2 x Intel(R) Xeon(R) CPU X5670
    @ 2.93GHz                                                                Each Rack with either 32 single or
    Disk: 4 x 2TB (7.2K RPM)                                                 dual attached host
    Network: 1G: LOM, 10G: Cisco UCS P81E
Network Reference Architecture
Characteristics

§ Network Attributes




                                                     Nexus LAN and SAN Core:
                                                     Optimized for Data Centre
 § Architecture
 §  Availability
 § Capacity, Scale &
    Oversubscription
                                           blade1




 § Flexibility
                                 blade1
                                 slot 1     slot 1
                                 blade2
                                  slot 2   blade2
                                            slot 2
                                 blade3    blade3
                                 slot 3     slot 3
                                 blade4
                                  slot 4   blade4
                                            slot 4
                                 blade5    blade5
                                 slot 5     slot 5
                                 blade6
                                  slot 6   blade6
                                            slot 6
                                 blade7    blade7
                                 slot 7     slot 7
                                 blade8
                                  slot 8   blade8
                                            slot 8




                                                              Edge/Access Layer
                                 blade1    blade1
                                 slot 1     slot 1




 § Management & Visibility
                                 blade2
                                  slot 2   blade2
                                            slot 2
                                 blade3    blade3
                                 slot 3     slot 3
                                 blade4
                                  slot 4   blade4
                                            slot 4
                                 blade5    blade5
                                 slot 5     slot 5
                                 blade6
                                  slot 6   blade6
                                            slot 6
                                 blade7    blade7
                                 slot 7     slot 7
                                 blade8
                                  slot 8   blade8
                                            slot 8


                                 blade1    blade1
                                 slot 1     slot 1
                                 blade2
                                  slot 2   blade2
                                            slot 2
                                 blade3    blade3
                                 slot 3     slot 3
                                 blade4
                                  slot 4   blade4
                                            slot 4
                                 blade5    blade5
                                 slot 5     slot 5
                                 blade6
                                  slot 6   blade6
                                            slot 6
                                 blade7    blade7
                                 slot 7     slot 7
                                 blade8
                                  slot 8   blade8
                                            slot 8




                                                                                  34
High Availability Switching Design
   Common High Availability Engineering Principles
§  The Core High Availability Design
                                                                                                     L3
    Principles are common across all                                                              Dual Node
    Network Systems Designs
§  Understand the causes of network                                                             Full Mesh
    outages
     Component Failures                                                                                 L2
                                                                                                     Dual Node
     Network Anomalies
                                                                                                 Full Mesh
§  Understand the Engineering
    foundations of systems level availability                                                        ToR
                                                                                                   Dual Node
     Device and Network level MTBF
     Understanding Hierarchical and Modular Design                                                    NIC
                                                                                                    Teaming
     Understand the HW and SW interaction in the
     system

§  Enhance VPC allows such topology and                            Dual NIC
                                                                                  Dual NIC
                                                     Single NIC      802.3ad
    ideally suited for Big Data applications                                    Active/Standby

§  Enhanced vPC (EvPC)configuration any and                      System High Availability is a function of
    all server NIC teaming configurations will be                    topology and component level High
    supported on any port
                                                                                Availability
Availability with Single Attached Server
 1G or 10G
§  Important to evaluate the overall availability of
    the system.
     Network failures can span many nodes in the system
     causing rebalancing and decreased overall resources.
     Typically multi-TB of data transfer occurs for a single
     ToR or FEX failure
     Load Sharing, ease of management and
     consistent SLA is important to enterprise
     operation

§  Failure Domain Impact on Job Completion
§  1 TB Terasort typically Takes ~4.20- 4.30 minutes

§  A failure of a SINGLE NODE (either NIC or server
    component) results in roughly doubling of the job
    completion time

§  Key observation is that the failure impact is
    dependent on type of workload being run on the
                                                               Single NIC
    cluster                                                    32 per ToR
     Short lived interactive vs. Short live batch
     Long job – ETL, Normalization, Joins
                                                                            36
Single Node Failure Job Completion Time
§  The MAP job are executed parallel so unit time for each MAP tasks/node remains same and more less
    completes the job roughly at the same time.

§  However during the failure, set of MAP task remains pending (since other nodes in the cluster are still
    completing their task) till ALL the node finishes the assigned tasks.

§  Once all the node finishes their MAP task, the left over MAP task being reassigned by name node, the unit time
    it take to finish those sets of MAP task remain the same(linear) as the time it took to finish the other MAPs – its
    just happened to be NOT done in parallel thus it could double job completion time. This is the worst case
    scenario with Terasort, other workload may have variable completion time.




                                                                                                                          37
1G Port Traffic & Job Completion Time




                                        38
1G Port Failure Traffic & Job Completion Time




                                                39
Availability with Dual Attached Server
1G and 10G
    Server NIC Teaming Topologies

 §  Dual homing(active-active) network connection
     from server allows
      Reduced replication and data movements during failure
      Allow optimal load-sharing

 §  Dual homing FEX avoids single point of failure.
 §  Enhance VPC allows such topology and ideally
     suited for Big Data applications
 §  Enhanced vPC (EvPC)configuration any and all
     server NIC teaming configurations will be
     supported on any port
 §  Supported with Nexus 5500 only
 §  Alternatively Nexus 3000 vPC allows host level
     redundancy with ToR ECMP



                                                                           Dual NIC
                                                              Single NIC    802.3ad   Dual NIC Active/
                                                                                         Standby

                                                                                                         40
Availability
Single Attached vs. Dual Attached Node
§  No single point of failure from network view point. No impact on job completion time
§  NIC bonding configured at Linux – with LACP mode of bonding
§  Effective load-sharing of traffic flow on two NICs.
§  Recommended to change the hashing to src-dst-ip-port (both network and NIC bonding in
    Linux) for optimal load-sharing




                                                                                            41
Availability Network Failure Result – 1TB Terasort - ETL
§  Failure of various components
                                                         FEX/ToR A
§  Failure introduce at 33%, 66% and 99%
    of reducer completion                                FEX/ToR A
§  Singly attached NIC server & Rack                  96 Nodes
    failure has bigger impact on job                   2 FEX per Rack
    completion time then any other failure
§  FEX Failure is a RACK failure for 1G topology
Job Completion Time with Various Failure

                      1G Single         2G
    Failure Point                                                                        FEX/ToR B
                      Attached     Dual Attached
                                                           FEX/ToR A
   Peer Link 5000       301              258

        FEX *           1137             259
        Rack *          1137            1017
                        See
    A Port – Single   previous
                                  See previous Slide
       Attached         Slide


                        See
     1 port – Dual    previous    See previous Slide
        Attach          Slide



*Variance in run time with % reducer completed                    Rack 1   Rack 2   Rack 3
                                                                                                     42
Network Reference Architecture
Characteristics

§ Network Attributes
 § Architecture
 § Availability
 § Capacity, Scale &
    Oversubscription
 § Flexibility
 § Management & Visibility




                                 43
Cluster Scaling
Nexus 7K/5K & FEX - 2248TP-E or 2232



§ 1G Based - Nexus
   2248TP- E
   48 1G host ports and up to 4
                                       Uplinks
   uplinks bundled into a single
   port channel                    Host Interface

§ 10G Based Nexus 2232
   32 10G host ports and up to 8
   uplinks bundled into a single
   port channel
                                             802.3ad &
                                               vPC          802.3ad &    Single
                                                              vPC       Attached

                                                 Nexus 2248TP-E and 2232 support
                                                  both local port channel and vPC
                                                   for distributed port channels
Oversubscription Design
§  Hadoop is a parallel batch job oriented framework
§  Primary benefits of hadoop is the reduction in job completion time that would
    otherwise would take longer with traditional technique. E.g. Large ETL, Log
    Analysis, Join-only-Map job etc.
§  Typically oversubscription occurs with 10G server access then at 1G server
§  Non-blocking network is NOT a needed, however degree of oversubscription
    matters for
    Job Completion Time
    Replication of Results
    Oversubscription during rack or FEX failure

§  Static vs. actual oversubscription
    Often how much data a single node push is IO bound and number of disk configuration

            Uplinks                  Oversubscription             Measured
                                  Theoretical (16 Servers)
                8                                 2:1            Next Slides
                4                                 4:1            Next Slides
                2                                 8:1            Next Slides
                1                                 16:1           Next Slides
                                                                                          45
Network Oversubscriptions
   §  Steady state
   §  Result Replication with 1,2,4, & 8 uplink
   §  Rack Failure with 1, 2, 4 & 8 Uplink




                                                   46
Data Node Speed Differences
 1G vs. 10G TCPDUMP of Reducers TX




•    Generally 1G is being used largely due to the cost/performance trade-offs. Though 10GE can
     provide benefits depending on workload
•    Reduced spike with 10G and smoother job completion time
•    Multiple 1G or 10G links can be bonded together to not only increase bandwidth, but increase
     resiliency.
                                                                                                    47
1GE vs. 10GE Buffer Usage
 Moving from 1GE to 10GE actually lowers the buffer requirement at the switching layer.




                                                                                                                                                                    Job	
  Completion
            Cell	
  Usage




                    109
                    121
                    133
                    145
                    157
                    169
                    181
                    193
                    205
                    217
                    229
                    241
                    253
                    265
                    277
                    289
                    301
                    313
                    325
                    337
                    349
                    361
                    373
                    385
                    397
                    409
                    421
                    433
                    445
                    457
                    469
                    481
                    493
                    505
                    517
                    529
                    541
                    553
                    565
                    577
                    589
                    601
                    613
                    625
                    637
                    649
                    661
                    673
                    685
                    697
                    709
                    721
                    733
                    745
                    757
                    769
                    781
                    793
                      1
                     13
                     25
                     37
                     49
                     61
                     73
                     85
                     97




                                         1G	
  Buffer	
  Used   10G	
  Buffer	
  Used   1G	
  Map	
  %   1G	
  Reduce	
  %   10G	
  Map	
  %   10G	
  Reduce	
  %




     By	
  moving	
  to	
  10GE,	
  the	
  data	
  node	
  has	
  a	
  wider	
  pipe	
  to	
  receive	
  data	
  lessening	
  the	
  
     need	
  for	
  buffers	
  on	
  the	
  network	
  as	
  the	
  total	
  aggregate	
  transfer	
  rate	
  and	
  amount	
  
     of	
  data	
  does	
  not	
  increase	
  substanEally.	
  This	
  is	
  due,	
  in	
  part,	
  to	
  limits	
  of	
  I/O	
  and	
  
     Compute	
  capabiliEes	
  
                                                                                                                                                                                        48
Network Reference Architecture
Characteristics

§ Network Attributes




                                                     Nexus LAN and SAN Core:
                                                     Optimized for Data Centre
 § Architecture
 § Capacity
 § Availability
 § Scale & Oversubscription     blade1
                                 slot 1
                                           blade1
                                            slot 1
                                 blade2
                                  slot 2   blade2
                                            slot 2
                                 blade3    blade3
                                 slot 3     slot 3




 § Flexibility
                                 blade4
                                  slot 4   blade4
                                            slot 4
                                 blade5    blade5
                                 slot 5     slot 5
                                 blade6
                                  slot 6   blade6
                                            slot 6
                                 blade7    blade7
                                 slot 7     slot 7
                                 blade8
                                  slot 8   blade8
                                            slot 8




                                                              Edge/Access Layer
                                 blade1    blade1
                                 slot 1     slot 1
                                 blade2
                                  slot 2   blade2
                                            slot 2
                                 blade3    blade3
                                 slot 3     slot 3
                                 blade4    blade4




 § Management & Visibility
                                  slot 4    slot 4
                                           blade5
                                 blade5
                                 slot 5     slot 5
                                 blade6
                                  slot 6   blade6
                                            slot 6
                                 blade7    blade7
                                 slot 7     slot 7
                                 blade8
                                  slot 8   blade8
                                            slot 8


                                 blade1    blade1
                                 slot 1     slot 1
                                 blade2
                                  slot 2   blade2
                                            slot 2
                                 blade3    blade3
                                 slot 3     slot 3
                                 blade4
                                  slot 4   blade4
                                            slot 4
                                 blade5    blade5
                                 slot 5     slot 5
                                 blade6
                                  slot 6   blade6
                                            slot 6
                                 blade7    blade7
                                 slot 7     slot 7
                                 blade8
                                  slot 8   blade8
                                            slot 8




                                                                                  49
Multi-use Cluster Characteristics


   Hadoop clusters are
 generally multi-use. The
effect of background use
can effect any single jobs
       completion.



 A given Cluster, running many different types of Jobs, Importing into HDFS, Etc.



                                            Importing Data into HDFS




                     Large ETL Job Overlaps with medium and small ETL Jobs and many small BI Jobs
                                  (Blue lines are ETL Jobs and purple lines are BI Jobs)

                                 Example View of 24 Hour Cluster Use
                                                                                                    50
100 Jobs each with 10GB Data Set
Stable, Node & Rack Failure




•    Almost all jobs are impacted with a single node failure
•    With multiple jobs running concurrently, node failure impact is as significant
     as rack failure


                                                                                      51
Network Reference Architecture
Characteristics

§ Network Attributes




                                                     Nexus LAN and SAN Core:
                                                     Optimized for Data Centre
 § Architecture
 § Capacity
 § Availability
 § Scale & Oversubscription     blade1
                                 slot 1
                                           blade1
                                            slot 1
                                 blade2
                                  slot 2   blade2
                                            slot 2
                                 blade3    blade3
                                 slot 3     slot 3




 § Flexibility
                                 blade4
                                  slot 4   blade4
                                            slot 4
                                 blade5    blade5
                                 slot 5     slot 5
                                 blade6
                                  slot 6   blade6
                                            slot 6
                                 blade7    blade7
                                 slot 7     slot 7
                                 blade8
                                  slot 8   blade8
                                            slot 8




                                                              Edge/Access Layer
                                 blade1    blade1
                                 slot 1     slot 1
                                 blade2
                                  slot 2   blade2
                                            slot 2
                                 blade3    blade3
                                 slot 3     slot 3
                                 blade4    blade4




 § Management & Visibility
                                  slot 4    slot 4
                                           blade5
                                 blade5
                                 slot 5     slot 5
                                 blade6
                                  slot 6   blade6
                                            slot 6
                                 blade7    blade7
                                 slot 7     slot 7
                                 blade8
                                  slot 8   blade8
                                            slot 8


                                 blade1    blade1
                                 slot 1     slot 1
                                 blade2
                                  slot 2   blade2
                                            slot 2
                                 blade3    blade3
                                 slot 3     slot 3
                                 blade4
                                  slot 4   blade4
                                            slot 4
                                 blade5    blade5
                                 slot 5     slot 5
                                 blade6
                                  slot 6   blade6
                                            slot 6
                                 blade7    blade7
                                 slot 7     slot 7
                                 blade8
                                  slot 8   blade8
                                            slot 8




                                                                                  52
Burst Handling and Queue Depth

                                           A network that cannot handle
•  Several HDFS operations and             bursts effectively will drop
   phases of MapReduce jobs are very       packets, so optimal buffering is
   bursty in nature                        needed in network devices to
                                           absorb bursts.
•  The extent of bursts largely depend
   on the type of job (ETL vs. BI)            Optimal Buffering
                                           •  Given large enough incast, TCP
                                              will collapse at some point no
•  Bursty phases can include                  matter how large the buffer
   replication of data (either importing   •  Well studied by multiple
   into HDFS or output replication) and       universities
   the output of the mappers during        •  Alternate solutions (Changing
                                              TCP behavior) proposed rather
   the shuffle phase.                         than Huge buffer switches
                                            http://simula.stanford.edu/
                                             sedcl/files/dctcp-final.pdf

                                                                               53
Nexus 2248TP-E Buffer Monitoring
§  Nexus	
  2248TP-­‐E	
  uElizes	
  a	
  32MB	
  shared	
  buffer	
  to	
  handle	
  larger	
  traffic	
  bursts	
  
§  Hadoop,	
  NAS,	
  AVID	
  are	
  examples	
  of	
  bursty	
  applicaEons	
  
§  You	
  can	
  control	
  the	
  queue	
  limit	
  for	
  a	
  specified	
  Fabric	
  Extender	
  for	
  egress	
  
    (network	
  to	
  the	
  host)	
  or	
  ingress(host	
  to	
  network)	
  
§  Extensive Drop Counters
     §  Provides drop counters for both directions: Network to host and Host to Network on a per
         host interface basis
     §  Drop counters for different reason
       •  Out of buffer drop, No credit drop, Queue limit drop(tail drop), MAC error drop, Truncation
          drop, Multicast drop
§  Buffer Occupancy Counter
     §  How much buffer is being used. One key indicator of congestion or bursty traffic


    N5548-L3(config-fex)# hardware N2248TPE queue-limit 4000000 rx
    N5548-L3(config-fex)# hardware N2248TPE queue-limit 4194304	
  tx

    fex-110# show platform software qosctrl asic 0 0



                                                                                                                        54
Buffer Monitoring
switch# attach fex 110
Attaching to FEX 110 ...
To exit type 'exit', to abort type '$.'


fex-110# show platform software qosctrl asic 0 0
number of arguments 4: show asic 0 0
----------------------------------------
QoSCtrl internal info {mod 0x0 asic 0}
mod 0 asic 0:
port type: CIF [0], total: 1, used: 1
port type: BIF [1], total: 1, used: 0
port type: NIF [2], total: 4, used: 4
port type: HIF [3], total: 48, used: 48

bound NIF ports: 2

N2H cells: 14752
H2N cells: 50784

----Programmed Buffers---------

Fixed Cells : 14752
Shared Cells : 50784              ç   Allocated Buffer in terms of cells
(512Bytes)
----Free Buffer Statistics-----
Total Cells : 65374
Fixed Cells : 14590
Shared Cells : 50784              ç   Number of free cells to be monitored




                                                                              55
%)(($,-./)$
                 !"#$"%&'
                 !"#("#('
                 !"#)"#*'
                 !"%#"+('
                 !"%+"+%'
                 !"%$"+)'
                 !"%(",&'
                                                                                        phases.
                 !"%)"$,'
                 !"+%"!#'
                 !"+,"!*'
                 !"+&"#&'
                 !"+*"%+'
                 !",!"+#'
                 !",%"$#'
                 !",,"$)'
                 !",("#)'
                                                                                        completion times.
                 !",)"%$'
                 !"$#",&'
                 !"$,"!&'
                 !"$&"%&'
                 !"$*"++'
                 #"!!",!'
                 #"!%",('
                 #"!,",+'
                                                Shuffle Phase
                 #"!("!+'
                 #"!)"##'
                 #"##"+!'
                                             Buffer Usage During


                 #"#+"+&'
                 #"#$",+'
                 #"#("$!'
                 #"#)"$('
                 #"%%"!$'
                 #"%,"%$'




     -./'0#'
                 #"%("%+'
                 #"+!"#)'
                 #"++"%,'
                 #"+&"+#'
                 #"+)"%$'




     -./'0%'
                 #",%"++'
                 #",$"%('
                 #",("$&'
                 #"$!"+('




     123'4'
                 #"$%",*'
                 #"$,"#*'
                 #"$,"#*'
                 %"!*",$'
                 %"##"#('
                 %"#(",!'




     567896'4'
                 %"%!"%,'
                 %"%%"$$'
                 %"%$",!'
                 %"%*"%$'
                 %"+#"!*'
                 %"++"$+'
                 %"+&"+*'
                 %"+)"%,'
                 %",%"%#'
                 %",$"!&'
                 %",("$#'
                 %"$!"+&'
                 %"$+"%!'
                 %"$&"!$'
                 %"$*"$!'
                 +"!#"+('
                 +"!,"+$'
                 +"!("%!'
                 +"#!"!*'
                 +"#%"$$'
                 +"#$"$&'
                 +"#*",('
                                                                    output Replication




                 +"%#"$,'
                                                                   Buffer Usage During




                 +"%$"!+'
                 +"%*"#!'
                 +"+#"#&'
                 +"+,"!('
                 +"+&"++'
                 +"+)"%#'
                            !"#$%"&'()*"+$
                                                                                    §  The buffer utilization is highest during the shuffle and output replication

                                                                                    §  Optimized buffer sizes are required to avoid packet loss leading to slower job
                                                                                                                                                                         TeraSort FEX(2248TP-E) Buffer Analysis (10TB)




56
Buffer depth monitoring: interface
              §  Real time command displaying the status of the shared buffer.
              §  XML support will be added in the maintenance release
              §  Counters are displayed in cell count. A cell is approximately 208 bytes

              show hardware internal buffer info pkt-stats [brief|clear|detail]




  Buffer	
          Free	
                  Total	
  buffer	
         Max	
  buffer	
  
  usage	
          buffer	
  	
             space	
  on	
  the	
     usage	
  since	
  
                                           pla`orm	
                clear	
  

                                                                                            57
%)(($,-./)$

                                                                                                                                                    !"#$%#&'(
                                                                                                                                                    !"#&)#&"(
                                                                                                                                                                                                                                       phases.
                                                                                                                                                    !"#&'#&*(
                                                                                                                                                    !"#&+#&+(
                                                                                                                                                    !"#'!#&%(
                                                                                                                                                    !"#'&#'!(




     Rack	
  layer	
  
                                                                                                                                                    !"#'*#')(
                                                                                                                                                    !*#,,#'$(
                                                                                                                                                    !*#,$#'&(
                                                                                                                                                    !*#,"#''(
                                                                                                                                                    !*#,%#'"(
                                                                                                                                                    !*#!)#'+(
                                                                                                                                                    !*#!'#'%(
                                                                                                                                                    !*#!%#,,(
                                                                                                                                                    !*#))#,!(
                                                                                                                                                    !*#)'#,)(
                                                                                                                                                    !*#)+#,&(
                                                                                                                                                    !*#$!#,'(
                                                                                                                                                    !*#$&#,"(
                                                                                                                                                                                    Shuffle Phase
                                                                                                                                                    !*#$*#,*(
                                                                                                                                                    !*#&,#,*(
                                                                                                                                                                                 Buffer Usage During


                                                                                                                                                    !*#&$#,+(
                                                                                                                                                    !*#&"#,%(
                                                                                                                                                    !*#&%#!,(
                                                                                                                                                    !*#')#!!(




                                                                                                                                        -$,&+(.!(
                                                                                                                                                                                                                                       slower job completion times.




                                                                                                                                                    !*#''#!!(
                                                                                                                                                    !*#'+#!$(
                                                                                                                                                    !+#,!#!&(
                                                                                                                                                    !+#,&#!'(
                                                                                                                                                    !+#,*#!"(
                                                                                                                                                    !+#!,#!*(




                                                                                                                                        -$,&+(.)(
                                                                                                                                                    !+#!$#!%(
                                                                                                                                                    !+#!"#),(
                                                                                                                                                    !+#!%#)!(
                                                                                                                                                    !+#))#))(
                                                                                                                                                    !+#)'#)$(




                                                                                                                                        -$,"&(
                                                                                                                                                    !+#)+#)&(
                                                                                                                                                    !+#$!#)'(
                                                                                                                                                    !+#$&#)*(
                                                                                                                                                    !+#$*#)+(
                                                                                                                                                    !+#&,#)%(


                                                                                                                                        /01(2(
                                                                                                                                                    !+#&$#$,(
                                                                                                                                                    !+#&"#$!(
                                                                                                                                                    !+#&%#$$(
                                                                                                                                                    !+#')#$&(
                                                                                                                                                    !+#''#$'(
                                                                                                                                                    !+#'+#$"(
                                                                                                                                        345674(2(



                                                                                                                                                    !%#,!#$*(
                                                                                                                                                    !%#,&#$+(
                                                                                                                                                    !%#,*#$%(
                                                                                                                                                    !%#!,#&!(
                                                                                                                                                    !%#!$#&)(
                                                                                                                                                    !%#!"#&)(
                                                                                                                                                                                                               Replication




                                                                                                                                                    !%#!%#&$(
                                                                                                                                                    !%#))#&&(
                                                                                                                                                    !%#)'#&"(
                                                                                                                                                    !%#)+#&*(
                                                                                                                                                    !%#$!#&+(
                                                                                                                                                    !%#$&#&%(
                                                                                                                                                    !%#$*#',(
                                                                                                                                                                                                       Buffer Usage During output




                                                                                                                                                    !%#&,#'!(
                                                                                                                                                    !%#&$#')(
                                                                                                                                                    !%#&"#'&(
                                                                                                                                                    !%#&%#''(
                                                                                                                                                                                                                                                                                                                     TeraSort(ETL) N3k Buffer Analysis (10TB)




                                                                                                                                                    !%#')#'"(
                                                                                                                                                    !%#''#'*(
                                                                                                                                                                                                                                    •  Optimized buffer sizes are required to avoid packet loss leading to




                                                                                                                                                                !"#$%"&'()*"+$
     The	
  AggregaEon	
  switch	
  buffer	
  remained	
  flat	
  as	
  the	
  bursts	
  were	
  absorbed	
  at	
  the	
  Top	
  of	
  
                                                                                                                                                                                                                                    •  The buffer utilization is highest during the shuffle and output replication




58
Network Latency
  Generally network
    latency, while                                                 N3K Topology         5k/2k Topology
  consistent latency
being important, does
   not represent a
 significant factor for



                                           Completion Time (Sec)
  Hadoop Clusters.

Note:
There is a difference in network latency
vs. application latency. Optimization in
the application stack can decrease
application latency that can potentially
have a significant benefit.                                        1TB            5TB           10TB
                                                                    Data Set Size (80 Node Cluster)




                                                                                                         59
Summary                         §  10G and/or Dual attached server
§  Extensive Validation of          provides consistent job completion
                                     time & better buffer utilization
    Hadoop Workload
                                 §  10G provide reduce burst at the
§  Reference Architecture           access layer

   Make it easy for Enterprise   §  A single attached node failure has
                                     considerable impact on job
   Demystify Network for             completion time
   Hadoop Deployment             §  Dual Attached Sever is recommended
   Integration with Enterprise       design – 1G or 10G. 10G for future
                                     proofing
   with efficient choices of
   network topology/devices      §  Rack failure has the biggest impact
                                     on job completion time

                                 §  Does not require non-blocking
                                     network

                                 §  Degree of oversubscription does
                                     impact job completion time

                                 §  Latency does not matter much in
                                     Hadoop work load
                                                                           60
128	
  Node/1PB	
  test	
  
       Big Data @ Cisco                                                                cluster	
  

	
  
	
  
Cisco.com	
  Big	
  Data	
  
www.cisco.com/go/bigdata	
  
	
  
        Cer;fica;ons	
  and	
  Solu;ons	
  with	
  UCS	
  C-­‐Series	
  	
  
        and	
  Nexus	
  5500+22xx	
  
        	
  
               •    EMC	
  Greenplum	
  MR	
  SoluEon	
  
               •    Cloudera	
  Hadoop	
  CerEfied	
  Technology	
  
                       •  Cloudera	
  Hadoop	
  SoluEon	
  Brief	
  
               •    Oracle	
  NoSQL	
  Validated	
  SoluEon	
  
                       •  Oracle	
  NoSQL	
  SoluEon	
  Brief	
  
        	
  
        Mul;-­‐month	
  network	
  and	
  compute	
  analysis	
  
             tes;ng	
  
        (In	
  conjunc;on	
  with	
  Cloudera)	
  
        	
  
               •  Network/Compute	
  ConsideraEons	
  Whitepaper	
  
               •  Presented	
  Analysis	
  at	
  Hadoop	
  World	
  
	
  
	
  
        	
  

                                                                                                            61
THANK YOU FOR
  LISTENING




   Nimish Desai – nidesai@cisco.com
   Technical Leader, Data Center Group
   Cisco Systems Inc.
Break!
Break takes place in the Community Showcase (Hall 2)
Sessions will resume at 3:35pm




                                                       Page 63

Weitere ähnliche Inhalte

Was ist angesagt?

CCNAv5 - S4: Chapter 5: Network Address Translation for ipv4
CCNAv5 - S4: Chapter 5: Network Address Translation for ipv4CCNAv5 - S4: Chapter 5: Network Address Translation for ipv4
CCNAv5 - S4: Chapter 5: Network Address Translation for ipv4Vuz Dở Hơi
 
Data center Technologies
Data center TechnologiesData center Technologies
Data center TechnologiesEMC
 
Wireless LAN Security, Policy, and Deployment Best Practices
Wireless LAN Security, Policy, and Deployment Best PracticesWireless LAN Security, Policy, and Deployment Best Practices
Wireless LAN Security, Policy, and Deployment Best PracticesCisco Mobility
 
Next Generation Network Architecture
Next Generation Network ArchitectureNext Generation Network Architecture
Next Generation Network ArchitectureAPNIC
 
CCNA 2 Routing and Switching v5.0 Chapter 1
CCNA 2 Routing and Switching v5.0 Chapter 1CCNA 2 Routing and Switching v5.0 Chapter 1
CCNA 2 Routing and Switching v5.0 Chapter 1Nil Menon
 
Kubernetes Native Infrastructure and CoreOS Operator Framework for 5G Edge Cl...
Kubernetes Native Infrastructure and CoreOS Operator Framework for 5G Edge Cl...Kubernetes Native Infrastructure and CoreOS Operator Framework for 5G Edge Cl...
Kubernetes Native Infrastructure and CoreOS Operator Framework for 5G Edge Cl...Hidetsugu Sugiyama
 
Wi fi 6 (802.11ax) presentation
Wi fi 6 (802.11ax) presentationWi fi 6 (802.11ax) presentation
Wi fi 6 (802.11ax) presentationBryan Slayman
 
Intro to Network Automation
Intro to Network AutomationIntro to Network Automation
Intro to Network AutomationAlbert Suwandhi
 
Intermediate: 5G Applications Architecture - A look at Application Functions ...
Intermediate: 5G Applications Architecture - A look at Application Functions ...Intermediate: 5G Applications Architecture - A look at Application Functions ...
Intermediate: 5G Applications Architecture - A look at Application Functions ...3G4G
 
Introduction to Virtualization
Introduction to VirtualizationIntroduction to Virtualization
Introduction to VirtualizationRahul Hada
 
Network Troubleshooting - Part 1
Network Troubleshooting - Part 1Network Troubleshooting - Part 1
Network Troubleshooting - Part 1SolarWinds
 
Cisco Study: State of Web Security
Cisco Study: State of Web Security Cisco Study: State of Web Security
Cisco Study: State of Web Security Cisco Canada
 

Was ist angesagt? (20)

netconf and yang
netconf and yangnetconf and yang
netconf and yang
 
CCNAv5 - S4: Chapter 5: Network Address Translation for ipv4
CCNAv5 - S4: Chapter 5: Network Address Translation for ipv4CCNAv5 - S4: Chapter 5: Network Address Translation for ipv4
CCNAv5 - S4: Chapter 5: Network Address Translation for ipv4
 
CCNA
CCNACCNA
CCNA
 
Data center Technologies
Data center TechnologiesData center Technologies
Data center Technologies
 
Wireless LAN Security, Policy, and Deployment Best Practices
Wireless LAN Security, Policy, and Deployment Best PracticesWireless LAN Security, Policy, and Deployment Best Practices
Wireless LAN Security, Policy, and Deployment Best Practices
 
NETCONF YANG tutorial
NETCONF YANG tutorialNETCONF YANG tutorial
NETCONF YANG tutorial
 
Next Generation Network Architecture
Next Generation Network ArchitectureNext Generation Network Architecture
Next Generation Network Architecture
 
6LoWPAN: An Open IoT Networking Protocol
6LoWPAN: An Open IoT Networking Protocol6LoWPAN: An Open IoT Networking Protocol
6LoWPAN: An Open IoT Networking Protocol
 
Chapter 10 - DHCP
Chapter 10 - DHCPChapter 10 - DHCP
Chapter 10 - DHCP
 
CCNA 2 Routing and Switching v5.0 Chapter 1
CCNA 2 Routing and Switching v5.0 Chapter 1CCNA 2 Routing and Switching v5.0 Chapter 1
CCNA 2 Routing and Switching v5.0 Chapter 1
 
SD WAN
SD WANSD WAN
SD WAN
 
Kubernetes Native Infrastructure and CoreOS Operator Framework for 5G Edge Cl...
Kubernetes Native Infrastructure and CoreOS Operator Framework for 5G Edge Cl...Kubernetes Native Infrastructure and CoreOS Operator Framework for 5G Edge Cl...
Kubernetes Native Infrastructure and CoreOS Operator Framework for 5G Edge Cl...
 
Network design
Network designNetwork design
Network design
 
Wi fi 6 (802.11ax) presentation
Wi fi 6 (802.11ax) presentationWi fi 6 (802.11ax) presentation
Wi fi 6 (802.11ax) presentation
 
Intro to Network Automation
Intro to Network AutomationIntro to Network Automation
Intro to Network Automation
 
Intermediate: 5G Applications Architecture - A look at Application Functions ...
Intermediate: 5G Applications Architecture - A look at Application Functions ...Intermediate: 5G Applications Architecture - A look at Application Functions ...
Intermediate: 5G Applications Architecture - A look at Application Functions ...
 
Introduction to Virtualization
Introduction to VirtualizationIntroduction to Virtualization
Introduction to Virtualization
 
Network Troubleshooting - Part 1
Network Troubleshooting - Part 1Network Troubleshooting - Part 1
Network Troubleshooting - Part 1
 
Cisco Study: State of Web Security
Cisco Study: State of Web Security Cisco Study: State of Web Security
Cisco Study: State of Web Security
 
Network Access Control (NAC)
Network Access Control (NAC)Network Access Control (NAC)
Network Access Control (NAC)
 

Andere mochten auch

FATTREE: A scalable Commodity Data Center Network Architecture
FATTREE: A scalable Commodity Data Center Network ArchitectureFATTREE: A scalable Commodity Data Center Network Architecture
FATTREE: A scalable Commodity Data Center Network ArchitectureAnkita Mahajan
 
Data Center Network Topologies
Data Center Network TopologiesData Center Network Topologies
Data Center Network Topologiesrjain51
 
Introduction to Data Center Network Architecture
Introduction to Data Center Network ArchitectureIntroduction to Data Center Network Architecture
Introduction to Data Center Network ArchitectureAnkita Mahajan
 
A Scalable, Commodity Data Center Network Architecture
A Scalable, Commodity Data Center Network ArchitectureA Scalable, Commodity Data Center Network Architecture
A Scalable, Commodity Data Center Network ArchitectureGunawan Jusuf
 
QFabric: Reinventing the Data Center Network
QFabric: Reinventing the Data Center NetworkQFabric: Reinventing the Data Center Network
QFabric: Reinventing the Data Center NetworkJuniper Networks
 
Net Ops Data Center Architecture Diagram 06
Net Ops Data Center Architecture Diagram 06Net Ops Data Center Architecture Diagram 06
Net Ops Data Center Architecture Diagram 06jeffqw
 
Modern Data Center Network Architecture - The house that Clos built
Modern Data Center Network Architecture - The house that Clos builtModern Data Center Network Architecture - The house that Clos built
Modern Data Center Network Architecture - The house that Clos builtCumulus Networks
 
Data center network architectures v1.3
Data center network architectures v1.3Data center network architectures v1.3
Data center network architectures v1.3Jeong, Wookjae
 
Designing Secure Cisco Data Centers
Designing Secure Cisco Data CentersDesigning Secure Cisco Data Centers
Designing Secure Cisco Data CentersCisco Russia
 
Simplifying Data Center Design/ Build
Simplifying Data Center Design/ BuildSimplifying Data Center Design/ Build
Simplifying Data Center Design/ BuildSchneider Electric
 
Enterprise data center design and methodology
Enterprise data center design and methodologyEnterprise data center design and methodology
Enterprise data center design and methodologyCarlos León Araujo
 
The New Network for the Data Center
The New Network for the Data CenterThe New Network for the Data Center
The New Network for the Data CenterJuniper Networks
 
Tia 942 Data Center Standards
Tia 942 Data Center StandardsTia 942 Data Center Standards
Tia 942 Data Center StandardsSri Chalasani
 
POWER POINT PRESENTATION ON DATA CENTER
POWER POINT PRESENTATION ON DATA CENTERPOWER POINT PRESENTATION ON DATA CENTER
POWER POINT PRESENTATION ON DATA CENTERvivekprajapatiankur
 
Data center Building & General Specification
Data center Building & General Specification Data center Building & General Specification
Data center Building & General Specification Ali Mirfallah
 

Andere mochten auch (20)

FATTREE: A scalable Commodity Data Center Network Architecture
FATTREE: A scalable Commodity Data Center Network ArchitectureFATTREE: A scalable Commodity Data Center Network Architecture
FATTREE: A scalable Commodity Data Center Network Architecture
 
Data Center Network Topologies
Data Center Network TopologiesData Center Network Topologies
Data Center Network Topologies
 
Introduction to Data Center Network Architecture
Introduction to Data Center Network ArchitectureIntroduction to Data Center Network Architecture
Introduction to Data Center Network Architecture
 
A Scalable, Commodity Data Center Network Architecture
A Scalable, Commodity Data Center Network ArchitectureA Scalable, Commodity Data Center Network Architecture
A Scalable, Commodity Data Center Network Architecture
 
QFabric: Reinventing the Data Center Network
QFabric: Reinventing the Data Center NetworkQFabric: Reinventing the Data Center Network
QFabric: Reinventing the Data Center Network
 
diagrama5
diagrama5diagrama5
diagrama5
 
diagrama 2
diagrama 2diagrama 2
diagrama 2
 
diagrama 6
diagrama 6diagrama 6
diagrama 6
 
Net Ops Data Center Architecture Diagram 06
Net Ops Data Center Architecture Diagram 06Net Ops Data Center Architecture Diagram 06
Net Ops Data Center Architecture Diagram 06
 
Modern Data Center Network Architecture - The house that Clos built
Modern Data Center Network Architecture - The house that Clos builtModern Data Center Network Architecture - The house that Clos built
Modern Data Center Network Architecture - The house that Clos built
 
Modular Data Center Design
Modular Data Center DesignModular Data Center Design
Modular Data Center Design
 
Data center network architectures v1.3
Data center network architectures v1.3Data center network architectures v1.3
Data center network architectures v1.3
 
Designing Secure Cisco Data Centers
Designing Secure Cisco Data CentersDesigning Secure Cisco Data Centers
Designing Secure Cisco Data Centers
 
Simplifying Data Center Design/ Build
Simplifying Data Center Design/ BuildSimplifying Data Center Design/ Build
Simplifying Data Center Design/ Build
 
Enterprise data center design and methodology
Enterprise data center design and methodologyEnterprise data center design and methodology
Enterprise data center design and methodology
 
The New Network for the Data Center
The New Network for the Data CenterThe New Network for the Data Center
The New Network for the Data Center
 
Data center network reference architecture with hpe flex fabric
Data center network reference architecture with hpe flex fabricData center network reference architecture with hpe flex fabric
Data center network reference architecture with hpe flex fabric
 
Tia 942 Data Center Standards
Tia 942 Data Center StandardsTia 942 Data Center Standards
Tia 942 Data Center Standards
 
POWER POINT PRESENTATION ON DATA CENTER
POWER POINT PRESENTATION ON DATA CENTERPOWER POINT PRESENTATION ON DATA CENTER
POWER POINT PRESENTATION ON DATA CENTER
 
Data center Building & General Specification
Data center Building & General Specification Data center Building & General Specification
Data center Building & General Specification
 

Ähnlich wie Reference Architecture-Validated & Tested Approach to Define Network Design

Cisco data center switch nexus series training presentation by zerone
Cisco data center switch nexus series training presentation by zeroneCisco data center switch nexus series training presentation by zerone
Cisco data center switch nexus series training presentation by zerone零壹科技股份有限公司
 
High Performance Cyberinfrastructure Enables Data-Driven Science in the Globa...
High Performance Cyberinfrastructure Enables Data-Driven Science in the Globa...High Performance Cyberinfrastructure Enables Data-Driven Science in the Globa...
High Performance Cyberinfrastructure Enables Data-Driven Science in the Globa...Larry Smarr
 
Cisco nexus 7009 overview
Cisco nexus 7009 overviewCisco nexus 7009 overview
Cisco nexus 7009 overviewHamza Al-Qudah
 
Apresentações | Jantar Exclusivo Cisco e Netapp | 27 de Junho de 2012 | Spett...
Apresentações | Jantar Exclusivo Cisco e Netapp | 27 de Junho de 2012 | Spett...Apresentações | Jantar Exclusivo Cisco e Netapp | 27 de Junho de 2012 | Spett...
Apresentações | Jantar Exclusivo Cisco e Netapp | 27 de Junho de 2012 | Spett...Softcorp
 
Introduction to nexux from zero to Hero
Introduction to nexux  from zero to HeroIntroduction to nexux  from zero to Hero
Introduction to nexux from zero to HeroDhruv Sharma
 
In-Network Acceleration with FPGA (MEMO)
In-Network Acceleration with FPGA (MEMO)In-Network Acceleration with FPGA (MEMO)
In-Network Acceleration with FPGA (MEMO)Naoto MATSUMOTO
 
Dc tco in_a_nutshell
Dc tco in_a_nutshellDc tco in_a_nutshell
Dc tco in_a_nutshellerjosito
 
CELC_VM-FEX with Cisco Virtual Interface Card
CELC_VM-FEX with Cisco Virtual Interface CardCELC_VM-FEX with Cisco Virtual Interface Card
CELC_VM-FEX with Cisco Virtual Interface CardCisco Russia
 
Cisco nexus series
Cisco nexus seriesCisco nexus series
Cisco nexus seriesAnwesh Dixit
 
Cisco at v mworld 2015 vmworld 2015 mds final preso
Cisco at v mworld 2015 vmworld 2015 mds final presoCisco at v mworld 2015 vmworld 2015 mds final preso
Cisco at v mworld 2015 vmworld 2015 mds final presoldangelo0772
 
Nexus 7000 Series Innovations: M3 Module, DCI, Scale
Nexus 7000 Series Innovations: M3 Module, DCI, ScaleNexus 7000 Series Innovations: M3 Module, DCI, Scale
Nexus 7000 Series Innovations: M3 Module, DCI, ScaleTony Antony
 
Brkarc 3470 - cisco nexus 7000-7700 switch architecture (2016 las vegas) - 2 ...
Brkarc 3470 - cisco nexus 7000-7700 switch architecture (2016 las vegas) - 2 ...Brkarc 3470 - cisco nexus 7000-7700 switch architecture (2016 las vegas) - 2 ...
Brkarc 3470 - cisco nexus 7000-7700 switch architecture (2016 las vegas) - 2 ...kds850
 
Dell - 9febr2012
Dell - 9febr2012Dell - 9febr2012
Dell - 9febr2012Agora Group
 
RunningQuantumOnQuantumAtNicira.pdf
RunningQuantumOnQuantumAtNicira.pdfRunningQuantumOnQuantumAtNicira.pdf
RunningQuantumOnQuantumAtNicira.pdfOpenStack Foundation
 
Scalable midsize data center designs
Scalable midsize data center designsScalable midsize data center designs
Scalable midsize data center designsJing Bai
 
数据中心网络研究:机遇与挑战
数据中心网络研究:机遇与挑战数据中心网络研究:机遇与挑战
数据中心网络研究:机遇与挑战Weiwei Fang
 
Brkarc 3454 - in-depth and personal with the cisco nexus 2000 fabric extender...
Brkarc 3454 - in-depth and personal with the cisco nexus 2000 fabric extender...Brkarc 3454 - in-depth and personal with the cisco nexus 2000 fabric extender...
Brkarc 3454 - in-depth and personal with the cisco nexus 2000 fabric extender...kds850
 
Why 10 Gigabit Ethernet Draft v2
Why 10 Gigabit Ethernet Draft v2Why 10 Gigabit Ethernet Draft v2
Why 10 Gigabit Ethernet Draft v2Vijay Tolani
 
cisco-n2k-c2348upq-datasheet.pdf
cisco-n2k-c2348upq-datasheet.pdfcisco-n2k-c2348upq-datasheet.pdf
cisco-n2k-c2348upq-datasheet.pdfHi-Network.com
 

Ähnlich wie Reference Architecture-Validated & Tested Approach to Define Network Design (20)

Cisco data center switch nexus series training presentation by zerone
Cisco data center switch nexus series training presentation by zeroneCisco data center switch nexus series training presentation by zerone
Cisco data center switch nexus series training presentation by zerone
 
High Performance Cyberinfrastructure Enables Data-Driven Science in the Globa...
High Performance Cyberinfrastructure Enables Data-Driven Science in the Globa...High Performance Cyberinfrastructure Enables Data-Driven Science in the Globa...
High Performance Cyberinfrastructure Enables Data-Driven Science in the Globa...
 
Cisco nexus 7009 overview
Cisco nexus 7009 overviewCisco nexus 7009 overview
Cisco nexus 7009 overview
 
Apresentações | Jantar Exclusivo Cisco e Netapp | 27 de Junho de 2012 | Spett...
Apresentações | Jantar Exclusivo Cisco e Netapp | 27 de Junho de 2012 | Spett...Apresentações | Jantar Exclusivo Cisco e Netapp | 27 de Junho de 2012 | Spett...
Apresentações | Jantar Exclusivo Cisco e Netapp | 27 de Junho de 2012 | Spett...
 
Introduction to nexux from zero to Hero
Introduction to nexux  from zero to HeroIntroduction to nexux  from zero to Hero
Introduction to nexux from zero to Hero
 
In-Network Acceleration with FPGA (MEMO)
In-Network Acceleration with FPGA (MEMO)In-Network Acceleration with FPGA (MEMO)
In-Network Acceleration with FPGA (MEMO)
 
Dc tco in_a_nutshell
Dc tco in_a_nutshellDc tco in_a_nutshell
Dc tco in_a_nutshell
 
CELC_VM-FEX with Cisco Virtual Interface Card
CELC_VM-FEX with Cisco Virtual Interface CardCELC_VM-FEX with Cisco Virtual Interface Card
CELC_VM-FEX with Cisco Virtual Interface Card
 
Cisco nexus series
Cisco nexus seriesCisco nexus series
Cisco nexus series
 
Cisco at v mworld 2015 vmworld 2015 mds final preso
Cisco at v mworld 2015 vmworld 2015 mds final presoCisco at v mworld 2015 vmworld 2015 mds final preso
Cisco at v mworld 2015 vmworld 2015 mds final preso
 
Nexus 7000 Series Innovations: M3 Module, DCI, Scale
Nexus 7000 Series Innovations: M3 Module, DCI, ScaleNexus 7000 Series Innovations: M3 Module, DCI, Scale
Nexus 7000 Series Innovations: M3 Module, DCI, Scale
 
Brkarc 3470 - cisco nexus 7000-7700 switch architecture (2016 las vegas) - 2 ...
Brkarc 3470 - cisco nexus 7000-7700 switch architecture (2016 las vegas) - 2 ...Brkarc 3470 - cisco nexus 7000-7700 switch architecture (2016 las vegas) - 2 ...
Brkarc 3470 - cisco nexus 7000-7700 switch architecture (2016 las vegas) - 2 ...
 
Dell - 9febr2012
Dell - 9febr2012Dell - 9febr2012
Dell - 9febr2012
 
RunningQuantumOnQuantumAtNicira.pdf
RunningQuantumOnQuantumAtNicira.pdfRunningQuantumOnQuantumAtNicira.pdf
RunningQuantumOnQuantumAtNicira.pdf
 
Scalable midsize data center designs
Scalable midsize data center designsScalable midsize data center designs
Scalable midsize data center designs
 
数据中心网络研究:机遇与挑战
数据中心网络研究:机遇与挑战数据中心网络研究:机遇与挑战
数据中心网络研究:机遇与挑战
 
Brkarc 3454 - in-depth and personal with the cisco nexus 2000 fabric extender...
Brkarc 3454 - in-depth and personal with the cisco nexus 2000 fabric extender...Brkarc 3454 - in-depth and personal with the cisco nexus 2000 fabric extender...
Brkarc 3454 - in-depth and personal with the cisco nexus 2000 fabric extender...
 
Brkarc 3454
Brkarc 3454Brkarc 3454
Brkarc 3454
 
Why 10 Gigabit Ethernet Draft v2
Why 10 Gigabit Ethernet Draft v2Why 10 Gigabit Ethernet Draft v2
Why 10 Gigabit Ethernet Draft v2
 
cisco-n2k-c2348upq-datasheet.pdf
cisco-n2k-c2348upq-datasheet.pdfcisco-n2k-c2348upq-datasheet.pdf
cisco-n2k-c2348upq-datasheet.pdf
 

Mehr von DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

Mehr von DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Kürzlich hochgeladen

Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfDaniel Santiago Silva Capera
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfJamie (Taka) Wang
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXTarek Kalaji
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IES VE
 

Kürzlich hochgeladen (20)

Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
 

Reference Architecture-Validated & Tested Approach to Define Network Design

  • 1. Hadoop - Validated Network Architecture and Reference Deployment in Enterprise Nimish Desai – nidesai@cisco.com Technical Leader, Data Center Group Cisco Systems Inc.
  • 2. Session Objectives & Takeways Goal 1: Provide Reference Network Architecture for Hadoop in Enterprise Goal 2: Characterize Hadoop Application on Network Goal 3: Network Validation Results with Hadoop Workload 2
  • 4. Validated 96 Node Hadoop Cluster Nexus 7000 Nexus 7000 Nexus 5548 Nexus 5548 2248TP-E Nexus 3000 Nexus 3000 2248TP-E Name Node Name Node Cisco UCS C 200 Cisco UCS C200 Single NIC Single NIC … … … … Data Nodes 1 – 48 Data Nodes 49- 96 Data Nodes 1 – 48 Data Nodes 49 - 96 Cisco UCS C 200 Single NIC Cisco UCS 200 Single NIC Cisco UCS C 200 Single NIC Cisco UCS C 200 Single NIC Traditional DC Design Nexus 55xx/2248 Nexus 7K-N3K based Topology §  Hadoop Framework §  Network Apache 0.20.2 Three Racks each with 32 nodes Linux 6.2 Distribution Layer – Nexus 7000 or Slots – 10 Maps & 2 Reducers per node Nexus 5000 §  Compute – UCS C200 M2 ToR – FEX or Nexus 3000 Cores: 12 2 FEX per Rack Processor: 2 x Intel(R) Xeon(R) CPU X5670 @ 2.93GHz Each Rack with either 32 single or Disk: 4 x 2TB (7.2K RPM) dual attached host Network: 1G: LOM, 10G: Cisco UCS P81E
  • 5. Data Center Infrastructure WAN Edge Layer FC FC SAN A SAN B Nexus 7000 Layer 3 MDS 9500 10 GE Core Layer 2 - 1GE SAN Director Layer 2 - 10GE Core Layer 10 GE DCB (LAN & SAN) 10 GE FCoE/DCB 4/8 Gb FC Nexus 7000 10 GE Aggr vPC+ L3 FabricPath Aggregation & Services L2 Layer Network Services FC FC SAN SAN A Access B Layer Nexus SAN Edge 5500 MDS 9200 / FCoE 9100 B22 FEX Nexus 5500 10GE CBS 31xx Nexus 7000 Nexus 5500 FCoE UCS FCoE HP Bare Metal Nexus 2148TP-E Blade switch Nexus 2232 Nexus 3000 Blade End-of-Row 1G Nexus 3000 Bare Metal Top-of-Rack Top-of-Rack C-class Top-of-Rack 10G 1 GbE Server Access & 4/8Gb FC via dual HBA (SAN A // SAN B) 10Gb DCB / FCoE Server Access or 10 GbE Server Access & 4/8Gb FC via dual HBA (SAN A // SAN B) © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 5
  • 6. Big Data Application Realm – Web 2.0 & Social/Community Networks §  Data live/die in Internet only entities §  Data Domain Partially private Data UI Service §  Homogeneous Data Life Cycle store Mostly Unstructured Web Centric, User Driven Unified workload – few process & owners Typically non-virtualized §  Scaling & Integration Dynamics Purpose Driven Apps Thousands of nodes Hundreds of PB and growing exponentially 6
  • 7. Big Data Application Realm - Enterprise §  Data Lives in a confined zone of enterprise repository §  Long Lived, Regulatory and Compliance Call Sales ERP Doc Recor Doc Driven Cente Pipeli Modul Mgmt ds Mgmt r ne eA A Mgmt B §  Heterogeneous Data Life Cycle Data ERP Soc Office Video §  Many Data Models Servic Media Modul eB Apps Conf Collab e §  Diverse data – Structured and Unstructured Produc Catalo Exec §  Diverse data sources - Subscriber based Customer DB (Oracle/SAP) t Catalo g VOIP Report g Data s §  Diverse workload from many sources/groups/ process/technology §  Virtualized and non-virtualized with mostly SAN/NAS base §  Scaling & Integration Dynamics are different §  Data Warehousing(structured) with divers repository + Unstructured Data §  Few hundred to thousand nodes, few PB §  Integration, Policy & Security Challenges §  Each Apps/Group/Technology limited in §  data generation §  Consumption §  Servicing confined domains 7
  • 8. Big Data Framework Application Comparison Batch-oriented Real-time Big Relational Big Data Database Data NoSQL (Hadoop) •  Structured Data – Rows •  Unstructured Data – •  Hbase, Cassandra, Oriented Files, logs, Web-Clicks Oracle •  Optimized for OLTP/ •  Data format is •  Structured and OLAP abstracted to higher Unstructured Data •  Rigid schema applied to level application •  Sparse column-family data on insert/update programing data storage or Key- •  Read and write (insert, •  Schema-less, flexible value pair update) many times for later re-use •  Not a RDBMS, though •  Non-linear scaling •  Write once, read many with some schema •  Most transactions and •  Data never dies •  Random read and write queries involve a small •  Linear scaling •  Modeled after Google’s subset of data set •  Entire data set at play BigTable •  Transactional – scaling for a given query •  High transaction – real to thousands of queries •  Multi PB time scaling to millions •  GB to TBs size •  Not suited for ad-hoc analysis •  More suited for ~1 PB 8
  • 9. Data Sources Big Data Enterprise Application Machine logs Sensor data Sales      Products   Call data records Web click Process    Inventory         stream data Finance  Payroll         Satellite feeds GPS data Sales Shipping        Tracking   data Blogs Emails Pictures Authoriza;on     Video Customers  Profile     mn Colu Store ness Busi ence ig Intell 9
  • 10. Big Data Building Blocks into the Enterprise Big Data Socia Event Application Click Streams Data l Media Mobility Trends Virtualized, Bare Metal and Cloud Sensor Logs Data Cisco Unified Fabric Traditional “Big Data” Storage “Big Data” Database NoSQL Real-Time Capture, Store and Analyze SAN and NAS Read and Update RDBMS Operations 10
  • 11. Infinite Use Cases §  Web & E-Commerce Faster User Response Customer Behaviors & Pricing Models Ad Target §  Retails Customer Churn & Integration of brick & mortar with .com business models PoS Transactional Analysis §  Insurance & Finance Risk Management User Behavior & Incentive Management Trade Surveillance for Financials §  Network Analytics – Splunk Text Mining Fault Prediction §  Security & Threat Defense 11
  • 12. Hadoop Cluster Design & Reference Network Architecture 12
  • 13. Hadoop Components and Operations Hadoop Distributed File System Blo Blo Blo Blo Blo Blo ck 1 ck 2 ck 3 ck 4 ck 5 ck 6 §  Data is not centrally located, Data is stored across all data nodes in the cluster §  Scalable & Fault Tolerant §  Data is divided in multiple large ToR FEX/ ToR FEX/ ToR FEX/ blocks – 64MB default, typical switch switch switch block 128MB Data Data Data §  Blocks are not the related to disk node 1 node 6 node 11 geometry Data Data Data §  Data is stored reliably. Each block node 2 node 7 node 12 is replicated 3 times Data Data Data §  Types of Functions node 3 node 8 node 13 §  Name Node (Master) - Manages Data Data Data Cluster node 4 node 9 node 14 §  Data Node (Map and Reducer) – Carries blocks Data Data Data node 5 node 10 node 15 13
  • 14. Hadoop Components and Operations Name §  Name Node Node Runs a scheduler – Job Tracker Manages all data nodes, in memory Secondary Name Node – Snapshot of meta data of HDFS cluster ToR FEX/ ToR FEX/ ToR FEX/ Typically all three JVM can run on single node switch switch switch Data Data Data §  Data Node node 1 node 6 node 11 Task Tracker Receives Job Info from Job Tracker Data Data Data (Name Node) node 2 node 7 node 12 Map & Reducer Task Managed by Task Tracker Data Data Data Configurable Ratio of Map & Reduce Task for various node 3 node 8 node 13 workload per Node/CPU/Core Data Data Data Data Locality - IF data not available where the map node 4 node 9 node 14 task is assigned, a missing block be copied over the network Data Data Data node 5 node 10 node 15 14
  • 15. Characteristics that Affect Hadoop Clusters §  Cluster Size §  Characteristics of Data Node Number of Data Nodes ‒ I/O, CPU, Memory, etc. §  Data Model & Mapper/Reduces §  Networking Characteristics Ratio ‒  Availability MapReduce functions ‒  Buffering §  Input Data Size ‒  Data Node Speed (1G vs. 10G) Total starting dataset ‒  Oversubscription ‒  Latency §  Data Locality in HDFS Ability to processes data where it already is located §  Background Activity Number of Jobs running http://www.cloudera.com/resource/hadoop- type of jobs world-2011-presentation-video-hadoop-network- Importing and-compute-architecture-considerations/ exporting 15
  • 16. Hadoop Components and Operations Hadoop Distributed File System Unstructured Data §  The Data Ingest & Replication External Connectivity Map Map Map Map Map Map Map Map East West Traffic (Replication of data Map Map Map Map Map Map Map Map blocks) Map Map Map Map §  Map Phase – Raw data Analyzed and converted to name/value pair. Shuffle Phase Workload translate to multiple batches of Map task Key 1 Key 1 Key 1 Key 1 Key 1 Key 1 Key 1 Key 1 Key 1 Key 1 Key 1 Key 1 Reducer can start the reduce Key 1 Key 2 Key 3 Key 4 phase ONLY after the entire Map set is complete §  Mostly a IO/compute function Reduce Reduce Reduce Reduce Result/Output 16
  • 17. Hadoop Components and Operations Hadoop Distributed File System Unstructured Data §  Shuffle Phase - All name/value pair are sorted and grouped by their keys. Map Map Map Map §  Mapper sending the data to Reducers Map Map Map Map Map Map Map Map §  High Network Activity Map Map Map Map Map Map Map Map §  Reduce Phase – All values associates with a key are process for results, three phases Copy - get intermediate result from each data Shuffle Phase node local disk Merge - to reduce the number of files Key 1 Key 1 Key 1 Key 1 Key 1 Key 1 Key 1 Key 1 Reduce method Key 1 Key 1 Key 1 Key 1 Key 1 Key 2 Key 3 Key 4 §  Output Replication Phase - Reducer replicating result to multiple nodes Highest Network Activity Reduce Reduce Reduce Reduce §  Network Activities Dependent on Workload Behavior Result/Output 17
  • 18. MapReduce Data Model ETL & BI Workload Benchmark The complexity of the functions used in Map and/or Reduce has a large impact on the job completion time and network traffic. Yahoo  TeraSort  –  ETL  Workload  –  Most  Network  Intensive   Reducers  Start   Map  Start   Map  Finish   Job  Finish   •  Input,  Shuffle  and  Output  data  size  is  the  same  –  e.g.  10  TB  data  set  in  all  phases   •  Yahoo  Terasort  has  a  more  balanced  Map  vs.  Reduce  funcEons  -­‐  linear  compute  and  IO   Shakespeare  WordCount  –  BI  Workload   Reducers  Start   Map  Finish   Map  Start   Job  Finish   •  Data  set  size  varies  in  various  phase  –  Varying  impact  on  the  network  e.g.  1TB  Input,   10MB  Shuffle,  1MB  Output   •  Most  of  the  processing  in  the  Map  FuncEons,  smaller  intermediate  and  even  smaller  final   Data     18
  • 19. ETL Workload (1TB Yahoo Terasort) Network Graph of all Traffic Received on an Single Node (80 Node Run) Shortly  aNer  the  Reducers  start  Map  tasks  are  finishing  and  data  is  being  shuffled  to  reducers   As  Maps  completely  finish  the  network  is  no  loner  used  as  Reducers  have  all  the  data  they   need  to  finish  the  job   The red line is the total These amount of symbols traffic represent a received by node sending hpc064 traffic to HPC064 Reducers Job Start Maps Complete Maps Start Finish 19
  • 20. ETL Workload (1TB Yahoo Terasort) Network Activity of all Traffic Received on an Single Node (80 Node Run) If  output  replica;on  is  enabled,  then  the  end  of  the  terasort,  must  store  addi;onal   copies.  For  a  1TB  sort,  2TB  will  need  to  be  replicated  across  the  network.   Output Data Replication Enabled §  Replication of 3 enabled (1 copy stored locally, 2 stored remotely) §  Each reduce output is replicated now, instead of just stored locally 20
  • 21. BI Workload Network Graph of all Traffic Received on an Single Node (80 Node Run) Wordcount on 200K Copies of complete works of Shakespeare Due  the  combinaEon  of  the  length  of  the  Map  phase  and  the  reduced  data  set  being   shuffled,  the  network  is  being  uElized  throughout  the  job,  but  by  a  limited  amount.   These The red line is symbols the total represent a amount of node sending traffic traffic to received by HPC064 hpc064 Reducers Job Maps Start Start Maps Complete Finish 21
  • 22. Data Locality in HDFS Data Locality – The ability to process data where it is locally stored. Observations §Notice this initial spike in RX Traffic is before the Reducers kick in. § It represents data each map task needs Note: that is not local. During the Map Phase, the JobTracker § Looking at the spike attempts to use data locality to schedule it is mainly data from only a few nodes. map tasks where the data is locally stored. This is not perfect and is dependent on a data nodes where the Reducers Start Job data is located. This is a consideration Maps Start Maps Finish Complete when choosing the replication factor. Map  Tasks:  IniEal  spike  for  non-­‐local  data.  SomeEmes  a  task   may  be  scheduled  on  a  node  that  does  not  have  the  data   More replicas tend to create higher available  locally.     probability for data locality. 22
  • 23. Map to Reducer Ratio Impact on Job Completion §  1 TB file with 128 MB Blocks == 7,813 Map Tasks §  The job completion time is directly related to number of reducers §  Average Network buffer usage lowers as number of reducer gets lower (see hidden slides) and vice versa. Job Completion Time in Sec 800 Total Graph of Job 700 Completion Time in Sec 600 500 30000 400 300 200 25000 100 0 192 96 48 20000 No. Of Reduceers 15000 Job Completion Time in Sec 10000 30000 25000 20000 5000 15000 10000 0 5000 192 96 48 24 12 6 0 No. Of Reduceers 24 12 6 No. Of Reduceers 23
  • 24. Job Completion Time with 96 Reducers 24
  • 25. Job Completion Time with 48 Reducers 25
  • 26. Job Completion Graph with 24 Reducers 26
  • 27. Network Characteristics The relative impact of various network characteristics on Hadoop clusters* Availablity Buffering Oversubscription Data Node Speed Latency * Not a scaled or measured data 27
  • 29. Data Center Access Connectivity Nexus 7000 MDS 9000 Core Distribution LAN SAN Unified Access Layer Nexus 5000 Nexus 1000V Direct Attach 10GE Nexus 4000 Cisco Nexus Nexus Nexus 2000 2000 2000 UCS 1 & 10GE 10GE Blade 1GE Rack 10GE Rack 10GE Rack UCS Compute Blade Servers Switch w/ FCoE Mount Servers Mount Servers Mount Servers Blade & Rack w/ Pass-Thru (IBM/Dell) BRKAPP-2027 © 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public 29
  • 30. Network Reference Architecture § Network Attributes Nexus LAN and SAN Core: Optimized for Data Centre § Architecture §  Availability § Capacity, Scale & Oversubscription blade1 § Flexibility blade1 slot 1 slot 1 blade2 slot 2 blade2 slot 2 blade3 blade3 slot 3 slot 3 blade4 slot 4 blade4 slot 4 blade5 blade5 slot 5 slot 5 blade6 slot 6 blade6 slot 6 blade7 blade7 slot 7 slot 7 blade8 slot 8 blade8 slot 8 Edge/Access Layer blade1 blade1 slot 1 slot 1 § Management & Visibility blade2 slot 2 blade2 slot 2 blade3 blade3 slot 3 slot 3 blade4 slot 4 blade4 slot 4 blade5 blade5 slot 5 slot 5 blade6 slot 6 blade6 slot 6 blade7 blade7 slot 7 slot 7 blade8 slot 8 blade8 slot 8 blade1 blade1 slot 1 slot 1 blade2 slot 2 blade2 slot 2 blade3 blade3 slot 3 slot 3 blade4 slot 4 blade4 slot 4 blade5 blade5 slot 5 slot 5 blade6 slot 6 blade6 slot 6 blade7 blade7 slot 7 slot 7 blade8 slot 8 blade8 slot 8 30
  • 31. Scaling the Data Centre Fabric Changing the device paradigm §  De-Coupling of the Layer 1 and Layer 2 Topologies §  Simplified Management Model, plug and play provisioning, centralized configuration §  Line Card Portability (N2K supported with Multiple Parent Switches – N5K, 6100, N7K) §  Unified access for any server (100Mà1GEà10GEà FCoE): Scalable Ethernet, HPC, unified fabric or virtualization deployment ... Virtualized Switch © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 31
  • 32. Hadoop Network Topologies - Reference Unified Fabric & ToR DC Design §  Integration with Enterprise architecture – essential pathway for data flow §  1Gbps Attached Server Integration §  Nexus 7000/5000 with 2248TP-E Consistency §  Nexus 7000 and 3048 Management Risk-assurance §  NIC Teaming - 1Gbps Attached Enterprise grade features §  Nexus 7000/5000 with 2248TP-E §  Consistent Operational Model §  Nexus 7000 and 3048 NxOS, CLI, Fault Behavior and Management §  10 Gbps Attached Server §  Though higher BW east-west compared §  Nexus 7000/5000 with 2232PP to traditional transactional networks §  Nexus 7000 and 3064 §  Over the time it will have multi-user, multi- workload behavior §  NIC Teaming – 10 Gbps Attached Need enterprise centric features Server Security, SLA, QoS etc §  Nexus 7000/5000 with 2232PP §  Nexus 7000 & 3064
  • 33. Validated Reference Network Topology Nexus 7000 Nexus 7000 Nexus 5548 Nexus 5548 Nexus 3000 2248TP-E Nexus 3000 2248TP-E Name Node Name Node Cisco UCS C 200 Cisco UCS C200 Single NIC Single NIC … … … … Data Nodes 1 – 48 Data Nodes 49- 96 Data Nodes 1 – 48 Data Nodes 49 - 96 Cisco UCS C 200 Single NIC Cisco UCS 200 Single NIC Cisco UCS C 200 Single NIC Cisco UCS C 200 Single NIC Traditional DC Design Nexus 55xx/2248 Nexus 7K-N3K based Topology §  Hadoop Framework §  Network Apache 0.20.2 Three Racks each with 32 nodes Linux 6.2 Distribution Layer – Nexus 7000 or Slots – 10 Maps & 2 Reducers per node Nexus 5000 §  Compute – UCS C200 M2 ToR – FEX or Nexus 3000 Cores: 12 2 FEX per Rack Processor: 2 x Intel(R) Xeon(R) CPU X5670 @ 2.93GHz Each Rack with either 32 single or Disk: 4 x 2TB (7.2K RPM) dual attached host Network: 1G: LOM, 10G: Cisco UCS P81E
  • 34. Network Reference Architecture Characteristics § Network Attributes Nexus LAN and SAN Core: Optimized for Data Centre § Architecture §  Availability § Capacity, Scale & Oversubscription blade1 § Flexibility blade1 slot 1 slot 1 blade2 slot 2 blade2 slot 2 blade3 blade3 slot 3 slot 3 blade4 slot 4 blade4 slot 4 blade5 blade5 slot 5 slot 5 blade6 slot 6 blade6 slot 6 blade7 blade7 slot 7 slot 7 blade8 slot 8 blade8 slot 8 Edge/Access Layer blade1 blade1 slot 1 slot 1 § Management & Visibility blade2 slot 2 blade2 slot 2 blade3 blade3 slot 3 slot 3 blade4 slot 4 blade4 slot 4 blade5 blade5 slot 5 slot 5 blade6 slot 6 blade6 slot 6 blade7 blade7 slot 7 slot 7 blade8 slot 8 blade8 slot 8 blade1 blade1 slot 1 slot 1 blade2 slot 2 blade2 slot 2 blade3 blade3 slot 3 slot 3 blade4 slot 4 blade4 slot 4 blade5 blade5 slot 5 slot 5 blade6 slot 6 blade6 slot 6 blade7 blade7 slot 7 slot 7 blade8 slot 8 blade8 slot 8 34
  • 35. High Availability Switching Design Common High Availability Engineering Principles §  The Core High Availability Design L3 Principles are common across all Dual Node Network Systems Designs §  Understand the causes of network Full Mesh outages Component Failures L2 Dual Node Network Anomalies Full Mesh §  Understand the Engineering foundations of systems level availability ToR Dual Node Device and Network level MTBF Understanding Hierarchical and Modular Design NIC Teaming Understand the HW and SW interaction in the system §  Enhance VPC allows such topology and Dual NIC Dual NIC Single NIC 802.3ad ideally suited for Big Data applications Active/Standby §  Enhanced vPC (EvPC)configuration any and System High Availability is a function of all server NIC teaming configurations will be topology and component level High supported on any port Availability
  • 36. Availability with Single Attached Server 1G or 10G §  Important to evaluate the overall availability of the system. Network failures can span many nodes in the system causing rebalancing and decreased overall resources. Typically multi-TB of data transfer occurs for a single ToR or FEX failure Load Sharing, ease of management and consistent SLA is important to enterprise operation §  Failure Domain Impact on Job Completion §  1 TB Terasort typically Takes ~4.20- 4.30 minutes §  A failure of a SINGLE NODE (either NIC or server component) results in roughly doubling of the job completion time §  Key observation is that the failure impact is dependent on type of workload being run on the Single NIC cluster 32 per ToR Short lived interactive vs. Short live batch Long job – ETL, Normalization, Joins 36
  • 37. Single Node Failure Job Completion Time §  The MAP job are executed parallel so unit time for each MAP tasks/node remains same and more less completes the job roughly at the same time. §  However during the failure, set of MAP task remains pending (since other nodes in the cluster are still completing their task) till ALL the node finishes the assigned tasks. §  Once all the node finishes their MAP task, the left over MAP task being reassigned by name node, the unit time it take to finish those sets of MAP task remain the same(linear) as the time it took to finish the other MAPs – its just happened to be NOT done in parallel thus it could double job completion time. This is the worst case scenario with Terasort, other workload may have variable completion time. 37
  • 38. 1G Port Traffic & Job Completion Time 38
  • 39. 1G Port Failure Traffic & Job Completion Time 39
  • 40. Availability with Dual Attached Server 1G and 10G Server NIC Teaming Topologies §  Dual homing(active-active) network connection from server allows Reduced replication and data movements during failure Allow optimal load-sharing §  Dual homing FEX avoids single point of failure. §  Enhance VPC allows such topology and ideally suited for Big Data applications §  Enhanced vPC (EvPC)configuration any and all server NIC teaming configurations will be supported on any port §  Supported with Nexus 5500 only §  Alternatively Nexus 3000 vPC allows host level redundancy with ToR ECMP Dual NIC Single NIC 802.3ad Dual NIC Active/ Standby 40
  • 41. Availability Single Attached vs. Dual Attached Node §  No single point of failure from network view point. No impact on job completion time §  NIC bonding configured at Linux – with LACP mode of bonding §  Effective load-sharing of traffic flow on two NICs. §  Recommended to change the hashing to src-dst-ip-port (both network and NIC bonding in Linux) for optimal load-sharing 41
  • 42. Availability Network Failure Result – 1TB Terasort - ETL §  Failure of various components FEX/ToR A §  Failure introduce at 33%, 66% and 99% of reducer completion FEX/ToR A §  Singly attached NIC server & Rack 96 Nodes failure has bigger impact on job 2 FEX per Rack completion time then any other failure §  FEX Failure is a RACK failure for 1G topology Job Completion Time with Various Failure 1G Single 2G Failure Point FEX/ToR B Attached Dual Attached FEX/ToR A Peer Link 5000 301 258 FEX * 1137 259 Rack * 1137 1017 See A Port – Single previous See previous Slide Attached Slide See 1 port – Dual previous See previous Slide Attach Slide *Variance in run time with % reducer completed Rack 1 Rack 2 Rack 3 42
  • 43. Network Reference Architecture Characteristics § Network Attributes § Architecture § Availability § Capacity, Scale & Oversubscription § Flexibility § Management & Visibility 43
  • 44. Cluster Scaling Nexus 7K/5K & FEX - 2248TP-E or 2232 § 1G Based - Nexus 2248TP- E 48 1G host ports and up to 4 Uplinks uplinks bundled into a single port channel Host Interface § 10G Based Nexus 2232 32 10G host ports and up to 8 uplinks bundled into a single port channel 802.3ad & vPC 802.3ad & Single vPC Attached Nexus 2248TP-E and 2232 support both local port channel and vPC for distributed port channels
  • 45. Oversubscription Design §  Hadoop is a parallel batch job oriented framework §  Primary benefits of hadoop is the reduction in job completion time that would otherwise would take longer with traditional technique. E.g. Large ETL, Log Analysis, Join-only-Map job etc. §  Typically oversubscription occurs with 10G server access then at 1G server §  Non-blocking network is NOT a needed, however degree of oversubscription matters for Job Completion Time Replication of Results Oversubscription during rack or FEX failure §  Static vs. actual oversubscription Often how much data a single node push is IO bound and number of disk configuration Uplinks Oversubscription Measured Theoretical (16 Servers) 8 2:1 Next Slides 4 4:1 Next Slides 2 8:1 Next Slides 1 16:1 Next Slides 45
  • 46. Network Oversubscriptions §  Steady state §  Result Replication with 1,2,4, & 8 uplink §  Rack Failure with 1, 2, 4 & 8 Uplink 46
  • 47. Data Node Speed Differences 1G vs. 10G TCPDUMP of Reducers TX •  Generally 1G is being used largely due to the cost/performance trade-offs. Though 10GE can provide benefits depending on workload •  Reduced spike with 10G and smoother job completion time •  Multiple 1G or 10G links can be bonded together to not only increase bandwidth, but increase resiliency. 47
  • 48. 1GE vs. 10GE Buffer Usage Moving from 1GE to 10GE actually lowers the buffer requirement at the switching layer. Job  Completion Cell  Usage 109 121 133 145 157 169 181 193 205 217 229 241 253 265 277 289 301 313 325 337 349 361 373 385 397 409 421 433 445 457 469 481 493 505 517 529 541 553 565 577 589 601 613 625 637 649 661 673 685 697 709 721 733 745 757 769 781 793 1 13 25 37 49 61 73 85 97 1G  Buffer  Used 10G  Buffer  Used 1G  Map  % 1G  Reduce  % 10G  Map  % 10G  Reduce  % By  moving  to  10GE,  the  data  node  has  a  wider  pipe  to  receive  data  lessening  the   need  for  buffers  on  the  network  as  the  total  aggregate  transfer  rate  and  amount   of  data  does  not  increase  substanEally.  This  is  due,  in  part,  to  limits  of  I/O  and   Compute  capabiliEes   48
  • 49. Network Reference Architecture Characteristics § Network Attributes Nexus LAN and SAN Core: Optimized for Data Centre § Architecture § Capacity § Availability § Scale & Oversubscription blade1 slot 1 blade1 slot 1 blade2 slot 2 blade2 slot 2 blade3 blade3 slot 3 slot 3 § Flexibility blade4 slot 4 blade4 slot 4 blade5 blade5 slot 5 slot 5 blade6 slot 6 blade6 slot 6 blade7 blade7 slot 7 slot 7 blade8 slot 8 blade8 slot 8 Edge/Access Layer blade1 blade1 slot 1 slot 1 blade2 slot 2 blade2 slot 2 blade3 blade3 slot 3 slot 3 blade4 blade4 § Management & Visibility slot 4 slot 4 blade5 blade5 slot 5 slot 5 blade6 slot 6 blade6 slot 6 blade7 blade7 slot 7 slot 7 blade8 slot 8 blade8 slot 8 blade1 blade1 slot 1 slot 1 blade2 slot 2 blade2 slot 2 blade3 blade3 slot 3 slot 3 blade4 slot 4 blade4 slot 4 blade5 blade5 slot 5 slot 5 blade6 slot 6 blade6 slot 6 blade7 blade7 slot 7 slot 7 blade8 slot 8 blade8 slot 8 49
  • 50. Multi-use Cluster Characteristics Hadoop clusters are generally multi-use. The effect of background use can effect any single jobs completion. A given Cluster, running many different types of Jobs, Importing into HDFS, Etc. Importing Data into HDFS Large ETL Job Overlaps with medium and small ETL Jobs and many small BI Jobs (Blue lines are ETL Jobs and purple lines are BI Jobs) Example View of 24 Hour Cluster Use 50
  • 51. 100 Jobs each with 10GB Data Set Stable, Node & Rack Failure •  Almost all jobs are impacted with a single node failure •  With multiple jobs running concurrently, node failure impact is as significant as rack failure 51
  • 52. Network Reference Architecture Characteristics § Network Attributes Nexus LAN and SAN Core: Optimized for Data Centre § Architecture § Capacity § Availability § Scale & Oversubscription blade1 slot 1 blade1 slot 1 blade2 slot 2 blade2 slot 2 blade3 blade3 slot 3 slot 3 § Flexibility blade4 slot 4 blade4 slot 4 blade5 blade5 slot 5 slot 5 blade6 slot 6 blade6 slot 6 blade7 blade7 slot 7 slot 7 blade8 slot 8 blade8 slot 8 Edge/Access Layer blade1 blade1 slot 1 slot 1 blade2 slot 2 blade2 slot 2 blade3 blade3 slot 3 slot 3 blade4 blade4 § Management & Visibility slot 4 slot 4 blade5 blade5 slot 5 slot 5 blade6 slot 6 blade6 slot 6 blade7 blade7 slot 7 slot 7 blade8 slot 8 blade8 slot 8 blade1 blade1 slot 1 slot 1 blade2 slot 2 blade2 slot 2 blade3 blade3 slot 3 slot 3 blade4 slot 4 blade4 slot 4 blade5 blade5 slot 5 slot 5 blade6 slot 6 blade6 slot 6 blade7 blade7 slot 7 slot 7 blade8 slot 8 blade8 slot 8 52
  • 53. Burst Handling and Queue Depth A network that cannot handle •  Several HDFS operations and bursts effectively will drop phases of MapReduce jobs are very packets, so optimal buffering is bursty in nature needed in network devices to absorb bursts. •  The extent of bursts largely depend on the type of job (ETL vs. BI) Optimal Buffering •  Given large enough incast, TCP will collapse at some point no •  Bursty phases can include matter how large the buffer replication of data (either importing •  Well studied by multiple into HDFS or output replication) and universities the output of the mappers during •  Alternate solutions (Changing TCP behavior) proposed rather the shuffle phase. than Huge buffer switches http://simula.stanford.edu/ sedcl/files/dctcp-final.pdf 53
  • 54. Nexus 2248TP-E Buffer Monitoring §  Nexus  2248TP-­‐E  uElizes  a  32MB  shared  buffer  to  handle  larger  traffic  bursts   §  Hadoop,  NAS,  AVID  are  examples  of  bursty  applicaEons   §  You  can  control  the  queue  limit  for  a  specified  Fabric  Extender  for  egress   (network  to  the  host)  or  ingress(host  to  network)   §  Extensive Drop Counters §  Provides drop counters for both directions: Network to host and Host to Network on a per host interface basis §  Drop counters for different reason •  Out of buffer drop, No credit drop, Queue limit drop(tail drop), MAC error drop, Truncation drop, Multicast drop §  Buffer Occupancy Counter §  How much buffer is being used. One key indicator of congestion or bursty traffic N5548-L3(config-fex)# hardware N2248TPE queue-limit 4000000 rx N5548-L3(config-fex)# hardware N2248TPE queue-limit 4194304  tx fex-110# show platform software qosctrl asic 0 0 54
  • 55. Buffer Monitoring switch# attach fex 110 Attaching to FEX 110 ... To exit type 'exit', to abort type '$.' fex-110# show platform software qosctrl asic 0 0 number of arguments 4: show asic 0 0 ---------------------------------------- QoSCtrl internal info {mod 0x0 asic 0} mod 0 asic 0: port type: CIF [0], total: 1, used: 1 port type: BIF [1], total: 1, used: 0 port type: NIF [2], total: 4, used: 4 port type: HIF [3], total: 48, used: 48 bound NIF ports: 2 N2H cells: 14752 H2N cells: 50784 ----Programmed Buffers--------- Fixed Cells : 14752 Shared Cells : 50784 ç Allocated Buffer in terms of cells (512Bytes) ----Free Buffer Statistics----- Total Cells : 65374 Fixed Cells : 14590 Shared Cells : 50784 ç Number of free cells to be monitored 55
  • 56. %)(($,-./)$ !"#$"%&' !"#("#(' !"#)"#*' !"%#"+(' !"%+"+%' !"%$"+)' !"%(",&' phases. !"%)"$,' !"+%"!#' !"+,"!*' !"+&"#&' !"+*"%+' !",!"+#' !",%"$#' !",,"$)' !",("#)' completion times. !",)"%$' !"$#",&' !"$,"!&' !"$&"%&' !"$*"++' #"!!",!' #"!%",(' #"!,",+' Shuffle Phase #"!("!+' #"!)"##' #"##"+!' Buffer Usage During #"#+"+&' #"#$",+' #"#("$!' #"#)"$(' #"%%"!$' #"%,"%$' -./'0#' #"%("%+' #"+!"#)' #"++"%,' #"+&"+#' #"+)"%$' -./'0%' #",%"++' #",$"%(' #",("$&' #"$!"+(' 123'4' #"$%",*' #"$,"#*' #"$,"#*' %"!*",$' %"##"#(' %"#(",!' 567896'4' %"%!"%,' %"%%"$$' %"%$",!' %"%*"%$' %"+#"!*' %"++"$+' %"+&"+*' %"+)"%,' %",%"%#' %",$"!&' %",("$#' %"$!"+&' %"$+"%!' %"$&"!$' %"$*"$!' +"!#"+(' +"!,"+$' +"!("%!' +"#!"!*' +"#%"$$' +"#$"$&' +"#*",(' output Replication +"%#"$,' Buffer Usage During +"%$"!+' +"%*"#!' +"+#"#&' +"+,"!(' +"+&"++' +"+)"%#' !"#$%"&'()*"+$ §  The buffer utilization is highest during the shuffle and output replication §  Optimized buffer sizes are required to avoid packet loss leading to slower job TeraSort FEX(2248TP-E) Buffer Analysis (10TB) 56
  • 57. Buffer depth monitoring: interface §  Real time command displaying the status of the shared buffer. §  XML support will be added in the maintenance release §  Counters are displayed in cell count. A cell is approximately 208 bytes show hardware internal buffer info pkt-stats [brief|clear|detail] Buffer   Free   Total  buffer   Max  buffer   usage   buffer     space  on  the   usage  since   pla`orm   clear   57
  • 58. %)(($,-./)$ !"#$%#&'( !"#&)#&"( phases. !"#&'#&*( !"#&+#&+( !"#'!#&%( !"#'&#'!( Rack  layer   !"#'*#')( !*#,,#'$( !*#,$#'&( !*#,"#''( !*#,%#'"( !*#!)#'+( !*#!'#'%( !*#!%#,,( !*#))#,!( !*#)'#,)( !*#)+#,&( !*#$!#,'( !*#$&#,"( Shuffle Phase !*#$*#,*( !*#&,#,*( Buffer Usage During !*#&$#,+( !*#&"#,%( !*#&%#!,( !*#')#!!( -$,&+(.!( slower job completion times. !*#''#!!( !*#'+#!$( !+#,!#!&( !+#,&#!'( !+#,*#!"( !+#!,#!*( -$,&+(.)( !+#!$#!%( !+#!"#),( !+#!%#)!( !+#))#))( !+#)'#)$( -$,"&( !+#)+#)&( !+#$!#)'( !+#$&#)*( !+#$*#)+( !+#&,#)%( /01(2( !+#&$#$,( !+#&"#$!( !+#&%#$$( !+#')#$&( !+#''#$'( !+#'+#$"( 345674(2( !%#,!#$*( !%#,&#$+( !%#,*#$%( !%#!,#&!( !%#!$#&)( !%#!"#&)( Replication !%#!%#&$( !%#))#&&( !%#)'#&"( !%#)+#&*( !%#$!#&+( !%#$&#&%( !%#$*#',( Buffer Usage During output !%#&,#'!( !%#&$#')( !%#&"#'&( !%#&%#''( TeraSort(ETL) N3k Buffer Analysis (10TB) !%#')#'"( !%#''#'*( •  Optimized buffer sizes are required to avoid packet loss leading to !"#$%"&'()*"+$ The  AggregaEon  switch  buffer  remained  flat  as  the  bursts  were  absorbed  at  the  Top  of   •  The buffer utilization is highest during the shuffle and output replication 58
  • 59. Network Latency Generally network latency, while N3K Topology 5k/2k Topology consistent latency being important, does not represent a significant factor for Completion Time (Sec) Hadoop Clusters. Note: There is a difference in network latency vs. application latency. Optimization in the application stack can decrease application latency that can potentially have a significant benefit. 1TB 5TB 10TB Data Set Size (80 Node Cluster) 59
  • 60. Summary §  10G and/or Dual attached server §  Extensive Validation of provides consistent job completion time & better buffer utilization Hadoop Workload §  10G provide reduce burst at the §  Reference Architecture access layer Make it easy for Enterprise §  A single attached node failure has considerable impact on job Demystify Network for completion time Hadoop Deployment §  Dual Attached Sever is recommended Integration with Enterprise design – 1G or 10G. 10G for future proofing with efficient choices of network topology/devices §  Rack failure has the biggest impact on job completion time §  Does not require non-blocking network §  Degree of oversubscription does impact job completion time §  Latency does not matter much in Hadoop work load 60
  • 61. 128  Node/1PB  test   Big Data @ Cisco cluster       Cisco.com  Big  Data   www.cisco.com/go/bigdata     Cer;fica;ons  and  Solu;ons  with  UCS  C-­‐Series     and  Nexus  5500+22xx     •  EMC  Greenplum  MR  SoluEon   •  Cloudera  Hadoop  CerEfied  Technology   •  Cloudera  Hadoop  SoluEon  Brief   •  Oracle  NoSQL  Validated  SoluEon   •  Oracle  NoSQL  SoluEon  Brief     Mul;-­‐month  network  and  compute  analysis   tes;ng   (In  conjunc;on  with  Cloudera)     •  Network/Compute  ConsideraEons  Whitepaper   •  Presented  Analysis  at  Hadoop  World         61
  • 62. THANK YOU FOR LISTENING Nimish Desai – nidesai@cisco.com Technical Leader, Data Center Group Cisco Systems Inc.
  • 63. Break! Break takes place in the Community Showcase (Hall 2) Sessions will resume at 3:35pm Page 63