SlideShare ist ein Scribd-Unternehmen logo
1 von 14
© 2013 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Hadoop-as-a-Service for Lifecycle Management Simplicity
Chris Mutchler | Adobe Compute Platform Engineer | @chrismutchler
Andrew Nelson | VMware Staff Systems Engineer | @vmwnelson | virtual-hiking.blogspot.com
1
© 2013 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Operational Approach to Virtualizing Hadoop
2
Why even bother?
These are the four reasons why I wanted to tackle this
problem.
» Excited about the idea of developing
internal Platform-as-a-Service offering.
» Solves a common “shadow IT” problem in
infrastructure organizations and save $$$.
» Adobe is a Big Data company, it makes
sense for us to have a Hadoop offering.
» It’s bleeding edge. Innovate quickly
and scale.
© 2013 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Benefits of Virtualizing Hadoop
© 2013 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Reference Architecture
4
ESXi
Tier-0 Flash
OS
App
OS
App
OS
App
Private Cloud Resource Pools
VMware vCenter
VMware vCloud Automation Center
VMware Big Data Extensions
© 2013 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Adobe Use-Cases
5
Two unique use-cases
Experimentation
Production A Production X
Test/Dev
Production
Test
Production
Test
Experimentation
Service A… …Service X
Experimentation
Engineering Pre-Production
Environment: Multiple teams with
zero Hadoop experience with a desire
to investigate Hadoop.
Production Environment: Digital
Marketing products looking to take
advantage of existing data managed
by Ops.
© 2013 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
3rd Party Integrated Deployment
6
© 2013 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Big Questions
7
What are the two questions op teams must answer?
»Where is my data?
»How do I access it?
Local StorageShared Storage
© 2013 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Solution
8
• Integration with Adobe DMBU Private Cloud
• HDFS Storage Integration
• Service Blueprints in vCAC
Data Layer – Hadoop on Isilon
Elastic Virtual Compute Layer
© 2013 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
© 2013 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
 VMware vSphere BDE web site
 http://www.vmware.com/bde
 Virtualized Hadoop Performance with VMware vSphere 5.1
 http://www.vmware.com/resources/techresources/10220
 Benchmarking Case Study of Virtualized Hadoop Performance on vSphere 5
 http://vmware.com/files/pdf/VMW-Hadoop-Performance-vSphere5.pdf
 Hadoop Virtualization Extensions (HVE) :
 http://www.vmware.com/files/pdf/Hadoop-Virtualization-Extensions-on-VMware-vSphere-5.pdf
 Apache Hadoop High Availability Solution on VMware vSphere 5.1
http://vmware.com/files/pdf/Apache-Hadoop-VMware-HA-solution.pdf
 Hadoop-as-a-Service workflows for vCloud Automation Center
https://solutionexchange.vmware.com/store/products/hadoop-as-a-service-vmware-vcloud-automation-center-and-big-data-extension
 Project Serengeti website
http://www.projectserengeti.org
https://github.com/vmware-serengeti
VMware vSphere BDE and Hadoop Resources
© 2013 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Hadoop Virtual Extensions
 Topology Extensions:
• Enable Hadoop to recognize additional virtualization layer for read/write/balancing for proper
replica placement
• Enable compute/data node separation without losing locality
 Elasticity Extensions:
• Ability to dynamically adjust resources allocated (CPU, memory, map/reduce slots) to
compute nodes
• Enables runtime elasticity of Hadoop nodes
© 2013 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
HVE Adds a New Layer in Hadoop Network Topology
• D = data center
• R = rack
• NG = node group
• HG = node
N13N1 N2 N3 N4 N5 N6 N7 N8 N9 N10 N11 N12
R1 R2 R3 R4
D1 D2
/
NG1 NG2 NG3 NG4 NG5 NG6 NG7 NG8
© 2013 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
vSphere Big Data Extensions Architecture
vCloud Automation Center
Big Data Extensions
vCenterOperationsManager
© 2013 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
State, stats
(Slots used,
Pending work)
Commands
(Decommission,
Recommission)
Stats and VM configuration
Serengeti
Job Tracker
vCenter DB
Manual/Auto
Power on/off
Virtual Hadoop Manager (VHM)
Job Tracker
Task
Tracker
Task
Tracker
Task
Tracker
vCenter Server
Serengeti
Configuration
VC
state and stats
Hadoop
state and stats
VC
actions
Hadoop
actions
Algorithms
Cluster
Configuration
Resource Management Module

Weitere ähnliche Inhalte

Was ist angesagt?

HTAP By Accident: Getting More From PostgreSQL Using Hardware Acceleration
HTAP By Accident: Getting More From PostgreSQL Using Hardware AccelerationHTAP By Accident: Getting More From PostgreSQL Using Hardware Acceleration
HTAP By Accident: Getting More From PostgreSQL Using Hardware AccelerationEDB
 
Brochure : The EMC Big Data Solution
Brochure : The EMC Big Data Solution Brochure : The EMC Big Data Solution
Brochure : The EMC Big Data Solution EMC
 
Best Practices for Monitoring Postgres
Best Practices for Monitoring Postgres Best Practices for Monitoring Postgres
Best Practices for Monitoring Postgres EDB
 
Software Defined IT @ Evento SOIEL Roma 6 Aprile 2017
Software Defined IT @ Evento SOIEL Roma 6 Aprile 2017Software Defined IT @ Evento SOIEL Roma 6 Aprile 2017
Software Defined IT @ Evento SOIEL Roma 6 Aprile 2017Riccardo Romani
 
Oracle Cloud Computing Strategy
Oracle Cloud Computing StrategyOracle Cloud Computing Strategy
Oracle Cloud Computing StrategyRex Wang
 
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the CloudPart 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the CloudCloudera, Inc.
 
Enterprise-Database-Migration-Strategies-and-Options-on-AWS
Enterprise-Database-Migration-Strategies-and-Options-on-AWSEnterprise-Database-Migration-Strategies-and-Options-on-AWS
Enterprise-Database-Migration-Strategies-and-Options-on-AWSAmazon Web Services
 
Part 1: Lambda Architectures: Simplified by Apache Kudu
Part 1: Lambda Architectures: Simplified by Apache KuduPart 1: Lambda Architectures: Simplified by Apache Kudu
Part 1: Lambda Architectures: Simplified by Apache KuduCloudera, Inc.
 
EMC Isilon Database Converged deck
EMC Isilon Database Converged deckEMC Isilon Database Converged deck
EMC Isilon Database Converged deckKeithETD_CTO
 
Apache Spark Workshop at Hadoop Summit
Apache Spark Workshop at Hadoop SummitApache Spark Workshop at Hadoop Summit
Apache Spark Workshop at Hadoop SummitSaptak Sen
 
Enabling the Software Defined Data Center for Hybrid IT
Enabling the Software Defined Data Center for Hybrid ITEnabling the Software Defined Data Center for Hybrid IT
Enabling the Software Defined Data Center for Hybrid ITNetApp
 
C1 oracle's cloud computing strategy your strategy-your cloud_your choice
C1   oracle's cloud computing strategy your strategy-your cloud_your choiceC1   oracle's cloud computing strategy your strategy-your cloud_your choice
C1 oracle's cloud computing strategy your strategy-your cloud_your choiceDr. Wilfred Lin (Ph.D.)
 
AWS Partner Presentation - PetaByte Scale Computing on Amazon EC2 with BigDat...
AWS Partner Presentation - PetaByte Scale Computing on Amazon EC2 with BigDat...AWS Partner Presentation - PetaByte Scale Computing on Amazon EC2 with BigDat...
AWS Partner Presentation - PetaByte Scale Computing on Amazon EC2 with BigDat...Amazon Web Services
 
Migration, Protection, and Availability with AWS
Migration, Protection, and Availability with AWSMigration, Protection, and Availability with AWS
Migration, Protection, and Availability with AWSAmazon Web Services
 
Achieving cloud scale with microservices based applications on azure
Achieving cloud scale with microservices based applications on azureAchieving cloud scale with microservices based applications on azure
Achieving cloud scale with microservices based applications on azureUtkarsh Pandey
 
Riding the Second Wave: Open Source for Relational Databases, Enterprise Post...
Riding the Second Wave: Open Source for Relational Databases, Enterprise Post...Riding the Second Wave: Open Source for Relational Databases, Enterprise Post...
Riding the Second Wave: Open Source for Relational Databases, Enterprise Post...EDB
 

Was ist angesagt? (20)

HTAP By Accident: Getting More From PostgreSQL Using Hardware Acceleration
HTAP By Accident: Getting More From PostgreSQL Using Hardware AccelerationHTAP By Accident: Getting More From PostgreSQL Using Hardware Acceleration
HTAP By Accident: Getting More From PostgreSQL Using Hardware Acceleration
 
Brochure : The EMC Big Data Solution
Brochure : The EMC Big Data Solution Brochure : The EMC Big Data Solution
Brochure : The EMC Big Data Solution
 
Best Practices for Monitoring Postgres
Best Practices for Monitoring Postgres Best Practices for Monitoring Postgres
Best Practices for Monitoring Postgres
 
Software Defined IT @ Evento SOIEL Roma 6 Aprile 2017
Software Defined IT @ Evento SOIEL Roma 6 Aprile 2017Software Defined IT @ Evento SOIEL Roma 6 Aprile 2017
Software Defined IT @ Evento SOIEL Roma 6 Aprile 2017
 
Oracle Cloud Computing Strategy
Oracle Cloud Computing StrategyOracle Cloud Computing Strategy
Oracle Cloud Computing Strategy
 
Hybrid is the New Normal
Hybrid is the New NormalHybrid is the New Normal
Hybrid is the New Normal
 
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the CloudPart 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
 
Enterprise-Database-Migration-Strategies-and-Options-on-AWS
Enterprise-Database-Migration-Strategies-and-Options-on-AWSEnterprise-Database-Migration-Strategies-and-Options-on-AWS
Enterprise-Database-Migration-Strategies-and-Options-on-AWS
 
Part 1: Lambda Architectures: Simplified by Apache Kudu
Part 1: Lambda Architectures: Simplified by Apache KuduPart 1: Lambda Architectures: Simplified by Apache Kudu
Part 1: Lambda Architectures: Simplified by Apache Kudu
 
EMC Isilon Database Converged deck
EMC Isilon Database Converged deckEMC Isilon Database Converged deck
EMC Isilon Database Converged deck
 
Big Data: Myths and Realities
Big Data: Myths and RealitiesBig Data: Myths and Realities
Big Data: Myths and Realities
 
Apache Spark Workshop at Hadoop Summit
Apache Spark Workshop at Hadoop SummitApache Spark Workshop at Hadoop Summit
Apache Spark Workshop at Hadoop Summit
 
Enabling the Software Defined Data Center for Hybrid IT
Enabling the Software Defined Data Center for Hybrid ITEnabling the Software Defined Data Center for Hybrid IT
Enabling the Software Defined Data Center for Hybrid IT
 
Highly Automated IT
Highly Automated ITHighly Automated IT
Highly Automated IT
 
C1 oracle's cloud computing strategy your strategy-your cloud_your choice
C1   oracle's cloud computing strategy your strategy-your cloud_your choiceC1   oracle's cloud computing strategy your strategy-your cloud_your choice
C1 oracle's cloud computing strategy your strategy-your cloud_your choice
 
AWS Partner Presentation - PetaByte Scale Computing on Amazon EC2 with BigDat...
AWS Partner Presentation - PetaByte Scale Computing on Amazon EC2 with BigDat...AWS Partner Presentation - PetaByte Scale Computing on Amazon EC2 with BigDat...
AWS Partner Presentation - PetaByte Scale Computing on Amazon EC2 with BigDat...
 
OCI Overview
OCI OverviewOCI Overview
OCI Overview
 
Migration, Protection, and Availability with AWS
Migration, Protection, and Availability with AWSMigration, Protection, and Availability with AWS
Migration, Protection, and Availability with AWS
 
Achieving cloud scale with microservices based applications on azure
Achieving cloud scale with microservices based applications on azureAchieving cloud scale with microservices based applications on azure
Achieving cloud scale with microservices based applications on azure
 
Riding the Second Wave: Open Source for Relational Databases, Enterprise Post...
Riding the Second Wave: Open Source for Relational Databases, Enterprise Post...Riding the Second Wave: Open Source for Relational Databases, Enterprise Post...
Riding the Second Wave: Open Source for Relational Databases, Enterprise Post...
 

Ähnlich wie Hadoop-as-a-Service for Lifecycle Management Simplicity

IBM Cloud Solution for SAP. Integrating IBM Management (Flex System Manager o...
IBM Cloud Solution for SAP. Integrating IBM Management (Flex System Manager o...IBM Cloud Solution for SAP. Integrating IBM Management (Flex System Manager o...
IBM Cloud Solution for SAP. Integrating IBM Management (Flex System Manager o...Doddi Priyambodo
 
Open Architecture in the Adobe Marketing Cloud - Summit 2014
Open Architecture in the Adobe Marketing Cloud - Summit 2014Open Architecture in the Adobe Marketing Cloud - Summit 2014
Open Architecture in the Adobe Marketing Cloud - Summit 2014Paolo Mottadelli
 
深入淺出 AWS 混合式雲端架構
深入淺出 AWS 混合式雲端架構 深入淺出 AWS 混合式雲端架構
深入淺出 AWS 混合式雲端架構 Amazon Web Services
 
What is A Cloud Stack in 2017
What is A Cloud Stack in 2017What is A Cloud Stack in 2017
What is A Cloud Stack in 2017Gaurav Roy
 
Integrating with Adobe Marketing Cloud - Summit 2014
Integrating with Adobe Marketing Cloud - Summit 2014Integrating with Adobe Marketing Cloud - Summit 2014
Integrating with Adobe Marketing Cloud - Summit 2014Paolo Mottadelli
 
Pivotal: Virtualize Big Data to Make the Elephant Dance
Pivotal: Virtualize Big Data to Make the Elephant DancePivotal: Virtualize Big Data to Make the Elephant Dance
Pivotal: Virtualize Big Data to Make the Elephant DanceEMC
 
A Deeper Look at How Veeam is Evolving Availability on AWS (STG206-S) - AWS r...
A Deeper Look at How Veeam is Evolving Availability on AWS (STG206-S) - AWS r...A Deeper Look at How Veeam is Evolving Availability on AWS (STG206-S) - AWS r...
A Deeper Look at How Veeam is Evolving Availability on AWS (STG206-S) - AWS r...Amazon Web Services
 
The intersection of Traditional IT and New-Generation IT
The intersection of Traditional IT and New-Generation ITThe intersection of Traditional IT and New-Generation IT
The intersection of Traditional IT and New-Generation ITKangaroot
 
Virtual Hadoop Introduction In Chinese
Virtual Hadoop Introduction In ChineseVirtual Hadoop Introduction In Chinese
Virtual Hadoop Introduction In Chinese天青 王
 
EMC Big Data | Hadoop Starter Kit | EMC Forum 2014
EMC Big Data | Hadoop Starter Kit | EMC Forum 2014EMC Big Data | Hadoop Starter Kit | EMC Forum 2014
EMC Big Data | Hadoop Starter Kit | EMC Forum 2014EMC
 
VMware: The Fastest Path to Hybrid Cloud
VMware: The Fastest Path to Hybrid CloudVMware: The Fastest Path to Hybrid Cloud
VMware: The Fastest Path to Hybrid CloudAmazon Web Services
 
Tudor Damian - Comparing Microsoft Cloud with VMware Cloud
Tudor Damian - Comparing Microsoft Cloud with VMware CloudTudor Damian - Comparing Microsoft Cloud with VMware Cloud
Tudor Damian - Comparing Microsoft Cloud with VMware CloudITSpark Community
 
GoGrid/AppZero: "Moving Windows Server Applications to the Cloud in 3 Easy St...
GoGrid/AppZero: "Moving Windows Server Applications to the Cloud in 3 Easy St...GoGrid/AppZero: "Moving Windows Server Applications to the Cloud in 3 Easy St...
GoGrid/AppZero: "Moving Windows Server Applications to the Cloud in 3 Easy St...GoGrid Cloud Hosting
 
A new platform for a new era emc
A new platform for a new era   emcA new platform for a new era   emc
A new platform for a new era emcTaldor Group
 
(ENT205) AWS and VMware: How to Architect and Manage Hybrid Environments | AW...
(ENT205) AWS and VMware: How to Architect and Manage Hybrid Environments | AW...(ENT205) AWS and VMware: How to Architect and Manage Hybrid Environments | AW...
(ENT205) AWS and VMware: How to Architect and Manage Hybrid Environments | AW...Amazon Web Services
 
HP CloudSystem, Alex Haddock, HP Server Strategy Team
HP CloudSystem, Alex Haddock, HP Server Strategy TeamHP CloudSystem, Alex Haddock, HP Server Strategy Team
HP CloudSystem, Alex Haddock, HP Server Strategy Teamsubtitle
 

Ähnlich wie Hadoop-as-a-Service for Lifecycle Management Simplicity (20)

Cloud ibrido nella PA
Cloud ibrido nella PACloud ibrido nella PA
Cloud ibrido nella PA
 
IBM Cloud Solution for SAP. Integrating IBM Management (Flex System Manager o...
IBM Cloud Solution for SAP. Integrating IBM Management (Flex System Manager o...IBM Cloud Solution for SAP. Integrating IBM Management (Flex System Manager o...
IBM Cloud Solution for SAP. Integrating IBM Management (Flex System Manager o...
 
Open Architecture in the Adobe Marketing Cloud - Summit 2014
Open Architecture in the Adobe Marketing Cloud - Summit 2014Open Architecture in the Adobe Marketing Cloud - Summit 2014
Open Architecture in the Adobe Marketing Cloud - Summit 2014
 
深入淺出 AWS 混合式雲端架構
深入淺出 AWS 混合式雲端架構 深入淺出 AWS 混合式雲端架構
深入淺出 AWS 混合式雲端架構
 
What is A Cloud Stack in 2017
What is A Cloud Stack in 2017What is A Cloud Stack in 2017
What is A Cloud Stack in 2017
 
Integrating with Adobe Marketing Cloud - Summit 2014
Integrating with Adobe Marketing Cloud - Summit 2014Integrating with Adobe Marketing Cloud - Summit 2014
Integrating with Adobe Marketing Cloud - Summit 2014
 
2014 cf summit_clustering
2014 cf summit_clustering2014 cf summit_clustering
2014 cf summit_clustering
 
Pivotal: Virtualize Big Data to Make the Elephant Dance
Pivotal: Virtualize Big Data to Make the Elephant DancePivotal: Virtualize Big Data to Make the Elephant Dance
Pivotal: Virtualize Big Data to Make the Elephant Dance
 
SAP Solution On VMware - Best Practice Guide 2011
SAP Solution On VMware - Best Practice Guide 2011SAP Solution On VMware - Best Practice Guide 2011
SAP Solution On VMware - Best Practice Guide 2011
 
A Deeper Look at How Veeam is Evolving Availability on AWS (STG206-S) - AWS r...
A Deeper Look at How Veeam is Evolving Availability on AWS (STG206-S) - AWS r...A Deeper Look at How Veeam is Evolving Availability on AWS (STG206-S) - AWS r...
A Deeper Look at How Veeam is Evolving Availability on AWS (STG206-S) - AWS r...
 
The intersection of Traditional IT and New-Generation IT
The intersection of Traditional IT and New-Generation ITThe intersection of Traditional IT and New-Generation IT
The intersection of Traditional IT and New-Generation IT
 
Virtual Hadoop Introduction In Chinese
Virtual Hadoop Introduction In ChineseVirtual Hadoop Introduction In Chinese
Virtual Hadoop Introduction In Chinese
 
EMC Big Data | Hadoop Starter Kit | EMC Forum 2014
EMC Big Data | Hadoop Starter Kit | EMC Forum 2014EMC Big Data | Hadoop Starter Kit | EMC Forum 2014
EMC Big Data | Hadoop Starter Kit | EMC Forum 2014
 
VMware: The Fastest Path to Hybrid Cloud
VMware: The Fastest Path to Hybrid CloudVMware: The Fastest Path to Hybrid Cloud
VMware: The Fastest Path to Hybrid Cloud
 
Tudor Damian - Comparing Microsoft Cloud with VMware Cloud
Tudor Damian - Comparing Microsoft Cloud with VMware CloudTudor Damian - Comparing Microsoft Cloud with VMware Cloud
Tudor Damian - Comparing Microsoft Cloud with VMware Cloud
 
GoGrid/AppZero: "Moving Windows Server Applications to the Cloud in 3 Easy St...
GoGrid/AppZero: "Moving Windows Server Applications to the Cloud in 3 Easy St...GoGrid/AppZero: "Moving Windows Server Applications to the Cloud in 3 Easy St...
GoGrid/AppZero: "Moving Windows Server Applications to the Cloud in 3 Easy St...
 
A new platform for a new era emc
A new platform for a new era   emcA new platform for a new era   emc
A new platform for a new era emc
 
(ENT205) AWS and VMware: How to Architect and Manage Hybrid Environments | AW...
(ENT205) AWS and VMware: How to Architect and Manage Hybrid Environments | AW...(ENT205) AWS and VMware: How to Architect and Manage Hybrid Environments | AW...
(ENT205) AWS and VMware: How to Architect and Manage Hybrid Environments | AW...
 
Hp
HpHp
Hp
 
HP CloudSystem, Alex Haddock, HP Server Strategy Team
HP CloudSystem, Alex Haddock, HP Server Strategy TeamHP CloudSystem, Alex Haddock, HP Server Strategy Team
HP CloudSystem, Alex Haddock, HP Server Strategy Team
 

Mehr von DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

Mehr von DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Kürzlich hochgeladen

Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 

Kürzlich hochgeladen (20)

Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 

Hadoop-as-a-Service for Lifecycle Management Simplicity

  • 1. © 2013 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Hadoop-as-a-Service for Lifecycle Management Simplicity Chris Mutchler | Adobe Compute Platform Engineer | @chrismutchler Andrew Nelson | VMware Staff Systems Engineer | @vmwnelson | virtual-hiking.blogspot.com 1
  • 2. © 2013 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Operational Approach to Virtualizing Hadoop 2 Why even bother? These are the four reasons why I wanted to tackle this problem. » Excited about the idea of developing internal Platform-as-a-Service offering. » Solves a common “shadow IT” problem in infrastructure organizations and save $$$. » Adobe is a Big Data company, it makes sense for us to have a Hadoop offering. » It’s bleeding edge. Innovate quickly and scale.
  • 3. © 2013 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Benefits of Virtualizing Hadoop
  • 4. © 2013 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Reference Architecture 4 ESXi Tier-0 Flash OS App OS App OS App Private Cloud Resource Pools VMware vCenter VMware vCloud Automation Center VMware Big Data Extensions
  • 5. © 2013 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Adobe Use-Cases 5 Two unique use-cases Experimentation Production A Production X Test/Dev Production Test Production Test Experimentation Service A… …Service X Experimentation Engineering Pre-Production Environment: Multiple teams with zero Hadoop experience with a desire to investigate Hadoop. Production Environment: Digital Marketing products looking to take advantage of existing data managed by Ops.
  • 6. © 2013 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 3rd Party Integrated Deployment 6
  • 7. © 2013 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Big Questions 7 What are the two questions op teams must answer? »Where is my data? »How do I access it? Local StorageShared Storage
  • 8. © 2013 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Solution 8 • Integration with Adobe DMBU Private Cloud • HDFS Storage Integration • Service Blueprints in vCAC Data Layer – Hadoop on Isilon Elastic Virtual Compute Layer
  • 9. © 2013 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
  • 10. © 2013 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.  VMware vSphere BDE web site  http://www.vmware.com/bde  Virtualized Hadoop Performance with VMware vSphere 5.1  http://www.vmware.com/resources/techresources/10220  Benchmarking Case Study of Virtualized Hadoop Performance on vSphere 5  http://vmware.com/files/pdf/VMW-Hadoop-Performance-vSphere5.pdf  Hadoop Virtualization Extensions (HVE) :  http://www.vmware.com/files/pdf/Hadoop-Virtualization-Extensions-on-VMware-vSphere-5.pdf  Apache Hadoop High Availability Solution on VMware vSphere 5.1 http://vmware.com/files/pdf/Apache-Hadoop-VMware-HA-solution.pdf  Hadoop-as-a-Service workflows for vCloud Automation Center https://solutionexchange.vmware.com/store/products/hadoop-as-a-service-vmware-vcloud-automation-center-and-big-data-extension  Project Serengeti website http://www.projectserengeti.org https://github.com/vmware-serengeti VMware vSphere BDE and Hadoop Resources
  • 11. © 2013 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Hadoop Virtual Extensions  Topology Extensions: • Enable Hadoop to recognize additional virtualization layer for read/write/balancing for proper replica placement • Enable compute/data node separation without losing locality  Elasticity Extensions: • Ability to dynamically adjust resources allocated (CPU, memory, map/reduce slots) to compute nodes • Enables runtime elasticity of Hadoop nodes
  • 12. © 2013 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. HVE Adds a New Layer in Hadoop Network Topology • D = data center • R = rack • NG = node group • HG = node N13N1 N2 N3 N4 N5 N6 N7 N8 N9 N10 N11 N12 R1 R2 R3 R4 D1 D2 / NG1 NG2 NG3 NG4 NG5 NG6 NG7 NG8
  • 13. © 2013 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. vSphere Big Data Extensions Architecture vCloud Automation Center Big Data Extensions vCenterOperationsManager
  • 14. © 2013 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. State, stats (Slots used, Pending work) Commands (Decommission, Recommission) Stats and VM configuration Serengeti Job Tracker vCenter DB Manual/Auto Power on/off Virtual Hadoop Manager (VHM) Job Tracker Task Tracker Task Tracker Task Tracker vCenter Server Serengeti Configuration VC state and stats Hadoop state and stats VC actions Hadoop actions Algorithms Cluster Configuration Resource Management Module

Hinweis der Redaktion

  1. Introduction and agenda Ops benefits Tech benefits Architecture Use cases Demo video Hybrid data model Current directions Q&A Supplementals
  2. Adobe is a Big Data company. Adobe adopting a virtualization approach of Hadoop has both business and technical justifications and allows competitive differentiation. Analytics is core competency of DMBU.
  3. Rapid provisioning: Much of the cluster deployment process can be automated using existing tools. High availability: HA protection can be provided through the virtualization platform to protect the single points of failure in the Hadoop system. Elasticity: Hadoop capacity can be scaled up and down on demand in a virtual environment. Multi-tenancy: Different tenants running Hadoop can be isolated in separate VMs, providing stronger VM-grade resource and security isolation. Operational Simplicity Rapid Deployment Self service tools Performance Maximize Resource Utilization True multi-tenancy Elastic scaling Avoid dedicated hardware VM-based isolation Increase resource utilization Architect Scalable Platform Deployment choice Maintain management flexibility at scale Control Costs Leverage toolsets Security
  4. Expecting a lot of questions on this one and halfway through, so good time for intermediate Q&A if Chris wants to discuss some of the physical design. We can defer questions on use-cases and workflows since those will be immediately following.
  5. Prod and dev review
  6. Video walkthrough of vCAC deployment and auto-discovery via Cloudera Manager
  7. Hybrid storage model to get the both of both worlds Or for flexibility Master nodes: NameNode, JobTracker on shared storage Leverage vSphere vMotion, HA and FT Slave nodes TaskTracker, DataNode on local storage Lower cost, scalable bandwidth
  8. Identify acronyms, DMBU and vCAC first. Integration with Adobe DMBU Private Cloud: IaaS environment leveraging VMware stack (vCAC + vCOPs + vCenter). HDFS Storage Integration: Storage team is currently managing >10PB of data on Isilon. Presenting this layer, via HDFS, to multiple product teams from a single-view. Service Blueprints in vCAC: Offering multiple blueprints for various cluster types and sizes within vCAC. Present these blueprints to the Service Catalog and our internal self-provisioning portal.
  9. Q&A slide
  10. Supplementary links
  11. Contributed back to Hadoop community
  12. HVE Supplemental
  13. BDE Components Supplemental
  14. BDE supplementary