SlideShare ist ein Scribd-Unternehmen logo
1 von 80
Downloaden Sie, um offline zu lesen
Hadoop Everywhere
Hortonworks. We do Hadoop.
$ whoami
Sean Roberts
Partner Solutions Engineer
London, EMEA &
everywhere
@seano
linkedin.com/in/seanorama
MacGyver. Data Freak. Cook.
Autodidact. Volunteer. Ancestral
Health. Fito. Couchsurfer. Nomad
- HDP 2.3
- http://hortonworks.com/
- Hadoop Summit recordings:
- http://2015.hadoopsummit.org/san-jose/
- http://2015.hadoopsummit.org/brussels/
- Past & Future workshops:
- http://hortonworks.com/partners/learn/
What’s New!
Agenda
● Hadoop Everywhere
● Deployment challenges & requirements
● Cloudbreak & our Docker approach
● Workshop: Your own CloudBreak
○ And auto-scaling with Periscope
● Cloud best practices
Reminder:
● Attendee phone lines are muted
● Please ask questions in the chat
Page 5 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Disclaimer
This document may contain product features and technology directions that are under
development, may be under development in the future or may ultimately not be
developed.
Project capabilities are based on information that is publicly available within the Apache
Software Foundation project websites ("Apache"). Progress of the project capabilities
can be tracked from inception to release through Apache, however, technical feasibility,
market demand, user feedback and the overarching Apache Software Foundation
community development process can all effect timing and final delivery.
This document’s description of these features and technology directions does not
represent a contractual commitment, promise or obligation from Hortonworks to deliver
these features in any generally available product.
Product features and technology directions are subject to change, and must not be
included in contracts, purchase orders, or sales agreements of any kind.
Since this document contains an outline of general product development plans,
customers should not rely upon it when making purchasing decisions.
Hadoop Everywhere
Page 7 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Any application
Batch, interactive, and real-time
Any data
Existing and new datasets
Anywhere
Complete range of deployment options
Commodity Appliance Cloud
YARN: data operating system
Existing
applications
New
analytics
Partner
applications
Data access: batch, interactive, real-time
Hadoop Everywhere
Page 8 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Hybrid Deployment Choice
Windows, Linux, On-Premise or Cloud
Data “gravity” guides choice
Compatible Clusters
Run applications and data processing
workloads wherever and whenever
needed
Replicated Datasets
Democratize Hadoop data access via
automated sharing of datasets using
Apache Falcon
Hadoop Up There, Down Here...Everywhere!
Dev / Test BI / ML
IoT Apps
On-Premises
Page 9 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Use Cases Where?
Active Archive / Compliance Reporting Sensitive data = “down here”; “up there” valid for many
scenarios
ETL / Data Warehouse Optimization
Usually has “down here” gravity; DW in cloud is changing
that
Smart Meter Analysis Data typically flows “up there”
Single View of Customer
May have “down here” gravity; unless you’re using SaaS
apps
Supply Chain Optimization May have heavy “down here” gravity
New Data for Product Management “Up there” could be considered for many scenarios.
Vehicle Data for
Transportation/Logistics
Why not “up there”?
Vehicle Data for Insurance
May have “down here” gravity (ex. join with existing risk
data)
Anywhere? Up There or Down Here?
Deployment
Challenges & Requirements
Deployment challenges
● Infrastructure is different everywhere
○ e.g. Each cloud provider has their own API
○ e.g. Each provider has different networking methods
● OS/images are different everywhere
● How to do service discovery?
● How to dynamically scale/manage?
See prior operations workshops
- Infrastructure
- Operating System
- Environment Prepared (see docs)
- Ambari Agent/Server installed & registered
- Deploy HDP Cluster
- Ambari Blueprints or Cluster Wizard
- Ongoing configuration/management
Deployment requirements
Options for Automation
- Many combinations of tools
- e.g. Foreman, Ansible, Chef, Puppet, docker-ambari,
shell scripts, CloudFormation, …
- Provider specific
- Cisco UCS, Teradata, HP, Google’s bdutil, …
- Docker with Cloudbreak
Using Ambari with all of the above!
https://github.com/seanorama/ambari-bootstrap/
Demo: Basic script-based example
https://github.com/seanorama/ambari-bootstrap
Requirements:
● Infrastructure prepped (see HDP docs)
● Nodes with RedHat EL or CentOS 6 systems
● HDFS paths mounted (see HDP docs)
● sudo or root access
ambari-bootstrap
After Ambari deployment
● (optional) Configure local YUM/APT repos
● Deploy HDP with Ambari Wizard or Blueprint
● Ongoing configuration/management
Using Ansible
https://github.com/rackerlabs/ansible-hadoop
Build once. Deploy anywhere.
Docker
Page 19 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Page 20 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Multiplicity
of
Stacks
Multiplicity
of hardware
environments
Static website Web
frontend
User
DB
Queu
e
Analytics
DB
Development VM
QA server Public Cloud
Contributor’s laptop
Docker is a “Shipping Container” System for Code
Production Cluster
Customer Data Center
An engine that enables any payload to be
encapsulated as a lightweight, portable,
self-sufficient container
Page 21 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Docker
• Container based virtualization
• Lightweight and portable
• Build once, run anywhere
• Ease of packaging applications
• Automated and scripted
• Isolated
Page 22 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Why Is Docker So Exciting?
For Developers:
Build once…run anywhere
• A clean, safe, and portable runtime
environment for your app.
• No missing dependencies, packages etc.
• Run each app in its own isolated container
• Automate testing, integration, packaging
• Reduce/eliminate concerns about
compatibility on different platforms
• Cheap, zero-penalty containers to deploy
services
For DevOps:
Configure once…run anything
• Make the entire lifecycle more efficient,
consistent, and repeatable
• Eliminate inconsistencies between SDLC
stages
• Support segregation of duties
• Significantly improves the speed and
reliability of CICD
• Significantly lightweight compared to VMs
Page 23 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
More Technical Explanation
WHY WHA
T
• Run on any LINUX
• Regardless of kernel version (2.6.32+)
• Regardless of host distro
• Physical or virtual, cloud or not
• Container and host architecture must match
• Run anything
• If it can run on the host, it can run in the
container
• i.e. if it can run on a Linux kernel, it can run
• High Level—It’s a lightweight VM
• Own process space
• Own network interface
• Can run stuff as root
• Low Level—It’s chroot on steroids
• Container=isolated processes
• Share kernel with host
• No device emulation (neither HVM nor PV)
from host)
Page 24 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Docker - How it works
App
A
Hypervisor (Type 2)
Host OS
Server
Guest
OS
Bins/
Libs
App
A’
Gues
t
OS
Bins/
Libs
App
B
Gues
t
OS
Bins/
Libs
Docker
Host OS kernel
Server
bin
AppA
lib
AppB
VM
Container
Containers are isolated. Share OS
and bins/libraries
Guest
OS
Guest
OS
…result is significantly faster
deployment, much less overhead,
easier migration, faster restart
lib
AppB
lib
AppB
lib
AppB
bin
AppA
Cloudbreak
Tool for Provision and Managing Hadoop Clusters In The
Cloud
Page 26 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Cloudbreak
• Developed by SequenceIQ
• Open source with Apache 2.0
license [ Apache project soon ]
• Cloud and infrastructure
agnostic, cost effective Hadoop
As-a-Service platform API.
• Elastic – can spin up any number
of nodes, add/remove on the fly
• Provides full cloud lifecycle
management post-deployment
Page 27 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Key Features of Cloudbreak
Elastic
• Enables provisioning an
arbitrary node Cluster
• Enables (de)
commissioning nodes
from Cluster
• Policy and time based
based scaling of cluster
Flexible
• Declarative and flexible
Hadoop cluster creation
using blueprints
• Provision to multiple
public cloud providers or
Openstack based private
cloud using same
common API
• Access all of this
functionality through rich
UI, secured REST API or
automatable Shell
Enterprise-ready
• Supports basic, token
based and OAuth2
authentication model
• The cluster is
provisioned in a logically
isolated network
• Tracking usage and
cluster metrics
Page 28 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
BI / Analytics
(Hive)
IoT Apps
(Storm, HBase, Hive)
Launch HDP on Any Cloud for Any Application
Dev / Test
(all HDP services)
Data Science
(Spark)
Cloudbreak
1. Pick a Blueprint
2. Choose a Cloud
3. Launch HDP!
Example Ambari Blueprints:
IoT Apps, BI / Analytics, Data Science, Dev /
Test
Page 29 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Cloudbreak Approach
• Use Ambari for heavy lifting
• Provisioning of Hadoop services
• Monitoring
• Use Ambari Blueprints
• Assign Host groups to physical instance types
• Public/Private Cloud provider API abstracted
• Azure/Google/Amazon/Openstack
• Run Ambari agent/server in Docker container
• Networking: docker run –net=host
• Service discovery: consul (previously serf)
Workshop: Your own Cloudbreak
cloudbreak-deployer
● https://github.com/sequenceiq/cloudbreak-deployer
Requirements:
● A Docker host (laptop, server or Cloud infrastructure)
● Resources:
○ Very little. Tested with 2GB of RAM.
Workshop: Your Own CloudBreak
Requirement: a Docker host
● OSX or Windows: http://boot2docker.io/
○ boot2docker init
○ boot2docker up
○ eval "$(boot2docker shellinit)"
○ boot2docker ssh
● Linux: Install the docker daemon
● Anywhere: docker-machine “lets you create Docker hosts on your
computer, on cloud providers, and inside your own data center”
○ Example on Rackspace:
■ docker-machine create --driver rackspace 
--rackspace-api-key $OS_PASSWORD 
--rackspace-username $OS_USERNAME 
--rackspace-region DFW docker-rax
■ docker-machine ssh docker-rax
Install cloudbreak-deployer
https://github.com/sequenceiq/cloudbreak-deployer
● curl  https://raw.githubusercontent.com/sequenceiq/cloudbreak-
deployer/master/install | sh && cbd --version
● cbd init
● cbd start
You’ll then have your own CloudBreak & Periscope server
with API and Web UI
Done: Your own Cloudbreak
Deploy a cluster with your CloudBreak
Documentation:
http://sequenceiq.
com/cloudbreak/#clou
dbreak-credentials
1. Add Credentials
2. Create Cluster
3. Use your Cluster
Ambari available as expected
To reach your Hadoop hosts:
● SSH to Docker Host
○ Hosts arre listed in “Cloud stack description”
○ ssh cloudbreak@IPofHost
● Shell to the “ambari-agent”
container
○ sudo docker ps | grep ambari-agent
■ note the CONTAINER ID
○ sudo docker -it CONTAINERID bash
● Use the hosts as usual. e.g.:
○ hadoop fs -ls /
Cloudbreak internals
Page 40 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Cloudbreak
Cloudbreak Internals
Uluwatu
(cbreak UI)
Sultans
(User mgmt UI)
Browser
Cloudbreak
shellOAuth2
(UAA)
uaa-db
(psql)
Cloudbreak
(rest API)
cb-db
(psql)
Periscope
(autoscaling
)
ps-db
(psql)
consul registrator ambassador
docker
Docker
Page 42 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Swarm
• Native clustering for Docker
• Distributed container orchestration
• Same API as Docker
Page 43 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Swarm – How it works
• Swarm managers/agents
• Discovery services
• Advanced scheduling
Page 44 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Consul
• Service discovery/registry
• Health checking
• Key/Value store
• DNS
• Multi datacenter aware
Page 45 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Consul – How it works
• Consul servers/agents
• Consistency through a quorum (RAFT)
• Scalability due to gossip based protocol (SWIM)
• Decentralized and fault tolerant
• Highly available
• Consistency over availability (CP)
• Multiple interfaces - HTTP and DNS
• Support for watches
Page 46 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Apache Ambari
• Easy Hadoop cluster provisioning
• Management and monitoring
• Key feature - Blueprints
• REST API, CLI shell
• Extensible
• Stacks
• Services
• Views
Page 47 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Apache Ambari – How it works
• Ambari server/agents
• Define a blueprint (blueprint.json)
• Define a host mapping (hostmapping.json)
• Post the cluster create
Page 48 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Run Hadoop as Docker containers
HDP as Docker
Containers
via Cloudbreak
• Fully Automated Ambari Cluster installation
• Avoid GUI, use rest API only (ambari-shell)
• Fully Automated HDP installation with blueprints
• Quick installation (pre-pulled rpms)
• Same process/images for dev/qa/prod
• Same process for single/multinode
Cloudbreak Ambari HDP
Installs
Ambari
on the
VMs
Docker
VM
Docker
VM
Docker
Linux
Instructs
Ambari
to build
HDP
cluster
Cloud Provider/Bare Metal
Provision
s VMs
from
Cloud
Providers
Page 49 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Provisioning – How it works
Start VMs -
with a running
Docker
daemon
Cloudbreak
Bootstrap
•Start Consul
Cluster
•Start Swarm
Cluster (Consul
for discovery)
Start Ambari
servers/agents
- Swarm API
Ambari
services
registered in
Consul
(Registrator)
Post Blueprint
Page 50 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Cloudbreak
Run Hadoop as Docker containers
Docker Docker
DockerDockerDocker
Docker
Page 51 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Cloudbreak
Run Hadoop as Docker containers
Docker Docker
DockerDockerDocker
Docker
amb-
agn
amb-ser
amb-
agn
amb-
agn
amb-
agn
amb-
agn
Page 52 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Cloudbreak
Run Hadoop as Docker containers
Docker Docker
DockerDockerDocker
Docker
amb-
agn
amb-ser
amb-
agn
amb-
agn
amb-
agn
amb-
agn
Blueprint
Page 53 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Cloudbreak
Run Hadoop as Docker containers
Docker Docker
DockerDockerDocker
Docker
amb-agn
- hdfs
- hbase
amb-ser
amb-agn
-hdfs
-hive
amb-agn
-hdfs
-yarn
amb-agn
-hdfs
-zookpr
amb-agn
-nmnode
-hdfs
Workshop: Auto-Scale your Cluster
with Periscope
Page 55 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Optimize Cloud Usage via Elastic HDP Clusters
Dev / Test
Auto-scaling
Policy
• Policies based on any Ambari metrics
• Dynamically scale to achieve physical elasticity
• Coordinates with YARN to achieve elasticity based on
the policies.
Page 56 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Scaling for Static and Dynamic Clusters
Auto-scale
Policy
Auto-scale
Policy
Auto-scale
Policy
YARN
Ambari
Alerts
Ambari
Metrics
Ambari
Ambari
Ambari
Provisioning
Cloudbreak
Static
Dynamic
Enforces Policies
Scales
Cluster/YARN Apps
Metrics and Alerts
Feed
Cloudbreak/Periscope
Scale by Ambari Monitoring Metric
1. Ambari: review metric
2. CloudBreak: set alert
3. Cloudbreak: set scaling policy
Scale up/down by time
1. Set time-based alert
2. Set scaling policy
Repeat with an alert
and policy which
scales down
Roadmap
Page 60 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Release Summary
Cloudbreak
● It’s own project
(separate from Ambari)
● Supported on Linux
flavors which support
Docker
Periscope
● Feature of Cloudbreak 1.0
● Will be embedded in
Ambari later in 2015
Page 61 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Release Timeline
Cloudbreak 1.0
GA
June/July
2015
Cloudbreak 2.0 GA
2H2015
Ambari 2.1.0
HDP “Dal” / 2.3
Ambari 2.2
HDP “Erie” / 2.4
Cloudbreak 1.1
August 2015
(est)
Ambari 2.1.1
HDP “Dal-M10”
Cloudbreak
Incubator
Proposal
July/August 2015
(est)
Page 62 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Supported Cloud Environments
Cloudbreak
HDP 2.3
Microsoft Azure GA
AWS GA
Google Compute GA
Cloudbreak
HDP 2.3
Cloudbreak HDP
2.4
Openstack
Community
Tech Preview Tech Preview
Red Hat OSP TBD
HP Helion GA (Tentative)
Mirantis
OpenStack
HDP as a Service
Hortonworks Data Platform On Azure
Rackspace
Cloud Big Data Platform
● Rapidly spin up on-demand HDP clusters
● Integrated with Cloud Files (OpenStack Swift)
● Opt-in for Managed Services by Rackspace
Managed Big Data Platform
● Fully Managed HDP on Dedicated and/or Cloud
● Leverage Fanatical Support and Industry Leading SLA’s
● Supported by Rackspace with escalation to Hortonworks
CSC
HDP on IaaS - Best Practices
Microsoft Azure
● Deployment
○ Deploy using CloudBreak
○ Deploy using HWX Azure Gallery Image
● Integrated with Azure Blob Storage
● Supported directly by Hortonworks
● Other offerings
○ Microsoft HDInsight
○ HDP Sandbox
Azure Deployment Guideline
● All in same Region
● Instance Types
○ Typical: A7
○ Performance: D14
○ 8x1TB Standard LRS x3 Virtual Hard Disk per
server
● Multiple Storage Accounts are recommended
○ Recommend no more than 40 Virtual Hard Disks
per Storage Account
Azure Blob Store
Azure Blob Store (Object Storage)
● wasb[s]:
//<containername>@<accountname>.blob.
core.windows.net/<path>
Can be used as a replacement for HDFS
● Thoroughly tested in HDP release test suites
Amazon Web Services
● Deploy using CloudBreak
● Integrated with AWS S3 (object storage)
● Supported directly by Hortonworks
Amazon Deployment Guideline
● All in same Region/AZ
● Instances with Enhanced
Networking
Master Nodes:
● Choose EBS Optimized
● Boot: 100GB on EBS
● Data: 4+ 1TB on EBS
Worker Nodes:
● Boot: 100GB on EBS
● Data: Instance Storage
○ EBS can be used, but local
is preferred
Instance Types:
● Typical: d2.
● Performance: i2.
https://aws.amazon.com/ec2/instance-types/
AWS RDS
● Some services rely on MySQL, Oracle or PostgreSQL:
○ Apache Ambari
○ Apache Hive
○ Apache Oozie
○ Apache Ranger
● Use RDS for these instead of managing yourself.
AWS S3 (Object Storage)
● s3n:// with HDP 2.2 (Hadoop 2.6)
● s3a:// with HDP 2.3 (Hadoop 2.7)
Not currently a direct replacement for HDFS
Recommended to configure access with IAM Role/Policy
● https://docs.aws.amazon.
com/IAM/latest/UserGuide/policies_examples.html#iam-
policy-example-s3
● Example: http://git.io/vLoGY
Amazon Deployment Guideline
● All in same Region/AZ
● Instances with Enhanced
Networking
Master Nodes:
● Choose EBS Optimized
● Boot: 100GB on EBS
● Data: 4+ 1TB on EBS
Worker Nodes:
● Boot: 100GB on EBS
● Data: Instance Storage
○ EBS can be used, but local
is preferred
Instance Types:
● Typical: d2.
● Performance: i2.
https://aws.amazon.com/ec2/instance-types/
Google Cloud
● Deploy using
○ CloudBreak
○ Google bdutil with Apache Ambari plug-in
● Integrated with Google Cloud Storage
● Supported directly by Hortonworks
Google Deployment Guideline
● Instance Types
○ Typical: n1 standard 4 with single 1.5 TB
persistent disks
○ Performance: n1 standard 8 with 1TB SSD
● Google GCS (Object Storage)
● gs://<CONFIGBUCKET>/dir/file
● Not currently a replacement for HDFS
S3 & GCS as Secondary storage system
The connectors are currently eventually consistent so do not replace HDFS
Backup
● Falcon, distCP, hadoop fs, HBase ExportSnapshot
● Kafka+Storm bolt sends messages to S3/GCS
providing backup & point-in-time recovery source
Input/Output
● Convenient & broadly used upload/download method
○ As a middleware to ease integration with Hadoop & limit access
● Publishing static content (optionally with CloudFront)
○ Removes need to manage any web services
● Storage for temporary/ephemeral clusters
Questions
$ shutdown -h now
- HDP 2.3
- http://hortonworks.com/
- Hadoop Summit recordings:
- http://2015.hadoopsummit.org/san-jose/
- http://2015.hadoopsummit.org/brussels/
- Past & Future workshops:
- http://hortonworks.com/partners/learn/

Weitere ähnliche Inhalte

Was ist angesagt?

A First-Hand Look at What's New in HDP 2.3
A First-Hand Look at What's New in HDP 2.3 A First-Hand Look at What's New in HDP 2.3
A First-Hand Look at What's New in HDP 2.3 DataWorks Summit
 
Driving in the Desert - Running Your HDP Cluster with Helion, Openstack, and ...
Driving in the Desert - Running Your HDP Cluster with Helion, Openstack, and ...Driving in the Desert - Running Your HDP Cluster with Helion, Openstack, and ...
Driving in the Desert - Running Your HDP Cluster with Helion, Openstack, and ...DataWorks Summit
 
Hadoop crashcourse v3
Hadoop crashcourse v3Hadoop crashcourse v3
Hadoop crashcourse v3Hortonworks
 
Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...DataWorks Summit
 
Its Finally Here! Building Complex Streaming Analytics Apps in under 10 min w...
Its Finally Here! Building Complex Streaming Analytics Apps in under 10 min w...Its Finally Here! Building Complex Streaming Analytics Apps in under 10 min w...
Its Finally Here! Building Complex Streaming Analytics Apps in under 10 min w...DataWorks Summit
 
20151027 sahara + manila final
20151027 sahara + manila final20151027 sahara + manila final
20151027 sahara + manila finalWei Ting Chen
 
Cloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road AheadCloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road AheadDataWorks Summit
 
How to Use Apache Zeppelin with HWX HDB
How to Use Apache Zeppelin with HWX HDBHow to Use Apache Zeppelin with HWX HDB
How to Use Apache Zeppelin with HWX HDBHortonworks
 
Running Hadoop as Service in AltiScale Platform
Running Hadoop as Service in AltiScale PlatformRunning Hadoop as Service in AltiScale Platform
Running Hadoop as Service in AltiScale PlatformInMobi Technology
 
Machine Learning Model Deployment: Strategy to Implementation
Machine Learning Model Deployment: Strategy to ImplementationMachine Learning Model Deployment: Strategy to Implementation
Machine Learning Model Deployment: Strategy to ImplementationDataWorks Summit
 
Powering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and HortonworksPowering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and HortonworksHortonworks
 
Discover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP2.1: Apache Storm for Stream Data Processing in HadoopDiscover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP2.1: Apache Storm for Stream Data Processing in HadoopHortonworks
 
How to deploy Apache Spark in a multi-tenant, on-premises environment
How to deploy Apache Spark in a multi-tenant, on-premises environmentHow to deploy Apache Spark in a multi-tenant, on-premises environment
How to deploy Apache Spark in a multi-tenant, on-premises environmentBlueData, Inc.
 
Bare-metal performance for Big Data workloads on Docker containers
Bare-metal performance for Big Data workloads on Docker containersBare-metal performance for Big Data workloads on Docker containers
Bare-metal performance for Big Data workloads on Docker containersBlueData, Inc.
 
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...DataWorks Summit
 
Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop
Successes, Challenges, and Pitfalls Migrating a SAAS business to HadoopSuccesses, Challenges, and Pitfalls Migrating a SAAS business to Hadoop
Successes, Challenges, and Pitfalls Migrating a SAAS business to HadoopDataWorks Summit/Hadoop Summit
 
What's new in Hadoop Yarn- Dec 2014
What's new in Hadoop Yarn- Dec 2014What's new in Hadoop Yarn- Dec 2014
What's new in Hadoop Yarn- Dec 2014InMobi Technology
 
Connecting the Drops with Apache NiFi & Apache MiNiFi
Connecting the Drops with Apache NiFi & Apache MiNiFiConnecting the Drops with Apache NiFi & Apache MiNiFi
Connecting the Drops with Apache NiFi & Apache MiNiFiDataWorks Summit
 

Was ist angesagt? (20)

A First-Hand Look at What's New in HDP 2.3
A First-Hand Look at What's New in HDP 2.3 A First-Hand Look at What's New in HDP 2.3
A First-Hand Look at What's New in HDP 2.3
 
Driving in the Desert - Running Your HDP Cluster with Helion, Openstack, and ...
Driving in the Desert - Running Your HDP Cluster with Helion, Openstack, and ...Driving in the Desert - Running Your HDP Cluster with Helion, Openstack, and ...
Driving in the Desert - Running Your HDP Cluster with Helion, Openstack, and ...
 
Hybrid is the New Normal
Hybrid is the New NormalHybrid is the New Normal
Hybrid is the New Normal
 
Hadoop crashcourse v3
Hadoop crashcourse v3Hadoop crashcourse v3
Hadoop crashcourse v3
 
Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...
 
Its Finally Here! Building Complex Streaming Analytics Apps in under 10 min w...
Its Finally Here! Building Complex Streaming Analytics Apps in under 10 min w...Its Finally Here! Building Complex Streaming Analytics Apps in under 10 min w...
Its Finally Here! Building Complex Streaming Analytics Apps in under 10 min w...
 
20151027 sahara + manila final
20151027 sahara + manila final20151027 sahara + manila final
20151027 sahara + manila final
 
Cloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road AheadCloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road Ahead
 
How to Use Apache Zeppelin with HWX HDB
How to Use Apache Zeppelin with HWX HDBHow to Use Apache Zeppelin with HWX HDB
How to Use Apache Zeppelin with HWX HDB
 
Running Hadoop as Service in AltiScale Platform
Running Hadoop as Service in AltiScale PlatformRunning Hadoop as Service in AltiScale Platform
Running Hadoop as Service in AltiScale Platform
 
Machine Learning Model Deployment: Strategy to Implementation
Machine Learning Model Deployment: Strategy to ImplementationMachine Learning Model Deployment: Strategy to Implementation
Machine Learning Model Deployment: Strategy to Implementation
 
Containers and Big Data
Containers and Big DataContainers and Big Data
Containers and Big Data
 
Powering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and HortonworksPowering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
 
Discover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP2.1: Apache Storm for Stream Data Processing in HadoopDiscover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
 
How to deploy Apache Spark in a multi-tenant, on-premises environment
How to deploy Apache Spark in a multi-tenant, on-premises environmentHow to deploy Apache Spark in a multi-tenant, on-premises environment
How to deploy Apache Spark in a multi-tenant, on-premises environment
 
Bare-metal performance for Big Data workloads on Docker containers
Bare-metal performance for Big Data workloads on Docker containersBare-metal performance for Big Data workloads on Docker containers
Bare-metal performance for Big Data workloads on Docker containers
 
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...
 
Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop
Successes, Challenges, and Pitfalls Migrating a SAAS business to HadoopSuccesses, Challenges, and Pitfalls Migrating a SAAS business to Hadoop
Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop
 
What's new in Hadoop Yarn- Dec 2014
What's new in Hadoop Yarn- Dec 2014What's new in Hadoop Yarn- Dec 2014
What's new in Hadoop Yarn- Dec 2014
 
Connecting the Drops with Apache NiFi & Apache MiNiFi
Connecting the Drops with Apache NiFi & Apache MiNiFiConnecting the Drops with Apache NiFi & Apache MiNiFi
Connecting the Drops with Apache NiFi & Apache MiNiFi
 

Andere mochten auch

Apache pig power_tools_by_viswanath_gangavaram_r&d_dsg_i_labs
Apache pig power_tools_by_viswanath_gangavaram_r&d_dsg_i_labsApache pig power_tools_by_viswanath_gangavaram_r&d_dsg_i_labs
Apache pig power_tools_by_viswanath_gangavaram_r&d_dsg_i_labsViswanath Gangavaram
 
Pivotal HD 3.0 설치가이드
Pivotal HD 3.0 설치가이드Pivotal HD 3.0 설치가이드
Pivotal HD 3.0 설치가이드seungdon Choi
 
Introduction to Apache Pig
Introduction to Apache PigIntroduction to Apache Pig
Introduction to Apache PigAvkash Chauhan
 
Zookeeper In Simple Words
Zookeeper In Simple WordsZookeeper In Simple Words
Zookeeper In Simple WordsFuqiang Wang
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeperknowbigdata
 
[233]멀티테넌트하둡클러스터 남경완
[233]멀티테넌트하둡클러스터 남경완[233]멀티테넌트하둡클러스터 남경완
[233]멀티테넌트하둡클러스터 남경완NAVER D2
 
Distributed Applications with Apache Zookeeper
Distributed Applications with Apache ZookeeperDistributed Applications with Apache Zookeeper
Distributed Applications with Apache ZookeeperAlex Ehrnschwender
 
Introduction to Apache Pig
Introduction to Apache PigIntroduction to Apache Pig
Introduction to Apache PigJason Shao
 
[236] 카카오의데이터파이프라인 윤도영
[236] 카카오의데이터파이프라인 윤도영[236] 카카오의데이터파이프라인 윤도영
[236] 카카오의데이터파이프라인 윤도영NAVER D2
 
Data Science with Spark
Data Science with SparkData Science with Spark
Data Science with SparkKrishna Sankar
 
[225]yarn 기반의 deep learning application cluster 구축 김제민
[225]yarn 기반의 deep learning application cluster 구축 김제민[225]yarn 기반의 deep learning application cluster 구축 김제민
[225]yarn 기반의 deep learning application cluster 구축 김제민NAVER D2
 
[112]rest에서 graph ql과 relay로 갈아타기 이정우
[112]rest에서 graph ql과 relay로 갈아타기 이정우[112]rest에서 graph ql과 relay로 갈아타기 이정우
[112]rest에서 graph ql과 relay로 갈아타기 이정우NAVER D2
 
[123] electron 김성훈
[123] electron 김성훈[123] electron 김성훈
[123] electron 김성훈NAVER D2
 
Hortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices WorkshopHortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices WorkshopHortonworks
 
Simplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache SparkSimplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache SparkDatabricks
 
[1A6]Docker로 보는 서버 운영의 미래
[1A6]Docker로 보는 서버 운영의 미래[1A6]Docker로 보는 서버 운영의 미래
[1A6]Docker로 보는 서버 운영의 미래NAVER D2
 
[221] docker orchestration
[221] docker orchestration[221] docker orchestration
[221] docker orchestrationNAVER D2
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Sparkdatamantra
 

Andere mochten auch (20)

On Demand HDP Clusters using Cloudbreak and Ambari
On Demand HDP Clusters using Cloudbreak and AmbariOn Demand HDP Clusters using Cloudbreak and Ambari
On Demand HDP Clusters using Cloudbreak and Ambari
 
Apache pig power_tools_by_viswanath_gangavaram_r&d_dsg_i_labs
Apache pig power_tools_by_viswanath_gangavaram_r&d_dsg_i_labsApache pig power_tools_by_viswanath_gangavaram_r&d_dsg_i_labs
Apache pig power_tools_by_viswanath_gangavaram_r&d_dsg_i_labs
 
Pivotal HD 3.0 설치가이드
Pivotal HD 3.0 설치가이드Pivotal HD 3.0 설치가이드
Pivotal HD 3.0 설치가이드
 
Introduction to Apache Pig
Introduction to Apache PigIntroduction to Apache Pig
Introduction to Apache Pig
 
Zookeeper In Simple Words
Zookeeper In Simple WordsZookeeper In Simple Words
Zookeeper In Simple Words
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeper
 
[233]멀티테넌트하둡클러스터 남경완
[233]멀티테넌트하둡클러스터 남경완[233]멀티테넌트하둡클러스터 남경완
[233]멀티테넌트하둡클러스터 남경완
 
Distributed Applications with Apache Zookeeper
Distributed Applications with Apache ZookeeperDistributed Applications with Apache Zookeeper
Distributed Applications with Apache Zookeeper
 
Introduction to Apache Pig
Introduction to Apache PigIntroduction to Apache Pig
Introduction to Apache Pig
 
[236] 카카오의데이터파이프라인 윤도영
[236] 카카오의데이터파이프라인 윤도영[236] 카카오의데이터파이프라인 윤도영
[236] 카카오의데이터파이프라인 윤도영
 
Data Science with Spark
Data Science with SparkData Science with Spark
Data Science with Spark
 
Apache ZooKeeper
Apache ZooKeeperApache ZooKeeper
Apache ZooKeeper
 
[225]yarn 기반의 deep learning application cluster 구축 김제민
[225]yarn 기반의 deep learning application cluster 구축 김제민[225]yarn 기반의 deep learning application cluster 구축 김제민
[225]yarn 기반의 deep learning application cluster 구축 김제민
 
[112]rest에서 graph ql과 relay로 갈아타기 이정우
[112]rest에서 graph ql과 relay로 갈아타기 이정우[112]rest에서 graph ql과 relay로 갈아타기 이정우
[112]rest에서 graph ql과 relay로 갈아타기 이정우
 
[123] electron 김성훈
[123] electron 김성훈[123] electron 김성훈
[123] electron 김성훈
 
Hortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices WorkshopHortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices Workshop
 
Simplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache SparkSimplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache Spark
 
[1A6]Docker로 보는 서버 운영의 미래
[1A6]Docker로 보는 서버 운영의 미래[1A6]Docker로 보는 서버 운영의 미래
[1A6]Docker로 보는 서버 운영의 미래
 
[221] docker orchestration
[221] docker orchestration[221] docker orchestration
[221] docker orchestration
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 

Ähnlich wie Hadoop Everywhere & Cloudbreak

Docker based Hadoop Deployment
Docker based Hadoop DeploymentDocker based Hadoop Deployment
Docker based Hadoop DeploymentRakesh Saha
 
DEVNET-1141 Dynamic Dockerized Hadoop Provisioning
DEVNET-1141	Dynamic Dockerized Hadoop ProvisioningDEVNET-1141	Dynamic Dockerized Hadoop Provisioning
DEVNET-1141 Dynamic Dockerized Hadoop ProvisioningCisco DevNet
 
Apache Ambari BOF - OpenStack - Hadoop Summit 2013
Apache Ambari BOF - OpenStack - Hadoop Summit 2013Apache Ambari BOF - OpenStack - Hadoop Summit 2013
Apache Ambari BOF - OpenStack - Hadoop Summit 2013Hortonworks
 
Cloud for agile_sw_projects-final
Cloud for agile_sw_projects-finalCloud for agile_sw_projects-final
Cloud for agile_sw_projects-finalAlain Delafosse
 
How to Upgrade Your Hadoop Stack in 1 Step -- with Zero Downtime
How to Upgrade Your Hadoop Stack in 1 Step -- with Zero DowntimeHow to Upgrade Your Hadoop Stack in 1 Step -- with Zero Downtime
How to Upgrade Your Hadoop Stack in 1 Step -- with Zero DowntimeIan Lumb
 
Cloud and agile software projects: Overview and Benefits
Cloud and agile software projects: Overview and BenefitsCloud and agile software projects: Overview and Benefits
Cloud and agile software projects: Overview and BenefitsGuillaume Berche
 
20150425 experimenting with openstack sahara on docker
20150425 experimenting with openstack sahara on docker20150425 experimenting with openstack sahara on docker
20150425 experimenting with openstack sahara on dockerWei Ting Chen
 
Cloud Foundry: Hands-on Deployment Workshop
Cloud Foundry: Hands-on Deployment WorkshopCloud Foundry: Hands-on Deployment Workshop
Cloud Foundry: Hands-on Deployment WorkshopManuel Garcia
 
Build Your Own PaaS, Just like Red Hat's OpenShift from LinuxCon 2013 New Orl...
Build Your Own PaaS, Just like Red Hat's OpenShift from LinuxCon 2013 New Orl...Build Your Own PaaS, Just like Red Hat's OpenShift from LinuxCon 2013 New Orl...
Build Your Own PaaS, Just like Red Hat's OpenShift from LinuxCon 2013 New Orl...OpenShift Origin
 
Hello OpenStack, Meet Hadoop
Hello OpenStack, Meet HadoopHello OpenStack, Meet Hadoop
Hello OpenStack, Meet HadoopDataWorks Summit
 
Hadoop Operations – Past, Present, and Future
Hadoop Operations – Past, Present, and FutureHadoop Operations – Past, Present, and Future
Hadoop Operations – Past, Present, and FutureDataWorks Summit
 
MiniFi and Apache NiFi : IoT in Berlin Germany 2018
MiniFi and Apache NiFi : IoT in Berlin Germany 2018MiniFi and Apache NiFi : IoT in Berlin Germany 2018
MiniFi and Apache NiFi : IoT in Berlin Germany 2018Timothy Spann
 
Galera on kubernetes_no_video
Galera on kubernetes_no_videoGalera on kubernetes_no_video
Galera on kubernetes_no_videoPatrick Galbraith
 
Running Enterprise Workloads in the Cloud
Running Enterprise Workloads in the CloudRunning Enterprise Workloads in the Cloud
Running Enterprise Workloads in the CloudDataWorks Summit
 
Hadoop Operations - Past, Present, and Future
Hadoop Operations - Past, Present, and FutureHadoop Operations - Past, Present, and Future
Hadoop Operations - Past, Present, and FutureDataWorks Summit
 
IoT with Apache MXNet and Apache NiFi and MiniFi
IoT with Apache MXNet and Apache NiFi and MiniFiIoT with Apache MXNet and Apache NiFi and MiniFi
IoT with Apache MXNet and Apache NiFi and MiniFiDataWorks Summit
 
Apache MXNet for IoT with Apache NiFi
Apache MXNet for IoT with Apache NiFiApache MXNet for IoT with Apache NiFi
Apache MXNet for IoT with Apache NiFiTimothy Spann
 
Transforming Application Delivery with PaaS and Linux Containers
Transforming Application Delivery with PaaS and Linux ContainersTransforming Application Delivery with PaaS and Linux Containers
Transforming Application Delivery with PaaS and Linux ContainersGiovanni Galloro
 
Containers Anywhere with OpenShift by Red Hat
Containers Anywhere with OpenShift by Red HatContainers Anywhere with OpenShift by Red Hat
Containers Anywhere with OpenShift by Red HatAmazon Web Services
 

Ähnlich wie Hadoop Everywhere & Cloudbreak (20)

Docker based Hadoop Deployment
Docker based Hadoop DeploymentDocker based Hadoop Deployment
Docker based Hadoop Deployment
 
DEVNET-1141 Dynamic Dockerized Hadoop Provisioning
DEVNET-1141	Dynamic Dockerized Hadoop ProvisioningDEVNET-1141	Dynamic Dockerized Hadoop Provisioning
DEVNET-1141 Dynamic Dockerized Hadoop Provisioning
 
Apache Ambari BOF - OpenStack - Hadoop Summit 2013
Apache Ambari BOF - OpenStack - Hadoop Summit 2013Apache Ambari BOF - OpenStack - Hadoop Summit 2013
Apache Ambari BOF - OpenStack - Hadoop Summit 2013
 
Cloud for agile_sw_projects-final
Cloud for agile_sw_projects-finalCloud for agile_sw_projects-final
Cloud for agile_sw_projects-final
 
How to Upgrade Your Hadoop Stack in 1 Step -- with Zero Downtime
How to Upgrade Your Hadoop Stack in 1 Step -- with Zero DowntimeHow to Upgrade Your Hadoop Stack in 1 Step -- with Zero Downtime
How to Upgrade Your Hadoop Stack in 1 Step -- with Zero Downtime
 
Red hat cloud platforms
Red hat cloud platformsRed hat cloud platforms
Red hat cloud platforms
 
Cloud and agile software projects: Overview and Benefits
Cloud and agile software projects: Overview and BenefitsCloud and agile software projects: Overview and Benefits
Cloud and agile software projects: Overview and Benefits
 
20150425 experimenting with openstack sahara on docker
20150425 experimenting with openstack sahara on docker20150425 experimenting with openstack sahara on docker
20150425 experimenting with openstack sahara on docker
 
Cloud Foundry: Hands-on Deployment Workshop
Cloud Foundry: Hands-on Deployment WorkshopCloud Foundry: Hands-on Deployment Workshop
Cloud Foundry: Hands-on Deployment Workshop
 
Build Your Own PaaS, Just like Red Hat's OpenShift from LinuxCon 2013 New Orl...
Build Your Own PaaS, Just like Red Hat's OpenShift from LinuxCon 2013 New Orl...Build Your Own PaaS, Just like Red Hat's OpenShift from LinuxCon 2013 New Orl...
Build Your Own PaaS, Just like Red Hat's OpenShift from LinuxCon 2013 New Orl...
 
Hello OpenStack, Meet Hadoop
Hello OpenStack, Meet HadoopHello OpenStack, Meet Hadoop
Hello OpenStack, Meet Hadoop
 
Hadoop Operations – Past, Present, and Future
Hadoop Operations – Past, Present, and FutureHadoop Operations – Past, Present, and Future
Hadoop Operations – Past, Present, and Future
 
MiniFi and Apache NiFi : IoT in Berlin Germany 2018
MiniFi and Apache NiFi : IoT in Berlin Germany 2018MiniFi and Apache NiFi : IoT in Berlin Germany 2018
MiniFi and Apache NiFi : IoT in Berlin Germany 2018
 
Galera on kubernetes_no_video
Galera on kubernetes_no_videoGalera on kubernetes_no_video
Galera on kubernetes_no_video
 
Running Enterprise Workloads in the Cloud
Running Enterprise Workloads in the CloudRunning Enterprise Workloads in the Cloud
Running Enterprise Workloads in the Cloud
 
Hadoop Operations - Past, Present, and Future
Hadoop Operations - Past, Present, and FutureHadoop Operations - Past, Present, and Future
Hadoop Operations - Past, Present, and Future
 
IoT with Apache MXNet and Apache NiFi and MiniFi
IoT with Apache MXNet and Apache NiFi and MiniFiIoT with Apache MXNet and Apache NiFi and MiniFi
IoT with Apache MXNet and Apache NiFi and MiniFi
 
Apache MXNet for IoT with Apache NiFi
Apache MXNet for IoT with Apache NiFiApache MXNet for IoT with Apache NiFi
Apache MXNet for IoT with Apache NiFi
 
Transforming Application Delivery with PaaS and Linux Containers
Transforming Application Delivery with PaaS and Linux ContainersTransforming Application Delivery with PaaS and Linux Containers
Transforming Application Delivery with PaaS and Linux Containers
 
Containers Anywhere with OpenShift by Red Hat
Containers Anywhere with OpenShift by Red HatContainers Anywhere with OpenShift by Red Hat
Containers Anywhere with OpenShift by Red Hat
 

Kürzlich hochgeladen

Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 

Kürzlich hochgeladen (20)

Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 

Hadoop Everywhere & Cloudbreak

  • 2. $ whoami Sean Roberts Partner Solutions Engineer London, EMEA & everywhere @seano linkedin.com/in/seanorama MacGyver. Data Freak. Cook. Autodidact. Volunteer. Ancestral Health. Fito. Couchsurfer. Nomad
  • 3. - HDP 2.3 - http://hortonworks.com/ - Hadoop Summit recordings: - http://2015.hadoopsummit.org/san-jose/ - http://2015.hadoopsummit.org/brussels/ - Past & Future workshops: - http://hortonworks.com/partners/learn/ What’s New!
  • 4. Agenda ● Hadoop Everywhere ● Deployment challenges & requirements ● Cloudbreak & our Docker approach ● Workshop: Your own CloudBreak ○ And auto-scaling with Periscope ● Cloud best practices Reminder: ● Attendee phone lines are muted ● Please ask questions in the chat
  • 5. Page 5 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Disclaimer This document may contain product features and technology directions that are under development, may be under development in the future or may ultimately not be developed. Project capabilities are based on information that is publicly available within the Apache Software Foundation project websites ("Apache"). Progress of the project capabilities can be tracked from inception to release through Apache, however, technical feasibility, market demand, user feedback and the overarching Apache Software Foundation community development process can all effect timing and final delivery. This document’s description of these features and technology directions does not represent a contractual commitment, promise or obligation from Hortonworks to deliver these features in any generally available product. Product features and technology directions are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind. Since this document contains an outline of general product development plans, customers should not rely upon it when making purchasing decisions.
  • 7. Page 7 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Any application Batch, interactive, and real-time Any data Existing and new datasets Anywhere Complete range of deployment options Commodity Appliance Cloud YARN: data operating system Existing applications New analytics Partner applications Data access: batch, interactive, real-time Hadoop Everywhere
  • 8. Page 8 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Hybrid Deployment Choice Windows, Linux, On-Premise or Cloud Data “gravity” guides choice Compatible Clusters Run applications and data processing workloads wherever and whenever needed Replicated Datasets Democratize Hadoop data access via automated sharing of datasets using Apache Falcon Hadoop Up There, Down Here...Everywhere! Dev / Test BI / ML IoT Apps On-Premises
  • 9. Page 9 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Use Cases Where? Active Archive / Compliance Reporting Sensitive data = “down here”; “up there” valid for many scenarios ETL / Data Warehouse Optimization Usually has “down here” gravity; DW in cloud is changing that Smart Meter Analysis Data typically flows “up there” Single View of Customer May have “down here” gravity; unless you’re using SaaS apps Supply Chain Optimization May have heavy “down here” gravity New Data for Product Management “Up there” could be considered for many scenarios. Vehicle Data for Transportation/Logistics Why not “up there”? Vehicle Data for Insurance May have “down here” gravity (ex. join with existing risk data) Anywhere? Up There or Down Here?
  • 11. Deployment challenges ● Infrastructure is different everywhere ○ e.g. Each cloud provider has their own API ○ e.g. Each provider has different networking methods ● OS/images are different everywhere ● How to do service discovery? ● How to dynamically scale/manage? See prior operations workshops
  • 12. - Infrastructure - Operating System - Environment Prepared (see docs) - Ambari Agent/Server installed & registered - Deploy HDP Cluster - Ambari Blueprints or Cluster Wizard - Ongoing configuration/management Deployment requirements
  • 13. Options for Automation - Many combinations of tools - e.g. Foreman, Ansible, Chef, Puppet, docker-ambari, shell scripts, CloudFormation, … - Provider specific - Cisco UCS, Teradata, HP, Google’s bdutil, … - Docker with Cloudbreak Using Ambari with all of the above!
  • 15. https://github.com/seanorama/ambari-bootstrap Requirements: ● Infrastructure prepped (see HDP docs) ● Nodes with RedHat EL or CentOS 6 systems ● HDFS paths mounted (see HDP docs) ● sudo or root access ambari-bootstrap
  • 16. After Ambari deployment ● (optional) Configure local YUM/APT repos ● Deploy HDP with Ambari Wizard or Blueprint ● Ongoing configuration/management
  • 18. Build once. Deploy anywhere. Docker
  • 19. Page 19 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
  • 20. Page 20 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Multiplicity of Stacks Multiplicity of hardware environments Static website Web frontend User DB Queu e Analytics DB Development VM QA server Public Cloud Contributor’s laptop Docker is a “Shipping Container” System for Code Production Cluster Customer Data Center An engine that enables any payload to be encapsulated as a lightweight, portable, self-sufficient container
  • 21. Page 21 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Docker • Container based virtualization • Lightweight and portable • Build once, run anywhere • Ease of packaging applications • Automated and scripted • Isolated
  • 22. Page 22 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Why Is Docker So Exciting? For Developers: Build once…run anywhere • A clean, safe, and portable runtime environment for your app. • No missing dependencies, packages etc. • Run each app in its own isolated container • Automate testing, integration, packaging • Reduce/eliminate concerns about compatibility on different platforms • Cheap, zero-penalty containers to deploy services For DevOps: Configure once…run anything • Make the entire lifecycle more efficient, consistent, and repeatable • Eliminate inconsistencies between SDLC stages • Support segregation of duties • Significantly improves the speed and reliability of CICD • Significantly lightweight compared to VMs
  • 23. Page 23 © Hortonworks Inc. 2011 – 2015. All Rights Reserved More Technical Explanation WHY WHA T • Run on any LINUX • Regardless of kernel version (2.6.32+) • Regardless of host distro • Physical or virtual, cloud or not • Container and host architecture must match • Run anything • If it can run on the host, it can run in the container • i.e. if it can run on a Linux kernel, it can run • High Level—It’s a lightweight VM • Own process space • Own network interface • Can run stuff as root • Low Level—It’s chroot on steroids • Container=isolated processes • Share kernel with host • No device emulation (neither HVM nor PV) from host)
  • 24. Page 24 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Docker - How it works App A Hypervisor (Type 2) Host OS Server Guest OS Bins/ Libs App A’ Gues t OS Bins/ Libs App B Gues t OS Bins/ Libs Docker Host OS kernel Server bin AppA lib AppB VM Container Containers are isolated. Share OS and bins/libraries Guest OS Guest OS …result is significantly faster deployment, much less overhead, easier migration, faster restart lib AppB lib AppB lib AppB bin AppA
  • 25. Cloudbreak Tool for Provision and Managing Hadoop Clusters In The Cloud
  • 26. Page 26 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Cloudbreak • Developed by SequenceIQ • Open source with Apache 2.0 license [ Apache project soon ] • Cloud and infrastructure agnostic, cost effective Hadoop As-a-Service platform API. • Elastic – can spin up any number of nodes, add/remove on the fly • Provides full cloud lifecycle management post-deployment
  • 27. Page 27 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Key Features of Cloudbreak Elastic • Enables provisioning an arbitrary node Cluster • Enables (de) commissioning nodes from Cluster • Policy and time based based scaling of cluster Flexible • Declarative and flexible Hadoop cluster creation using blueprints • Provision to multiple public cloud providers or Openstack based private cloud using same common API • Access all of this functionality through rich UI, secured REST API or automatable Shell Enterprise-ready • Supports basic, token based and OAuth2 authentication model • The cluster is provisioned in a logically isolated network • Tracking usage and cluster metrics
  • 28. Page 28 © Hortonworks Inc. 2011 – 2015. All Rights Reserved BI / Analytics (Hive) IoT Apps (Storm, HBase, Hive) Launch HDP on Any Cloud for Any Application Dev / Test (all HDP services) Data Science (Spark) Cloudbreak 1. Pick a Blueprint 2. Choose a Cloud 3. Launch HDP! Example Ambari Blueprints: IoT Apps, BI / Analytics, Data Science, Dev / Test
  • 29. Page 29 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Cloudbreak Approach • Use Ambari for heavy lifting • Provisioning of Hadoop services • Monitoring • Use Ambari Blueprints • Assign Host groups to physical instance types • Public/Private Cloud provider API abstracted • Azure/Google/Amazon/Openstack • Run Ambari agent/server in Docker container • Networking: docker run –net=host • Service discovery: consul (previously serf)
  • 30. Workshop: Your own Cloudbreak
  • 31. cloudbreak-deployer ● https://github.com/sequenceiq/cloudbreak-deployer Requirements: ● A Docker host (laptop, server or Cloud infrastructure) ● Resources: ○ Very little. Tested with 2GB of RAM. Workshop: Your Own CloudBreak
  • 32. Requirement: a Docker host ● OSX or Windows: http://boot2docker.io/ ○ boot2docker init ○ boot2docker up ○ eval "$(boot2docker shellinit)" ○ boot2docker ssh ● Linux: Install the docker daemon ● Anywhere: docker-machine “lets you create Docker hosts on your computer, on cloud providers, and inside your own data center” ○ Example on Rackspace: ■ docker-machine create --driver rackspace --rackspace-api-key $OS_PASSWORD --rackspace-username $OS_USERNAME --rackspace-region DFW docker-rax ■ docker-machine ssh docker-rax
  • 33. Install cloudbreak-deployer https://github.com/sequenceiq/cloudbreak-deployer ● curl https://raw.githubusercontent.com/sequenceiq/cloudbreak- deployer/master/install | sh && cbd --version ● cbd init ● cbd start You’ll then have your own CloudBreak & Periscope server with API and Web UI
  • 34. Done: Your own Cloudbreak
  • 35. Deploy a cluster with your CloudBreak
  • 38. 3. Use your Cluster Ambari available as expected To reach your Hadoop hosts: ● SSH to Docker Host ○ Hosts arre listed in “Cloud stack description” ○ ssh cloudbreak@IPofHost ● Shell to the “ambari-agent” container ○ sudo docker ps | grep ambari-agent ■ note the CONTAINER ID ○ sudo docker -it CONTAINERID bash ● Use the hosts as usual. e.g.: ○ hadoop fs -ls /
  • 40. Page 40 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Cloudbreak Cloudbreak Internals Uluwatu (cbreak UI) Sultans (User mgmt UI) Browser Cloudbreak shellOAuth2 (UAA) uaa-db (psql) Cloudbreak (rest API) cb-db (psql) Periscope (autoscaling ) ps-db (psql) consul registrator ambassador docker
  • 42. Page 42 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Swarm • Native clustering for Docker • Distributed container orchestration • Same API as Docker
  • 43. Page 43 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Swarm – How it works • Swarm managers/agents • Discovery services • Advanced scheduling
  • 44. Page 44 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Consul • Service discovery/registry • Health checking • Key/Value store • DNS • Multi datacenter aware
  • 45. Page 45 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Consul – How it works • Consul servers/agents • Consistency through a quorum (RAFT) • Scalability due to gossip based protocol (SWIM) • Decentralized and fault tolerant • Highly available • Consistency over availability (CP) • Multiple interfaces - HTTP and DNS • Support for watches
  • 46. Page 46 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Apache Ambari • Easy Hadoop cluster provisioning • Management and monitoring • Key feature - Blueprints • REST API, CLI shell • Extensible • Stacks • Services • Views
  • 47. Page 47 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Apache Ambari – How it works • Ambari server/agents • Define a blueprint (blueprint.json) • Define a host mapping (hostmapping.json) • Post the cluster create
  • 48. Page 48 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Run Hadoop as Docker containers HDP as Docker Containers via Cloudbreak • Fully Automated Ambari Cluster installation • Avoid GUI, use rest API only (ambari-shell) • Fully Automated HDP installation with blueprints • Quick installation (pre-pulled rpms) • Same process/images for dev/qa/prod • Same process for single/multinode Cloudbreak Ambari HDP Installs Ambari on the VMs Docker VM Docker VM Docker Linux Instructs Ambari to build HDP cluster Cloud Provider/Bare Metal Provision s VMs from Cloud Providers
  • 49. Page 49 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Provisioning – How it works Start VMs - with a running Docker daemon Cloudbreak Bootstrap •Start Consul Cluster •Start Swarm Cluster (Consul for discovery) Start Ambari servers/agents - Swarm API Ambari services registered in Consul (Registrator) Post Blueprint
  • 50. Page 50 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Cloudbreak Run Hadoop as Docker containers Docker Docker DockerDockerDocker Docker
  • 51. Page 51 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Cloudbreak Run Hadoop as Docker containers Docker Docker DockerDockerDocker Docker amb- agn amb-ser amb- agn amb- agn amb- agn amb- agn
  • 52. Page 52 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Cloudbreak Run Hadoop as Docker containers Docker Docker DockerDockerDocker Docker amb- agn amb-ser amb- agn amb- agn amb- agn amb- agn Blueprint
  • 53. Page 53 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Cloudbreak Run Hadoop as Docker containers Docker Docker DockerDockerDocker Docker amb-agn - hdfs - hbase amb-ser amb-agn -hdfs -hive amb-agn -hdfs -yarn amb-agn -hdfs -zookpr amb-agn -nmnode -hdfs
  • 54. Workshop: Auto-Scale your Cluster with Periscope
  • 55. Page 55 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Optimize Cloud Usage via Elastic HDP Clusters Dev / Test Auto-scaling Policy • Policies based on any Ambari metrics • Dynamically scale to achieve physical elasticity • Coordinates with YARN to achieve elasticity based on the policies.
  • 56. Page 56 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Scaling for Static and Dynamic Clusters Auto-scale Policy Auto-scale Policy Auto-scale Policy YARN Ambari Alerts Ambari Metrics Ambari Ambari Ambari Provisioning Cloudbreak Static Dynamic Enforces Policies Scales Cluster/YARN Apps Metrics and Alerts Feed Cloudbreak/Periscope
  • 57. Scale by Ambari Monitoring Metric 1. Ambari: review metric 2. CloudBreak: set alert 3. Cloudbreak: set scaling policy
  • 58. Scale up/down by time 1. Set time-based alert 2. Set scaling policy Repeat with an alert and policy which scales down
  • 60. Page 60 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Release Summary Cloudbreak ● It’s own project (separate from Ambari) ● Supported on Linux flavors which support Docker Periscope ● Feature of Cloudbreak 1.0 ● Will be embedded in Ambari later in 2015
  • 61. Page 61 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Release Timeline Cloudbreak 1.0 GA June/July 2015 Cloudbreak 2.0 GA 2H2015 Ambari 2.1.0 HDP “Dal” / 2.3 Ambari 2.2 HDP “Erie” / 2.4 Cloudbreak 1.1 August 2015 (est) Ambari 2.1.1 HDP “Dal-M10” Cloudbreak Incubator Proposal July/August 2015 (est)
  • 62. Page 62 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Supported Cloud Environments Cloudbreak HDP 2.3 Microsoft Azure GA AWS GA Google Compute GA Cloudbreak HDP 2.3 Cloudbreak HDP 2.4 Openstack Community Tech Preview Tech Preview Red Hat OSP TBD HP Helion GA (Tentative) Mirantis OpenStack
  • 63. HDP as a Service
  • 65. Rackspace Cloud Big Data Platform ● Rapidly spin up on-demand HDP clusters ● Integrated with Cloud Files (OpenStack Swift) ● Opt-in for Managed Services by Rackspace Managed Big Data Platform ● Fully Managed HDP on Dedicated and/or Cloud ● Leverage Fanatical Support and Industry Leading SLA’s ● Supported by Rackspace with escalation to Hortonworks
  • 66. CSC
  • 67. HDP on IaaS - Best Practices
  • 68. Microsoft Azure ● Deployment ○ Deploy using CloudBreak ○ Deploy using HWX Azure Gallery Image ● Integrated with Azure Blob Storage ● Supported directly by Hortonworks ● Other offerings ○ Microsoft HDInsight ○ HDP Sandbox
  • 69. Azure Deployment Guideline ● All in same Region ● Instance Types ○ Typical: A7 ○ Performance: D14 ○ 8x1TB Standard LRS x3 Virtual Hard Disk per server ● Multiple Storage Accounts are recommended ○ Recommend no more than 40 Virtual Hard Disks per Storage Account
  • 70. Azure Blob Store Azure Blob Store (Object Storage) ● wasb[s]: //<containername>@<accountname>.blob. core.windows.net/<path> Can be used as a replacement for HDFS ● Thoroughly tested in HDP release test suites
  • 71. Amazon Web Services ● Deploy using CloudBreak ● Integrated with AWS S3 (object storage) ● Supported directly by Hortonworks
  • 72. Amazon Deployment Guideline ● All in same Region/AZ ● Instances with Enhanced Networking Master Nodes: ● Choose EBS Optimized ● Boot: 100GB on EBS ● Data: 4+ 1TB on EBS Worker Nodes: ● Boot: 100GB on EBS ● Data: Instance Storage ○ EBS can be used, but local is preferred Instance Types: ● Typical: d2. ● Performance: i2. https://aws.amazon.com/ec2/instance-types/
  • 73. AWS RDS ● Some services rely on MySQL, Oracle or PostgreSQL: ○ Apache Ambari ○ Apache Hive ○ Apache Oozie ○ Apache Ranger ● Use RDS for these instead of managing yourself.
  • 74. AWS S3 (Object Storage) ● s3n:// with HDP 2.2 (Hadoop 2.6) ● s3a:// with HDP 2.3 (Hadoop 2.7) Not currently a direct replacement for HDFS Recommended to configure access with IAM Role/Policy ● https://docs.aws.amazon. com/IAM/latest/UserGuide/policies_examples.html#iam- policy-example-s3 ● Example: http://git.io/vLoGY
  • 75. Amazon Deployment Guideline ● All in same Region/AZ ● Instances with Enhanced Networking Master Nodes: ● Choose EBS Optimized ● Boot: 100GB on EBS ● Data: 4+ 1TB on EBS Worker Nodes: ● Boot: 100GB on EBS ● Data: Instance Storage ○ EBS can be used, but local is preferred Instance Types: ● Typical: d2. ● Performance: i2. https://aws.amazon.com/ec2/instance-types/
  • 76. Google Cloud ● Deploy using ○ CloudBreak ○ Google bdutil with Apache Ambari plug-in ● Integrated with Google Cloud Storage ● Supported directly by Hortonworks
  • 77. Google Deployment Guideline ● Instance Types ○ Typical: n1 standard 4 with single 1.5 TB persistent disks ○ Performance: n1 standard 8 with 1TB SSD ● Google GCS (Object Storage) ● gs://<CONFIGBUCKET>/dir/file ● Not currently a replacement for HDFS
  • 78. S3 & GCS as Secondary storage system The connectors are currently eventually consistent so do not replace HDFS Backup ● Falcon, distCP, hadoop fs, HBase ExportSnapshot ● Kafka+Storm bolt sends messages to S3/GCS providing backup & point-in-time recovery source Input/Output ● Convenient & broadly used upload/download method ○ As a middleware to ease integration with Hadoop & limit access ● Publishing static content (optionally with CloudFront) ○ Removes need to manage any web services ● Storage for temporary/ephemeral clusters
  • 80. $ shutdown -h now - HDP 2.3 - http://hortonworks.com/ - Hadoop Summit recordings: - http://2015.hadoopsummit.org/san-jose/ - http://2015.hadoopsummit.org/brussels/ - Past & Future workshops: - http://hortonworks.com/partners/learn/