From PuppetCamp Southeast Asia 2012 in Kuala Lumpur, Malaysia. Hadoop in a box - from playground to production Desc: How Vagrant, Puppet and other tools can be used to move your manifest from test bed to production.
The Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick Maludy
Using Vagrant, Puppet, Testing & Hadoop
1. Hadoop
in
Box
From
Playground
to
Produc5on
–
Using
Vagrant,
Puppet,
Tes5ng
and
Hadoop.
2. Who
am
I?
• Dennis
Matotek
Technical
Lead,
PlaForms
Experian
Hitwise
Co-‐Author:
Pro
Linux
System
Administra5on:
Turnbull,
Lieverdink,
Matotek,
Apress
2009
Technical
Reviewer:
Pulling
Strings
with
Puppet:
Turnbull,
Apress
2008
3. What
are
we
solving?
• We
have
a
group
of
developers...
7. How
can
we
help?
• Don’t
put
implementa5on
plans
at
the
end
of
a
project.
• Everyone
gets
involved
in
wri5ng
infrastructure
code
• Infrastructure
code
should
be
included
in
the
development
build
pipeline
and
have
to
pass
tests.
• Push
infrastructure
code
from
playground
to
produc5on.
Design,
test
and
deploy
your
infrastructure
code
like
your
applica5on
code.
8. How
can
we
do
it?
• As
administrators
we
can
help
build
the
development
environment
for
projects.
• Infrastructure
on
the
desktop
– A
lot
of
the
concep5on
phase
coding
work
can
be
done
on
the
desktop.
• What
packages
are
needed
for
the
project?
• What
configura5on
should
they
be
in?
• How
can
you
share
your
ideas?
9. Choices
• Virtualiza5on
technologies
to
choose
from
– Virtual
Box
– LXC
– KVM/XEN
• Configura5on
management
tools
– Puppet
– Chef
– SaltStack
• Tes5ng
tools
– Cucumber-‐puppet
– Rspec
• CI
tools
– Jenkins
11. What’s
it
about?
• A
project
on
Github
wriaen
in
Ruby
to
manage
Oracle’s
VirtualBox
virtual
machines
(
originators:
Mitchell
Hashimoto
and
John
Bender,
2010).
• You
can
build
and
distribute
projects
amongst
teams
or
colleagues.
• Download
‘boxes’
and
build
project
environments
that
are
the
same
• Boxes
are
reusable
testbeds.
When
you
are
ready,
push
your
development
code
environment
to
others.
• Take
those
environments
and
run
them
against
Jenkins
or
other
CI
tools.
• Sandbox,
develop,
test
and
push
your
infrastructure
code
into
produc5on.
• How
easy
is
it?
$
vagrant
box
add
base
hap://files.vagrantup.com/lucid32.box
$
vagrant
init
$
vagrant
up
12. Vagrant
boxes
• What’s
a
box?
• Boxes
come
from
standard
VirtualBox
instances.
With
specific
configura5ons
that
Vagrant
requires.
• What
ever
VirtualBox
supports,
so
does
Vagrant.
• Boxes
are
basically
a
tar
of
an
exported
VirtualBox.
• Configured
harddisks,
CPU,
RAM,
Networks.
• You
can
create
them
yourself
or
use
ones
that
others
have
created
and
distributed.
• How
to
build
a
box
is
documented
here:
hap://vagrantup.com/docs/base_boxes.html
13. Launching
a
box
Install
VirtualBox,
install
Ruby,
install
vagrant.
Create
your
own
box
or
find
one
that
is
distributed
already
$
mkdir
project
;
cd
project
$
vagrant
box
add
<box_name>
<url
or
file_path>
This
adds
and
makes
it
available
to
the
vagrant
init
command
$
vagrant
box
list
hadoop_in_a_box
$
vagrant
init
<box_name>
You
will
now
have
the
default
Vagran2ile
created
in
your
directory
$
ls
VagranFile
$
vagrant
up
$
vagrant
ssh
14. VagranFile
Vagrant::Config.run
do
|config|
#
All
Vagrant
configura5on
is
done
here.
The
most
common
configura5on
#
op5ons
are
documented
and
commented
below.
For
a
complete
reference,
#
please
see
the
online
documenta5on
at
vagrantup.com.
#
Every
Vagrant
virtual
environment
requires
a
box
to
build
off
of.
config.vm.box
=
"hadoop_in_a_box"
config.ssh.private_key_path
=
"./.ssh/vagrant.key"
#
shared_folders
-‐
this
folder
must
exist
in
your
project
directory
config.vm.share_folder("shared_folder",
"/shared",
"./shared_folder")
end
15. Vagrant::Config.run
do
|config|
#
general
setngs:
#
config.vm.boot_mode
=
:gui
config.vm.customize
[
"modifyvm",
:id,
"-‐-‐memory",
"512"
]
#
ssh
setngs:
#
Set
the
following
to
point
to
your
ssh
key
config.ssh.private_key_path
=
"./.ssh/vagrant.key"
#
Change
these
to
suit,
some5mes
it
takes
awhile
to
the
virtual
box
to
respond
config.ssh.max_tries
=
25
config.ssh.5meout
=
3
#
shared_folders
-‐
this
folder
must
exist
in
your
project
directory
config.vm.share_folder("shared_folder",
"/shared",
"./shared_folder“)
#
Below
is
an
example
of
a
mul5ple
VM
config.vm.define
:node1
do
|base_config|
base_config.vm.box
=
"my_base"
base_config.vm.forward_port
22,
2102
base_config.vm.network
:hostonly,
"192.168.222.10"
end
#
config.vm.define
:node2
do
|base_config|
#
base_config.vm.box
=
"my_base"
#
base_config.vm.forward_port
22,
2103
#
base_config.vm.network
:hostonly,
"192.168.222.11"
#
end
end
16. Provisioning
Your
Box
• Ruby
plugins
for
Vagrant
• Chef
Solo/Chef
Server
– Build
your
own
specific
config.vm.provision
:chef_solo
do
|chef|
plugins
that
make
chef.add_recipe("apache")
provisioning
easy
for
you
chef.add_recipe("php")
end
• Shell
provisioning
– Bash
shell
scripts
or
commands
• Puppet/Puppet
Server
base_config.vm.provision
:shell
do
|shell|
config.vm.provision
:puppet,
do
|puppet|
shell.inline
=
"hostname
$1“
puppet.manifests_path
=
“manifests"
shell.args
=
“node1“
puppet.manifest_file
=
"default.pp“
end
end
18. Vagrant
and
Puppet
• You
can
use
a
Puppet
Master
or
locally
apply
Puppet
modules
and
manifests
to
provision
your
Vagrant
nodes.
– Locally
applied
Puppet
modules
and
manifests:
base_config.vm.provision
:puppet,
:module_path
=>
["puppet_modules","puppet_modules_private"],
:op5ons
=>
"-‐-‐verbose"
do
|basepuppet|
basepuppet.manifests_path
=
"puppet_manifests“
basepuppet.manifest_file
=
"default.pp“
basepuppet.pp_path
=
"/tmp/vagrant-‐puppet“
end
19. Vagrant
and
Puppet
Cont’d
– Using
Puppet
Master
to
provision:
– Point
your
configura5on
at
your
local
Puppet
Master
Vagrant::Config.run
do
|config|
....
<snip>
....
base_config.vm.provision
:puppet_server
do
|puppet|
puppet.puppet_server
=
"puppet.yourdomain.com"
end
end
20. Puppet
Manifest
design
• The
basic
manifest
is
made
up
of
the
following
components:
/etc/puppet
-‐
-‐
manifests/site.pp
-‐
manifests/nodes.pp
-‐
modules/<module_name>/manifests
-‐
modules/<module_name>/files
-‐
modules/<module_name>/templates
-‐
modules/<module_name>/lib
21. Think
about
using
‘environments’
• Puppet
allows
you
to
use
environments.
Environments
are
separate
namespaces
where
you
can
run
and
test
your
code
on
the
same
puppet
master.
– Namespaces
like
produc5on,
staging,
tes5ng,
etc
• The
puppet.conf
file
needs
the
following:
modulepath
=
/etc/puppet/environments/
$environments
22. Environments
cont’d
• Allows
you
to
checkout
code
under
the
/etc/
puppet/environments/<checkout>
and
then
pass
the
following
to
the
client
$
puppet
agent
-‐-‐test
-‐-‐noop
-‐-‐environment
<checkout>
• Test
changes
against
systems
before
pushing
code
to
produc5on
23. Things
to
think
about
in
Module
Design
• Puppet
modules
are
a
collec5on
of
resources
to
install,
configure
and
manage
a
specific
applica5on
or
perform
some
kind
of
func5on.
– Eg,
install
and
configure
the
hapd
service
for
your
applica5ons.
• Keep
modules
separate.
Don’t
have
hapd
resources
being
managed
from
your
postgresql
module.
• Keep
data
separate
from
code.
– Have
a
separate
class
that
contains
your
data
(
class
modname::data
{
}
)
– Use
an
external
node
classifier
(ENC).
That
is
a
CMDB
like
service
that
Puppet
can
extract
and
build
configura5ons
from.
• Keep
an
ear
on
the
Puppet
User
list
as
many
design
ques5ons
are
asked
and
answered
there.
24. Manage
nodes?
• Nodes
are
tedious
to
manage.
nodes.pp
node
base
{
include
yum
}
node
node1
inherits
base
{
include
hapd
}
• Just
have
this:
node
default
{
include
roles
}
• Group
nodes
based
on
Facts
or
other
data.
25. Roles,
everything
has
a
role
• If
it
doesn’t
have
a
role,
it
has
a
default
role.
• Roles
decide
what
the
node
has.
– Easier
to
manage
than
node
and
doesn’t
rely
on
‘inheritance’.
• Commonly,
inheritance
is
not
like
programming
inheritance.
– Roles
with
Hiera.
class
roles
{
$my_role
=
hiera(‘my_role’)
if
$my_role
==
‘webservice’
{
include
roles::webservices
}
}
27. Tes5ng,
phhhuu!
• Why
test?
– As
your
module
complexity
grows
you
need
to
make
sure
that
it
will
work.
– Puppet
is
CONSTANTLY
changing
• Ensure
your
code
is
keeping
up
with
new
puppet
versions
– Your
infrastructure
code
is
code
–
why
not
test
it?
– Test
driven
code
is
beaer
code,
helps
to
think
about
what
the
outcome
should
be.
28. Introducing
the
tools
• RSpec
–
– hap://rspec-‐puppet.com/
• Cucumber-‐Puppet
–
– haps://github.com/nistude/cucumber-‐puppet
• Both
tools
do
the
same
thing
and
are
based
on
common
tes5ng
frameworks.
• Both
tools
support
Business
Driven
Development
• How
do
I
use
it?
– RSpec
tests
the
modules
– Cucumber
tests
the
manifests
as
a
whole
29. RSpec
class
hadoop::namenode::config
{
require
hadoop::config
include
hadoop::install::namenode
include
hadoop::namenode::cluster_config_files
#
realise
the
user
and
group
and
the
configfiles
Group
<|
tag
==
'hadoop_node'
|>
-‐>
<snip>
file
{
'/usr/lib/hadoop-‐0.20/logs/SecurityAuth.audit':
ensure
=>
present,
<snip>
require
=>
Package['hadoop-‐0.20-‐namenode']
}
-‐>
Exec
<|
tags
==
'common_execs'
|>
-‐>
hadoop::namenode::create_namenode_dirs
{$hadoop::config::hadoop_default_dirs:
}
-‐>
class
{"hadoop::namenode::namenode_format":
}
}
#end
class
30. RSpec
require
'spec_helper'
describe
'hadoop::namenode::config'
do
let(:facts)
{
{:hostname
=>
'node2',
:hadoop_node
=>
'namenode',
:role
=>
'hadoop_namenode'
}
}
let(:5tle)
{
'config'
}
it
{
should
include_class('hadoop::install::namenode')
}
it
{
should
contain_file('/usr/lib/hadoop-‐0.20/logs/SecurityAuth.audit')
}
it
{
should
contain_file('/etc/hadoop-‐0.20/conf.default/core-‐site.xml')
}
it
{
should
contain_service('hadoop-‐0.20-‐namenode').with_ensure('present')
}
end
31. Cucumber-‐Puppet
• Does
the
catalog
compile
for
your
nodes?
– Tests
run
on
the
master
(or
alterna5ve)
– When
nodes
check
in,
Puppet
creates
a
yaml
file
in
/var/lib/puppet/yaml/node
– cucumber-‐puppet
uses
the
output,
the
node
cache
file,
from
the
last
puppet
run
– In
Puppet
v3
this
changes
some
what
as
you
can
use
the
puppet
node
find
interface
to
retrieve
the
same
informa5on.
32. Cucumber
Basics
Feature:
General
policy
for
all
catalogs
In
order
to
ensure
applicability
of
a
host's
catalog
As
a
manifest
developer
I
want
all
catalogs
to
obey
some
general
rules
Scenario
Outline:
Compile
and
verify
catalog
Given
a
node
specified
by
"features/yaml/<hostname>.mylocal.yaml"
When
I
compile
its
catalog
Then
compila5on
should
succeed
And
all
resource
dependencies
should
resolve
Examples:
|
hostname
|
|
puppet
|
|
node2
|
|
node3
|
|
node4
|
33. Cucumber-‐Puppet
Then
/^service
"([^"]*)"
should
be
"([^"]*)"$/
do
|name,
state|
steps
%Q{
Then
there
should
be
a
resource
"Service[#{name}]"
}
if
state
==
"disabled"
steps
%Q{
Then
the
service
should
have
"enable"
set
to
"false"
}
elsif
state
==
"running"
steps
%Q{
Then
the
state
should
be
"#{state}"
}
end
end
35. Jenkins
• Helps
maintain
build
pipelines
• Push
your
infrastructure
into
the
soƒware
project
pipelines.
• Con5nuous
integra5on
used
main
by
soƒware
projects,
not
oƒen
by
infrastructure
• Get
greater
certainty
of
your
infrastructure
deployments.