Containers: Portable, repeatable user-oriented application delivery II
HPC Saudi 2018
#dockerbday
Christian Kniep @CQnib
Walid Shaari @walidshaari
AGENDA : Good Morning Containers
https://events.docker.com/events/details/docker-az-zahran-presents-docker-birthday-5-celebration-dammam-edition-day-1#/
$id Christian
Over ten-year journey rooted in the industrial, automotive HPC in Germany, Christian started his career in Bull R&D supporting CAE
applications and VR installations, then later Dyna.
Co-founded the container and cloud workshop in ISC HPC conference when told at a meeting that HPC can not learn anything from
the emerging Cloud and BigData companies.
Since then, he is curious and leading DevOps and containerization effort wherever he goes.
Just before Docker, he worked on the cloud-stack team at Sony PlayStation.
Christian joined Docker Inc in 2017 to help push the adoption forward and be part of the innovation instead of an external bystander.
During the day he helps Docker customers in the EMEA region to fully utilise the power of containers; at night he likes to explore new
emerging trends by containerising them first and seek application in the nebulous world of DevOps.
@kniepbert
christian.kniep@docker.com
https://www.linkedin.com/in/christian-kniep-3004b053/
$id walid
Passionate about Openness, Open Source, DevOps, Infosec
Team member of the Expec Computer Center systems division
Red Hat Certified Architect RHCA V
Certified Kubernetes Administrator CKA
SANS GIAC Incident handler, Forensics and Web security certified.
Dhahran Docker & Ansible meetup organizer “Community Leader”
@walidshaari
walid.shaari@linux.com
https://www.linkedin.com/in/walidshaari/
https://github.com/walidshaari
Join the Docker Student
Community! Sign up here:
http://dockr.ly/students (with your school email) for
access to our free Docker Student Developer Kit and
more!
Become a Docker
Campus Ambassador!
For leaders on campus who want to help their
peers learn Docker! Learn more and apply here:
http://dockr.ly/campus-ambassador
Are you a student?
Let's get to know each other
▪ Assuming everyone knows a bit of
▪ Linux
▪ Unix
▪ Mac OSX CLI ?
▪ Development, Operations, Security, Research, Business, Others?
▪ DevOps
▪ Containers
▪ Schedulers
▪ Containers ecosystem
▪ Clusters, Load balancers, Orchestration
Goal
Up and running with containers ecosystem
informal interactive workshop format
Docker Momentum
Thank You for 5 Amazing Years!
Docker EE
commercial
customers
450+
Job listings on
LinkedIn
15K
Container
downloads
37B 3.5M 200+
Active Docker
user groups
Dockerized
apps
Containers are the “Fastest Growing Cloud Enabling Technology”
By 2020, more than 50% of global
organizations will be running
containers in production.
-Gartner
Title source: 451 Research
2017
24B
PULLS
Lab Instructions
STEP 1: Visit
https://training.play-with-docker.com/alacart/
Or https://goo.gl/hLA1VN
Create Docker hub/store account: https://hub.docker.com/
Join the Docker Community - dockr.ly/community
Join the slack channel: #5th-bday #dockerbday
HPC or Scientific Computing?
▪HPC workloads mostly
▪ Runs on Linux
▪ Preferably on bare-metal for maximum performance, lower overhead
▪HPC Application
▪ Broken into smaller parallel distributed problems across a cluster of nodes.
▪ Utilizes interprocess communications heavily via shared memory, or across the
network.
HPC Status Quo
▪ HPC dominated by Academics research and discovery
▪ Business HPC by the industry in the last 5-10 years seen an increase in HPC
interest (Automotive, Finance, O&E)
▪ Possible constraints:
▪ Snowflake deployments, each HPC cluster/supercomputer is build in mind with
specific use cases
▪ Long lived nodes.
▪ Bloated/drift/unclean maybe diskless reboots
▪ Reboot time, or launching app could be long due to system/memory checks, bootstrapping
▪ Old Linux distribution
▪ Fixed installation based on single enterprise distro (Scientific, RHEL)
▪ Old kernel features
Which workloads and frameworks are running on OpenStack?
Source: https://www.openstack.org/assets/survey/Public-User-Survey-Report.pdf
> 38%
scientific/technical
computing already
happening on
Openstack
Namespaces
Processes Isolation
● host sees all processes with real PID from the Kernels perspective
● first process within PID namespace gets PID=1
Host
cnt0
ps -ef
cnt1
start.sh
java -jar ..
cnt2
start.sh
java -jar ..
health.sh
Resource Isolation of Process Groups
7 as of Kernel 4.10
1. MNT: Controls mount points
2. PID: Individual process table
3. NET: Network resources (IPs, routing,...)
4. IPC: Prevents the use of shared memory between processes
5. UTS: Individual host- and domain name
6. USR: Maps container UID to a different UID of the host
7. CGRP: Hides system cgroup hierarchy from container
Other (incomplete list):
● RDMA
● Syslog
● Time
Container Namespaces
A starting container gets his own namespaces.
PIDMNT IPCNET USR
Host
UTS CGRP
cnt0 cnt1 cnt2
But can share namespaces with other containers or even the host
Host
All In
When using all host namespaces - we are on the host (almost like ssh).
PIDMNT IPCNET USRUTS CGRP
cnt0
$ docker run -ti --rm
--privileged
--security-opt=seccomp=unconfined
--pid=host
--uts=host
--ipc=host
--net=host
-v /:/host
ubuntu bash
root@linuxkit-025000000001:/# chroot /host
/ # ash
/ #
Overlay Filesystem
Compose a FS from multiple pieces
ubuntu:16.04
openjre:9-b114
appA.jar:1.1 appB.jar
ARG FROM openjre:9-b114
COPY appB.jar /usr/local/bin/
CMD [“java”, “-jar”, “/usr/local/bin/appB.jar”]
ARG FROM openjre:9-b114
COPY appA.jar /usr/local/bin/
CMD [“java”, “-jar”, “/usr/local/bin/appA.jar”]
FROM ubuntu:16.04
ARG JRE_VER=9~b114-0ubuntu1
RUN apt-get update
&& apt-get install -y openjdk-9-jre-headless=${JRE_VER}
&& java -version
openjre:9-b117
First Step, toward a container definition?
• What matters most? The application or data
• The application can be a process or a set of processes
• The use case might be not a running app
• Set of tools to develop an app
• Set of scripts "apps" that are part of a pipeline
• complete appliance
• Isolated contained environment "Encapsulation"
• Technical synonyms
• chroot
• jail
• partition
• namespace
• zone
chroot/jail
A chroot on Unix operating systems is an operation that
changes the apparent root directory for the current running
process and its children. A program that is run in such a
modified environment cannot name (and therefore normally
cannot access) files outside the designated directory tree.
The term "chroot" may refer to the chroot(2) system call or
the chroot(8) wrapper program. The modified environment
is called a chroot jail.
https://en.wikipedia.org/wiki/Chroot
Scott McCarty Twitter: @fatherlinux Blog: bit.ly/fatherlinux
THE HISTORY OF CONTAINERS
2008:
KERNEL & USER
NAMESPACES
2008:
LINUX
CONTAINER
PROJECT (LXC)
2013:
DOTCLOUD
BECOMES
DOCKER
2013:
RED HAT
ENTERPRISE
LINUX
2000
2010
2005
2015
2000:
JAILS ADDED
TO FREEBSD
2006:
PROCESS
CONFINEMENT
2007:
GPC RENAMED
CGROUPS
2014:
GOOGLE
KUBERNETES
2015:
RED HAT
CONTAINER
PLATFORM
2001:
LINUX -VSERVER
PROJECT
2015:
STANDARDS VIA
OCI AND CNCF
2003:
SELINUX
ADDED TO LINUX
MAINLINE
2005:
FULL RELEASE
OF SOLARIS
ZONES
2013:
DOTCLOUD PYCON
LIGHTNING TALK
Docker
provides
simple user
tools and
images.
Containers go
mainstream
CONTAINERS?
WHAT ARE THEY REALLY?
Linux features?
Namespace
cgroupsLXC
Union file systems
Configuration management?
Virtualization technology?
npm
jar
Packaging ?
rpm
deb
tar.gz
Virtual/environment management ?
Sandboxing?
chroot
BSD jail Solaris zones
IBM VM/370 (1972)
seccomp
IT DEPENDS
Manual
Configuration
Traditional VMs
Less Portable
Minimal overhead
Most Portable
Lots of overhead
Configuration
Management tools
Containers
Docker
Intel Clear ContainersSingularity
LXC/LXD
Non-Repeatable Repeatable
rkt
Container
Containment, isolation, or encapsulation of an environment.
Machine container:
Encapsulates a complete system image. e.g. Ubuntu, RHEL, Scientific Linux.
Application container:
Encapsulates a service/software. e.g. Django, ROR, Gitlab, redis, Openfoam, kafka, spark.
what is the smallest application container?
Possible HPC Caveats/Constraints
1. Memory/storage deduplication
2. Code Optimization for specific architecture
3. Limited take on HPC specific orchestration and scheduling
4. Hardware topology assumptions (e.g. GPU brand, interconnect)
5. Chroot based containers have none/limited tooling (e.g. introspection )
6. Chroot based containers might be hard to scan for security vulnerabilities,
hardening, and composition.
KUBERNETES SEEING THE MOST DEVELOPER
TRACTION
43https://www.slideshare.net/dberkholz/cloud-native-in-the-enterprise-realworld-data-on-container-and-microservice-ado
ption
Use Cases: Packaging
Agnostic packaging
Captures
○ Dependencies
○ Environment
○ Configurations
○ Executables
○ How about data?
○ What Else?
■ hint: m*
Pack once, Run everywhere
http://hpcbios.readthedocs.io/en/latest/HPCBIOS_2012-92.html
#EasyBuild #lmod #GUIX #NYU-Environment
Use Case: Portability
Portable/Scalable across
● platforms
● Distributions
● Environments
Separation of concerns, e.g. development pack and ship, operations scale and deploy.
development ensures app is resilient, operations enure infra is HA resilient and scalable
Use Case: Reproducible
Paolo Di Tommaso from the Center for Genomic Regulation presented : Manage Reproducibility of Computational Workflows with Docker Containers
and Nextflow.
https://www.slideshare.net/insideHPC/reproducible-computational-pipelines-with-docker-and-nextflow
https://youtu.be/Doo9H2-gBAk
Cloud use Case
- Transport
- Security CIA
- at rest encrypted signed image
- at runtime:
- platform specific
- scalability issues
- PMIx to the rescue?!
Data Center current state
SchedulerScheduler
Jobs
Jobs
Jobs
Jobs
Jobs
Jobs
Scheduler
Jobs
Jobs
Jobs
Cluster Management A
Cluster Management B
Cluster Management C
Data Center
Secure Allocation of Resources
VC3
BigData
VC1
Infra
VC2
HPC
SchedulerSchedulerScheduler
DataCenter
Scheduler
jobs
Jobs
Jobs
Jobs
Jobs
Jobs
Jobs
Jobs
2nd
Generation Cluster Management
Mesos
▪ Mature, Open Source Apache Project
▪ Cluster Resource Manager
▪ Scalable to over 10,000s of nodes
▪ Fault tolerant, no single point of failure
▪ Multi-tenancy with strong resource isolation
▪ Improved resource utilization
MPI batch jobs
● use ssh inside container
● dssh
● Capitalize on openmpi
○ Openmpi/pbs/TORQUE
○ Process Management Interfaces PMIx
● Singularity examples uses Openmpi/Slurm
● mesos
● Commercial Univa support
● Research, and contribute ideas, pull requests to swarm,
kubernetes, slurm, pbs pro
● Joing the HPC-SIG
DISCLAIMER
@kelseyhightower :
The problem with most blog posts attempting to compare two different systems is
the author not having the sufficient experience to do so.
https://twitter.com/kelseyhightower/status/826974374536187905
What is Docker?
The leading open source platform to pack, ship and run
apps as lightweight containers.
Developers: use Docker to eliminate “works on my machine” problems when
collaborating on code with co-workers.
Operators: use Docker to run and manage apps side-by-side in isolated
containers to get better compute density.
Enterprises: use Docker to build agile software delivery pipelines to ship new
features faster, more securely and with confidence for both
Linux and Windows Server apps.
#dockerbday
• Standardized packaging for
software and dependencies
• Isolate apps from each other
• Share the same OS kernel
• Works for all major Linux
distributions
• Containers native to Windows
Server 2016
What are Docker containers?
Containers and VMs together
Containers and VMs together provide a tremendous amount of
flexibility for IT to optimally deploy and manage apps.
Architecture on Linux
Operating System
Control Groups
(cgroups)
Namespaces
(mnt,pid,ipc,...)
Layer Capabilities
AUFS,overlay,...
Other OS
Functionality
Docker Engine
REST interface
libcontainerd libnetwork storage plugins
containerd + runc
Docker Client Docker Compose Docker Registry Docker Swarm/K8s
Runtime
runc + containerd
●
● containerd
An industry-standard container runtime with an emphasis on simplicity, robustness and portability.
● runc
CLI tool for spawning and running containers according to the OCI specification
rootfs
config.json
runc executed container
libnetwork
Provide IP connectivity
The goal of libnetwork is to deliver a robust Container Network
Model that provides a consistent programming interface and the
required network abstractions for applications.
Plugins
Extend Functionality of the Engine
Framework to ‘intercept’ certain API calls and act on them.
Current supported drivers:
- VolumeDriver
- NetworkDriver
- IPAMDriver
- LogDriver
- MetricsCollector
- Authentication (authz)
// VolumeDriver
type Driver interface {
Create(Request) Response
List(Request) Response
Get(Request) Response
Path(Request) Response
Mount(Request) Response
Unmount(Request) Response
Capabilities(Request) Response
}
Architecture on Windows
Operating System
Other OS
Functionality
Docker Engine
REST interface
libcontainer libnetwork storage plugins
Docker Client Docker Compose Docker Registry Docker Swarm/K8s
Host Compute Service
Control Groups Namespaces Layer Capabilities
Object Namespace,
Process Table,
Networking
Job Objects Registry, Union like
filesystem extension
Docker is the only Containers-as-a-Service platform for IT that manages and secures
diverse applications across disparate infrastructure, both on-premises and in the cloud
Multi-Architecture
Operations
Infrastructure Independence
Secure Software
Supply Chain
COST SAVINGS
Linux Mainframe AWS Azure Other Public
Clouds
Windows
ENGINE FOR INNOVATION
DOCKER ENTERPRISE EDITION
Docker Enterprise Edition Capabilities
Enterprise Edition
Optimized Container Engine
Integrated App and Cluster
Management
Certification and Support
Policy Management
Image Scanning and
Monitoring
Secure Access and
User Management
Content Trust and
Verification
Application and
Cluster Management
Image Management
Security
Distributed State
Network
Container Runtime
Volumes
Orchestration
Application Composition, Deployment and Reliability
Certified Containers Certified Plugins
Certified Infrastructure
What is rkt?
From the rkt GitHub page, "rkt (pronounced "rock-it") is a CLI for running app
containers on Linux. rkt is designed to be secure, composable, and
standards-based.
#ACI
Why rkt?
● Don’t want to run dockerd daemon.
● Don’t require the Docker’s rich feature set/ecosystem.
● Can’t trust Docker security yet.
rkt
# rkt run --interactive docker://ubuntu --insecure-options=image