Covers overview of CoreOS and current status of CoreOS projects. Presented at Open source meetup, Bangalore(http://www.meetup.com/Bangalore-Open-Source-Meetup/events/229763724/)
1. COREOS OVERVIEW AND
CURRENT STATUS
Presenter Name: Sreenivas Makam
Presented at: Open source Meetup Bangalore
Presentation Date: April 16, 2016
2. About me
• Senior Engineering Manager at
Cisco Systems Data Center group
• Personal blog can be found at
https://sreeninet.wordpress.com/
and my hacky code at
https://github.com/smakam
• Author of “Mastering CoreOS”
book, published on Feb 2016.
(https://www.packtpub.com/netw
orking-and-servers/mastering-
coreos )
• You can reach me on LinkedIn at
https://in.linkedin.com/in/sreeniva
smakam, Twitter handle -
@srmakam
3. Container Optimized OS - Characteristics
• All applications run as Containers on top of OS.
• Base OS should be small and fast to bootup.
• Services are managed by Init system. eg: Systemd
• OS need to have auto-update strategy like web
browsers.
• Systems needs to be managed as a cluster rather than
individual node.
• Built-in support should be present for Service
discovery, Container networking and Orchestration.
• Examples – CoreOS, Rancher OS, Redhat Atomic,
Ubuntu snappy, Mesos DCOS, VMWare Photon
4. CoreOS
• First Container optimized OS. First release was done in July 2013.
• Linux based and based on concepts from ChromeOS.
• OS is Security focused.
• Auto Update OS with A/B partition.
• OS is Open sourced. Along with OS, CoreOS has following
components:
– Systemd as Init system
– Etcd as distributed database
– Fleet as Cluster service scheduler
– Flannel for Container networking
– Docker, Rkt for Containers
• Has commercial components like Tectonic, Quay.
• CoreOS integrates well with Kubernetes
6. CoreOS Automatic Update
• The CoreOS update mechanism is based on Google's open source Omaha protocol
(https://code.google.com/p/omaha/) that is used in the Chrome browser.
• Dual partition scheme is used to achieve automatic update and to handle upgrade
related failures.
• Critical CoreOS services to handle Upgrade are update-engine and locksmith service.
• Etcd is used to serialize upgrades in a CoreOS cluster.
• User can control upgrades using different schemes like etcd-lock, reboot, best effort.
7. Cluster Architecture - Development
• Single cluster to run critical services
and application Containers.
• Odd cluster size gives better fault
tolerance.
• Used for development purposes.
• etcd is run in a separate node and application
Containers run in Worker nodes.
• Etcd in Worker nodes proxy to main etcd node.
Cluster 1
Cluster 2
Picture from https://coreos.com/os/docs/latest/cluster-architectures.html
8. Cluster architecture - Production
Cluster 3
Fault Tolerance• This cluster architecture is suited for
production
• There is redundancy available for core
services as well as for application Containers.
• Services are scheduled based on node
metadata(eg: services, worker)
Picture from https://coreos.com/os/docs/latest/cluster-architectures.html
9. CoreOS Release cycle
• Alpha, Beta, and Stable are release channels within
CoreOS.
• CoreOS releases progress through each channel in this
order: Alpha->Beta->Stable.
• All releases get started as Alpha, but the promotion to
Beta and Stable happens on the basis of testing.
• The major version number (for example, 1000 in
1000.0.0) is the number of days from July 13, 2013,
which was the CoreOS epoch.
core@core-01 ~ $ cat /etc/os-release
NAME=CoreOS
ID=coreos
VERSION=1000.0.0
VERSION_ID=1000.0.0
BUILD_ID=2016-03-27-0652
PRETTY_NAME="CoreOS 1000.0.0 (MoreOS)"
ANSI_COLOR="1;32"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://github.com/coreos/bugs/issues"
core@core-01 ~ $ uname -a
Linux core-01 4.5.0-coreos-r1 #2 SMP Thu Apr 7 03:52:11 UTC
2016 x86_64 Intel(R) Core(TM) i7-4800MQ CPU @ 2.70
GHz GenuineIntel GNU/Linux
10. Systemd
• Systemd is an Init system used by CoreOS to
start, stop, and manage processes.
• Dependencies between services can be specified
easily inside the unit files.
• All processes inside a single Systemd unit run in
1 cgroup and this includes forked processes.
• Systemd can monitor health of service and
restart service if it dies.
• Common Systemd unit types—service, socket,
device, and mount.
11. Systemd unit file example
Fleet.service:
[Unit]
Description=fleet daemon
After=etcd.service
After=etcd2.service
Wants=fleet.socket
After=fleet.socket
[Service]
User=fleet
Environment=GOMAXPROCS=1
ExecStart=/usr/bin/fleetd
Restart=always
RestartSec=10s
[Install]
WantedBy=multi-user.target
ordering
dependency
Service
restartability
Unit
grouping
12. Systemd unit execution
Hello1.service:
[Unit]
Description=My Service
After=docker.service
[Service]
TimeoutStartSec=0
KillMode=none
ExecStartPre=-/usr/bin/docker kill hello1
ExecStartPre=-/usr/bin/docker rm hello1
ExecStartPre=/usr/bin/docker pull busybox
ExecStart=/usr/bin/docker run --name hello1
busybox /bin/sh -c "while true; do echo Hello
World; sleep 1;
done"
Restart=always
RestartSec=30s
ExecStop=/usr/bin/docker stop hello1
[Install]
WantedBy=multi-user.target
Execute service:
1. Copy hello1.service as sudo to /etc/systemd/system.
2. Enable the service:
sudo systemctl enable hello1.service
3. Start hello1.service:
sudo systemctl start hello1.service
Hello1.service Status:
core@core-01 /etc/systemd/system $ systemctl status
hello1.service
? hello1.service - My Service
Loaded: loaded (/etc/systemd/system/hello1.service; disabled;
vendor preset: disabled)
Active: active (running) since Mon 2016-04-04 15:38:29 UTC;
2min 38s ago
Process: 1069 ExecStartPre=/usr/bin/docker pull busybox
(code=exited, status=0/SUCCESS)
Process: 1060 ExecStartPre=/usr/bin/docker rm hello1
(code=exited, status=1/FAILURE)
Process: 961 ExecStartPre=/usr/bin/docker kill hello1
(code=exited, status=1/FAILURE)
Main PID: 1093 (docker)
Memory: 33.1M
CPU: 634ms
CGroup: /system.slice/hello1.service
└─1093 /usr/bin/docker run --name hello1 busybox /bin/sh
-c while true; do echo Hello World;...
Apr 04 15:40:59 core-01 docker[1093]: Hello World
13. etcd
• Distributed key, value store used by CoreOS nodes.
• Uses Raft consensus algorithm to maintain highly
available cluster.
• Etcd is used outside CoreOS by companies like Cloud
foundry.
• Fleet, Flannel, Kubernetes uses etcd to communicate
configuration and monitoring data.
• Etcd can be used for Service discovery.
• Etcd exposes REST api and also provides etcdctl for CLI
access.
14. Etcd examples
Member list:
core@core-01 ~ $ etcdctl member list
156d6ef020d72caa: name=a4562fd1d8d346b79e80ca467d82b7ee peerURLs=http://172.17.8.101:2380 clientURLs=http://172.17.8.101:2379
2ba42fe58ee8dc90: name=d29b150786c04239937794f87985cd76 peerURLs=http://172.17.8.102:2380 clientURLs=http://172.17.8.102:2379
72ab4a0a6ba49052: name=7a895214d7274c0f8c362188c3bf8eb8 peerURLs=http://172.17.8.103:2380 clientURLs=http://172.17.8.103:2379
Set, get, delete example:
core@core-01 ~ $ etcdctl set /message hello
hello
core@core-02 ~ $ etcdctl get /message
Hello
core@core-01 ~ $ etcdctl rm /message
PrevNode.Value: hello
Get all keys:
core@core-01 ~ $ etcdctl ls / --recursive
/coreos.com
/coreos.com/updateengine
/coreos.com/updateengine/rebootlock
/coreos.com/updateengine/rebootlock/semaphore
/message
TTL expiry:
core@core-01 ~ $ etcdctl set /expirymessage "I will expire soon"
--ttl 30
I will expire soon
core@core-02 ~ $ etcdctl get /expirymessage
I will expire soon
core@core-02 ~ $ etcdctl get /expirymessage
Error: 100: Key not found (/expirymessage) [51952]
Watch:
Node 1:
core@core-01 ~ $ etcdctl set /didichange "not changed"
not changed
core@core-01 ~ $ etcdctl get /didichange
not changed
core@core-01 ~ $ etcdctl watch /didichange –recursive
----
Node 2:
core@core-02 ~ $ etcdctl set /didichange "changed"
Changed
----
Node 1:
[set] /didichange
changed
15. Fleet
• Init system for the CoreOS cluster
• Fleet uses etcd for communication across the cluster
• Fleet was originally planned as Orchestrator. After
Kubernetes took the role of Orchestrator, Fleet is used
only for scheduling system services. For example,
Kubernetes services are scheduled using Fleet.
• Fleet is not under active development and is mostly
under the maintenance mode.
• Fleet service exposes REST api and CLI can be
accessed using fleetctl.
16. Fleet Architecture
• One master Fleet engine is selected
in the CoreOS cluster using etcd.
• Fleet master talks to Fleet agents of
each node and schedules the Jobs.
• Fleet also takes care of Service high
availability at cluster level.
• Fleet global units gets scheduled on
all nodes in the cluster.
• Fleet can schedule units based on
metadata onto the appropriate
node in the cluster.
17. Fleet global unit example
helloglobals.service:
[Unit]
Description=My Service
After=docker.service
[Service]
TimeoutStartSec=0
ExecStartPre=-/usr/bin/docker kill hello
ExecStartPre=-/usr/bin/docker rm hello
ExecStartPre=/usr/bin/docker pull busybox
ExecStart=/usr/bin/docker run --name hello
busybox /bin/sh -c "while true; do echo Hello
World; sleep 1; d
one"
ExecStop=/usr/bin/docker stop hello
[X-Fleet]
Global=true
Start Unit:
fleetctl start helloglobal.service
Fleet Status:
core@core-01 ~ $ fleetctl list-units
UNIT MACHINE ACTIVE SUB
helloglobal.service 7a895214.../172.17.8.103 active running
helloglobal.service a4562fd1.../172.17.8.101 active running
helloglobal.service d29b1507.../172.17.8.102 active running
Systemd status:
core@core-01 ~ $ systemctl status helloglobal.service
? helloglobal.service - My Service
Loaded: loaded (/run/fleet/units/helloglobal.service; linked-
runtime; vendor preset: disabled)
Active: active (running) since Mon 2016-04-04 17:28:17 UTC; 1min
44s ago
Process: 7702 ExecStartPre=/usr/bin/docker pull busybox
(code=exited, status=0/SUCCESS)
Process: 7693 ExecStartPre=/usr/bin/docker rm hello (code=exited,
status=1/FAILURE)
Process: 7686 ExecStartPre=/usr/bin/docker kill hello (code=exited,
status=1/FAILURE)
Main PID: 7716 (docker)
Memory: 9.4M
CPU: 444ms
CGroup: /system.slice/helloglobal.service
└─7716 /usr/bin/docker run --name hello busybox /bin/sh -c
while true; do echo Hello World; ...
18. Flannel
• Flannel uses an Overlay network to allow Containers
across different hosts to talk to each other.
• Both Docker and Rkt Containers in CoreOS uses
Flannel to do Container networking.
• Projects like Kubernetes use Flannel for Container
networking.
• Flannel can be implemented as a
Container networking
interface(CNI) plugin like shown in
the diagram.
19. Flannel Architecture
Flannel Control path
Flannel Data path
• Flannel agent in each node communicate to each other using etcd.
• Flannel agent of each node gets an individual subnet and Containers of that node
gets IP address from the subnet.
• Inter-host Container traffic gets encapsulated into either UDP or VXLAN by Flannel
bridge of each node.
20. Rkt
• Rkt is the Container runtime developed by CoreOS.
• CoreOS created Rkt to have a better security model for
Containers and to have an industry standard.
• Rkt uses the Application Container image (ACI) image
format, which is according to the APPC specification.
• Rkt does not have a daemon(Like Docker) and is managed
by Systemd.
• Rkt uses multiple stages to run Containers. This is done so
that some stages can be swapped with other
implementations.
– First stage does Container image discovery and retrieval.
– Second stage sets the Container runtime environment like
filesystem, cgroups.
– Third stage runs the Container.
21. Rkt Examples
Run Docker image as Rkt Container:
core@core-02 ~ $ sudo rkt run --insecure-options=image --interactive docker://busybox
image: using image from local store for image name coreos.com/rkt/stage1-coreos:1.2.1
image: remote fetching from URL "docker://busybox"
Downloading sha256:385e281300c: [==============================] 676 KB/676 KB
Downloading sha256:a3ed95caeb0: [==============================] 32 B/32 B
networking: loading networks from /etc/rkt/net.d
networking: loading network default with type ptp
/ #
List pods:
core@core-02 ~ $ rkt list pods
UUID APP IMAGE NAME STATE CREATED
STARTED NETWORK
S
6bafd510 busybox registry-1.docker.io/library/busybox:latest running 6
minutes ago 6 minutes ago default
:ip4=172.16.28.2
List images:
core@core-02 ~ $ rkt image list
ID NAME IMPORT TIME LAST USED
SIZE LATEST
sha512-c0ab44d87499 coreos.com/rkt/stage1-coreos:1.2.1 10
minutes ago 10 minutes ago 152MiB false
sha512-cdb74a334f97 registry-1.docker.io/library/busybox:latest 10
minutes ago 10 minutes ago 2.4MiB true
Rkt Container in Systemd:
[Unit]
Description=nginx
[Service]
# Resource limits
CPUShares=512
MemoryLimit=1G
# Prefetch the image
ExecStartPre=/usr/bin/rkt fetch --insecure-skip-
verify docker://nginx
ExecStart=/usr/bin/rkt run --insecure-skip-verify
--private-net
--port=80-tcp:8080 --volume volume-var-cache-
nginx,kind=host,source=/
home/co
re docker://nginx
KillMode=mixed
Restart=always
22. CoreOS Installation
• CoreOS can be installed in baremetal using ISO image or
PXE boot.
• Works with majority of Cloud providers – AWS, Google
cloud, DigitalOcean, Azure.
• Works with baremetal cloud like Packet.
• CoreOS can be deployed on Openstack cloud.
• Vagrant based installation can be used for development
purposes.
• Base configuration of CoreOS node can be provided using
cloud-config YAML file. Cloud-init program in CoreOS
system takes care of setting up configuration.
• For Cloud providers, System cloud-config is embedded in
base image and user cloud-config is provided by the user.
23. Cloud-config sample
#cloud-config
coreos:
etcd2:
#generate a new token for each unique cluster from https://discovery.etcd.io/new
discovery: https://discovery.etcd.io/99ce5d499df6a4a9c004e8596687ccc7
# multi-region and multi-cloud deployments need to use $public_ipv4
advertise-client-urls: http://$public_ipv4:2379
initial-advertise-peer-urls: http://$private_ipv4:2380
# listen on both the official ports and the legacy ports
# legacy ports can be omitted if your application doesn't depend on them
listen-client-urls: http://0.0.0.0:2379,http://0.0.0.0:4001
listen-peer-urls: http://$private_ipv4:2380,http://$private_ipv4:7001
fleet:
public-ip: $public_ipv4
units:
# To use etcd2, comment out the above service and uncomment these
# Note: this requires a release that contains etcd2
- name: etcd2.service
command: start
- name: fleet.service
command: start
Dynamic node discovery
Etcd environment
variables
Fleet environment
variable
Services to be started
automatically
24. CoreOS Ecosystem
• Supports Docker for running Containers.
• Integrates nicely with Kubernetes.
• Mesos can be run on top of CoreOS.
• Kubernetes can be used for orchestrating
Docker and Rkt Containers.
• CoreOS can be used with Openstack.
25. Tectonic – GIFEE(Google Infrastructure for
everyone else)
Tectonic Distributed Trusted computing
• At the firmware level, the customer key can be
embedded, and this allows customers to verify all
the software running in the system.
• Secure keys embedded in the firmware can verify
the bootloader as well as CoreOS.
• Containers such as Rkt can be verified with their
image signature.
• Logs can be made tamper-proof using the TPM
hardware module embedded in the CPU
motherboard.
• Tectonic integrates CoreOS’s open source components and Kubernetes
along with some integration software to provide Enterprise ready
microservices infrastructure platform.
• With Tectonic, CoreOS is integrating their other commercial offerings
such as CoreUpdate, Quay repository, and Enterprise CoreOS into
Tectonic.
• Tectonic will support both Docker and Rkt Container runtime.
Distributed Trusted computing(Image from
https://tectonic.com/)
Tectonic Software Components
26. Newer projects
• Ignition – Better cloud-config.
• Clair – Container scanning service for
vulnerabilities
• DEX – Identity management
• Openstack deployment on
Tectonic(https://tectonic.com/blog/openstac
k-and-kubernetes-come-together.html)