Enabling Security via Container Runtimes

A talk given at the Google-hosted Container Security Summit on Wednesday, February 12th, 2020 in Seattle, Washington. This talk covered the impact of work done at the lower-level runtimes layer and up through layers like cri-o, containerd, and Docker to bring specific security features to overall platforms like Kubernetes.

  1. 1. @estesp Enabling security via runtimes A container runtime perspective on security Phil Estes Distinguished Engineer & CTO, IBM Cloud Platform CNCF containerd project maintainer OCI Technical Oversight Board chair
  2. 2. @estesp Runtimes circa 2014 ● No seccomp ● No pids limit ● No user namespaces ● No rootless ● No content-addressable image format ● No sandboxes ● ...
  3. 3. @estesp Runtime Reorganization ● Open Container Initiative (OCI) and CNCF formed: Summer 2015 ● libcontainer project donated to become “runc” ● Docker architecturally splits into the engine, containerd, and runc ● Later, containerd and cri-o emerge as standalone runtimes without Docker engine
  4. 4. @estesp Runtime Security Progress Late 20162015 Early 2016 2017 PIDS controller SECCOMP support Docker 1.10 ● user namespaces, seccomp profiles, auth plugins ● New content-addressable image format (v2.2) No new privs support User NS nesting Ambient caps readonly+userns rootless runc
  5. 5. @estesp Secure image layer hashes, new manifest
  6. 6. @estesp User namespaces, Phase 1
  7. 7. @estesp Resources, Attack Surface, ...Privilege! Rootless containers refers to the ability for an unprivileged user to create, run and otherwise manage containers. ● Docker “Shocker” 2014 ● Docker CVE-2014-9357 ● Containerd #2001 (2018) ● Runc #1962 (2019) ● K8s CVE-2017-1002101, CVE-2017-1002102 ● K8s CVE-2018-1002105 ● …?
  8. 8. @estesp Rootless Containers ● Built on Linux kernel user namespaces functionality ○ An unprivileged user manages a user and group range in which containers will run ● Limitations for privileged actions handled via userspace functionality ○ slirp4netns - required for network creation/interactions ○ fuse-overlayfs - required for root filesystem interactions and handles UID/GID shifts ○ cgroups - cannot manage cgroups in rootless containers, v2 is solution ● Upstream kernel and related projects (overlayfs) continue work to remove limitations/performance impacts ● Available at all levels (runc, container runtimes, builders, Kubernetes***)
  10. 10. @estesp Sandboxes: Linux Containers++ What if we could consolidate the developer experience of containers with the isolation guarantees of virtual machines?
  11. 11. @estesp Runtime shim v2 API ● Minimal and scoped to the execution lifecycle of a container ● Binary naming convention ○ Type io.containerd.runsc.v1 -> Binary containerd-shim-runsc-v1
  12. 12. @estesp Runtime Plugins - Task Service service Task { rpc State(StateRequest) returns (StateResponse); rpc Create(CreateTaskRequest) returns (CreateTaskResponse); rpc Start(StartRequest) returns (StartResponse); rpc Delete(DeleteRequest) returns (DeleteResponse); rpc Pids(PidsRequest) returns (PidsResponse); rpc Pause(PauseRequest) returns (google.protobuf.Empty); rpc Resume(ResumeRequest) returns (google.protobuf.Empty); rpc Checkpoint(CheckpointTaskRequest) returns (google.protobuf.Empty); rpc Kill(KillRequest) returns (google.protobuf.Empty); rpc Exec(ExecProcessRequest) returns (google.protobuf.Empty); rpc ResizePty(ResizePtyRequest) returns (google.protobuf.Empty); rpc CloseIO(CloseIORequest) returns (google.protobuf.Empty); rpc Update(UpdateTaskRequest) returns (google.protobuf.Empty); rpc Wait(WaitRequest) returns (WaitResponse); rpc Stats(StatsRequest) returns (StatsResponse); rpc Connect(ConnectRequest) returns (ConnectResponse); rpc Shutdown(ShutdownRequest) returns (google.protobuf.Empty); }
  13. 13. @estesp Using Sandboxes ● RuntimeClass + CRI provide a way to utilize custom sandboxes
  14. 14. @estesp Encrypted Container Layers ● Provide a method to encrypt specific image layers with target keys for secure delivery to approved runtime nodes ● Implemented components available/underway in: ○ containers/image, cri-o, buildah, containerd, generic library implementation ○ Kubernetes KEP under review (https://github.com/kubernetes/enhancements/pull/1066) ○ Proposed update to OCI image-spec (https://github.com/opencontainers/image-spec/pull/775)
  15. 15. @estesp What’s Left?
  16. 16. @estesp Thank You!