Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

OpenCensus with Prometheus and Kubernetes

570 Aufrufe

Veröffentlicht am

Korea DevOps MeetUP '19
일자 : 2019년 6월 8일(토)
장소 : 광화문 마이크로소프트
제목 : Opencensus with Prometheus and Kubernetes
발표자 : SK C&C 김진웅(ddiiwoong)

Veröffentlicht in: Technologie
  • Als Erste(r) kommentieren

OpenCensus with Prometheus and Kubernetes

  1. 1. OpenCensus with Prometheus and Kubernetes Korea DevOps MeetUp '19 김진웅 (ddiiwoong)
  2. 2. About Me 김진웅 @ddiiwoong Cloud Native Platform Architect @SK C&C Interested in Kubernetes and Serverless(FaaS), Dev(Data)Ops, SRE, ML/DL
  3. 3. Who am I and Where am I? DevOps? Data Center Virtual Machine Container Serverless Weeks Minutes Seconds Milliseconds Evolution 1단계 : Self-manage 2단계 : Managed 3단계 : Fully-Managed OS설치/운영, 개발플랫폼 패치, 백업 등 직접관리 서버 기반이나 관리형 서비스로 제공 (설정, Scale 관리) 서버관리 없는 서비스 (No-Ops)
  4. 4. Complexity is inevitable Microservices Containerization Orchestration Service Mesh Bare Metal Kernel Network Stack Cloud Stack Libraries Frameworks Your Codes
  5. 5. Monitoring and Troubleshooting with Prometheus
  6. 6. Monitoring and Troubleshooting • Cluster (APIs, Etcd, Nodes, VMs or BMs) • Network (Service, Ingress, NetworkPolicy, DNS, TLS) • Storage (Volumes, PV, PVC, CSI) • Code Instrumentation • Metrics, Tracing (Cloud Providers, APM Solution, OpenSource)
  7. 7. Metrics 응용 프로그램 및 서비스의 성능과 품질을 측정하는 데 도움이 되는 정량 데이터 • Database, API Latency • Request content length • Open file descriptor • Cache hit/miss
  8. 8. Tracing 서비스 요청에 대한 애플리케이션 또는 서비스 구조 확인 모든 서비스들 간 데이터 흐름을 시각화하여 아키텍처상의 병목 현상을 파악
  9. 9. OpenCensus A Stats Collection and Distributed Tracing Framework backed by Google and Microsoft since Jan. 2018 A single distribution of libraries that automatically collect traces and metrics from your app, Display them locally, and send them to any backend. (Prometheus, Stackdriver, Zipkin, Jaeger...)
  10. 10. VM or Kube Pod VM or Kube Pod OpenCensus Libraries Auth. service Catalog service Search service FrontEnd service oc lib oc lib oc lib oc lib oc agent oc agent metrics + tracing backends oc collector
  11. 11. OpenCensus 다양한 Language, 백엔드 Application 지원
  12. 12. OpenCensus Agent Polygot 개발/배포를 위해 중앙화된 exporter 구현을 할 수있게 해주는 Daemon • Agent • Sidecar • Kubernetes DaemonSet
  13. 13. OpenCensus Agent Benefit • 단일 exporter 관리 • 배포의 민주화 (Democratizes) Backend로 보내는 선택은 개발자의 몫 • 최초 Instrumentation 적용후 언제든지 원하는 Backend로 변경가능 Prometheus 에서 Stackdriver로, Zipkin 에서 Jaeger로 • 오버헤드 감소 application 재시작 없이 ocagent 만 재시작 • 관측가능한 signal 통합 (pass-through) Routing - Zipkin, Jaeger, Prometheus data polyglot and poly-backend 관리 용이 • 관리 Port 최소화 TCP 55678
  14. 14. OpenCensus Collector Application과 근접한 곳에 위치 (예: 동일 VPC, Available Zone등)
  15. 15. OpenCensus Collector Benefit • 단일 exporter 관리 • 배포의 민주화 (Democratizes) Backend로 보내는 선택은 개발자의 몫 • 최초 Instrumentation 적용후 언제든지 원하는 Backend로 변경가능 Prometheus 에서 Stackdriver로, Zipkin 에서 Jaeger로 • Egress Point 제한 Application 내 다수의 API Key, TLS관리 일원화 • Backend까지 data 보장 built-in buffering and retry capabilities • Intelligent Sampling 기능 활용 (percentile, 백분위) • Annotation span이 수집되는 동안 metadata 추가 가능 • Tagging 가능 span에 포함된 tag override, remove 가능
  16. 16. OpenCensus Collector Performance • 1 collector (HA 구성) - 24 Cores, 48 GB • 17.3 billion spans/day (200K spans/second) • 57 TB data a day (3.3KB per span) • 0 dropped spans • https://cloud.withgoogle.com/next/sf/sessions?session=268946-130602
  17. 17. Demo - Hipster Shop Hipster Shop: Cloud-Native Microservices Demo Application • 상품을 검색 및 구매할 수 있는 웹 기반 이커머스 Application
  18. 18. Demo - Hipster Shop Hipster Shop: Cloud-Native Microservices Demo Application • 모든 통신은 gRPC, 외부 통신만 HTTP • Polygot : Go, C#, Node.js, Python, Java • Istio 구성 가능 • Skaffold 로 배포 (https://skaffold.dev/) • Backend Embedded Stackdriver - https://github.com/GoogleCloudPlatform/microservices-demo Prometheus - https://github.com/census-ecosystem/opencensus-microservices-demo • Load Generator(Locust) 가 지속적으로 서비스 호출 • 특정 서비스(productcatalog)에서 GetProduct 함수 Latency 지연 발생 • Backend(Prometheus/Jaeger) 도구로 원인 파악 후 코드 수정 및 재배포
  19. 19. Demo - Tracing (Frontend, Go) Exporter Library 추가 및 http handler 초기화 import ( … "go.opencensus.io/exporter/jaeger" "go.opencensus.io/exporter/prometheus" "go.opencensus.io/plugin/ochttp" "go.opencensus.io/plugin/ochttp/propagation/b3" ... ) func main() { … var handler http.Handler = r handler = &logHandler{log: log, next: handler} handler = ensureSessionID(handler) handler = &ochttp.Handler{ Handler: handler, Propagation: &b3.HTTPFormat{}} log.Infof("starting server on " + addr + ":" + srvPort) log.Fatal(http.ListenAndServe(addr+":"+srvPort, handler)) } https://godoc.org/go.opencensus.io/plugin/ochttp
  20. 20. Demo - Tracing (Frontend, Go) Exporter 등록, Sampling (https://opencensus.io/stats/sampling/) func initJaegerTracing(log logrus.FieldLogger) { exporter, err := jaeger.NewExporter(jaeger.Options{ Endpoint: "http://jaeger:14268", Process: jaeger.Process{ ServiceName: "frontend", }, }) if err != nil { log.Fatal(err) } trace.RegisterExporter(exporter) } trace.ApplyConfig(trace.Config{ DefaultSampler: trace.AlwaysSample(), }) initJaegerTracing(log) ... } Supported Sampling Bit ● AlwaysSample ● NeverSample ● Probability ● RateLimiting https://github.com/census-instrumentation/opencensus-specs/blob/master/trace/Sampling.md#ratelimiting-sampler-implementation-details
  21. 21. Demo - Tracing (AdService, Java) Exporter 등록 import io.opencensus.exporter.trace.jaeger.JaegerTraceExporter; public static void main(String[] args) throws IOException, InterruptedException { ... // Register Jaeger Tracing. JaegerTraceExporter .createAndRegister("http://jaeger:14268/api/traces", "adservice"); ... final AdService service = AdService.getInstance(); service.start(); service.blockUntilShutdown(); } trace.AlwaysSample( ) 없는 이유는 Frontend에서 전이되기 때문임
  22. 22. Demo - Metrics (Frontend, Go) Exporter 등록 및 gRPC views func initPrometheusStatsExporter(log logrus.FieldLogger) *prometheus.Exporter { exporter, err := prometheus.NewExporter(prometheus.Options{}) if err != nil { log.Fatal("error registering prometheus exporter") return nil } view.RegisterExporter(exporter) return exporter } func startPrometheusExporter(log logrus.FieldLogger, exporter *prometheus.Exporter) { addr := ":9090" log.Infof("starting prometheus server at %s", addr) http.Handle("/metrics", exporter) log.Fatal(http.ListenAndServe(addr, nil)) } func initStats(log logrus.FieldLogger) { // Start prometheus exporter exporter := initPrometheusStatsExporter(log) go startPrometheusExporter(log, exporter) if err := view.Register(ochttp.DefaultServerViews...); err != nil { log.Fatal("error registering default http server views") } if err := view.Register(ocgrpc.DefaultClientViews...); err != nil { log.Fatal("error registering default grpc client views") } } https://github.com/census-instrumentation/opencensus- specs/blob/master/stats/DataAggregation.md#view
  23. 23. Demo - Metrics (AdService, Java) Exporter 등록 import io.opencensus.exporter.stats.prometheus.PrometheusStatsCollector; public static void main(String[] args) throws IOException, InterruptedException { ... // Register Prometheus exporters and export metrics to a Prometheus HTTPServer. PrometheusStatsCollector.createAndRegister(); HTTPServer prometheusServer = new HTTPServer(9090, true); ... final AdService service = AdService.getInstance(); service.start(); service.blockUntilShutdown(); } https://github.com/census-instrumentation/opencensus-specs/blob/master/stats/DataAggregation.md#view
  24. 24. Demo - Metrics (AdService, Java) gRPC views /** Main launches the server from the command line. */ public static void main(String[] args) throws IOException, InterruptedException { ... // Registers all RPC views. RpcViews.registerAllViews(); ... } https://github.com/census-instrumentation/opencensus-specs/blob/master/stats/DataAggregation.md#view
  25. 25. Demo - Trace Monitoring (문제상황)
  26. 26. Demo - Stats Monitoring (문제상황)
  27. 27. Demo - Code Tuning parceCatalog( )를 products 변수로 처리하여 전체 로직에서 시간 줄이기 재배포 (skaffold) $ skaffold run --default-repo=gcr.io/cloudrun-237814 -n default
  28. 28. Demo - Trace Monitoring
  29. 29. Demo - Stats Monitoring
  30. 30. Wrap Up • OpenCensus Agent, Collector 활용 고민 • App. SRE - SLI(Service Level Indicator), SLO(Service Level Objective) • Application Custom Metric 확장 • Istio 확장 https://github.com/census-instrumentation/opencensus-service/blob/master/DESIGN.md#implementation-details-of-agent-server • OpenMetric + OpenCensus = OpenTelemetry https://medium.com/opentracing/merging-opentracing-and-opencensus-f0fe9c7ca6f0
  31. 31. OpenTracing Vendor-neutral APIs and instrumentation for distributed tracing. Open standard for distributed tracing. Libraries available in 9 languages (Go, JavaScript, Java, Python, Ruby, PHP, Objective-C, C++, C#)
  32. 32. OpenTelemetry OpenTelemetry : The next major version of the OpenTracing and OpenCensus + =
  33. 33. Q&A @ddiiwoong @ddiiwoong ddiiwoong@gmail.com https://ddii.dev

×