SlideShare ist ein Scribd-Unternehmen logo
1 von 30
Downloaden Sie, um offline zu lesen
OpenCensus with Prometheus and Kubernetes
201904
김진웅
About Me
김진웅 @ddiiwoong
Cloud Platform, Data Lake Architect @SK C&C
Interested in Kubernetes and Serverless(FaaS), DevOps, SRE, ML/DL
Who am I and Where am I?
Ops
Data Center Virtual Machine Container Serverless
Weeks Minutes Seconds Milliseconds
Dev
1단계 : Self-manage 2단계 : Managed 3단계 : Fully-Managed
OS설치/운영,
개발플랫폼 패치,
백업 등 직접관리
서버 기반이나
관리형 서비스로 제공
(설정, Scale 관리)
서버관리 없는 서비스
(No-Ops)
Complexity is inevitable
Microservices Containerization Orchestration Service Mesh
Bare Metal
Kernel
Network Stack
Cloud Stack
Libraries
Frameworks
Your Codes
Monitoring and Troubleshooting with Prometheus
Monitoring and Troubleshooting
● Cluster (APIs, Etcd, Nodes, VMs or BMs)
● Network (Service, Ingress, NetworkPolicy, DNS, TLS)
● Storage (Volumes, PV, PVC, CSI)
● Code
○ Instrumentation
○ Tracing, Metrics
(Cloud Providers, APM Solution, OpenSource)
OpenCensus
A single distribution of libraries that
automatically collect traces and metrics
from your app, display them locally, and
send them to any backend.
A Stats Collection and Distributed Tracing Framework
backed by Google and Microsoft in Jan. 2018
VM or Kube Pod
VM or Kube Pod
OpenCensus Libraries
Auth.
service
Catalog
service
Search
service
FrontEnd
service
oc
lib
oc
lib
oc
lib
oc
lib
oc agent
oc agent
metrics +
tracing
backends
oc
collector
OpenCensus
다양한 Language, 백엔드 Applicatin 지원
Tracing
● 서비스 요청에 대한 애플리케이션 또는 서비스 구조 확인
● 모든 서비스들 간 데이터 흐름을 시각화하여 아키텍처상의 병목 현상을 파악
Metrics
응용 프로그램 및 서비스의 성능과 품질을 측정하는 데 도움이 되는 정량 데이터
● database 및 API의 Latency
● Request content length
● open file descriptor 수
● cache hit/miss 수
OpenCensus Agent
Polygot 개발/배포를 위해 중앙화된 exporter 구현을 할 수있게 해주는 Daemon
● Agent
● Sidecar
● Kubernetes DaemonSet
OpenCensus Agent Benefit
● 단일 exporter 관리
● 배포의 민주화 (Democratizes)
○ Backend로 보내는 선택은 개발자의 몫
● 최초 Instrumentation 적용후 언제든지 원하는 Backend로 변경가능
○ Prometheus 에서 Stackdriver로, Zipkin 에서 Jaeger로
● 오버헤드 감소
○ application 재시작 없이 ocagent 만 재시작
● 관측가능한 signal 통합 (pass-through)
○ Routing - Zipkin, Jaeger, Prometheus data
○ polyglot and poly-backend 관리 용이
● 관리 Port 최소화
○ TCP 55678
OpenCensus Collector
Application과 근접한 곳에 위치(예: 동일VPC, AZ등)
OpenCensus Collector Benefit
● 단일 exporter 관리
● 배포의 민주화 (Democratizes)
○ Backend로 보내는 선택은 개발자의 몫
● 최초 Instrumentation 적용후 언제든지 원하는 Backend로 변경가능
○ Prometheus 에서 Stackdriver로, Zipkin 에서 Jaeger로
● Egress Point 제한
○ Application 내 다수의 API Key, TLS관리 일원화
● Backend까지 data 보장
○ built-in buffering and retry capabilities
● Intelligent (tail-based) Sampling 기능 활용 (percentile, 백분위)
● Annotation
○ span이 수집되는 동안 metadata 추가 가능
● Tagging 가능
○ span에 포함된 tag override, remove 가능
Demo - Hipster Shop
Hipster Shop: Cloud-Native Microservices Demo Application
● 상품을 검색 및 구매할 수 있는 웹 기반 이커머스 Application
Demo - Hipster Shop
Hipster Shop: Cloud-Native Microservices Demo Application
● 모든 통신은 gRPC, 외부 통신만 HTTP
● Polygot : Go, C#, Node.js, Python, Java
● Istio 구성 가능
● Skaffold 로 배포 (https://skaffold.dev/)
● Backend Embedded
○ Stackdriver - https://github.com/GoogleCloudPlatform/microservices-demo
○ Prometheus - https://github.com/census-ecosystem/opencensus-microservices-demo
● Load Generator(Locust) 가 지속적으로 서비스 호출
● 특정 서비스(CheckoutService/PlaceOrder)에서 Latency 지연 발생
● Backend(Prometheus/Jaeger) 도구로 원인 파악 후 코드 수정 및 재배포
Demo - Tracing (Frontend, Go)
Exporter Library 추가 및 http handler 초기화
import (
…
"go.opencensus.io/exporter/jaeger"
"go.opencensus.io/exporter/prometheus"
"go.opencensus.io/plugin/ochttp"
"go.opencensus.io/plugin/ochttp/propagation/b3"
...
)
func main() {
…
var handler http.Handler = r
handler = &logHandler{log: log, next: handler}
handler = ensureSessionID(handler)
handler = &ochttp.Handler{
Handler: handler,
Propagation: &b3.HTTPFormat{}}
log.Infof("starting server on " + addr + ":" + srvPort)
log.Fatal(http.ListenAndServe(addr+":"+srvPort, handler))
}
https://godoc.org/go.opencensus.io/plugin/ochttp
Demo - Tracing (Frontend, Go)
Exporter 등록, Sampling (https://opencensus.io/stats/sampling/)
func initJaegerTracing(log logrus. FieldLogger) {
exporter, err := jaeger.NewExporter(jaeger.Options{
Endpoint: "http://jaeger:14268",
Process: jaeger.Process{
ServiceName: "frontend",
},
})
if err != nil {
log.Fatal(err)
}
trace.RegisterExporter(exporter)
}
trace.ApplyConfig(trace.Config{
DefaultSampler: trace.AlwaysSample(),
})
initJaegerTracing(log)
...
}
Supported Sampling Bit
● AlwaysSample
● NeverSample
● Probability
● RateLimiting
https://github.com/census-instrumentation/opencensus-specs/blob/master/trace/Sampling.md#ratelimiting-sampler-implementation-details
Demo - Tracing (AdService, Java)
Exporter 등록
import io.opencensus.exporter.trace.jaeger. JaegerTraceExporter ;
public static void main(String[] args) throws IOException,
InterruptedException {
...
// Register Jaeger Tracing.
JaegerTraceExporter
.createAndRegister("http://jaeger:14268/api/traces",
"adservice");
...
final AdService service = AdService.getInstance();
service.start();
service.blockUntilShutdown();
}
trace.AlwaysSample( ) 없는 이유는 Frontend에서 전이되기 때문임
Demo - Metrics (Frontend, Go)
Exporter 등록 및 gRPC views
func initPrometheusStatsExporter(log logrus.FieldLogger) *prometheus.Exporter {
exporter, err := prometheus.NewExporter(prometheus.Options{})
if err != nil {
log.Fatal("error registering prometheus exporter")
return nil
}
view.RegisterExporter(exporter)
return exporter
}
func startPrometheusExporter(log logrus.FieldLogger, exporter *prometheus.Exporter) {
addr := ":9090"
log.Infof("starting prometheus server at %s", addr)
http.Handle("/metrics", exporter)
log.Fatal(http.ListenAndServe(addr, nil))
}
func initStats(log logrus.FieldLogger) {
// Start prometheus exporter
exporter := initPrometheusStatsExporter(log)
go startPrometheusExporter(log, exporter)
if err := view.Register(ochttp.DefaultServerViews...); err != nil {
log.Fatal("error registering default http server views")
}
if err := view.Register(ocgrpc.DefaultClientViews...); err != nil {
log.Fatal("error registering default grpc client views")
}
}
https://github.com/census-instrumentation/opencensus-specs/blob/master/
stats/DataAggregation.md#view
Demo - Metrics (AdService, Java)
Exporter 등록
import io.opencensus.exporter.stats.prometheus.PrometheusStatsCollector;
public static void main(String[] args) throws IOException,
InterruptedException {
...
// Register Prometheus exporters and export metrics to a Prometheus
HTTPServer.
PrometheusStatsCollector.createAndRegister();
HTTPServer prometheusServer = new HTTPServer(9090, true);
...
final AdService service = AdService.getInstance();
service.start();
service.blockUntilShutdown();
}
https://github.com/census-instrumentation/opencensus-specs/blob/master/stats/DataAggregation.md#view
Demo - Metrics (AdService, Java)
gRPC views
/** Main launches the server from the command line. */
public static void main(String[] args) throws IOException,
InterruptedException {
...
// Registers all RPC views.
RpcViews.registerAllViews();
...
}
https://github.com/census-instrumentation/opencensus-specs/blob/master/stats/DataAggregation.md#view
Demo - Trace Monitoring (문제상황)
Demo - Stats Monitoring (문제상황)
Demo - Code Tuning
parceCatalog( )를 products 변수로 처리하여 전체 로직에서 시간 줄이기
재배포 (skaffold)
$ skaffold run --default-repo=gcr.io/cloudrun-237814 -n default
Demo - Trace Monitoring
Demo - Stats Monitoring
정리
● OpenCensus Agent, Collector 활용 고민해보자
● SRE - SLI(Service Level Indicator), SLO(Service Level Objective)
● Application Custom Metric 확장
● Istio 확장
○ https://github.com/census-instrumentation/opencensus-service/blob/master/DESIGN.md#implementation-details-of-agent-server
● OpenMetric + OpenCensus : ??
○ https://medium.com/opentracing/merging-opentracing-and-opencensus-f0fe9c7ca6f0
Q&A
@ddiiwoong
@ddiiwoong
ddiiwoong@gmail.com
https://ddii.dev

Weitere ähnliche Inhalte

Was ist angesagt?

OpenStack Nova Scheduler
OpenStack Nova Scheduler OpenStack Nova Scheduler
OpenStack Nova Scheduler
Peeyush Gupta
 
Kafka on Kubernetes—From Evaluation to Production at Intuit
Kafka on Kubernetes—From Evaluation to Production at Intuit Kafka on Kubernetes—From Evaluation to Production at Intuit
Kafka on Kubernetes—From Evaluation to Production at Intuit
confluent
 
Building a Real-Time Data Pipeline with Spark, Kafka, and Python
Building a Real-Time Data Pipeline with Spark, Kafka, and PythonBuilding a Real-Time Data Pipeline with Spark, Kafka, and Python
Building a Real-Time Data Pipeline with Spark, Kafka, and Python
SingleStore
 

Was ist angesagt? (19)

Envoy @ Lyft: Developer Productivity
Envoy @ Lyft: Developer ProductivityEnvoy @ Lyft: Developer Productivity
Envoy @ Lyft: Developer Productivity
 
NetflixOSS Open House Lightning talks
NetflixOSS Open House Lightning talksNetflixOSS Open House Lightning talks
NetflixOSS Open House Lightning talks
 
[오픈소스컨설팅] 서비스 메쉬(Service mesh)
[오픈소스컨설팅] 서비스 메쉬(Service mesh)[오픈소스컨설팅] 서비스 메쉬(Service mesh)
[오픈소스컨설팅] 서비스 메쉬(Service mesh)
 
[OpenInfra Days Korea 2018] (Track 4) CloudEvents 소개 - 상호 운용 가능성을 극대화한 이벤트 데이...
[OpenInfra Days Korea 2018] (Track 4) CloudEvents 소개 - 상호 운용 가능성을 극대화한 이벤트 데이...[OpenInfra Days Korea 2018] (Track 4) CloudEvents 소개 - 상호 운용 가능성을 극대화한 이벤트 데이...
[OpenInfra Days Korea 2018] (Track 4) CloudEvents 소개 - 상호 운용 가능성을 극대화한 이벤트 데이...
 
Kubernetes Forum Seoul 2019: Re-architecting Data Platform with Kubernetes
Kubernetes Forum Seoul 2019: Re-architecting Data Platform with KubernetesKubernetes Forum Seoul 2019: Re-architecting Data Platform with Kubernetes
Kubernetes Forum Seoul 2019: Re-architecting Data Platform with Kubernetes
 
Strata London 2018: Multi-everything with Apache Pulsar
Strata London 2018:  Multi-everything with Apache PulsarStrata London 2018:  Multi-everything with Apache Pulsar
Strata London 2018: Multi-everything with Apache Pulsar
 
OpenStack Nova Scheduler
OpenStack Nova Scheduler OpenStack Nova Scheduler
OpenStack Nova Scheduler
 
Monitoring with Prometheus
Monitoring with PrometheusMonitoring with Prometheus
Monitoring with Prometheus
 
OpenStack Nova - Developer Introduction
OpenStack Nova - Developer IntroductionOpenStack Nova - Developer Introduction
OpenStack Nova - Developer Introduction
 
Kafka on Kubernetes—From Evaluation to Production at Intuit
Kafka on Kubernetes—From Evaluation to Production at Intuit Kafka on Kubernetes—From Evaluation to Production at Intuit
Kafka on Kubernetes—From Evaluation to Production at Intuit
 
OpenStack KOREA 정기 세미나_OpenStack meet iNaaS SDN Controller
OpenStack KOREA 정기 세미나_OpenStack meet iNaaS SDN ControllerOpenStack KOREA 정기 세미나_OpenStack meet iNaaS SDN Controller
OpenStack KOREA 정기 세미나_OpenStack meet iNaaS SDN Controller
 
[Demo session] 관리형 Kafka 서비스 - Oracle Event Hub Service
[Demo session] 관리형 Kafka 서비스 - Oracle Event Hub Service[Demo session] 관리형 Kafka 서비스 - Oracle Event Hub Service
[Demo session] 관리형 Kafka 서비스 - Oracle Event Hub Service
 
Powering Microservices with Docker, Kubernetes, Kafka, & MongoDB
Powering Microservices with Docker, Kubernetes, Kafka, & MongoDBPowering Microservices with Docker, Kubernetes, Kafka, & MongoDB
Powering Microservices with Docker, Kubernetes, Kafka, & MongoDB
 
Docker+java
Docker+javaDocker+java
Docker+java
 
OpenStack Storage Overview
OpenStack Storage OverviewOpenStack Storage Overview
OpenStack Storage Overview
 
Building a Real-Time Data Pipeline with Spark, Kafka, and Python
Building a Real-Time Data Pipeline with Spark, Kafka, and PythonBuilding a Real-Time Data Pipeline with Spark, Kafka, and Python
Building a Real-Time Data Pipeline with Spark, Kafka, and Python
 
[OpenInfra Days Korea 2018] Day 1 - T4-7: "Ceph 스토리지, PaaS로 서비스 운영하기"
[OpenInfra Days Korea 2018] Day 1 - T4-7: "Ceph 스토리지, PaaS로 서비스 운영하기"[OpenInfra Days Korea 2018] Day 1 - T4-7: "Ceph 스토리지, PaaS로 서비스 운영하기"
[OpenInfra Days Korea 2018] Day 1 - T4-7: "Ceph 스토리지, PaaS로 서비스 운영하기"
 
Cortex: Horizontally Scalable, Highly Available Prometheus
Cortex: Horizontally Scalable, Highly Available PrometheusCortex: Horizontally Scalable, Highly Available Prometheus
Cortex: Horizontally Scalable, Highly Available Prometheus
 
Building Super Fast Cloud-Native Data Platforms - Yaron Haviv, KubeCon 2017 EU
Building Super Fast Cloud-Native Data Platforms - Yaron Haviv, KubeCon 2017 EUBuilding Super Fast Cloud-Native Data Platforms - Yaron Haviv, KubeCon 2017 EU
Building Super Fast Cloud-Native Data Platforms - Yaron Haviv, KubeCon 2017 EU
 

Ähnlich wie Opencensus with prometheus and kubernetes

Ähnlich wie Opencensus with prometheus and kubernetes (20)

OWASP ZAP Workshop for QA Testers
OWASP ZAP Workshop for QA TestersOWASP ZAP Workshop for QA Testers
OWASP ZAP Workshop for QA Testers
 
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInDataMonitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
 
Docker Monitoring Webinar
Docker Monitoring  WebinarDocker Monitoring  Webinar
Docker Monitoring Webinar
 
Prometheus - Utah Software Architecture Meetup - Clint Checketts
Prometheus - Utah Software Architecture Meetup - Clint CheckettsPrometheus - Utah Software Architecture Meetup - Clint Checketts
Prometheus - Utah Software Architecture Meetup - Clint Checketts
 
OpenTelemetry For Developers
OpenTelemetry For DevelopersOpenTelemetry For Developers
OpenTelemetry For Developers
 
introduction to node.js
introduction to node.jsintroduction to node.js
introduction to node.js
 
Monitoring on Kubernetes using prometheus
Monitoring on Kubernetes using prometheusMonitoring on Kubernetes using prometheus
Monitoring on Kubernetes using prometheus
 
Monitoring on Kubernetes using Prometheus - Chandresh
Monitoring on Kubernetes using Prometheus - Chandresh Monitoring on Kubernetes using Prometheus - Chandresh
Monitoring on Kubernetes using Prometheus - Chandresh
 
Hadoop cluster performance profiler
Hadoop cluster performance profilerHadoop cluster performance profiler
Hadoop cluster performance profiler
 
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
 
2307 - DevBCN - Otel 101_compressed.pdf
2307 - DevBCN - Otel 101_compressed.pdf2307 - DevBCN - Otel 101_compressed.pdf
2307 - DevBCN - Otel 101_compressed.pdf
 
Prometheus (Microsoft, 2016)
Prometheus (Microsoft, 2016)Prometheus (Microsoft, 2016)
Prometheus (Microsoft, 2016)
 
Understanding the Android System Server
Understanding the Android System ServerUnderstanding the Android System Server
Understanding the Android System Server
 
Implementing Observability for Kubernetes.pdf
Implementing Observability for Kubernetes.pdfImplementing Observability for Kubernetes.pdf
Implementing Observability for Kubernetes.pdf
 
Apache Pulsar with MQTT for Edge Computing - Pulsar Summit Asia 2021
Apache Pulsar with MQTT for Edge Computing - Pulsar Summit Asia 2021Apache Pulsar with MQTT for Edge Computing - Pulsar Summit Asia 2021
Apache Pulsar with MQTT for Edge Computing - Pulsar Summit Asia 2021
 
Pulsar summit asia 2021 apache pulsar with mqtt for edge computing
Pulsar summit asia 2021   apache pulsar with mqtt for edge computingPulsar summit asia 2021   apache pulsar with mqtt for edge computing
Pulsar summit asia 2021 apache pulsar with mqtt for edge computing
 
Node.js for enterprise - JS Conference
Node.js for enterprise - JS ConferenceNode.js for enterprise - JS Conference
Node.js for enterprise - JS Conference
 
ThroughTheLookingGlass_EffectiveObservability.pptx
ThroughTheLookingGlass_EffectiveObservability.pptxThroughTheLookingGlass_EffectiveObservability.pptx
ThroughTheLookingGlass_EffectiveObservability.pptx
 
OpenTelemetry 101 FTW
OpenTelemetry 101 FTWOpenTelemetry 101 FTW
OpenTelemetry 101 FTW
 
KCD-OpenTelemetry.pdf
KCD-OpenTelemetry.pdfKCD-OpenTelemetry.pdf
KCD-OpenTelemetry.pdf
 

Mehr von Jinwoong Kim

Mehr von Jinwoong Kim (9)

Prometheus Project Journey
Prometheus Project JourneyPrometheus Project Journey
Prometheus Project Journey
 
AWS기반 서버리스 데이터레이크 구축하기 - 김진웅 (SK C&C) :: AWS Community Day 2020
AWS기반 서버리스 데이터레이크 구축하기 - 김진웅 (SK C&C) :: AWS Community Day 2020AWS기반 서버리스 데이터레이크 구축하기 - 김진웅 (SK C&C) :: AWS Community Day 2020
AWS기반 서버리스 데이터레이크 구축하기 - 김진웅 (SK C&C) :: AWS Community Day 2020
 
Knative로 서버리스 워크로드 구현
Knative로 서버리스 워크로드 구현Knative로 서버리스 워크로드 구현
Knative로 서버리스 워크로드 구현
 
EKS workshop 살펴보기
EKS workshop 살펴보기EKS workshop 살펴보기
EKS workshop 살펴보기
 
Cloud Native 오픈소스 서비스 소개 및 Serverless로 실제 게임 개발하기
Cloud Native 오픈소스 서비스 소개 및 Serverless로 실제 게임 개발하기Cloud Native 오픈소스 서비스 소개 및 Serverless로 실제 게임 개발하기
Cloud Native 오픈소스 서비스 소개 및 Serverless로 실제 게임 개발하기
 
Spinnaker on Kubernetes
Spinnaker on KubernetesSpinnaker on Kubernetes
Spinnaker on Kubernetes
 
Cloud Z 의 오픈소스 서비스 소개 및 Serverless로 게임 개발하기
Cloud Z 의 오픈소스 서비스 소개 및 Serverless로 게임 개발하기Cloud Z 의 오픈소스 서비스 소개 및 Serverless로 게임 개발하기
Cloud Z 의 오픈소스 서비스 소개 및 Serverless로 게임 개발하기
 
Continuous Delivery with Spinnaker on K8s(kubernetes) Cluster
Continuous Delivery with Spinnaker on K8s(kubernetes) Cluster Continuous Delivery with Spinnaker on K8s(kubernetes) Cluster
Continuous Delivery with Spinnaker on K8s(kubernetes) Cluster
 
Provisioning Dedicated Game Server on Kubernetes Cluster
Provisioning Dedicated Game Server on Kubernetes ClusterProvisioning Dedicated Game Server on Kubernetes Cluster
Provisioning Dedicated Game Server on Kubernetes Cluster
 

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 

Opencensus with prometheus and kubernetes

  • 1. OpenCensus with Prometheus and Kubernetes 201904 김진웅
  • 2. About Me 김진웅 @ddiiwoong Cloud Platform, Data Lake Architect @SK C&C Interested in Kubernetes and Serverless(FaaS), DevOps, SRE, ML/DL
  • 3. Who am I and Where am I? Ops Data Center Virtual Machine Container Serverless Weeks Minutes Seconds Milliseconds Dev 1단계 : Self-manage 2단계 : Managed 3단계 : Fully-Managed OS설치/운영, 개발플랫폼 패치, 백업 등 직접관리 서버 기반이나 관리형 서비스로 제공 (설정, Scale 관리) 서버관리 없는 서비스 (No-Ops)
  • 4. Complexity is inevitable Microservices Containerization Orchestration Service Mesh Bare Metal Kernel Network Stack Cloud Stack Libraries Frameworks Your Codes
  • 6. Monitoring and Troubleshooting ● Cluster (APIs, Etcd, Nodes, VMs or BMs) ● Network (Service, Ingress, NetworkPolicy, DNS, TLS) ● Storage (Volumes, PV, PVC, CSI) ● Code ○ Instrumentation ○ Tracing, Metrics (Cloud Providers, APM Solution, OpenSource)
  • 7. OpenCensus A single distribution of libraries that automatically collect traces and metrics from your app, display them locally, and send them to any backend. A Stats Collection and Distributed Tracing Framework backed by Google and Microsoft in Jan. 2018
  • 8. VM or Kube Pod VM or Kube Pod OpenCensus Libraries Auth. service Catalog service Search service FrontEnd service oc lib oc lib oc lib oc lib oc agent oc agent metrics + tracing backends oc collector
  • 10. Tracing ● 서비스 요청에 대한 애플리케이션 또는 서비스 구조 확인 ● 모든 서비스들 간 데이터 흐름을 시각화하여 아키텍처상의 병목 현상을 파악
  • 11. Metrics 응용 프로그램 및 서비스의 성능과 품질을 측정하는 데 도움이 되는 정량 데이터 ● database 및 API의 Latency ● Request content length ● open file descriptor 수 ● cache hit/miss 수
  • 12. OpenCensus Agent Polygot 개발/배포를 위해 중앙화된 exporter 구현을 할 수있게 해주는 Daemon ● Agent ● Sidecar ● Kubernetes DaemonSet
  • 13. OpenCensus Agent Benefit ● 단일 exporter 관리 ● 배포의 민주화 (Democratizes) ○ Backend로 보내는 선택은 개발자의 몫 ● 최초 Instrumentation 적용후 언제든지 원하는 Backend로 변경가능 ○ Prometheus 에서 Stackdriver로, Zipkin 에서 Jaeger로 ● 오버헤드 감소 ○ application 재시작 없이 ocagent 만 재시작 ● 관측가능한 signal 통합 (pass-through) ○ Routing - Zipkin, Jaeger, Prometheus data ○ polyglot and poly-backend 관리 용이 ● 관리 Port 최소화 ○ TCP 55678
  • 14. OpenCensus Collector Application과 근접한 곳에 위치(예: 동일VPC, AZ등)
  • 15. OpenCensus Collector Benefit ● 단일 exporter 관리 ● 배포의 민주화 (Democratizes) ○ Backend로 보내는 선택은 개발자의 몫 ● 최초 Instrumentation 적용후 언제든지 원하는 Backend로 변경가능 ○ Prometheus 에서 Stackdriver로, Zipkin 에서 Jaeger로 ● Egress Point 제한 ○ Application 내 다수의 API Key, TLS관리 일원화 ● Backend까지 data 보장 ○ built-in buffering and retry capabilities ● Intelligent (tail-based) Sampling 기능 활용 (percentile, 백분위) ● Annotation ○ span이 수집되는 동안 metadata 추가 가능 ● Tagging 가능 ○ span에 포함된 tag override, remove 가능
  • 16. Demo - Hipster Shop Hipster Shop: Cloud-Native Microservices Demo Application ● 상품을 검색 및 구매할 수 있는 웹 기반 이커머스 Application
  • 17. Demo - Hipster Shop Hipster Shop: Cloud-Native Microservices Demo Application ● 모든 통신은 gRPC, 외부 통신만 HTTP ● Polygot : Go, C#, Node.js, Python, Java ● Istio 구성 가능 ● Skaffold 로 배포 (https://skaffold.dev/) ● Backend Embedded ○ Stackdriver - https://github.com/GoogleCloudPlatform/microservices-demo ○ Prometheus - https://github.com/census-ecosystem/opencensus-microservices-demo ● Load Generator(Locust) 가 지속적으로 서비스 호출 ● 특정 서비스(CheckoutService/PlaceOrder)에서 Latency 지연 발생 ● Backend(Prometheus/Jaeger) 도구로 원인 파악 후 코드 수정 및 재배포
  • 18. Demo - Tracing (Frontend, Go) Exporter Library 추가 및 http handler 초기화 import ( … "go.opencensus.io/exporter/jaeger" "go.opencensus.io/exporter/prometheus" "go.opencensus.io/plugin/ochttp" "go.opencensus.io/plugin/ochttp/propagation/b3" ... ) func main() { … var handler http.Handler = r handler = &logHandler{log: log, next: handler} handler = ensureSessionID(handler) handler = &ochttp.Handler{ Handler: handler, Propagation: &b3.HTTPFormat{}} log.Infof("starting server on " + addr + ":" + srvPort) log.Fatal(http.ListenAndServe(addr+":"+srvPort, handler)) } https://godoc.org/go.opencensus.io/plugin/ochttp
  • 19. Demo - Tracing (Frontend, Go) Exporter 등록, Sampling (https://opencensus.io/stats/sampling/) func initJaegerTracing(log logrus. FieldLogger) { exporter, err := jaeger.NewExporter(jaeger.Options{ Endpoint: "http://jaeger:14268", Process: jaeger.Process{ ServiceName: "frontend", }, }) if err != nil { log.Fatal(err) } trace.RegisterExporter(exporter) } trace.ApplyConfig(trace.Config{ DefaultSampler: trace.AlwaysSample(), }) initJaegerTracing(log) ... } Supported Sampling Bit ● AlwaysSample ● NeverSample ● Probability ● RateLimiting https://github.com/census-instrumentation/opencensus-specs/blob/master/trace/Sampling.md#ratelimiting-sampler-implementation-details
  • 20. Demo - Tracing (AdService, Java) Exporter 등록 import io.opencensus.exporter.trace.jaeger. JaegerTraceExporter ; public static void main(String[] args) throws IOException, InterruptedException { ... // Register Jaeger Tracing. JaegerTraceExporter .createAndRegister("http://jaeger:14268/api/traces", "adservice"); ... final AdService service = AdService.getInstance(); service.start(); service.blockUntilShutdown(); } trace.AlwaysSample( ) 없는 이유는 Frontend에서 전이되기 때문임
  • 21. Demo - Metrics (Frontend, Go) Exporter 등록 및 gRPC views func initPrometheusStatsExporter(log logrus.FieldLogger) *prometheus.Exporter { exporter, err := prometheus.NewExporter(prometheus.Options{}) if err != nil { log.Fatal("error registering prometheus exporter") return nil } view.RegisterExporter(exporter) return exporter } func startPrometheusExporter(log logrus.FieldLogger, exporter *prometheus.Exporter) { addr := ":9090" log.Infof("starting prometheus server at %s", addr) http.Handle("/metrics", exporter) log.Fatal(http.ListenAndServe(addr, nil)) } func initStats(log logrus.FieldLogger) { // Start prometheus exporter exporter := initPrometheusStatsExporter(log) go startPrometheusExporter(log, exporter) if err := view.Register(ochttp.DefaultServerViews...); err != nil { log.Fatal("error registering default http server views") } if err := view.Register(ocgrpc.DefaultClientViews...); err != nil { log.Fatal("error registering default grpc client views") } } https://github.com/census-instrumentation/opencensus-specs/blob/master/ stats/DataAggregation.md#view
  • 22. Demo - Metrics (AdService, Java) Exporter 등록 import io.opencensus.exporter.stats.prometheus.PrometheusStatsCollector; public static void main(String[] args) throws IOException, InterruptedException { ... // Register Prometheus exporters and export metrics to a Prometheus HTTPServer. PrometheusStatsCollector.createAndRegister(); HTTPServer prometheusServer = new HTTPServer(9090, true); ... final AdService service = AdService.getInstance(); service.start(); service.blockUntilShutdown(); } https://github.com/census-instrumentation/opencensus-specs/blob/master/stats/DataAggregation.md#view
  • 23. Demo - Metrics (AdService, Java) gRPC views /** Main launches the server from the command line. */ public static void main(String[] args) throws IOException, InterruptedException { ... // Registers all RPC views. RpcViews.registerAllViews(); ... } https://github.com/census-instrumentation/opencensus-specs/blob/master/stats/DataAggregation.md#view
  • 24. Demo - Trace Monitoring (문제상황)
  • 25. Demo - Stats Monitoring (문제상황)
  • 26. Demo - Code Tuning parceCatalog( )를 products 변수로 처리하여 전체 로직에서 시간 줄이기 재배포 (skaffold) $ skaffold run --default-repo=gcr.io/cloudrun-237814 -n default
  • 27. Demo - Trace Monitoring
  • 28. Demo - Stats Monitoring
  • 29. 정리 ● OpenCensus Agent, Collector 활용 고민해보자 ● SRE - SLI(Service Level Indicator), SLO(Service Level Objective) ● Application Custom Metric 확장 ● Istio 확장 ○ https://github.com/census-instrumentation/opencensus-service/blob/master/DESIGN.md#implementation-details-of-agent-server ● OpenMetric + OpenCensus : ?? ○ https://medium.com/opentracing/merging-opentracing-and-opencensus-f0fe9c7ca6f0