SlideShare ist ein Scribd-Unternehmen logo
1 von 33
Downloaden Sie, um offline zu lesen
P I N T R A C E
D I S T R I B U T E D T R A C I N G @ P I N T E R E S T
S U M A N K A R U M U R I
A B O U T M E
• Passionate about distributed tracing, observability and
cloud infrastructure.
• Lead on Visibility/Storage team at Pinterest.
• Lead for Zipkin project at Twitter.
• Ex-(Twitter, Facebook, Amazon, Yahoo, Goldman
Sachs).
M O T I VAT I O N
SPEED
IMPROVES
ENGAGEMENT
M I C R O - S E R V I C E S B R O K E O U R T O O L S
HOW DID THIS REQUEST EXECUTE?
A G G R E G AT E E V E N T S P E R S E R V I C E
U N D E R S TA N D T R E N D S A N D A L E R T S
C H E A P
S E R V I C E L E V E L O V E R V I E W
N O P E R R E Q U E S T O V E R V I E W
M E T R I C S
R E C O R D D I S C R E T E E V E N T S
M A N U A L C O R R E L AT I O N
E X P E N S I V E
F L E X I B L E B U T V E RY B R I T T L E
L O G S
P R O J E C T P R E S T I G E
P I N P O I N T
M A N U A L T R A C I N G
R E C O R D E V E N T S I N A R E Q U E S T W I T H C A U S A L
O R D E R I N G
What is Distributed Tracing?
S T R U C T U R E D L O G G I N G O N S T E R O I D S
A N N O TAT I O N , S PA N , T R A C E
What is Distributed Tracing?
T R A C E R E Q U E S T S : R E C O R D E V E N T S I N
A R E Q U E S T W I T H C A U S A L O R D E R I N G .
A C R O S S M O B I L E C L I E N T S , B A C K E N D
S E R V I C E S A N D D ATA B A S E S
Z I P K I N B A S E D T R A C I N G S O L U T I O N
M O R E E X P E N S I V E
P I N T R A C E
B U I L D I N G P I N T R A C E : 5 C H A L L E N G E S
B U I L D I N S T R U M E N TAT I O N
C H A L L E N G E 1
HARD & TEDIOUS
O N E I N S T R U M E N TAT I O N S P E R ( L A N G U A G E ,
F R A M E W O R K , T H R E A D P O O L , P R O T O C O L )
C O M B I N AT I O N .
P Y T H O N T R A C E R ( O P E N T R A C I N G A P I )
F I N A G L E Z I P K I N T R A C E R
C U S T O M A N D R O I D A N D I O S T R A C E R .
S PA N R E P O R T A N D A G G R E G AT I O N
C H A L L E N G E 2
First company wide span aggregation pipeline.
D E P L O Y I N S T R U M E N TAT I O N
C H A L L E N G E 3
3 instrumentations.
100+ services
40 teams
Sampling <1% traffic
T R A C E P R O C E S S I N G A N D S T O R A G E
C H A L L E N G E 4
Open sourced our streaming pipeline:
github.com/openzipkin/zipkin-sparkstreaming
T R A C E V I S U A L I Z AT I O N
C H A L L E N G E 5
Pintrace architecture
T R A C E S A R E D ATA
Z I P K I N U I
A P P L I C AT I O N S O F T R A C E D ATA
U N D E R S TA N D , D E B U G A N D T U N E D I S T R I B U T E D S Y S T E M S .
I D E N T I F Y I N G S E R V I C E S I N T E R A C T I N G
W I T H A R E Q U E S T
U N D E R S TA N D R E Q U E S T T I M E L I N E
I D E N T I F Y I N G D U P L I C AT E
C O M P U TAT I O N
U N D E R S TA N D R E Q U E S T T I M E L I N E
5% latency (20ms improvement) while halving the load
W H I C H C L U S T E R S E R V E D T H I S
R E Q U E S T ?
D E B U G D I S T R I B U T E D R E Q U E S T E X E C U T I O N
I D E N T I F Y I N G S E R I A L E X E C U T I O N
T U N E D I S T R I B U T E D S Y S T E M
Step pattern in a trace signifies serial execution
Parallel get_many after the bug fix.
T R A C E S A N D M E T R I C S
R E D U C E T I M E T O T R I A G E
S E R V I C E D E P E N D E N C Y G R A P H
U N D E R S TA N D D E P E N D E N C Y A C R O S S S E R V I C E S
E N D T O E N D T R A C I N G
T R A C E A U S E R I N T E R A C T I O N T O B A C K E N D C A C H E R E Q U E S T
T R A C I N G J S C L I E N T
T R A C E B R O W S E R E V E N T S
Identified real user latency was worse than
our measured latency.
T R A C I N G T H R O U G H C D N
C D N T R A F F I C A N A LY S I S
Clients send trace headers
to CDN.
Ingest traced fastly/cedexis
logs as spans.
Analyze spans to
compare CDN latency vs
user perceived latency.
M O R E A P P L I C AT I O N S O F T R A C E D ATA
• Improve time to triage.
• Automated root cause analysis.
• Tracking down p99 latencies.
• Identify architectural optimizations.
• Latency pipeline.
• Inter AZ traffic analysis.
L E S S O N S L E A R N E D
• User awareness and education are very important to
make tracing successful.
• Begin with the end in mind.
• Trace most valuable paths in the application.
• Distributed tracing landscape is confusing.
• Quality of traces is more important than quantity.
T E A M
N A O M A N A B B A S
J O N PA R I S E
B R I A N PA N E
S A M M E D E R
J O E G O R D O N
V I S I B I L I T Y T E A M
S R E T E A M
W E B / I O S ? A N D R O I D
Q U E S T I O N S ?
https://tinyurl.com/pintrace-architecture
https://tinyurl.com/pintrace-applications
https://tinyurl.com/pintrace-analysis
skarumuri@pinterest.com
twitter: @mansu

Weitere ähnliche Inhalte

Was ist angesagt?

John Stauffer - Closing the Empathy Gap: Six Ways to Develop Better Consumer ...
John Stauffer - Closing the Empathy Gap: Six Ways to Develop Better Consumer ...John Stauffer - Closing the Empathy Gap: Six Ways to Develop Better Consumer ...
John Stauffer - Closing the Empathy Gap: Six Ways to Develop Better Consumer ...
Julia Grosman
 

Was ist angesagt? (20)

Web Development for Managers
Web Development for ManagersWeb Development for Managers
Web Development for Managers
 
How to Optimize your Professional Network by Mikus Kins
How to Optimize your Professional Network by Mikus KinsHow to Optimize your Professional Network by Mikus Kins
How to Optimize your Professional Network by Mikus Kins
 
Reimagining Retail by @JohnBatistich
Reimagining Retail by @JohnBatistichReimagining Retail by @JohnBatistich
Reimagining Retail by @JohnBatistich
 
Preparing for CRM
Preparing for CRMPreparing for CRM
Preparing for CRM
 
DPU SUMMER LAB PROPOSAL GROUP A
DPU SUMMER LAB PROPOSAL GROUP ADPU SUMMER LAB PROPOSAL GROUP A
DPU SUMMER LAB PROPOSAL GROUP A
 
Manejo de redes
Manejo de redesManejo de redes
Manejo de redes
 
#TweetSmarter Webinar 2.0: Learn from the Experts How to Drive More Conversi...
 #TweetSmarter Webinar 2.0: Learn from the Experts How to Drive More Conversi... #TweetSmarter Webinar 2.0: Learn from the Experts How to Drive More Conversi...
#TweetSmarter Webinar 2.0: Learn from the Experts How to Drive More Conversi...
 
Project Management & Innovation
Project Management & InnovationProject Management & Innovation
Project Management & Innovation
 
Bringing Learning Innovation to Life
Bringing Learning Innovation to LifeBringing Learning Innovation to Life
Bringing Learning Innovation to Life
 
John Stauffer - Closing the Empathy Gap: Six Ways to Develop Better Consumer ...
John Stauffer - Closing the Empathy Gap: Six Ways to Develop Better Consumer ...John Stauffer - Closing the Empathy Gap: Six Ways to Develop Better Consumer ...
John Stauffer - Closing the Empathy Gap: Six Ways to Develop Better Consumer ...
 
StoreMotion Company Profile
StoreMotion Company ProfileStoreMotion Company Profile
StoreMotion Company Profile
 
Online video Landscape
Online video LandscapeOnline video Landscape
Online video Landscape
 
Interactive and Transmedia Storytelling [Day 1]
Interactive and Transmedia Storytelling [Day 1]Interactive and Transmedia Storytelling [Day 1]
Interactive and Transmedia Storytelling [Day 1]
 
What is Student Centered Coaching?
What is Student Centered Coaching?What is Student Centered Coaching?
What is Student Centered Coaching?
 
TLN presentatie 4 juni 2015
TLN presentatie 4 juni 2015TLN presentatie 4 juni 2015
TLN presentatie 4 juni 2015
 
The Unicorn Leaps into Tech Talk
The Unicorn Leaps into Tech TalkThe Unicorn Leaps into Tech Talk
The Unicorn Leaps into Tech Talk
 
HOW TO CREATE ONLINE COURSES AT THE SAME PACE AS YOUR BUSINESS CYCLE?
HOW TO CREATE ONLINE COURSES AT THE SAMEPACE AS YOUR BUSINESS CYCLE? HOW TO CREATE ONLINE COURSES AT THE SAMEPACE AS YOUR BUSINESS CYCLE?
HOW TO CREATE ONLINE COURSES AT THE SAME PACE AS YOUR BUSINESS CYCLE?
 
Building Successful Communities: Michael Howard, urbanbubble
Building Successful Communities: Michael Howard, urbanbubbleBuilding Successful Communities: Michael Howard, urbanbubble
Building Successful Communities: Michael Howard, urbanbubble
 
Improve the Quality of Breaks with Kafka (Julian Stampfli, Spoud) Kafka Summi...
Improve the Quality of Breaks with Kafka (Julian Stampfli, Spoud) Kafka Summi...Improve the Quality of Breaks with Kafka (Julian Stampfli, Spoud) Kafka Summi...
Improve the Quality of Breaks with Kafka (Julian Stampfli, Spoud) Kafka Summi...
 
Apresentacao ICPM
Apresentacao  ICPMApresentacao  ICPM
Apresentacao ICPM
 

Ähnlich wie Pintrace: Distributed tracing @Pinterest

Ähnlich wie Pintrace: Distributed tracing @Pinterest (20)

Pintrace: Distributed tracing@Pinterest
Pintrace: Distributed tracing@PinterestPintrace: Distributed tracing@Pinterest
Pintrace: Distributed tracing@Pinterest
 
AWS SEMINAR SERIES 2015 Perth
AWS SEMINAR SERIES 2015 PerthAWS SEMINAR SERIES 2015 Perth
AWS SEMINAR SERIES 2015 Perth
 
AUA Data Science Meetup
AUA Data Science MeetupAUA Data Science Meetup
AUA Data Science Meetup
 
A ChatGPT Content Creation Master Class - Leah Faul, 15000 Cubits
A ChatGPT Content Creation Master Class - Leah Faul, 15000 CubitsA ChatGPT Content Creation Master Class - Leah Faul, 15000 Cubits
A ChatGPT Content Creation Master Class - Leah Faul, 15000 Cubits
 
AWS Seminar Series 2015 Melbourne
AWS Seminar Series 2015 MelbourneAWS Seminar Series 2015 Melbourne
AWS Seminar Series 2015 Melbourne
 
AWS Seminar Series 2015 Brisbane
AWS Seminar Series 2015 BrisbaneAWS Seminar Series 2015 Brisbane
AWS Seminar Series 2015 Brisbane
 
Auckland AWS Seminar Series
Auckland AWS Seminar SeriesAuckland AWS Seminar Series
Auckland AWS Seminar Series
 
The Digital Transformation: A New World Order
The Digital Transformation: A New World OrderThe Digital Transformation: A New World Order
The Digital Transformation: A New World Order
 
Agile Workshop for Teams
Agile Workshop for TeamsAgile Workshop for Teams
Agile Workshop for Teams
 
Gain Maximum Visibility into Your Applications - DEM04 - Atlanta AWS Summit
Gain Maximum Visibility into Your Applications - DEM04 - Atlanta AWS SummitGain Maximum Visibility into Your Applications - DEM04 - Atlanta AWS Summit
Gain Maximum Visibility into Your Applications - DEM04 - Atlanta AWS Summit
 
AWS SeMINAR SERIES 2015 Sydney
AWS SeMINAR SERIES 2015 SydneyAWS SeMINAR SERIES 2015 Sydney
AWS SeMINAR SERIES 2015 Sydney
 
eHarmony @ Phoenix Con 2016
eHarmony @ Phoenix Con 2016eHarmony @ Phoenix Con 2016
eHarmony @ Phoenix Con 2016
 
Gartner - Changing the CIO game with a Data Driven Culture
Gartner - Changing the CIO game with a Data Driven CultureGartner - Changing the CIO game with a Data Driven Culture
Gartner - Changing the CIO game with a Data Driven Culture
 
Touch Drive - A touch-based multi-function controller for autonomous driving
Touch Drive - A touch-based multi-function controller for autonomous drivingTouch Drive - A touch-based multi-function controller for autonomous driving
Touch Drive - A touch-based multi-function controller for autonomous driving
 
Gain Maximum Visibility - DEM06 - Anaheim AWS Summit
Gain Maximum Visibility - DEM06 - Anaheim AWS SummitGain Maximum Visibility - DEM06 - Anaheim AWS Summit
Gain Maximum Visibility - DEM06 - Anaheim AWS Summit
 
Gartner: Changing the CIO game with a Data Driven Culture
Gartner: Changing the CIO game with a Data Driven CultureGartner: Changing the CIO game with a Data Driven Culture
Gartner: Changing the CIO game with a Data Driven Culture
 
A Twenty-Minute Intro to Scrum Lean Agile Scotland 2015
A Twenty-Minute Intro to Scrum Lean Agile Scotland 2015 A Twenty-Minute Intro to Scrum Lean Agile Scotland 2015
A Twenty-Minute Intro to Scrum Lean Agile Scotland 2015
 
FSLSO QTR
FSLSO QTRFSLSO QTR
FSLSO QTR
 
Tic liz
Tic lizTic liz
Tic liz
 
Testifire_XTR2_Brochure.pdf
Testifire_XTR2_Brochure.pdfTestifire_XTR2_Brochure.pdf
Testifire_XTR2_Brochure.pdf
 

Mehr von Suman Karumuri (9)

Monorepo at Pinterest
Monorepo at PinterestMonorepo at Pinterest
Monorepo at Pinterest
 
PinTrace Advanced AWS meetup
PinTrace Advanced AWS meetup PinTrace Advanced AWS meetup
PinTrace Advanced AWS meetup
 
Phobos
PhobosPhobos
Phobos
 
Gpu Join Presentation
Gpu Join PresentationGpu Join Presentation
Gpu Join Presentation
 
Dream Language!
Dream Language!Dream Language!
Dream Language!
 
Bittorrent
BittorrentBittorrent
Bittorrent
 
Practical Byzantine Fault Tolerance
Practical Byzantine Fault TolerancePractical Byzantine Fault Tolerance
Practical Byzantine Fault Tolerance
 
bluespec talk
bluespec talkbluespec talk
bluespec talk
 
GFS
GFSGFS
GFS
 

Kürzlich hochgeladen

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Kürzlich hochgeladen (20)

Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

Pintrace: Distributed tracing @Pinterest

  • 1. P I N T R A C E D I S T R I B U T E D T R A C I N G @ P I N T E R E S T S U M A N K A R U M U R I
  • 2. A B O U T M E • Passionate about distributed tracing, observability and cloud infrastructure. • Lead on Visibility/Storage team at Pinterest. • Lead for Zipkin project at Twitter. • Ex-(Twitter, Facebook, Amazon, Yahoo, Goldman Sachs).
  • 3. M O T I VAT I O N
  • 5. M I C R O - S E R V I C E S B R O K E O U R T O O L S HOW DID THIS REQUEST EXECUTE?
  • 6. A G G R E G AT E E V E N T S P E R S E R V I C E U N D E R S TA N D T R E N D S A N D A L E R T S C H E A P S E R V I C E L E V E L O V E R V I E W N O P E R R E Q U E S T O V E R V I E W M E T R I C S
  • 7. R E C O R D D I S C R E T E E V E N T S M A N U A L C O R R E L AT I O N E X P E N S I V E F L E X I B L E B U T V E RY B R I T T L E L O G S
  • 8. P R O J E C T P R E S T I G E P I N P O I N T M A N U A L T R A C I N G
  • 9. R E C O R D E V E N T S I N A R E Q U E S T W I T H C A U S A L O R D E R I N G What is Distributed Tracing?
  • 10. S T R U C T U R E D L O G G I N G O N S T E R O I D S A N N O TAT I O N , S PA N , T R A C E What is Distributed Tracing?
  • 11. T R A C E R E Q U E S T S : R E C O R D E V E N T S I N A R E Q U E S T W I T H C A U S A L O R D E R I N G . A C R O S S M O B I L E C L I E N T S , B A C K E N D S E R V I C E S A N D D ATA B A S E S Z I P K I N B A S E D T R A C I N G S O L U T I O N M O R E E X P E N S I V E P I N T R A C E
  • 12. B U I L D I N G P I N T R A C E : 5 C H A L L E N G E S
  • 13. B U I L D I N S T R U M E N TAT I O N C H A L L E N G E 1 HARD & TEDIOUS O N E I N S T R U M E N TAT I O N S P E R ( L A N G U A G E , F R A M E W O R K , T H R E A D P O O L , P R O T O C O L ) C O M B I N AT I O N . P Y T H O N T R A C E R ( O P E N T R A C I N G A P I ) F I N A G L E Z I P K I N T R A C E R C U S T O M A N D R O I D A N D I O S T R A C E R .
  • 14. S PA N R E P O R T A N D A G G R E G AT I O N C H A L L E N G E 2 First company wide span aggregation pipeline.
  • 15. D E P L O Y I N S T R U M E N TAT I O N C H A L L E N G E 3 3 instrumentations. 100+ services 40 teams Sampling <1% traffic
  • 16. T R A C E P R O C E S S I N G A N D S T O R A G E C H A L L E N G E 4 Open sourced our streaming pipeline: github.com/openzipkin/zipkin-sparkstreaming
  • 17. T R A C E V I S U A L I Z AT I O N C H A L L E N G E 5 Pintrace architecture
  • 18. T R A C E S A R E D ATA Z I P K I N U I
  • 19. A P P L I C AT I O N S O F T R A C E D ATA U N D E R S TA N D , D E B U G A N D T U N E D I S T R I B U T E D S Y S T E M S .
  • 20. I D E N T I F Y I N G S E R V I C E S I N T E R A C T I N G W I T H A R E Q U E S T U N D E R S TA N D R E Q U E S T T I M E L I N E
  • 21. I D E N T I F Y I N G D U P L I C AT E C O M P U TAT I O N U N D E R S TA N D R E Q U E S T T I M E L I N E 5% latency (20ms improvement) while halving the load
  • 22. W H I C H C L U S T E R S E R V E D T H I S R E Q U E S T ? D E B U G D I S T R I B U T E D R E Q U E S T E X E C U T I O N
  • 23. I D E N T I F Y I N G S E R I A L E X E C U T I O N T U N E D I S T R I B U T E D S Y S T E M Step pattern in a trace signifies serial execution Parallel get_many after the bug fix.
  • 24. T R A C E S A N D M E T R I C S R E D U C E T I M E T O T R I A G E
  • 25. S E R V I C E D E P E N D E N C Y G R A P H U N D E R S TA N D D E P E N D E N C Y A C R O S S S E R V I C E S
  • 26. E N D T O E N D T R A C I N G T R A C E A U S E R I N T E R A C T I O N T O B A C K E N D C A C H E R E Q U E S T
  • 27. T R A C I N G J S C L I E N T T R A C E B R O W S E R E V E N T S Identified real user latency was worse than our measured latency.
  • 28. T R A C I N G T H R O U G H C D N C D N T R A F F I C A N A LY S I S Clients send trace headers to CDN. Ingest traced fastly/cedexis logs as spans. Analyze spans to compare CDN latency vs user perceived latency.
  • 29. M O R E A P P L I C AT I O N S O F T R A C E D ATA • Improve time to triage. • Automated root cause analysis. • Tracking down p99 latencies. • Identify architectural optimizations. • Latency pipeline. • Inter AZ traffic analysis.
  • 30. L E S S O N S L E A R N E D • User awareness and education are very important to make tracing successful. • Begin with the end in mind. • Trace most valuable paths in the application. • Distributed tracing landscape is confusing. • Quality of traces is more important than quantity.
  • 31. T E A M N A O M A N A B B A S J O N PA R I S E B R I A N PA N E S A M M E D E R J O E G O R D O N V I S I B I L I T Y T E A M S R E T E A M W E B / I O S ? A N D R O I D
  • 32.
  • 33. Q U E S T I O N S ? https://tinyurl.com/pintrace-architecture https://tinyurl.com/pintrace-applications https://tinyurl.com/pintrace-analysis skarumuri@pinterest.com twitter: @mansu