1) The document discusses the importance of monitoring APIs, applications, databases, and external calls. It highlights the need for metrics, logging, tracing, and performance monitoring.
2) Open source tools like Elasticsearch (ELK stack), Zipkin, and Sleuth are mentioned for logging, tracing, and monitoring. However, it is noted that no single open source project provides an integrated solution for all operational needs.
3) Commercial offerings are able to provide more comprehensive and integrated solutions compared to various open source tools, including out-of-the-box dashboards, method-level insight, host and process metrics, cross-technology tracing, log analytics, and automation to support operations teams.
5. #apisummit @MartinGoodwell
About me
Passionate about life,
technology, and the people
behind both of them.
Trying not to be an A-hole.
• Started with Commodore 8-bit (VC-20 and C-64)
• Built Null-modem connections for playing Doom and WarCraft
• Built IPX/SPX networks between MS-DOS 5.0 and Windows 3.1
• Did DevOps before they called it that way (mainly Java and Web)
for about 10 years
• Now at Dynatrace Innovation Lab
• Tech Lead for Microsoft Technologies
and Software Architecture
• Talking, blogging, webinaring, and innovating
• Find me on Twitter: @MartinGoodwell
7. #apisummit @MartinGoodwell
Warm up
• What's your occupation?
• Dev, Ops, Non-technical
• What's your technology stack?
• Java, .net, Node.js, Go, PHP, Python
• Who of you does
• Cloud
• API
• Application Monitoring
• Level of automation
• Version control (also for stored procedures?)
• Build server
• Automated deployment
• Who of you builds their own troubleshooting tools?
10. #apisummit @MartinGoodwell
Two "real" problems of every project
(but with APIs it's even more complicated)
1) it's not working
2) it's too slow
One "developer" problem
1) it's crap. we need to redo it
25. #apisummit @MartinGoodwell
It does not matter,
what we want to get out of our logfiles.
Whatever it is,
we have to filter out lots of noise
26. #apisummit @MartinGoodwell
What do we log?
• Information about exceptions
What do we not log?
• Metrics
How do we log?
• in JSON
• including a correlation id
Where do we log?
• to a central logging server
27. #apisummit @MartinGoodwell
Logging learnings
• Use a logging server (eg ELK stack)
• directly log as JSON
• or at least store as JSON
• Using logging for monitoring is expensive
• log analysis is a real resource hog
• works great for troubleshooting
• works great with limited problem scope
• for Java, use Logback via SLF4J
• to local logfiles
• to logstash
• to syslog
28. #apisummit @MartinGoodwell
Logging vs Monitoring
Monitoring
• numeric only
• Analysis and aggregation much cheaper
• perfect for charting
• and long time reporting
• Numeric only
Logging
• Text or numeric
• Analysis and aggregation is expensive
• b/c lots of noise
• only for limited timeframe
• Can contain text with detailed
descriptions
@MartinGoodwell
38. #apisummit @MartinGoodwell
Monitoring external API calls
• Monitor
• nr of calls
• response times
• errors
• Netflix OSS Hystrix
• circuit breaker
• trip count
• If you create a public API, please keep request headers in response
41. #apisummit @MartinGoodwell
What open-source offerings still miss
• No project that takes care about automating Operation's use-cases
• No single umbrella project for
• Monitoring
• Log-Analysis
• Call-Tracing
• DB-Analysis
53. #apisummit @MartinGoodwell
DevOps is about collaboration.
Collaboration requires documentation.
Automation is implicit documentation.
But there is no automation for
supporting Ops with troubleshooting.