Herding Microservices – the Atlassian Way

•

2 gefällt mir•9,333 views

Matej Konecny discusses how his team at Atlassian manages their microservices architecture. In the early days, their applications were built as WAR files running on EC2 with autoscaling and logs stored in CloudTrail, which made monitoring and incident handling difficult. They have since implemented a unified PaaS across Atlassian using Docker Compose with automated provisioning of containers and resources. This common platform enforces best practices and allows sidecars to be reused. Incident detection and communication is now automated through tools like OpsGenie, and changes are easier to track. Regular meetings cover metrics, alerts, and code health to improve oversight of services.

Technologie

MATEJ KONECNY | SENIOR DEVELOPER | ATLASSIAN
Herding Microservices
The Atlassian Way

IT dev team
~ 15 microservices
3 continents
MY TEAM

My team's mission:
Deliver great help experience to our
customers

To achieve our goal,  
we need to
interconnect 
many services
research.archives.gov/description/1633445

Our early days...
EC2
Applications were built as
WARs and running on EC2/
EBS. We've used
autoscaling.

Our early days...
EC2
Applications were built as
WARs and running on EC2/
EBS. We've used
autoscaling.
Logs
In CloudTrail. Difficult to
search for clues.

Incidents
were
chaos
https://www.aair.biz

We treated  
our services  
like pets
https://www.flickr.com/photos/presidioofmonterey/43804939880

We now have a
unified PaaS
across all
Atlassian
PLATFORM

Provisioning
Automatically
provision containers
AND resources.
Cattle, not pets

Provisioning
Automatically
provision containers
AND resources.
Technology
Built on top of Docker
Compose.
Integrates with CI/
CD.
Cattle, not pets

Provisioning
Automatically
provision containers
AND resources.
Technology
Built on top of Docker
Compose.
Integrates with CI/
CD.
Reuse
Sidecars can be
reused in other
services, e.g.
monitoring.
Cattle, not pets

INCIDENT HANDLING TODAY
Detect the
problem
Automatic
escalation to
service owner
Check the
changes globally
Rollback, turn off
feature or hotfix
Post Incident
Review
T+0 T+5 T+10 T+X Later

Detect problems
automatically
Have an andon
cord as a backup
DETECTION

Define the service
tiers
Use tools to
communicate
ESCALATION

HOT ticket raised
INCIDENT COMMUNICATION
DETECTION

HOT ticket raised
INCIDENT COMMUNICATION
DETECTION OpsGenie auto-page

HOT ticket raised
INCIDENT COMMUNICATION
DETECTION
PAGE ACCEPTED
OpsGenie auto-page

HOT ticket raised
INCIDENT COMMUNICATION
DETECTION
PAGE ACCEPTEDInvestigation
OpsGenie auto-page

HOT ticket raised
INCIDENT COMMUNICATION
Zoom/Slack/Statuspage
DETECTION
PAGE ACCEPTEDInvestigation
OpsGenie auto-page

HOT ticket raised
INCIDENT COMMUNICATION
Zoom/Slack/Statuspage
DETECTION
PAGE ACCEPTEDInvestigation
OpsGenie auto-page
Resolve

Weekly TechOps meeting
Signal vs Noise
We check that the alerts
raised are meaningful.

Weekly TechOps meeting
Signal vs Noise
We check that the alerts
raised are meaningful.
Check KPIs
Did the service meet all
the defined Service Level
Objectives (SLO)?

What's next?
Knowledge silos
Automatically measure how
well each team member
knows each service to
reduce knowledge silos.

What's next?
Knowledge silos
Automatically measure how
well each team member
knows each service to
reduce knowledge silos.
Service costs
Gain more insights in total
cost of ownership of each
service and new feature
built.

Platform
Use common
platform and define
guiding principles.
To sum up...

Platform
Use common
platform and define
guiding principles.
Monitoring
Collect metrics and
aggregate the logs in
central location.
To sum up...

Platform
Use common
platform and define
guiding principles.
Monitoring
Collect metrics and
aggregate the logs in
central location.
Standards
Enforce standards
when deploying and
developing.
To sum up...

MATEJ KONECNY | SENIOR DEVELOPER | ATLASSIAN
Thank you!

Empfohlen

Behind the Scenes of Vendor Security Reviews in the EnterpriseAtlassian

Monitoring As Code: How to Integrate App Monitoring Into Your Developer CycleAtlassian

Not All Heroes Wear Capes: Skills and Tools Helpful in Becoming a Support Sup...Atlassian

Serverless Analytics and Monitoring For Your Cloud AppAtlassian

Building Faster With Your Team's UI KitAtlassian

Leaning into Server to Cloud App MigrationAtlassian

Declaring Server App Components in Pure JavaAtlassian

Spec-first API Design for Speed and SafetyAtlassian

Empfohlen

Behind the Scenes of Vendor Security Reviews in the EnterpriseAtlassian

Monitoring As Code: How to Integrate App Monitoring Into Your Developer CycleAtlassian

Not All Heroes Wear Capes: Skills and Tools Helpful in Becoming a Support Sup...Atlassian

Serverless Analytics and Monitoring For Your Cloud AppAtlassian

Building Faster With Your Team's UI KitAtlassian

Leaning into Server to Cloud App MigrationAtlassian

Declaring Server App Components in Pure JavaAtlassian

Spec-first API Design for Speed and SafetyAtlassian

Launch into New Markets with JIRA Service DeskAtlassian

The New & Improved Confluence Server and Data CenterAtlassian

5 Essential Techniques for Building Fault-tolerant SystemsAtlassian

4 Changes We're Making to Help you be Successful in the CloudAtlassian

Take Action with Forge TriggersAtlassian

What's New in Jira Cloud for DevelopersAtlassian

Using Cookies to Store Your Postman SecretsPostman

The User Who Must Not be Named: GDPR and Your Jira AppAtlassian

What's New in AUI 8 and Why you Should Care!Atlassian

Designing Forge UI: A Story of Designing an App UI SystemAtlassian

Trusted by Default: The Forge Security & Privacy ModelAtlassian

What Does Jira Next-Gen Mean for Cloud Apps?Atlassian

An Exploration of Cross-product App ExperiencesAtlassian

Automation Awesomeness: Scaling JIRA Service DeskAtlassian

AI-Powered DevOps: Injecting Speed & Quality Across Verizon’s Cloud PipelinesDynatrace

Meet the Forge RuntimeAtlassian

The Four Principles of Atlassian Performance TuningAtlassian

SecOps - Bringing Agility into SecurityAtlassian

Technical Deep Dive Into Atlassian's New Apps Performance Testing FrameworkAtlassian

Discover the Possibilities of the Jira Cloud Asset APIAtlassian

11 Ways Microservices & Dynamic Clouds Break Your MonitoringAbner Germanow

Incident Management in the Age of DevOps and SRE Rundeck

Weitere ähnliche Inhalte

Was ist angesagt?

Launch into New Markets with JIRA Service DeskAtlassian

The New & Improved Confluence Server and Data CenterAtlassian

5 Essential Techniques for Building Fault-tolerant SystemsAtlassian

4 Changes We're Making to Help you be Successful in the CloudAtlassian

Take Action with Forge TriggersAtlassian

What's New in Jira Cloud for DevelopersAtlassian

Using Cookies to Store Your Postman SecretsPostman

The User Who Must Not be Named: GDPR and Your Jira AppAtlassian

What's New in AUI 8 and Why you Should Care!Atlassian

Designing Forge UI: A Story of Designing an App UI SystemAtlassian

Trusted by Default: The Forge Security & Privacy ModelAtlassian

What Does Jira Next-Gen Mean for Cloud Apps?Atlassian

An Exploration of Cross-product App ExperiencesAtlassian

Automation Awesomeness: Scaling JIRA Service DeskAtlassian

AI-Powered DevOps: Injecting Speed & Quality Across Verizon’s Cloud PipelinesDynatrace

Meet the Forge RuntimeAtlassian

The Four Principles of Atlassian Performance TuningAtlassian

SecOps - Bringing Agility into SecurityAtlassian

Technical Deep Dive Into Atlassian's New Apps Performance Testing FrameworkAtlassian

Discover the Possibilities of the Jira Cloud Asset APIAtlassian

Was ist angesagt? (20)

Launch into New Markets with JIRA Service Desk

The New & Improved Confluence Server and Data Center

5 Essential Techniques for Building Fault-tolerant Systems

4 Changes We're Making to Help you be Successful in the Cloud

Take Action with Forge Triggers

What's New in Jira Cloud for Developers

Using Cookies to Store Your Postman Secrets

The User Who Must Not be Named: GDPR and Your Jira App

What's New in AUI 8 and Why you Should Care!

Designing Forge UI: A Story of Designing an App UI System

Trusted by Default: The Forge Security & Privacy Model

What Does Jira Next-Gen Mean for Cloud Apps?

An Exploration of Cross-product App Experiences

Automation Awesomeness: Scaling JIRA Service Desk

AI-Powered DevOps: Injecting Speed & Quality Across Verizon’s Cloud Pipelines

Meet the Forge Runtime

The Four Principles of Atlassian Performance Tuning

SecOps - Bringing Agility into Security

Technical Deep Dive Into Atlassian's New Apps Performance Testing Framework

Discover the Possibilities of the Jira Cloud Asset API

Ähnlich wie Herding Microservices – the Atlassian Way

11 Ways Microservices & Dynamic Clouds Break Your MonitoringAbner Germanow

Incident Management in the Age of DevOps and SRE Rundeck

Scaling Gilt: from Monolithic Ruby Application to Distributed Scala Micro-Ser...C4Media

AppSec Pipelines and Event based SecurityMatt Tesauro

5 Years Of Building SaaS On AWSChristian Beedgen

Cloud Roundtable | Amazon Web Services: Key = IterationCodemotion

Red hat forum 2019 - Modern Organization CookbookStefan van Oirschot

Keeping Your DevOps Transformation From Crushing Your Ops Capacity Rundeck

Consul: Service-oriented at ScaleC4Media

AWS Summit - Trends in Advanced Monitoring for AWS environmentsAndreas Grabner

Microservices, Microfrontends and Feature TeamsGiulio Roggero

Scaling micro services at giltAdrian Trenaman

Semplificare l'observability per progetti ServerlessLuciano Mammino

How to Say Yes to Self-Service in the Cloud and Become an IT HeroRightScale

How to Build a Successful AWS Consulting PracticeAmazon Web Services

DevOps - Applying Lean & Agile Principles to Operations & MoreChris Edwards

Building a Modern Microservices Architecture at Gilt: The EssentialsC4Media

The DevOps journey in an Enterprise - Continuous Lifecycle London 2016Anders Lundsgård

AWS and Dynatrace: Moving your Cloud Strategy to the Next LevelDynatrace

AWS Startup Insights Kuala LumpurAmazon Web Services

Ähnlich wie Herding Microservices – the Atlassian Way (20)

11 Ways Microservices & Dynamic Clouds Break Your Monitoring

Incident Management in the Age of DevOps and SRE

Scaling Gilt: from Monolithic Ruby Application to Distributed Scala Micro-Ser...

AppSec Pipelines and Event based Security

5 Years Of Building SaaS On AWS

Cloud Roundtable | Amazon Web Services: Key = Iteration

Red hat forum 2019 - Modern Organization Cookbook

Keeping Your DevOps Transformation From Crushing Your Ops Capacity

Consul: Service-oriented at Scale

AWS Summit - Trends in Advanced Monitoring for AWS environments

Microservices, Microfrontends and Feature Teams

Scaling micro services at gilt

Semplificare l'observability per progetti Serverless

How to Say Yes to Self-Service in the Cloud and Become an IT Hero

How to Build a Successful AWS Consulting Practice

DevOps - Applying Lean & Agile Principles to Operations & More

Building a Modern Microservices Architecture at Gilt: The Essentials

The DevOps journey in an Enterprise - Continuous Lifecycle London 2016

AWS and Dynatrace: Moving your Cloud Strategy to the Next Level

AWS Startup Insights Kuala Lumpur

Mehr von Atlassian

International Women's Day 2020Atlassian

10 emerging trends that will unbreak your workplace in 2020Atlassian

Forge App ShowcaseAtlassian

Let's Build an Editor Macro with Forge UIAtlassian

Forge UI: A New Way to Customize the Atlassian User ExperienceAtlassian

Observability and Troubleshooting in ForgeAtlassian

Forge: Under the HoodAtlassian

Access to User Activities - Activity Platform APIsAtlassian

Design Your Next App with the Atlassian Vendor Sketch PluginAtlassian

Tear Up Your Roadmap and Get Out of the BuildingAtlassian

Nailing Measurement: a Framework for Measuring Metrics that MatterAtlassian

Building Apps With Color Blind Users in MindAtlassian

Creating Inclusive Experiences: Balancing Personality and Accessibility in UX...Atlassian

Beyond Diversity: A Guide to Building Balanced TeamsAtlassian

The Road(map) to Las Vegas - The Story of an Emerging Self-Managed TeamAtlassian

Building Apps With Enterprise in MindAtlassian

Shipping With Velocity and Confidence Using Feature FlagsAtlassian

Build With Heart and Balance, Remote Work EditionAtlassian

How to Grow an Atlassian App Worthy of Top Vendor StatusAtlassian

How to Market Your New App on the Atlassian MarketplaceAtlassian

Mehr von Atlassian (20)

International Women's Day 2020

10 emerging trends that will unbreak your workplace in 2020

Forge App Showcase

Let's Build an Editor Macro with Forge UI

Forge UI: A New Way to Customize the Atlassian User Experience

Observability and Troubleshooting in Forge

Forge: Under the Hood

Access to User Activities - Activity Platform APIs

Design Your Next App with the Atlassian Vendor Sketch Plugin

Tear Up Your Roadmap and Get Out of the Building

Nailing Measurement: a Framework for Measuring Metrics that Matter

Building Apps With Color Blind Users in Mind

Creating Inclusive Experiences: Balancing Personality and Accessibility in UX...

Beyond Diversity: A Guide to Building Balanced Teams

The Road(map) to Las Vegas - The Story of an Emerging Self-Managed Team

Building Apps With Enterprise in Mind

Shipping With Velocity and Confidence Using Feature Flags

Build With Heart and Balance, Remote Work Edition

How to Grow an Atlassian App Worthy of Top Vendor Status

How to Market Your New App on the Atlassian Marketplace

Kürzlich hochgeladen

Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge

Artificial Intelligence: Facts and MythsJoaquim Jorge

Automating Google Workspace (GWS) & more with Apps Scriptwesley chun

Slack Application Development 101 Slidespraypatel2

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer

🐬 The future of MySQL is Postgres 🐘RTylerCroy

08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls

Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies

Real Time Object Detection Using Open CVKhem

Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech

Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

CNv6 Instructor Chapter 6 Quality of Servicegiselly40

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science

The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad

Presentation on how to chat with PDF using ChatGPT code interpreternaman860154

A Domino Admins Adventures (Engage 2024)Gabriella Davis

Scaling API-first – The story of a global engineering organizationRadu Cotescu

What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco

Kürzlich hochgeladen (20)

Handwritten Text Recognition for manuscripts and early printed texts

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf

Artificial Intelligence: Facts and Myths

Automating Google Workspace (GWS) & more with Apps Script

Slack Application Development 101 Slides

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024

🐬 The future of MySQL is Postgres 🐘

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men

Factors to Consider When Choosing Accounts Payable Services Providers.pptx

Real Time Object Detection Using Open CV

Advantages of Hiring UIUX Design Service Providers for Your Business

Axa Assurance Maroc - Insurer Innovation Award 2024

08448380779 Call Girls In Civil Lines Women Seeking Men

CNv6 Instructor Chapter 6 Quality of Service

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx

The Codex of Business Writing Software for Real-World Solutions 2.pptx

Presentation on how to chat with PDF using ChatGPT code interpreter

A Domino Admins Adventures (Engage 2024)

Scaling API-first – The story of a global engineering organization

What Are The Drone Anti-jamming Systems Technology?

Herding Microservices – the Atlassian Way

1. MATEJ KONECNY | SENIOR DEVELOPER | ATLASSIAN Herding Microservices The Atlassian Way

2. IT dev team ~ 15 microservices 3 continents MY TEAM

3. My team's mission: Deliver great help experience to our customers

4. To achieve our goal,   we need to interconnect  many services research.archives.gov/description/1633445

5. Back in the old days...

6. Our early days...

7. Our early days... EC2 Applications were built as WARs and running on EC2/ EBS. We've used autoscaling.

8. Our early days... EC2 Applications were built as WARs and running on EC2/ EBS. We've used autoscaling. Logs In CloudTrail. Difficult to search for clues.

9. Our early days... EC2 Applications were built as WARs and running on EC2/ EBS. We've used autoscaling. Logs In CloudTrail. Difficult to search for clues. Monitoring Only whatever is built-in AWS like CloudWatch.

10. Incidents were chaos https://www.aair.biz

11. We treated   our services   like pets https://www.flickr.com/photos/presidioofmonterey/43804939880

12. We now have a unified PaaS across all Atlassian PLATFORM

13. Cattle, not pets

14. Provisioning Automatically provision containers AND resources. Cattle, not pets

15. Provisioning Automatically provision containers AND resources. Technology Built on top of Docker Compose. Integrates with CI/ CD. Cattle, not pets

16. Provisioning Automatically provision containers AND resources. Technology Built on top of Docker Compose. Integrates with CI/ CD. Reuse Sidecars can be reused in other services, e.g. monitoring. Cattle, not pets

17. Provisioning Automatically provision containers AND resources. Technology Built on top of Docker Compose. Integrates with CI/ CD. Reuse Sidecars can be reused in other services, e.g. monitoring. Best practices Enforce the best practices before deployment. Cattle, not pets

18. INCIDENT HANDLING TODAY Detect the problem Automatic escalation to service owner Check the changes globally Rollback, turn off feature or hotfix Post Incident Review T+0 T+5 T+10 T+X Later

19. Detect problems automatically Have an andon cord as a backup DETECTION

20. INCIDENT HANDLING TODAY Detect the problem Automatic escalation to service owner Check the changes globally Rollback, turn off feature or hotfix Post Incident Review T+0 T+5 T+10 T+X Later

21. Define the service tiers Use tools to communicate ESCALATION

22. INCIDENT COMMUNICATION DETECTION

23. HOT ticket raised INCIDENT COMMUNICATION DETECTION

24. HOT ticket raised INCIDENT COMMUNICATION DETECTION OpsGenie auto-page

25. HOT ticket raised INCIDENT COMMUNICATION DETECTION PAGE ACCEPTED OpsGenie auto-page

26. HOT ticket raised INCIDENT COMMUNICATION DETECTION PAGE ACCEPTEDInvestigation OpsGenie auto-page

27. HOT ticket raised INCIDENT COMMUNICATION Zoom/Slack/Statuspage DETECTION PAGE ACCEPTEDInvestigation OpsGenie auto-page

28. HOT ticket raised INCIDENT COMMUNICATION Zoom/Slack/Statuspage DETECTION PAGE ACCEPTEDInvestigation OpsGenie auto-page Resolve

29. INCIDENT HANDLING TODAY Detect the problem Automatic escalation to service owner Check the changes globally Rollback, turn off feature or hotfix Post Incident Review T+0 T+5 T+10 T+X Later

30. 23% incidents caused by changes

31. Find the changes easily CHANGES

32.

33. INCIDENT HANDLING TODAY Detect the problem Automatic escalation to service owner Check the changes globally Rollback, turn off feature or hotfix Post Incident Review T+0 T+5 T+10 T+X Later

34. INCIDENT HANDLING TODAY Detect the problem Automatic escalation to service owner Check the changes globally Rollback, turn off feature or hotfix Post Incident Review T+0 T+5 T+10 T+X Later

35. Weekly TechOps meeting

36. Weekly TechOps meeting Signal vs Noise We check that the alerts raised are meaningful.

37. Weekly TechOps meeting Signal vs Noise We check that the alerts raised are meaningful. Check KPIs Did the service meet all the defined Service Level Objectives (SLO)?

38. Weekly TechOps meeting Signal vs Noise We check that the alerts raised are meaningful. Check KPIs Did the service meet all the defined Service Level Objectives (SLO)? Code health Analyze the test coverage and the technical debt backlog.

39. ? Did it help?

40. Feels  betterDid it help?

41. What's next?

42. What's next? Knowledge silos Automatically measure how well each team member knows each service to reduce knowledge silos.

43. What's next? Knowledge silos Automatically measure how well each team member knows each service to reduce knowledge silos. Service costs Gain more insights in total cost of ownership of each service and new feature built.

44. What's next? Knowledge silos Automatically measure how well each team member knows each service to reduce knowledge silos. Service costs Gain more insights in total cost of ownership of each service and new feature built. Runbooks Organize the runbooks and make them searchable. Run more war games (fire drills).

45. To sum up...

46. Platform Use common platform and define guiding principles. To sum up...

47. Platform Use common platform and define guiding principles. Monitoring Collect metrics and aggregate the logs in central location. To sum up...

48. Platform Use common platform and define guiding principles. Monitoring Collect metrics and aggregate the logs in central location. Standards Enforce standards when deploying and developing. To sum up...

49. Platform Use common platform and define guiding principles. Monitoring Collect metrics and aggregate the logs in central location. Standards Enforce standards when deploying and developing. Learn Incorporate learning from the incidents. They are invaluable! To sum up...

50. MATEJ KONECNY | SENIOR DEVELOPER | ATLASSIAN Thank you!