This document introduces Sam Guckenheimer and Ed Blankenship and discusses Microsoft's goal of creating a single engineering system (1ES). The purpose is to enable any developer to access and reuse source code across the company, get rewarded for creating popular components, and have changes instantly visible. In practice, this means scaling Git for enterprise use, promoting a live site culture, and creating a common telemetry pipeline to measure usage and engineering metrics. The goal is to have self-forming teams and determine if 1ES is achieving its goals like supporting 4x user growth.
Axa Assurance Maroc - Insurer Innovation Award 2024
DOES SFO 2016 - Sam Guckenheimer & Ed Blankenship "Moving to One Engineering System"
1.
2. About Us
Sam Guckenheimer
Product Owner, Visual Studio Cloud Services
13 years Microsoft
30 years software industry
@SamGuckenheimer
https://aka.ms/devops
Ed Blankenship
Product Manager, DevOps, VSTS/TFS
5 years Microsoft
15 years software industry
@EdBlankenship
https://about.me/edblankenship
https://engineering.microsoft.com/
3. Purpose of One Engineering System
If you want to go fast, go alone. If you want to go far, go together.
5. An engineering north star…
…the source across the company is available to anyone
…any dev can offer improvements to anything in the company
…the IP the company has built up over the years is made of re-usable components
…anybody can find and potentially re-use components from anywhere else
…devs are rewarded for creating popular components
…there is zero lag from when a dev makes a change & when the rest of the company sees it
… build and test time is directly proportional to the change made
…devs can move to another team and already know how to work
8. Live Site Culture and Engineering
Live Site Health
Time to Detect
Time to Communicate
Time To Mitigate
Customer Impact
Incident prevention items
Aging live site problems
Customer support metrics
SLA per customer account
(SLA, MPI, top drivers)
Engineering
Bug cap per engineer
Aging bugs in important
categories
Pass rate & coverage by
test level
Velocity
Time to build
Time to self test
Time to deploy
Time to learn
(Telemetry pipe)
Usage
Acquisition
Engagement
Dedication
Churn
Feature usage
Transition to Git
Motivation for Why Git @ Microsoft
Learnings from Adopting Git
Contribute to Open Source Git
Issues with Us Internally - We have some long lived code bases
They don’t neatly factor themselves into small repositories (like microservices)
How many of you would like to clone the Windows Git repo to your laptop?
Troubleshooting Daughter’s Internet Access (Sam)
Sequential Migration and Refactor Code Bases
First iteration of solving this problem was Git LFS, didn’t work, we contributed/participant to the community
Why did Git LFS not work for us at least? (Too slow)
New: Git Virtual File System - lazy loading
Build at Scale - Something similar to Electric Cloud (slack of the datacenters)
Using Git, Allows us to take advantage of the Pull Request Workflow
including branch policies (supports org scale)
Segue: Takes us back to Team Dashboard (example of team autonomy and enterprise alignment)
Kanban Board - Expedite Lanes lets you handle live site issues
One thing live site culture requires is that it requires us to be on the same telemetry pipeline.
Azure and services built-on Azure
Where is the problem?
(opening the Service Insights dashboard)
Hot, warm, and cold paths
Hot - optimized for speed
Warm - optimized for troubleshooting (data doesn’t need to be kept a long time)
Cold - business analysis (long running)
Availability
Root Cause Analysis from our VP
Support Site with Live Status (and updates)