SlideShare ist ein Scribd-Unternehmen logo
1 von 55
Inmar – do not copy, distribute or use without Inmar written permission, 2018 1
Deploying to production 100x a day,
with no QA and zero downtime
Deliver at Warp Speed
Rex Morgan
Director Software Engineering
Inmar – do not copy, distribute or use without Inmar written permission, 2018
How We Do It, And How You Can Too
3
Building Failsafe
Software
Event Sourcing
Idempotency
Async-First
Circuit Breakers & Auto-Retry
Real-Time Architecture
Continuous Deployment
Testing and SDLC
First-class automated testing
Deploy then test (not the other
way around)
Decide what to test using
Science™
Trunk-based development
Operations
You Build It, You Run It
Monitoring and Alarms
Ops Standup
Shoot things just because you
can
Inmar – do not copy, distribute or use without Inmar written permission, 2018
What is Inmar?
Ever used a paper or digital
coupon? Ever redeemed a rebate?
Ever filled a prescription, returned
something to a store, or online;
stayed at a hospital, read a blog, or
talked to a chatbot on social
media?
If so, you’ve touched Inmar.
4
> $1b / week processed
>100b consumer touchpoints
100s of deployments / day
Inmar – do not copy, distribute or use without Inmar written permission, 2018 5
Client’s TRUST
is Paramount
How do we go
FAST with
CONFIDENCE?
What do we
GAIN?
Inmar – do not copy, distribute or use without Inmar written permission, 2018
I am an Engineer
• 3 years as Director of Engineering at Inmar
• 5 years as Architect at Qorvo and Volvo
• 12 years as a professional software
developer
• At Inmar, managers, directors – all the way
up to the CTO – are technical – we code!
• I get to work with some of the most talented
and sharp engineers around. Want to talk
about joining us? Hit me up after!
6
AND YOU?
MY TEAM:
Inmar – do not copy, distribute or use without Inmar written permission, 2018
The Game
Our leaders are always asking,
begging us to deliver more, faster.
Why?
Competition is always about trying
to gain an advantage.
7
THE STAKES:
We win:
• More jobs
• Bigger projects
• Higher pay
We lose:
• Fewer jobs
• Stressful workplace
• No $ for raises
Inmar – do not copy, distribute or use without Inmar written permission, 2018 8
BUSINESSVAUEDELIVERED
TIME ELAPSEDPlanning Architecture
(ooo fun)
Development QA/UAT
BAM!
RELEASE
Inmar – do not copy, distribute or use without Inmar written permission, 2018 9
BUSINESSVAUEDELIVERED
Sprint 1
TIME ELAPSED
Sprint 2 Sprint 3 Sprint 4 Sprint 5 Sprint 6
…
Agile advantages:
• Greater cumulative value delivered
• Team velocity increases over time
Inmar – do not copy, distribute or use without Inmar written permission, 2018
OODA
10
Fit what we know about the
environment with our strengths
and weaknesses
Produces
effects on the
environment
ObserveDecide
Orient
ourselves
Act
Produces what
we know about
the environment
Produces
possible paths
to go down
Produces
action plan
Measure our
environment
Choose the best
course of action
Execute the plan!
Inmar – do not copy, distribute or use without Inmar written permission, 2018
OODA
11
ObserveDecide
Orient
ourselves
Act
• This is how we Learn: by Shipping
• Business leaders are playing chess –
they need to make a move to see its
effect. They need us to deliver.
• Code that isn’t live, being exercised
to deliver real value, is just practice.
• If the time it takes to complete one
full loop is too slow, the plan you are
executing is based on out-out-of-date
observations!
• If we aren’t looping as fast as we
possibly can, we aren’t maximizing
our own potential.
Inmar – do not copy, distribute or use without Inmar written permission, 2018 12
After a year, the team that OODA
loops 100x a day will have
accelerated far beyond the team
that does once every 3 weeks.
Inmar – do not copy, distribute or use without Inmar written permission, 2018 13
Inmar – do not copy, distribute or use without Inmar written permission, 2018
If it’s painful, you should do it more often
Amount of fear:
Unknown > Known
When you start doing things that hurt more
often, they go from “I fear it will fail in some
horrible way” to ”I see that it fails in this
very specific way”
14
Inmar – do not copy, distribute or use without Inmar written permission, 2018
Calculated Chaos
• When you go through this cycle of pressing
on your system in ways that hurt, you
discover, describe, quantify, and correct the
fragile parts of your system. Once you
overcome one, you’ll find another, and
another, but they will be more and more rare.
• The really amazing thing that happens is: as
you remove fragilities and they become
more rare, your system becomes more
resilient generally. Because you
proactively found and fixed the specific ways
in which it will break, your system can now
withstand that entire class of failures.
15
Inmar – do not copy, distribute or use without Inmar written permission, 2018 16
Your MARGIN is
my OPPORTUNITY
“ ”Countless companies out
there, large and small,
are looking for weakness
where they can slip into
the margins where you
are vulnerable, and take
your business.
Our responsibility as
professionals is to
proactively defend our
companies’ turf in our
area of expertise:
delivering technology.
Inmar – do not copy, distribute or use without Inmar written permission, 2018
How We Do It, And How You Can Too
17
Building Failsafe
Software
Event Sourcing
Idempotency
Async-First
Circuit Breakers & Auto-Retry
Real-Time Architecture
Continuous Deployment
Testing and SDLC
First-class automated testing
Deploy then test (not the other
way around)
Decide what to test using
Science™
Trunk-based development
Operations
You Build It, You Run It
Monitoring and Alarms
Ops Standup
Shoot things just because you
can
Inmar – do not copy, distribute or use without Inmar written permission, 2018
Event Sourcing
• Problems with that approach
– DELETE is destructive (obviously), but so is
UPDATE
– The audit table approach is not failsafe. The
primary action is guaranteed, but the audit
can break.
– So even with audit tables, every UPDATE is
potentially destroying data – specifically, the
fully-complete answer to the question “what
was it before the update?”
• Traditional CRUD system:
– Model your entities
– Use a DAL for Create, Update, Delete, and
Read
– What if I need to know what happened to the
entity in the past?
– Oh, right… add an audit table
18
Inmar – do not copy, distribute or use without Inmar written permission, 2018
• Get rid of UPDATEs. In fact, get rid of
directly storing your entities!
• Instead, just write the intended change -
directly to the audit table
• When I want to know what the entity is, I just
replay all the audit records in order
• When I want to know what the entity was, at
any point in time, I just replay up to that
point.
19
Event Sourcing (cont’d)
Inmar – do not copy, distribute or use without Inmar written permission, 2018 20
Event Sourcing (cont’d)
• Every financial institution in the world uses it,
and they seem to manage transactional
volume decently well
• The most common performance booster is
snapshotting. Every nth record (even every
record) we cache the resulting entity
alongside the event, so we don’t have to
replay from the beginning of time.
Inmar – do not copy, distribute or use without Inmar written permission, 2018
Let’s Build A Rebates System!
21
Shopper makes a purchase
(in-store or online) Submission
1 2 3
Submits a rebate form
(paper or online)
4
• Did the consumer buy the
right product?
• Do they meet all the
criteria?
• How much $ are they
due?
Settlement
We owe
you $20!
• How do we remit
payment?
• Do we have enough $ in
the bank?
• What happens if it comes
back unclaimed?
You paid me $20 but I
should have gotten $50
5
Correction:
We owe you
$50!
6 7
Inmar – do not copy, distribute or use without Inmar written permission, 2018
Idempotency
Old-school:
POST creates an object… we use an
incrementing primary key. So if I POST the
exact same object again, it will create a
duplicate record with a higher ID. That is
almost never what we want.
In computer science, the term idempotent is
used more comprehensively to describe an
operation that will produce the same results if
executed once or multiple times.
22
Inmar – do not copy, distribute or use without Inmar written permission, 2018
Async-First
23
• Myth: most business
operations need to block until
the operation is completely
executed.
• Fact: most business operations
only need to acknowledge the
request to execute was
received.
• Being synchronous is an
additional constraint – a
temporal constraint – on any
feature. Constraints limit
implementation options – and
increase cost.
• It must be a business
requirement – with justification
and value attached – for an
operation to be synchronous.
Otherwise, everything is
implemented async by default.
Inmar – do not copy, distribute or use without Inmar written permission, 2018
Submission API
Combine Event Sourcing with Message Bus
24
Event 4
1
Worker
determines if a
settlement is
needed, and
calls the API
3
Event 3
Event 2
Event 1
Event 0
2
Settlement API
Event 4
Event 3
Event 2
Event 1
Event 0
4
5
Publish
event to
bus
Publish
event to
bus
Create or
update
submission
Inmar – do not copy, distribute or use without Inmar written permission, 2018
Auto-Retry and Circuit Breakers
25
Inmar – do not copy, distribute or use without Inmar written permission, 2018
Twine
When your solution is made of many processes
(functions, or microservices, or whatever), a lot
more of your regular coding involves Inter-Process
Calls (IPC)
• LINQ was made because working with sets is
very common in business apps – gives us a
standard model for thinking about sets
• Twine (and others) were made because in
microservices, IPCs are very common. Gives us
a standard model for thinking about IPC.
26
• Service discovery
• Protocol abstraction
• Load-balancing
• Auto-retry and backoff
• Auto-fallback and failover
• Circuit breakers
• Authentication (like JWT)
• Tracing
• Completely pluggable and extensible
• Implementations in .NET/c#, javascript, and go
twine
Inmar – do not copy, distribute or use without Inmar written permission, 2018
Builds & One-Click Deploy
27
Inmar – do not copy, distribute or use without Inmar written permission, 2018
Builds & One-Click Deploy
28
• Make sure the package can be deployed to
any environment (makes no assumptions)
• Separate Build from Deploy
• Setting environment-specific config or values
is part of the deploy stage.
• Every previous build already packaged, one-
click deploy means it’s easy to roll back
Check-in
change
Master
Trigger
automated build
Build produces a
deployable package
to sit on a shelf
forever
• Apply environment-
specific config
• Remove previous version
from target environment
• Add desired version to
target environment
Inmar – do not copy, distribute or use without Inmar written permission, 2018
Builds & One-Click Deploy
29
Check-in
change
Master
Trigger
automated build
Build produces a
deployable package
to sit on a shelf
forever
• Apply environment-
specific config
• Remove previous version
from target environment
• Add desired version to
target environment
• If you don’t have this, start here
• So many tools - already solved for any shop,
any platform
– TFS / VSO / Release Manager
– TeamCity
– Jenkins
Inmar – do not copy, distribute or use without Inmar written permission, 2018
Failsafe software == Fearless engineers
30
• Combining each of these patterns and
sticking to them religiously, we can
absolutely wreck our system and
everything will be OK
• When you know the chances of
breaking something are small, you can
be fearless
Inmar – do not copy, distribute or use without Inmar written permission, 2018
How We Do It, And How You Can Too
31
Building Failsafe
Software
Event Sourcing
Idempotency
Async-First
Circuit Breakers & Auto-Retry
Real-Time Architecture
Continuous Deployment
Testing and SDLC
First-class automated testing
Deploy then test (not the other
way around)
Decide what to test using
Science™
Trunk-based development
Operations
You Build It, You Run It
Monitoring and Alarms
Ops Standup
Shoot things just because you
can
Inmar – do not copy, distribute or use without Inmar written permission, 2018
Traditional Testing in an SDLC
32
Product owner
writes user story
and team refines
1 2a
Release
Engineer
implements the
story
2b
3
Tester writes test
cases
QA tests
4 “The Loop”
Inmar – do not copy, distribute or use without Inmar written permission, 2018
Traditional Testing in an SDLC
33
Product owner
writes user story
and team refines
1 2a
Release
Engineer
implements the
story
2b
3
Tester writes test
cases
QA tests
4 “The Loop”
• Track which engineer
owns each story, and
track each time it comes
back from QA
• Number of cycles went
way down
• NEVER came back from
QA due to failed ACs
• All failures were non-
obvious and unexpected
downstream impacts
outside the scope of the
story
Inmar – do not copy, distribute or use without Inmar written permission, 2018
Two Kinds of QA…
Testing
• Exploratory
• Requires real domain expertise
• As well as historical knowledge of the
specific system
• And good judgement
Checking
• Verifying the business requirements are met
• Covers functional and non-functional
requirements
• As long as user stories are decent, anyone
with basic domain knowledge can do it
34
Inmar – do not copy, distribute or use without Inmar written permission, 2018
Two Kinds of QA…
Testing
• Exploratory
• Requires real domain expertise
• As well as historical knowledge of the
specific system
• And good judgement
Checking
• Verifying the business requirements are met
• Covers functional and non-functional
requirements
• As long as user stories are decent, anyone
with basic domain knowledge can do it
35
Don’t do this This is better…
But it’s still slow and unpredictable.
More predictable ► More rigorous ► Even slower
Inmar – do not copy, distribute or use without Inmar written permission, 2018
Gates are a Cost
36
Product owner
writes user story
and team refines
1 2a
Release
Engineer
implements the
story
2b
3
Tester writes test
cases
QA tests
4 “The Loop”
• Engineering
accountability drives
down QA failures –
most of what we submit
to QA now passes… but
we still test it.
• Add the cost of testing
every time, but we only
realize the benefit of
testing that one rare
time they catch a
problem.
Inmar – do not copy, distribute or use without Inmar written permission, 2018
Automated Testing After Deployment – No Gates
37
Product owner
writes user story
and team refines,
including
defining each
positive and
negative test
1 2
Release
Engineer
implements the
story, including
building the
automated tests
3a
Automated test
battery runs
continuously
against
production
Engineer adds the new
or updated automated
tests to the battery
3b Deploys the changes
4 Continuous feedback, 24/7
Inmar – do not copy, distribute or use without Inmar written permission, 2018 38
Inmar – do not copy, distribute or use without Inmar written permission, 2018
Why continuously, and why in production?
• Rules for acceptable and unacceptable
behavior are true, not just when you change
the software, but also when the conditions in
which your software runs also change
(different and more data, different loads,
etc.)
• The code changes you’re deploying are
simply one dimension of continuous change
• Fail-safe software won’t do anything
destructive
• Get from concept to effects on the real world
as quickly as possible
39
ObserveDecide
Orient
ourselves
Act
Inmar – do not copy, distribute or use without Inmar written permission, 2018
Why continuously, and why in production?
You can start to do more interesting things,
such as:
• Combine the one-step deployable artifacts
with automated tests to do automated
rollbacks
• Use live service tracing with automated tests
to create an early-warning system
(cascading failures before they happen)
40
ObserveDecide
Orient
ourselves
Act
Inmar – do not copy, distribute or use without Inmar written permission, 2018
Engineers have to write test cases?
• Every other kind of engineering (mechanical,
structural, etc.) characterizes and quantifies
the failure modes of the thing they’re
engineering.
• In what ways can it fail?
• What are the stressors and limits?
• What is the risk, impact, and mitigation of
each type of potential failure?
41
Inmar – do not copy, distribute or use without Inmar written permission, 2018
Make Quantifying Testing Part of User Story Requirements
• Describe each potential failure mode (“what is
the worst possible bug I could introduce?”)
• Y-Axis: Impact of Problem
1. Moderate impact
2. Significant but correctable impact
3. Irreparable harm to reputation or integrity, or
unrecoverable loss of cash
• X-Axis: Invisibility of Problem
1. Very unlikely to be missed during normal dev-
test cycle
2. Discoverable in diligent manual testing pass
3. Likely to be missed - subtle or complex behavior
42
Inmar – do not copy, distribute or use without Inmar written permission, 2018
Don’t Need Big Fancy Test Frameworks or Platforms
• Postman to craft your test calls
• Save out the Postman call definitions as files
right into your source control alongside the
thing they test
• newman is a CLI to programmatically run
Postman calls
• Use whatever code you want to look at the
response and decide whether it’s good or
bad
• Save the results to a DB or something
• Report on it!
43
Inmar – do not copy, distribute or use without Inmar written permission, 2018
Where does QA fit?
Examples
• Mobile apps
• Hardware
Characteristics
• High MTTR* Components
• Qualitative, not quantitative
• User-facing only
• Not “is it broken?” (automated tests validate
that), but “is the experience as good as it
should be?”
* Mean Time To Resolution
44
Inmar – do not copy, distribute or use without Inmar written permission, 2018
Testing Maturity Ladder
45
Manual Testing
1
Creating validation is a first-class
part of every engineer’s daily work,
and computers execute it.
Engineers Test
Test Engineers
2
3
Testers take on responsibility of
automating their work
QA manually gates deployment
Inmar – do not copy, distribute or use without Inmar written permission, 2018
Optimize your SDLC for Speed
46
When you automate validation during
development, you:
• Know any future change that has
unexpected and non-obvious
downstream breakage will be caught
deterministically
• Free up humans to do higher-value
work than verification
• Have continuous feedback on end-
to-end health of every potential failure
mode
Inmar – do not copy, distribute or use without Inmar written permission, 2018
How We Do It, And How You Can Too
47
Building Failsafe
Software
Event Sourcing
Idempotency
Async-First
Circuit Breakers & Auto-Retry
Real-Time Architecture
Continuous Deployment
Testing and SDLC
First-class automated testing
Deploy then test (not the other
way around)
Decide what to test using
Science™
Trunk-based development
Operations
You Build It, You Run It
Monitoring and Alarms
Ops Standup
Shoot things just because you
can
Inmar – do not copy, distribute or use without Inmar written permission, 2018
“You Build It, You Run It”
48
• Make no distinction between
operations and development.
They are literally the same thing.
• “Operations” involves:
– Configuring the runtime environment
– Deploying the software
– Monitoring (and responding to) the
health of the runtime environment
Inmar – do not copy, distribute or use without Inmar written permission, 2018 49
Inmar – do not copy, distribute or use without Inmar written permission, 2018 50
Inmar – do not copy, distribute or use without Inmar written permission, 2018
Operations Standup
Operations Standup
• What errors got logged or tests failed in the
last 24 hours?
• Who will own resolving each one today?
Traditional Agile Standup
• What did you do yesterday?
• What do you plan to do today?
• What impediments stand in your way?
51
Mandate: zero errors unaccounted for!
No “oh yeah I’ve seen that error, it’s not
actually a problem”
Inmar – do not copy, distribute or use without Inmar written permission, 2018
Shoot Things, Just Because You Can
Fully Automated Infrastructure
• Servers have no names. We never even log
into them, except for forensics.
• Make changes in the middle of the day,
because the whole team is online and 100%
engaged
• Kill resources daily, just to force them to be
automatically replaced and verify everything
still works
Traditional Ops/Infrastructure
• “Is the error coming from SEGOT-APP-2503
or SEGOT-APP-2505? You know 03 tends to
get a little wobbly sometimes”
• Make changes in the middle of the night on
the weekend to “minimize impact”
• Reboot things very carefully
52
Inmar – do not copy, distribute or use without Inmar written permission, 2018
How We Do It, And How You Can Too
53
Building Failsafe
Software
Event Sourcing
Idempotency
Async-First
Circuit Breakers & Auto-Retry
Real-Time Architecture
Continuous Deployment
Testing and SDLC
First-class automated testing
Deploy then test (not the other
way around)
Decide what to test using
Science™
Trunk-based development
Operations
You Build It, You Run It
Monitoring and Alarms
Ops Standup
Shoot things just because you
can
Inmar – do not copy, distribute or use without Inmar written permission, 2018
THANK YOU! @rexm rexm
Inmar – do not copy, distribute or use without Inmar written permission, 2018 55

Weitere ähnliche Inhalte

Ähnlich wie Codestock 2018 - Deliver at Warp Speed

Reducing Tickets and Crushing SLAs with StatusPage
Reducing Tickets and Crushing SLAs with StatusPageReducing Tickets and Crushing SLAs with StatusPage
Reducing Tickets and Crushing SLAs with StatusPageAtlassian
 
Evidence-Based Management of Software Organizations (closing keynote ScrumDay...
Evidence-Based Management of Software Organizations (closing keynote ScrumDay...Evidence-Based Management of Software Organizations (closing keynote ScrumDay...
Evidence-Based Management of Software Organizations (closing keynote ScrumDay...Gunther Verheyen
 
Engage2018 Watson Workspace Templates
Engage2018 Watson Workspace TemplatesEngage2018 Watson Workspace Templates
Engage2018 Watson Workspace TemplatesVincent Burckhardt
 
Adopting A Whole Team Approach To Quality
Adopting  A  Whole  Team  Approach  To  QualityAdopting  A  Whole  Team  Approach  To  Quality
Adopting A Whole Team Approach To QualityBen Carey
 
The Most Important Thing: How Mozilla Does Security and What You Can Steal
The Most Important Thing: How Mozilla Does Security and What You Can StealThe Most Important Thing: How Mozilla Does Security and What You Can Steal
The Most Important Thing: How Mozilla Does Security and What You Can Stealmozilla.presentations
 
Axcient: Don't Get Caught With Your Saas Down
Axcient: Don't Get Caught With Your Saas DownAxcient: Don't Get Caught With Your Saas Down
Axcient: Don't Get Caught With Your Saas DownIngram Micro Cloud
 
Computer Manufacturing Briefing 2014
Computer Manufacturing Briefing 2014Computer Manufacturing Briefing 2014
Computer Manufacturing Briefing 2014bryanrimmer
 
Spatz.ai for Teams - A referee toolkit for unfair idea-challenges
Spatz.ai for Teams - A referee toolkit for unfair idea-challengesSpatz.ai for Teams - A referee toolkit for unfair idea-challenges
Spatz.ai for Teams - A referee toolkit for unfair idea-challengesDesmond Sherlock
 
Building Ops Automation in DevOps
Building Ops Automation in DevOpsBuilding Ops Automation in DevOps
Building Ops Automation in DevOpsDevOps.com
 
Mobile App User Experience Myths, Debunked
Mobile App User Experience Myths, DebunkedMobile App User Experience Myths, Debunked
Mobile App User Experience Myths, DebunkedApteligent
 
Be More Secure than your Competition: MePush Cyber Security for Small Business
Be More Secure than your Competition:  MePush Cyber Security for Small BusinessBe More Secure than your Competition:  MePush Cyber Security for Small Business
Be More Secure than your Competition: MePush Cyber Security for Small BusinessArt Ocain
 
An Agile Approach to Cloud Adoption
An Agile Approach to Cloud AdoptionAn Agile Approach to Cloud Adoption
An Agile Approach to Cloud AdoptionAmazon Web Services
 
Three Secrets of Agile Leadership: From Working Hard to Working Smart
Three Secrets of Agile Leadership: From Working Hard to Working SmartThree Secrets of Agile Leadership: From Working Hard to Working Smart
Three Secrets of Agile Leadership: From Working Hard to Working SmartPeter Stevens
 
Large Scale Graph Processing & Machine Learning Algorithms for Payment Fraud ...
Large Scale Graph Processing & Machine Learning Algorithms for Payment Fraud ...Large Scale Graph Processing & Machine Learning Algorithms for Payment Fraud ...
Large Scale Graph Processing & Machine Learning Algorithms for Payment Fraud ...DataWorks Summit
 
James foulkes, director and co founder, kingpin
James foulkes, director and co founder, kingpinJames foulkes, director and co founder, kingpin
James foulkes, director and co founder, kingpinB2B Marketing
 
Meet Preston, and Explore Your Digital Twin in Virtual Reality (GPSTEC321) - ...
Meet Preston, and Explore Your Digital Twin in Virtual Reality (GPSTEC321) - ...Meet Preston, and Explore Your Digital Twin in Virtual Reality (GPSTEC321) - ...
Meet Preston, and Explore Your Digital Twin in Virtual Reality (GPSTEC321) - ...Amazon Web Services
 
Analytics in Action: What Users Want: How and Why to Build Knowledge into You...
Analytics in Action: What Users Want: How and Why to Build Knowledge into You...Analytics in Action: What Users Want: How and Why to Build Knowledge into You...
Analytics in Action: What Users Want: How and Why to Build Knowledge into You...Aggregage
 
Analytics in Action: What Users Want: How and Why to Build Knowledge into You...
Analytics in Action: What Users Want: How and Why to Build Knowledge into You...Analytics in Action: What Users Want: How and Why to Build Knowledge into You...
Analytics in Action: What Users Want: How and Why to Build Knowledge into You...Hannah Flynn
 
Engineering eCommerce systems for Scale
Engineering eCommerce systems for ScaleEngineering eCommerce systems for Scale
Engineering eCommerce systems for Scalevivekv
 

Ähnlich wie Codestock 2018 - Deliver at Warp Speed (20)

Reducing Tickets and Crushing SLAs with StatusPage
Reducing Tickets and Crushing SLAs with StatusPageReducing Tickets and Crushing SLAs with StatusPage
Reducing Tickets and Crushing SLAs with StatusPage
 
Evidence-Based Management of Software Organizations (closing keynote ScrumDay...
Evidence-Based Management of Software Organizations (closing keynote ScrumDay...Evidence-Based Management of Software Organizations (closing keynote ScrumDay...
Evidence-Based Management of Software Organizations (closing keynote ScrumDay...
 
Engage2018 Watson Workspace Templates
Engage2018 Watson Workspace TemplatesEngage2018 Watson Workspace Templates
Engage2018 Watson Workspace Templates
 
Adopting A Whole Team Approach To Quality
Adopting  A  Whole  Team  Approach  To  QualityAdopting  A  Whole  Team  Approach  To  Quality
Adopting A Whole Team Approach To Quality
 
Growth Hackers Dublin 6 (slides 1 of 2)
Growth Hackers Dublin 6  (slides 1 of 2)Growth Hackers Dublin 6  (slides 1 of 2)
Growth Hackers Dublin 6 (slides 1 of 2)
 
The Most Important Thing: How Mozilla Does Security and What You Can Steal
The Most Important Thing: How Mozilla Does Security and What You Can StealThe Most Important Thing: How Mozilla Does Security and What You Can Steal
The Most Important Thing: How Mozilla Does Security and What You Can Steal
 
Axcient: Don't Get Caught With Your Saas Down
Axcient: Don't Get Caught With Your Saas DownAxcient: Don't Get Caught With Your Saas Down
Axcient: Don't Get Caught With Your Saas Down
 
Computer Manufacturing Briefing 2014
Computer Manufacturing Briefing 2014Computer Manufacturing Briefing 2014
Computer Manufacturing Briefing 2014
 
Spatz.ai for Teams - A referee toolkit for unfair idea-challenges
Spatz.ai for Teams - A referee toolkit for unfair idea-challengesSpatz.ai for Teams - A referee toolkit for unfair idea-challenges
Spatz.ai for Teams - A referee toolkit for unfair idea-challenges
 
Building Ops Automation in DevOps
Building Ops Automation in DevOpsBuilding Ops Automation in DevOps
Building Ops Automation in DevOps
 
Mobile App User Experience Myths, Debunked
Mobile App User Experience Myths, DebunkedMobile App User Experience Myths, Debunked
Mobile App User Experience Myths, Debunked
 
Be More Secure than your Competition: MePush Cyber Security for Small Business
Be More Secure than your Competition:  MePush Cyber Security for Small BusinessBe More Secure than your Competition:  MePush Cyber Security for Small Business
Be More Secure than your Competition: MePush Cyber Security for Small Business
 
An Agile Approach to Cloud Adoption
An Agile Approach to Cloud AdoptionAn Agile Approach to Cloud Adoption
An Agile Approach to Cloud Adoption
 
Three Secrets of Agile Leadership: From Working Hard to Working Smart
Three Secrets of Agile Leadership: From Working Hard to Working SmartThree Secrets of Agile Leadership: From Working Hard to Working Smart
Three Secrets of Agile Leadership: From Working Hard to Working Smart
 
Large Scale Graph Processing & Machine Learning Algorithms for Payment Fraud ...
Large Scale Graph Processing & Machine Learning Algorithms for Payment Fraud ...Large Scale Graph Processing & Machine Learning Algorithms for Payment Fraud ...
Large Scale Graph Processing & Machine Learning Algorithms for Payment Fraud ...
 
James foulkes, director and co founder, kingpin
James foulkes, director and co founder, kingpinJames foulkes, director and co founder, kingpin
James foulkes, director and co founder, kingpin
 
Meet Preston, and Explore Your Digital Twin in Virtual Reality (GPSTEC321) - ...
Meet Preston, and Explore Your Digital Twin in Virtual Reality (GPSTEC321) - ...Meet Preston, and Explore Your Digital Twin in Virtual Reality (GPSTEC321) - ...
Meet Preston, and Explore Your Digital Twin in Virtual Reality (GPSTEC321) - ...
 
Analytics in Action: What Users Want: How and Why to Build Knowledge into You...
Analytics in Action: What Users Want: How and Why to Build Knowledge into You...Analytics in Action: What Users Want: How and Why to Build Knowledge into You...
Analytics in Action: What Users Want: How and Why to Build Knowledge into You...
 
Analytics in Action: What Users Want: How and Why to Build Knowledge into You...
Analytics in Action: What Users Want: How and Why to Build Knowledge into You...Analytics in Action: What Users Want: How and Why to Build Knowledge into You...
Analytics in Action: What Users Want: How and Why to Build Knowledge into You...
 
Engineering eCommerce systems for Scale
Engineering eCommerce systems for ScaleEngineering eCommerce systems for Scale
Engineering eCommerce systems for Scale
 

Kürzlich hochgeladen

Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 

Kürzlich hochgeladen (20)

Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

Codestock 2018 - Deliver at Warp Speed

  • 1. Inmar – do not copy, distribute or use without Inmar written permission, 2018 1
  • 2. Deploying to production 100x a day, with no QA and zero downtime Deliver at Warp Speed Rex Morgan Director Software Engineering
  • 3. Inmar – do not copy, distribute or use without Inmar written permission, 2018 How We Do It, And How You Can Too 3 Building Failsafe Software Event Sourcing Idempotency Async-First Circuit Breakers & Auto-Retry Real-Time Architecture Continuous Deployment Testing and SDLC First-class automated testing Deploy then test (not the other way around) Decide what to test using Science™ Trunk-based development Operations You Build It, You Run It Monitoring and Alarms Ops Standup Shoot things just because you can
  • 4. Inmar – do not copy, distribute or use without Inmar written permission, 2018 What is Inmar? Ever used a paper or digital coupon? Ever redeemed a rebate? Ever filled a prescription, returned something to a store, or online; stayed at a hospital, read a blog, or talked to a chatbot on social media? If so, you’ve touched Inmar. 4 > $1b / week processed >100b consumer touchpoints 100s of deployments / day
  • 5. Inmar – do not copy, distribute or use without Inmar written permission, 2018 5 Client’s TRUST is Paramount How do we go FAST with CONFIDENCE? What do we GAIN?
  • 6. Inmar – do not copy, distribute or use without Inmar written permission, 2018 I am an Engineer • 3 years as Director of Engineering at Inmar • 5 years as Architect at Qorvo and Volvo • 12 years as a professional software developer • At Inmar, managers, directors – all the way up to the CTO – are technical – we code! • I get to work with some of the most talented and sharp engineers around. Want to talk about joining us? Hit me up after! 6 AND YOU? MY TEAM:
  • 7. Inmar – do not copy, distribute or use without Inmar written permission, 2018 The Game Our leaders are always asking, begging us to deliver more, faster. Why? Competition is always about trying to gain an advantage. 7 THE STAKES: We win: • More jobs • Bigger projects • Higher pay We lose: • Fewer jobs • Stressful workplace • No $ for raises
  • 8. Inmar – do not copy, distribute or use without Inmar written permission, 2018 8 BUSINESSVAUEDELIVERED TIME ELAPSEDPlanning Architecture (ooo fun) Development QA/UAT BAM! RELEASE
  • 9. Inmar – do not copy, distribute or use without Inmar written permission, 2018 9 BUSINESSVAUEDELIVERED Sprint 1 TIME ELAPSED Sprint 2 Sprint 3 Sprint 4 Sprint 5 Sprint 6 … Agile advantages: • Greater cumulative value delivered • Team velocity increases over time
  • 10. Inmar – do not copy, distribute or use without Inmar written permission, 2018 OODA 10 Fit what we know about the environment with our strengths and weaknesses Produces effects on the environment ObserveDecide Orient ourselves Act Produces what we know about the environment Produces possible paths to go down Produces action plan Measure our environment Choose the best course of action Execute the plan!
  • 11. Inmar – do not copy, distribute or use without Inmar written permission, 2018 OODA 11 ObserveDecide Orient ourselves Act • This is how we Learn: by Shipping • Business leaders are playing chess – they need to make a move to see its effect. They need us to deliver. • Code that isn’t live, being exercised to deliver real value, is just practice. • If the time it takes to complete one full loop is too slow, the plan you are executing is based on out-out-of-date observations! • If we aren’t looping as fast as we possibly can, we aren’t maximizing our own potential.
  • 12. Inmar – do not copy, distribute or use without Inmar written permission, 2018 12 After a year, the team that OODA loops 100x a day will have accelerated far beyond the team that does once every 3 weeks.
  • 13. Inmar – do not copy, distribute or use without Inmar written permission, 2018 13
  • 14. Inmar – do not copy, distribute or use without Inmar written permission, 2018 If it’s painful, you should do it more often Amount of fear: Unknown > Known When you start doing things that hurt more often, they go from “I fear it will fail in some horrible way” to ”I see that it fails in this very specific way” 14
  • 15. Inmar – do not copy, distribute or use without Inmar written permission, 2018 Calculated Chaos • When you go through this cycle of pressing on your system in ways that hurt, you discover, describe, quantify, and correct the fragile parts of your system. Once you overcome one, you’ll find another, and another, but they will be more and more rare. • The really amazing thing that happens is: as you remove fragilities and they become more rare, your system becomes more resilient generally. Because you proactively found and fixed the specific ways in which it will break, your system can now withstand that entire class of failures. 15
  • 16. Inmar – do not copy, distribute or use without Inmar written permission, 2018 16 Your MARGIN is my OPPORTUNITY “ ”Countless companies out there, large and small, are looking for weakness where they can slip into the margins where you are vulnerable, and take your business. Our responsibility as professionals is to proactively defend our companies’ turf in our area of expertise: delivering technology.
  • 17. Inmar – do not copy, distribute or use without Inmar written permission, 2018 How We Do It, And How You Can Too 17 Building Failsafe Software Event Sourcing Idempotency Async-First Circuit Breakers & Auto-Retry Real-Time Architecture Continuous Deployment Testing and SDLC First-class automated testing Deploy then test (not the other way around) Decide what to test using Science™ Trunk-based development Operations You Build It, You Run It Monitoring and Alarms Ops Standup Shoot things just because you can
  • 18. Inmar – do not copy, distribute or use without Inmar written permission, 2018 Event Sourcing • Problems with that approach – DELETE is destructive (obviously), but so is UPDATE – The audit table approach is not failsafe. The primary action is guaranteed, but the audit can break. – So even with audit tables, every UPDATE is potentially destroying data – specifically, the fully-complete answer to the question “what was it before the update?” • Traditional CRUD system: – Model your entities – Use a DAL for Create, Update, Delete, and Read – What if I need to know what happened to the entity in the past? – Oh, right… add an audit table 18
  • 19. Inmar – do not copy, distribute or use without Inmar written permission, 2018 • Get rid of UPDATEs. In fact, get rid of directly storing your entities! • Instead, just write the intended change - directly to the audit table • When I want to know what the entity is, I just replay all the audit records in order • When I want to know what the entity was, at any point in time, I just replay up to that point. 19 Event Sourcing (cont’d)
  • 20. Inmar – do not copy, distribute or use without Inmar written permission, 2018 20 Event Sourcing (cont’d) • Every financial institution in the world uses it, and they seem to manage transactional volume decently well • The most common performance booster is snapshotting. Every nth record (even every record) we cache the resulting entity alongside the event, so we don’t have to replay from the beginning of time.
  • 21. Inmar – do not copy, distribute or use without Inmar written permission, 2018 Let’s Build A Rebates System! 21 Shopper makes a purchase (in-store or online) Submission 1 2 3 Submits a rebate form (paper or online) 4 • Did the consumer buy the right product? • Do they meet all the criteria? • How much $ are they due? Settlement We owe you $20! • How do we remit payment? • Do we have enough $ in the bank? • What happens if it comes back unclaimed? You paid me $20 but I should have gotten $50 5 Correction: We owe you $50! 6 7
  • 22. Inmar – do not copy, distribute or use without Inmar written permission, 2018 Idempotency Old-school: POST creates an object… we use an incrementing primary key. So if I POST the exact same object again, it will create a duplicate record with a higher ID. That is almost never what we want. In computer science, the term idempotent is used more comprehensively to describe an operation that will produce the same results if executed once or multiple times. 22
  • 23. Inmar – do not copy, distribute or use without Inmar written permission, 2018 Async-First 23 • Myth: most business operations need to block until the operation is completely executed. • Fact: most business operations only need to acknowledge the request to execute was received. • Being synchronous is an additional constraint – a temporal constraint – on any feature. Constraints limit implementation options – and increase cost. • It must be a business requirement – with justification and value attached – for an operation to be synchronous. Otherwise, everything is implemented async by default.
  • 24. Inmar – do not copy, distribute or use without Inmar written permission, 2018 Submission API Combine Event Sourcing with Message Bus 24 Event 4 1 Worker determines if a settlement is needed, and calls the API 3 Event 3 Event 2 Event 1 Event 0 2 Settlement API Event 4 Event 3 Event 2 Event 1 Event 0 4 5 Publish event to bus Publish event to bus Create or update submission
  • 25. Inmar – do not copy, distribute or use without Inmar written permission, 2018 Auto-Retry and Circuit Breakers 25
  • 26. Inmar – do not copy, distribute or use without Inmar written permission, 2018 Twine When your solution is made of many processes (functions, or microservices, or whatever), a lot more of your regular coding involves Inter-Process Calls (IPC) • LINQ was made because working with sets is very common in business apps – gives us a standard model for thinking about sets • Twine (and others) were made because in microservices, IPCs are very common. Gives us a standard model for thinking about IPC. 26 • Service discovery • Protocol abstraction • Load-balancing • Auto-retry and backoff • Auto-fallback and failover • Circuit breakers • Authentication (like JWT) • Tracing • Completely pluggable and extensible • Implementations in .NET/c#, javascript, and go twine
  • 27. Inmar – do not copy, distribute or use without Inmar written permission, 2018 Builds & One-Click Deploy 27
  • 28. Inmar – do not copy, distribute or use without Inmar written permission, 2018 Builds & One-Click Deploy 28 • Make sure the package can be deployed to any environment (makes no assumptions) • Separate Build from Deploy • Setting environment-specific config or values is part of the deploy stage. • Every previous build already packaged, one- click deploy means it’s easy to roll back Check-in change Master Trigger automated build Build produces a deployable package to sit on a shelf forever • Apply environment- specific config • Remove previous version from target environment • Add desired version to target environment
  • 29. Inmar – do not copy, distribute or use without Inmar written permission, 2018 Builds & One-Click Deploy 29 Check-in change Master Trigger automated build Build produces a deployable package to sit on a shelf forever • Apply environment- specific config • Remove previous version from target environment • Add desired version to target environment • If you don’t have this, start here • So many tools - already solved for any shop, any platform – TFS / VSO / Release Manager – TeamCity – Jenkins
  • 30. Inmar – do not copy, distribute or use without Inmar written permission, 2018 Failsafe software == Fearless engineers 30 • Combining each of these patterns and sticking to them religiously, we can absolutely wreck our system and everything will be OK • When you know the chances of breaking something are small, you can be fearless
  • 31. Inmar – do not copy, distribute or use without Inmar written permission, 2018 How We Do It, And How You Can Too 31 Building Failsafe Software Event Sourcing Idempotency Async-First Circuit Breakers & Auto-Retry Real-Time Architecture Continuous Deployment Testing and SDLC First-class automated testing Deploy then test (not the other way around) Decide what to test using Science™ Trunk-based development Operations You Build It, You Run It Monitoring and Alarms Ops Standup Shoot things just because you can
  • 32. Inmar – do not copy, distribute or use without Inmar written permission, 2018 Traditional Testing in an SDLC 32 Product owner writes user story and team refines 1 2a Release Engineer implements the story 2b 3 Tester writes test cases QA tests 4 “The Loop”
  • 33. Inmar – do not copy, distribute or use without Inmar written permission, 2018 Traditional Testing in an SDLC 33 Product owner writes user story and team refines 1 2a Release Engineer implements the story 2b 3 Tester writes test cases QA tests 4 “The Loop” • Track which engineer owns each story, and track each time it comes back from QA • Number of cycles went way down • NEVER came back from QA due to failed ACs • All failures were non- obvious and unexpected downstream impacts outside the scope of the story
  • 34. Inmar – do not copy, distribute or use without Inmar written permission, 2018 Two Kinds of QA… Testing • Exploratory • Requires real domain expertise • As well as historical knowledge of the specific system • And good judgement Checking • Verifying the business requirements are met • Covers functional and non-functional requirements • As long as user stories are decent, anyone with basic domain knowledge can do it 34
  • 35. Inmar – do not copy, distribute or use without Inmar written permission, 2018 Two Kinds of QA… Testing • Exploratory • Requires real domain expertise • As well as historical knowledge of the specific system • And good judgement Checking • Verifying the business requirements are met • Covers functional and non-functional requirements • As long as user stories are decent, anyone with basic domain knowledge can do it 35 Don’t do this This is better… But it’s still slow and unpredictable. More predictable ► More rigorous ► Even slower
  • 36. Inmar – do not copy, distribute or use without Inmar written permission, 2018 Gates are a Cost 36 Product owner writes user story and team refines 1 2a Release Engineer implements the story 2b 3 Tester writes test cases QA tests 4 “The Loop” • Engineering accountability drives down QA failures – most of what we submit to QA now passes… but we still test it. • Add the cost of testing every time, but we only realize the benefit of testing that one rare time they catch a problem.
  • 37. Inmar – do not copy, distribute or use without Inmar written permission, 2018 Automated Testing After Deployment – No Gates 37 Product owner writes user story and team refines, including defining each positive and negative test 1 2 Release Engineer implements the story, including building the automated tests 3a Automated test battery runs continuously against production Engineer adds the new or updated automated tests to the battery 3b Deploys the changes 4 Continuous feedback, 24/7
  • 38. Inmar – do not copy, distribute or use without Inmar written permission, 2018 38
  • 39. Inmar – do not copy, distribute or use without Inmar written permission, 2018 Why continuously, and why in production? • Rules for acceptable and unacceptable behavior are true, not just when you change the software, but also when the conditions in which your software runs also change (different and more data, different loads, etc.) • The code changes you’re deploying are simply one dimension of continuous change • Fail-safe software won’t do anything destructive • Get from concept to effects on the real world as quickly as possible 39 ObserveDecide Orient ourselves Act
  • 40. Inmar – do not copy, distribute or use without Inmar written permission, 2018 Why continuously, and why in production? You can start to do more interesting things, such as: • Combine the one-step deployable artifacts with automated tests to do automated rollbacks • Use live service tracing with automated tests to create an early-warning system (cascading failures before they happen) 40 ObserveDecide Orient ourselves Act
  • 41. Inmar – do not copy, distribute or use without Inmar written permission, 2018 Engineers have to write test cases? • Every other kind of engineering (mechanical, structural, etc.) characterizes and quantifies the failure modes of the thing they’re engineering. • In what ways can it fail? • What are the stressors and limits? • What is the risk, impact, and mitigation of each type of potential failure? 41
  • 42. Inmar – do not copy, distribute or use without Inmar written permission, 2018 Make Quantifying Testing Part of User Story Requirements • Describe each potential failure mode (“what is the worst possible bug I could introduce?”) • Y-Axis: Impact of Problem 1. Moderate impact 2. Significant but correctable impact 3. Irreparable harm to reputation or integrity, or unrecoverable loss of cash • X-Axis: Invisibility of Problem 1. Very unlikely to be missed during normal dev- test cycle 2. Discoverable in diligent manual testing pass 3. Likely to be missed - subtle or complex behavior 42
  • 43. Inmar – do not copy, distribute or use without Inmar written permission, 2018 Don’t Need Big Fancy Test Frameworks or Platforms • Postman to craft your test calls • Save out the Postman call definitions as files right into your source control alongside the thing they test • newman is a CLI to programmatically run Postman calls • Use whatever code you want to look at the response and decide whether it’s good or bad • Save the results to a DB or something • Report on it! 43
  • 44. Inmar – do not copy, distribute or use without Inmar written permission, 2018 Where does QA fit? Examples • Mobile apps • Hardware Characteristics • High MTTR* Components • Qualitative, not quantitative • User-facing only • Not “is it broken?” (automated tests validate that), but “is the experience as good as it should be?” * Mean Time To Resolution 44
  • 45. Inmar – do not copy, distribute or use without Inmar written permission, 2018 Testing Maturity Ladder 45 Manual Testing 1 Creating validation is a first-class part of every engineer’s daily work, and computers execute it. Engineers Test Test Engineers 2 3 Testers take on responsibility of automating their work QA manually gates deployment
  • 46. Inmar – do not copy, distribute or use without Inmar written permission, 2018 Optimize your SDLC for Speed 46 When you automate validation during development, you: • Know any future change that has unexpected and non-obvious downstream breakage will be caught deterministically • Free up humans to do higher-value work than verification • Have continuous feedback on end- to-end health of every potential failure mode
  • 47. Inmar – do not copy, distribute or use without Inmar written permission, 2018 How We Do It, And How You Can Too 47 Building Failsafe Software Event Sourcing Idempotency Async-First Circuit Breakers & Auto-Retry Real-Time Architecture Continuous Deployment Testing and SDLC First-class automated testing Deploy then test (not the other way around) Decide what to test using Science™ Trunk-based development Operations You Build It, You Run It Monitoring and Alarms Ops Standup Shoot things just because you can
  • 48. Inmar – do not copy, distribute or use without Inmar written permission, 2018 “You Build It, You Run It” 48 • Make no distinction between operations and development. They are literally the same thing. • “Operations” involves: – Configuring the runtime environment – Deploying the software – Monitoring (and responding to) the health of the runtime environment
  • 49. Inmar – do not copy, distribute or use without Inmar written permission, 2018 49
  • 50. Inmar – do not copy, distribute or use without Inmar written permission, 2018 50
  • 51. Inmar – do not copy, distribute or use without Inmar written permission, 2018 Operations Standup Operations Standup • What errors got logged or tests failed in the last 24 hours? • Who will own resolving each one today? Traditional Agile Standup • What did you do yesterday? • What do you plan to do today? • What impediments stand in your way? 51 Mandate: zero errors unaccounted for! No “oh yeah I’ve seen that error, it’s not actually a problem”
  • 52. Inmar – do not copy, distribute or use without Inmar written permission, 2018 Shoot Things, Just Because You Can Fully Automated Infrastructure • Servers have no names. We never even log into them, except for forensics. • Make changes in the middle of the day, because the whole team is online and 100% engaged • Kill resources daily, just to force them to be automatically replaced and verify everything still works Traditional Ops/Infrastructure • “Is the error coming from SEGOT-APP-2503 or SEGOT-APP-2505? You know 03 tends to get a little wobbly sometimes” • Make changes in the middle of the night on the weekend to “minimize impact” • Reboot things very carefully 52
  • 53. Inmar – do not copy, distribute or use without Inmar written permission, 2018 How We Do It, And How You Can Too 53 Building Failsafe Software Event Sourcing Idempotency Async-First Circuit Breakers & Auto-Retry Real-Time Architecture Continuous Deployment Testing and SDLC First-class automated testing Deploy then test (not the other way around) Decide what to test using Science™ Trunk-based development Operations You Build It, You Run It Monitoring and Alarms Ops Standup Shoot things just because you can
  • 54. Inmar – do not copy, distribute or use without Inmar written permission, 2018 THANK YOU! @rexm rexm
  • 55. Inmar – do not copy, distribute or use without Inmar written permission, 2018 55

Hinweis der Redaktion

  1. Continuous delivery – who here is familiar with it? OK keep your hand up if you do it today.
  2. Building Failsafe software: We’re going to walk through SPECIFIC examples of what we do at Inmar to make it work… then circle back and go “OK, what are some common themes”. Testing and SDLC: If we put these things together to make software that is FAIL-SAFE, then where does testing fit in? And what do we mean when we say “No QA”? Operations: When you are changing your software 100 times a day, operations and development merge into one harmony and rhythm. So what does operations look like when it is part of dev, and dev is part of operations? (This is NOT what most people call “devops”)
  3. Every week, over $1b in value flows through our systems. We don’t lose a penny, and yet we deliver new, innovative features directly into production every single day.
  4. Our client’s TRUST is paramount. They trust us to handle their money and their customers, and to get it right every time. How do we do this? How do we go THAT fast AND be confident we aren’t going to break things that lose that trust? And what do we gain? Why do we even do this – isn’t releasing weekly/monthly/whenever good enough?
  5. Anybody know who this is? (John Boyd) Anybody know what he invented?
  6. No business value… none… look at all that business value!
  7. No business value… none… look at all that business value!
  8. Demo both Swizl and Hopster Emphasize tech and value of all mobile apps – Reference Health-e-Basket
  9. Demo both Swizl and Hopster Emphasize tech and value of all mobile apps – Reference Health-e-Basket
  10. No business value… none… look at all that business value!
  11. When you very first start out deploying more than is comfortable, things break. That’s why people are scared of doing it, they know it will break but they don’t know how! (If you knew how it would break, you would just mitigate it). So you have this sense that your system isn’t resilient enough to handle all the engineers just deploying new versions willy-nilly, and you’re right. And things break.
  12. Adrian Cockroft (chief architect at Netflix and now VP of Cloud Architecture Strategy at Amazon) likened this to being blindfolded on a hill and near a cliff. We are all blindfolded, on a hill near a cliff. If you don’t probe your system for fragility, it’s like you can wander around and the more you wander, you don’t fall yet and so you start to assume the cliff is farther away, and you’re safe. And then you get blindsided one day when it turns out you were one step away, and it’s horrible. Not doing this because you’re stupid or an arrogant ass, just doing it because you don’t know. It’s ignorance. I’ve been there, and I am sure many of you have been too.
  13. OK, so we are going to build a system that can represent Submissions and Settlements. - A Submission is all about the business rules of should we pay you and if so, how much? - We can have multiple settlements to a submission, and settlements are all about physically executing payments.
  14. Who is familiar with idempotency? OK, keep your hands up if some of your APIs are idempotent. OK, keep them up if ALL of your APIs are idempotent.
  15. If everything is idempotent, then it’s safe to let the system repeat failing calls automatically until one goes through Also need to talk about situations like things getting really slow (circuit breaker) And now… a quick detour into Twine
  16. 2008 started tracking stories and tying them directly to engineers’ performance evaluations. Cycles went down Learned there’s two kinds of QA
  17. Verifying the requirements are met… if the requirements aren’t met, why did the engineer even send it to QA? It’s not done. We don’t pay adults to follow up on other adults and check to make sure they did not just what they were asked to do… it was written down for them, and discussed at length in backlog refinement, and they had the opportunity to say this is not clear before committing to it. That’s why it’s called a sprint commitment in agile, because you’re committing to do it. If an engineer can’t complete the requirements without having another adult check their work, that’s just not even meeting the basic minimum level of professionalism. Fire them. What remains is testing. But with manual QA we have a problem… discovery of these failure cases is entirely reliant on the exploratory abilities of the individual tester, and to some degree just luck. A great tester will find most, but that’s still very time-consuming. Ideally we want a way to catch unexpected and non-obvious downstream impacts in a deterministic way.
  18. Verifying the requirements are met… if the requirements aren’t met, why did the engineer even send it to QA? It’s not done. We don’t pay adults to follow up on other adults and check to make sure they did not just what they were asked to do… it was written down for them, and discussed at length in backlog refinement, and they had the opportunity to say this is not clear before committing to it. That’s why it’s called a sprint commitment in agile, because you’re committing to do it. If an engineer can’t complete the requirements without having another adult check their work, that’s just not even meeting the basic minimum level of professionalism. Fire them. What remains is testing. But with manual QA we have a problem… discovery of these failure cases is entirely reliant on the exploratory abilities of the individual tester, and to some degree just luck. A great tester will find most, but that’s still very time-consuming. Ideally we want a way to catch unexpected and non-obvious downstream impacts in a deterministic way.
  19. Gates are for babies and dogs (show of hands: who all has tests that have to pass before they can build/deploy? That’s a gate) One time somebody broke something and it hurt, so you had a big post mortem and declared THIS SHALL NEVER HAPPEN AGAIN and you put in a gate Check-in gates, deploy gates, slow down the 99% to cover for the 1%. We got distracted by exception because it was in our face and then we optimized for the exception, and slowed down the rule.
  20. ”Moderate impact” is the lowest because if there was some functionality in your app that would have minimal or no impact if it broke, is it even doing anything useful? Why did you build it?
  21. High MTTR – how long does it take to respond to an issue and get a fix into the field? Minutes, seconds? Or hours, days? Qualitative not quantitative – software can quantify and characterize whether the functional and non-functional requirements are met. Don’t waste humans’ time on this
  22. Let's stop with the architects already (we fired them all) Hopefully by this point you've started to see the theme here: Freedom & responsibility - Some people think this means chaos, or it means devs get to do whatever they want. WRONG! They stopped listening after the "freedom" part. If you have freedom without responsibility, it is anarchy. But the responsibility part means that as long as you have smart people with clear responsibilities, they will usually make VERY GOOD decisions about how to execute! - When you take away artificial controls like QA gates, and replace them with smart incentives and clear objectives, people will make better decisions than you could