Testing Practices for Continuous Delivery - Ken McCormack

TESTING PRACTICES FOR CD
Ken McCormack - Sydney Alt.Net Meetup, Nov 2015
1

Ken McCormack
Lead Developer, ThoughtWorks
@kenmccormack
kenmccormack@thoughtworks.com
ABOUT ME
2

These slides contain a wide intro to automated testing in CD pipelines -
We’ll look at the types of xUnit tests we write, and how they help us model system specification -
- Unit testing and TDD
- Integration testing and PACT (CDC)
- Acceptance testing and ATDD
We’ll also touch on
- Team up-skilling tips
- Architectural challenges
- Suggested materials on automated testing
- Debates around TDD is included for reference, but is another talk in itself!
SCOPE
3

Neal Ford describes Continuous Delivery as the ‘authoritative,
repeatable, automated checking of production readiness’ [Ford,
2015]
TESTABILITY QUALITY FACTOR AND CD
4

[DZone, 2015] do a very useful introduction, ‘CD Visualised’, which
covers the testing technologies (both automated and manual)
used within CD pipelines [pdf]
We will concentrate on xUnit tests run within commit and
automated acceptance testing stages in the pipeline.
CD OVERVIEW
5

TYPES OF TESTS
Types of xUnit tests we write
6

[Cohn, 2010] chapter 16 introduces the test
automation pyramid
– We use this metaphor to describe the various
approaches to automated testing
– Cohn was drawing a contrast between Agile (test-
early) and waterfall (test-late) approaches
– Many projects had become reliant upon large
suites of slow, brittle UI automation tests, written
after development work was complete
THE COHN TEST AUTOMATION PYRAMID
7
“One reason teams found it difficult to write tests sooner
was because they were automating at the wrong level. An
effective test automation strategy calls for automating
tests at three different levels”

Watirmelon – ice cream cone anti-pattern
- Various visual metaphors
- Implies the concepts in xUnit are lost
- Late defects are generally more costly
TEST ANTI-PATTERN METAPHORS
8

THE CD TEST AUTOMATION PYRAMID
9

“Inspection does not improve the quality, nor guarantee quality.
Inspection is too late. The quality, good or bad, is already in the
product. As Harold F. Dodge said, ‘You can not inspect quality
into a product.’”
[Deming, 1986]
BUILD QUALITY IN
10

TEST SCOPE - EXAMPLE SYSTEM
Do we test the entire system, fully, end to end?
11

PIPELINE STEPS
• Code quality (lint)
• Build / transpile
• Unit tests
• Integration tests (sandbox)
• Deploy to E2E runtime
• Journey tests against E2E
12

We prefer unit testing because -
reliability - 100% deterministic
dependencies - simple, components
scope - single class or component
isolation - tests for this component do not affect others
mocks - use a spy to validate actions are called
UNIT TESTS
14

TEST SCOPE - INTEGRATION TESTS
15

PACT (CONSUMER DRIVEN CONTRACTS)
The design of the pipeline and PACT repository represents a movement towards ‘architecture for specification’. Specification is a
key architectural concern, critical to pipeline quality factors.
16

This is a valid tactic when faced with legacy core systems. Can the interceptor validate its stub content against
the real system? (Probably not in a repeatable manner)
SMART STUBS CAN MODEL COMBINATIONS OF FAILURE
17

PACT SPECIFICATION (CONSUMER DRIVEN CONTRACT)
{
"provider": {
"name": "Animal Service"
},
"consumer": {
"name": "Zoo App"
},
"interactions": [
{
"description": "a request for an alligator",
"provider_state": "there is an alligator named Mary",
"request": {
"method": "get",
"path": "/alligators/Mary",
"headers": {
"Accept": "application/json"
}
},
"response": {
"status": 200,
"headers": {
"Content-Type": "application/json;charset=utf-8"
},
"body": {
"name": "Mary"
}
}
},
18
{
"description": "a request for an alligator",
"provider_state": "there is not an alligator named Mary",
"request": {
"method": "get",
"path": "/alligators/Mary",
"headers": {
"Accept": "application/json"
}
},
"response": {
"status": 404
}
},

TEST SCOPE - E2E JOURNEY TESTS
19

PIPELINE DESIGN
CHALLENGES
Testability Quality Factor in CD Pipelines
20

Neal Ford describes Continuous Delivery as the ‘authoritative,
repeatable, automated checking of production readiness’ [Ford,
2015].
In essence, the quality factor of pipelines is that they must be able to
repeatedly and reliability demonstrate that our software is production
ready.
21

DESIGN HEURISTIC
“… for a system to meet its acceptance criteria to the satisfaction of
all parties, it must be architected, designed and built to do so - no
more and no less.”
[Rechtin, 1991]
22

Neal Ford argues that -
• Architects should be responsible for constructing the deployment pipeline
• It is an architectural concern to decide the number of stages for the
deployment pipeline
Continuous Delivery for Architects [Ford, 2014]
PIPELINE AS AN ARCHITECTURAL CONCERN
23

TESTABILITY PAYOFF
“Industry estimates indicate that between 30 and 50 percent (or in some
cases, even more) of the cost of developing well-engineered systems is
taken up by testing. If the software architect can reduce this cost, the
payoff is large.”
[Bass et al, 2013]
24

[IEEE 90] defines testability as -
• The degree to which a system or component facilitates the establishment of
test criteria and the performance of tests to determine whether those criteria
have been met
• The degree to which a requirement is stated in terms that permit
establishment of test criteria and performance of tests to determine whether
those criteria have been met
25

26
How Do We Achieve Repeatability?

PIPELINE REPEATABILITY AND RELIABILITY
27

FOWLER ON NONDETERMINISM
• In order to get tests to run reliably, we must have clear control over the system state at the
beginning of the test
• Some people are firmly against using test doubles in functional tests, believing that you must
test with real connection in order to ensure end-to-end behaviour
• However, automated tests are useless if they are non-deterministic. Any advantage you
gain by tested to the real system is negated by non-determinism
• Often remote systems don't have test systems we can call, which means hitting a live system. If
there is a test system, it may not be stable enough to provide deterministic responses.
http://martinfowler.com/articles/nonDeterminism.html
28

29
Design Tactics for Testability?

TESTABILITY TACTICS
Techniques for system testability
30

Generally, we can think of a test (a specification) as a single state machine sequence
IDEAL TEST (SPECIFICATION)
31

Goal is to construct systems from
high-quality components
Coarse-grained tests are
- more complex
- have more dependencies
- are harder to understand
- are harder to write
- provide poor defect localisation
- must model more states
BUILD QUALITY IN / SOLID
32

“Setting and examining a program's internal state is an aspect
of testing that will figure predominantly in our tactics for
testability”
[Bass et al, 2013]
TESTABILITY TACTICS
33

BASS ET AL - TESTABILITY QUALITY FACTOR AND TACTICS
34

HOLISTIC SYSTEM DESIGN APPROACH
“The test setup for a system is itself a system” [Rechtin, 1991]
35

TACTICS FOR TESTABILITY - RUNTIME ASSERTION
36

PIPELINE REPEATABILITY AND RELIABILITY TACTICS
- Test in isolation (depend on specification, not an actual system)
- Systems at boundary should expose canonical stubs or test harnesses
- Most downtime relates to external system outages. Shared (critical) services should
utilise blue/green deploy to minimize downstream impact
- The repeatability of acceptance tests concerns nondeterminism, and the repeatability of
state scenarios at our boundary
- Apply Record and Replay stubbing approaches to make legacy connected systems
repeatable, as make specification amenable to source control
37

UP-SKILLING TIPS
Team self-help
38

PIPELINES AND LEGACY ARCHITECTURE
Legacy architecture presents several challenges -
• APIs and services expose consumers to fully-connected, end-to-end shared
environments; depending on the nature of a services' underlying state, tests were
not repeatable, and availability was low
• Integration tests written against shared fixtures were typically brittle, slow, and
exhibited low levels of verification, limiting their value to a simple litmus test of "the
service is on”
• Shared data was frequently cannibalised by external teams, causing integration
tests within pipelines to fail
39

PIPELINES AND LEGACY ARCHITECTURE
• Even though some service teams had payed close attention to architectural guidelines,
those guidelines did not cover testability
• There was a ‘release mismatch’ across silos, where teams had conflicting technical
practices(e.g. it isn’t possible to push into master if another team is using long-lived
branching, rather than feature switching)
• Specification approaches were ad-hoc and fragmented; no-one could say ‘lets run the new
scenario for all 17 types of account’
• Specification systems were a by-product, not designed, so presented a barrier to the
addition of new types of product
• Some tests applied nUnit, some used SOAP UI, some relied on intercepting active fake
systems
40

• Include developers and QA’s
• Pick a code kata
• Provide a free-lunch incentive!
• Break down pairing barriers
• Build deeper understanding by practice
• Cover concepts like test-first, first-gear
TDD, evolutionary design, SOLID, GitHub
Flow and Feature Switching
DOJOS
41

• Dojos drive technical practices, but also
break down fear barriers
• Day-to-day, spread knowledge by pairing
more experienced developers with less
experienced developers
• Developers and QAs pair to meet
acceptance criteria, pushing testing down
the pyramid where more extensive
coverage is required
• Collective code ownership
• Deming - focus on quality and
craftsmanship / drive out fear!
TECHNICAL PRACTICES
42

[Cohn, 2010] key takeaways
- focuses on continuous improvement of engineering practices
- Testing should be a whole-team responsibility, it should not be
delegated to ‘experts’ in the testing field
- Use the innate skills within the team to solve quality problems
“… the tester creates automated tests and the programmer programs.
When both are done the results are integrated. Although it may be
correct to still think of there being hand-offs between the programmer
and tester, in this case, the cycle should be so short that the hand-
offs are of insignificant size.”
TEAM EMBEDDING FOR TECHNICAL PRACTICES
43
Story Kick Off Story Handover
Shared understanding Minimizing bugs
•Unit Testing
•Integration Testing
•Work with QA to write Acceptance Test
•Drive Acceptance Testing
•Exploratory Testing
Writing Testable Stories
Product Owner signs off
“Avoid working in a micro-waterfall approach, with distinct analysis,
design, coding and testing phases within a sprint.”
“The hand-offs between programmers and testers (if they exist at all) will
be so small as not to be noticeable.”
“There should be as much test activity on the first day of a sprint as on
the last day.”
“Testers may be specifying test cases and preparing test data on the first
day and then executing automated tests on the last, but are equally busy
throughout.”

INVEST
Independent - self contained
Negotiable - can be changed until in play
Valuable - value for the end user
Estimatable* - well enough defined to be estimated
Small - easy to plan / prioritise
Testable - story must provide test criteria
*Not sure this is really a word, they just made it up
STORY WRITING FOR SPECIFICATION
44

Feature: Quick Balance
As a Customer
I would like to view my Quick Balance without having to login
So that I can check my available balance more quickly
45

In order to construct an acceptance test, we need to define the criteria
Acceptance Criteria:
Given I am on the Home Screen and not logged in
And I have a valid Account
When I swipe left
Then I can view my Quick Balance
Given I am on the Home Screen and not logged in
And I have cancelled my Account
When I swipe left
Then I see an Error Message “Quick Balance not available.”
46

LET’S CODE THAT FEATURE IN AN ATDD CYCLE
The cycle of growing software, guided by tests [Freeman-Pryce 2009; 40]. (I’ve added a Git
commit path☺)
47

ACCEPTANCE TEST (API TEST)
We add a failing acceptance test, to call the API and validate the response
[SetUp]
public void Setup()
{
client = new HttpClient();
client.DefaultRequestHeaders.Add("userToken", UserFixture.Token);
}
[Test(), Description(“Given a logged in user, Quick Balance returns AvailableBalance.")]
public void GivenUserLoggedIn_QuickBalance_ReturnsAvailableBalance()
{
var response = client.PostAsync(UrlHelper.QuickBalance), null).Result;
Assert.IsEqual(HttpStatusCode.OK, response.StatusCode);
Assert.IsNotNull(response);
var jsonResult = response.Content.ReadAsStringAsync().Result;
Assert.IsTrue(jsonResult.Contains("AvailableBalance"));
}
48

UNIT TEST
[SetUp]
public void Setup()
{
acctRepo = Substitute.For<IAccountRepository>(); // unit tests stub dependencies
identity = Substitute.For<IIdentity>();
}
[Test(), Description("Balance returns BadRe for invalid loginId")]
[TestCase(String.Empty)]
[TestCase(null)]
public void GetBalance_InvalidLoginId_ReturnsBadResponse(string invalidLoginId)
{
//ARRANGE
identity.LoginId.Returns(invalidLoginId);
//Repeated constructor is a DRY fail? Test is doing work of a container?
var subjectUnderTest = new QuickBalanceController(acctRepo, identity);
// ACT
var response = subjectUnderTest.GetBalance();
// ASSERT
Assert.AreEqual(HttpStatusCode.BadRequest, response.StatusCode);
}
49

50
Now keep coding until everything goes green… then refactor!

MATERIALS
Selection of Materials on Automated Testing
51

Advanced Unit Testing - Mark Seeman [Seeman, 2013]
- Test readability and “DRY vs. DAMP”
- Red, Green, Refactor and trusting tests
- Simple coding guidelines for test readability
- SUT management and test fixture management patterns
Automated Testing: End to End – Jason Roberts [Roberts, 2013]
Basic economics of testing and the test pyramid
Unit testing (Module 2)
Integration testing (Module 3)
Team City pipeline design (Module 5)
TESTING MATERIALS
52

xUnit Test Patterns - Refactoring Test Code
Gerard Meszaros, Addison-Wesley, 2007 [xUnit Test Patterns]
Amazon
- The book is about the patterns used in the design of
software systems… it’s the book all architects and technical
leads should read!
- This book is to xUnit what the ‘Gang of Four’ book is to
object-oriented design
- Its content is available online http://xunitpatterns.com
- As its subtitle suggests – Refactoring Test Code –the book
details simple goals and patterns for incremental
improvements at code level
- BUT it also discusses wider scale architectural anti-
patterns, e.g. testing against shared databases.
TESTING MATERIALS
53

xUnit Test Pattern landscape
TESTING MATERIALS
54

Growing Object-Oriented Software, Guided by
Tests
Steve Freeman and Nat Pryce, Addison-Wesley,
2009 [Freeman-Pryce 2009]
- Good reference on all aspects of testing
- Explains mocks and test in isolation
- ATDD
- End-to-end testing of event driven systems
TESTING MATERIALS
55

PRIME DIRECTIVES
Debates around TDD
56

Use visible metrics to drive continuous
improvement of technical practices…
… but, over-reliance on metrics is an anti-
pattern, e.g. James Coplien - Why Most Unit
Testing is Waste
We should be confident that all acceptance
criteria have been met, and the risk of system
failure is low.
An interesting metric is cycle time, as well as
the number of defects found and fixed in dev,
test, staging and production environments.
TEST METRICS ARE USEFUL… BUT DANGEROUS
57

UNCLE BOB
Another well quoted adage of test driven development is the three rules of TDD –
1. You must write a failing test before you write any production code
2. You must not write more of a test than is sufficient to fail, or fail to
compile
3. You must not write more production code than is sufficient to make the
currently failing test pass
This is often interpreted as ‘first-gear’ TDD, with 100% test coverage. You would write a failing
test before you write any code that introduces new specifications or modifies existing
behaviour. You can use first-gear TDD if you want to… but it should be optional.
58

Ian Cooper - Where has TDD Gone Wrong? http://vimeo.com/68375232 [Cooper, 2013]
■ Discussion of Kent Beck, and how TDD has been misinterpreted by the community
■ Key issue is the over-coupling between implementation and tests (accidental over-specification)
■ Discussion of testing anti-patterns (‘ice cream cone’) and test pyramid (see 35.00)
■ Hexagonal architecture and unit testing at port boundaries, to produce tests that verify external
component specification (see 42.20)
■ Gears – suggestion that we may write finer tests to incrementally grow a complex algorithm and
then perhaps throw them away, as they are expensive to maintain [45.30]
■ We should retain only external specification tests [45.00]
■ We must focus on writing tests against behaviours (acceptance criteria), rather than method
level testing, i.e. test at correct level of granularity and focus on component specification
(engineering practice) [48.00]
■ Weaknesses of ATDD is that it relies on an engaged business stakeholder [51.00]
FIRST GEAR TDD AND ATDD HAVE THEIR DETRACTORS
59

TEST-DRIVEN DEVELOPMENT BY EXAMPLE (BECK)
Do we need to re-interpret Kent Beck?
- The original book on test-first development
- Focus on xUnit tools
- Code examples, and simple patterns and
techniques for test writing and refactoring
Test-Driven Development By Example
Kent Beck, Addison-Wesley, 2002
60

How large should your test steps be? Beck says…
- You could write tests to encourage a single line of code, or
- You could write tests to underpin hundreds of lines of code
Although test-driven approaches imply small increments, we
should be able to do either.
61

What don’t you have to test?
“Write tests until fear is transformed into boredom”
So, address risk.
62

What do you have to test?
- Conditionals
- Loops
- Operations
- Polymorphism
I think you should cover all variants of concept - it’s up to you whether you nail down
semantics to a microscopic level.
63

Can you drive development with application-level tests?
Beck says that the risk of small scale-testing (“unit testing”) is that
application behaviour may not be what users expect.
Application level testing has some advantages -
- Tests can be written by users
- Can be mixed with “programmer-level TDD”
64

When should you delete tests?
If a test is redundant, it can be deleted, but…
- Confidence - never delete a test if it reduces your confidence
in the system
- Communication - two tests exercising the same code path,
but covering different scenarios should remain
65

REFERENCES
[Bass et al, 2013] Software Architecture in Practice (3rd Edition) (SEI Series in Software Engineering), Len
Bass, Paul Clements, Rick Kazman, Addison-Wesley, 2013.
[Cohn, 2010] Succeeding With Agile, Software Development Using Scrum, Mike Cohn, Addison-Wesley,
2010.
[DZone, 2015] Continuous Delivery, Visualised <https://saucelabs.com/resources/white-papers/dzone-
continuous-deliver-guide.pdf>
[Meszaros, 2007] xUnit Test Patterns - Refactoring Test Code, Gerard Meszaros, Addison-Wesley, 2007.
[Ford, 2014] Continuous Delivery for Architects, Neal Ford
[Ford, 2015] Engineering Practices for Continuous Delivery, Neal Ford
[IEEE 1990] IEEE Computer Society. IEEE Standard Computer Dictionary: A Compilation of IEEE Standard
Computer Glossaries, 610. New York, N.Y.: Institute of Electrical and Electronics Engineers, 1990.
[Rechtin, 1991] Systems Architecting, Creating and Building Complex Systems, Eberhardt Rechtin, Prentice
Hall, 1991.
[Uncle Bob, 2014] Bob Martin – the Cycles of TDD <http://blog.cleancoder.com/uncle-
bob/2014/12/17/TheCyclesOfTDD.html>
67

Ken McCormack
kenmccormack@thoughtworks.com
@kenmccormack
THANK YOU

Testing Practices for Continuous Delivery - Ken McCormack

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Testing Practices for Continuous Delivery - Ken McCormack

Ähnlich wie Testing Practices for Continuous Delivery - Ken McCormack (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Testing Practices for Continuous Delivery - Ken McCormack

Hinweis der Redaktion