Automatic testing in DevOps

Automated Testing in
DevOps
Benoit Baudry
baudry@kth.se

DevOps
•High degrees of automation in software
development, deployment and operations
•Objectives
•Better quality
•Shorter release cycles
•Continuous feedback from Ops to Dev
Automatic Testing in DevOps - Lorentz Workshop 2

DevOps
3Automatic Testing in DevOps - Lorentz Workshop

DevOps – automatic development
unit
perf.
fuzzing
loggingdep. inj.
UI
CI
pertur-
bation
fault
recov.
IDEs
libraries
container
IDS
VMs
cluster
config.

Automatic testing enables DevOps
Continuous Delivery. Jez Humble and David Farley. 2010.

DevOps – continuous testing
unit
perf.
fuzzing
logging
UI
pertur-
bation
fault
recov.
IDS
cluster
config.
7
static
Automatic Testing in DevOps - Lorentz Workshop

Automatic testing and
Continuous Integration

Continuous integration
•‘only’ Dev automation

•Many software
companies do not
operate their product
•IDE, test runners and CI
are available for research

- Unit, integration testing
- Coverage, mutation
- Test generation and repair
Automatic test improvement
- Test refactoring
- Test amplification
- Test fixing
IDE
- Linters
- Completion

Continuous feedback
Continuous Delivery. Jez Humble and David Farley. 2010.

The pull request loop
Collaboration
platform
pull req.

Collaboration
platform
pull req.
code

Collaboration
platform
pull req.
code
analyses

Collaboration
platform
pull req.
code
analyses
feedback

pull req.
code
analyses
feedback

The pull request loop for research
Collaboration
platform
pull req.
code
analyses
feedback
Empirical analyses

Collaboration
platform
pull req.
code
analyses
feedback
Novel analysesAutomatic Testing in DevOps - Lorentz Workshop 19

Collaboration
platform
pull req.
code
analyses
feedback
Novel analysesAutomatic Testing in DevOps - Lorentz Workshop 20

Revisit test prioritization in the CI
•Test priorization orders tests to detect
failures faster
•Non-CI approaches reorder tests cases
according to a change
•Proposal: in the CI, reorder commits to be
executed
Redefining Prioritization: Continuous Prioritization for Continuous Integration. J. Liang, S. Elbaum, G. Rothermel. ICSE 2018.

Analysis of code coverage evolution
•“Statement coverage … reduces the quality
measure to a single ratio, making
developers potentially miss valuable
information about their test suite and its
limitations”
•7,816 revisions of 47 projects
•http://www.code-coverage.org
A Large-Scale Study of Test Coverage Evolution. M. Hilton, J. Bell, D. Marinov. ASE 2018.

Analysis of code coverage evolution
A Large-Scale Study of Test Coverage Evolution. M. Hilton, J. Bell, D. Marinov. ASE 2018.

Deflaker
•Flaky tests are tests which verdict changes
even if the code does not change
•Deflaker is a new bot to automate the
detection of flaky tests
•http://www.deﬂaker.org
DeFlaker: Automatically Detecting Flaky Tests. J. Bell, O. Legunsen, M. Hilton, L. Eloussi, T. Yung, and D. Marinov2. ICSE 2018.

Deflaker

Repairnator
•Automatic repair bot to target build failures
•Runs since February 2017
•ICSE SEIP paper reports on
•11 523 test failures over 1 609 open-source
software projects hosted on GitHub
•generated patches for 15 diﬀerent bugs
How to Design a Program Repair Bot? Insights from the Repairnator Project. S. Urli, Z. Yu, L. Seinturier, M. Monperrus. ICSE SEIP 2018.

Repairnator
How to Design a Program Repair Bot? Insights from the Repairnator Project. S. Urli,
Z. Yu, L. Seinturier, M. Monperrus. ICSE SEIP 2018.
30

Repairnator

DSpot
•Amplify existing unit test cases
•Start from developers’ test cases
•Automatically generate variants
•Submit pull requests
Automatic Test Improvement with DSpot: a Study with Ten Mature Open-Source Projects. B. Danglot, O. Luis Vera-Pérez, B. Baudry, M. Monperrus. Submitted to EMSE.

DSpot
@Test
public void html() {
Attribute attr = new Attribute("key", "value &");
assertEquals("key="value &"", attr.html());
assertEquals(attr.html(), attr.toString());}

DSpot
@Test
public void html() {
assertEquals("key="value &"", attr.html());
assertEquals(attr.html(), attr.toString());}
@Test
public void html_add33() throws Exception {
Assert.assertEquals("key="value &"", attr.html());
Assert.assertEquals("key="value &"", attr.toString());
Assert.assertEquals("key", attr.getKey());
Assert.assertEquals("value &", attr.getValue()); }

DSpot

Mutation at Google
•Mutation testing assesses the validity of test
cases
•Inject bugs in code
•Check that test cases detect the bugs
•Traditional mutation does not scale
•Google has developed a mutation approach
for their PR-loop
State of Mutation Testing at Google. G. Petrovic, M. Ivankovic. ICSE SEIP 2018.

Mutation at Google

Mutation at Google
40

Sapienz
•Automatic test generation and repair
•Black-box, system level
•Deployed in the Facebook PR-loop
•All tests, bugs and patches are reviewed by
developers
Automatic Testing in DevOps - Lorentz Workshop 41Deploying Search Based Software Engineering with Sapienz at Facebook. N. Alshahwan, X. Gao, M. Harman, Y. Jia, K. Mao, A. Mols, T. Tei, and I. Zorin. SSBSE 2018.
https://code.fb.com/developer-tools/finding-and-fixing-software-bugs-automatically-with-sapfix-and-sapienz/

Sapienz

Sapienz
•In production since September 2017
•75% of bugs reported have been fixed
•Sapfix deployed in August 2018
•Some automatic patches accepted

Automatic testing in Ops

Crash analysis
- Reproduction
- Localization
Online experiments
- Chaos
- A/B testing
Testing in Ops
•Test in production
•Requires production
•Industry-driven
•Test in the feedback
•Little research

Online experiments
•Software testing is all about experiments
•Express expected behavior (oracle)
•Run system to check actual vs. Expected (test case)
•On the Ops side, one can perform
experiments with the real system
•Express hypothesis
•Run experiments to check validity

A/B testing

A/B testing
https://medium.com/netflix-techblog/a-b-testing-and-beyond-improving-the-netflix-streaming-experience-with-experimentation-and-data-5b0ae9295bdf

A/B testing
•Modular features, feature toggles
•Graphical elements
•Game levels
•Sample and track a population
of users
•Precise metrics about the value of a
feature

Chaos engineering
•Breaking things on purpose in order to build
more resilient systems!
https://principlesofchaos.org/

Principles of chaos engineering
•Build a Hypothesis around Steady State
Behavior
•Vary Real-world Events
•Run Experiments in Production
•Automate Experiments to Run Continuously
•Minimize Blast Radius
https://principlesofchaos.org/

Netflix’s simian army
•Induce failure regularly
• ‘break’ production code to check the
system’s ability to react
• Chaos monkey: randomly terminates an
instance in production
• Chaos kong: take an entire region offline
• Latency monkey: artificial delay in
RESTful clients

Chaos engineering
•Growing adoption: SDN, CDN, JVM
•Resilience of large distributed systems
•Online perturbation
•Monitor and report
•Loosely coupled architectures
•Macro-level health metric

Crash reproduction
54
java.util.NoSuchElementException
at org.xwiki.rendering.listener.chaining.EmptyBlockChainingListener.stopContainerBlock(EmptyBlockChainingListener.java:458)
at org.xwiki.rendering.listener.chaining.EmptyBlockChainingListener.endFormat(EmptyBlockChainingListener.java:263)
at org.xwiki.rendering.listener.chaining.AbstractChainingListener.endFormat(AbstractChainingListener.java:290)
at org.xwiki.rendering.listener.chaining.BlockStateChainingListener.endFormat(BlockStateChainingListener.java:439)
at org.xwiki.rendering.listener.chaining.AbstractChainingListener.endFormat(AbstractChainingListener.java:290)
at org.xwiki.rendering.listener.CompositeListener.endFormat(CompositeListener.java:253)
at org.xwiki.rendering.internal.parser.wikimodel.DefaultXWikiGeneratorListener.flushFormat(DefaultXWikiGeneratorListener.java:325)
at org.xwiki.rendering.internal.parser.wikimodel.DefaultXWikiGeneratorListener.onWord(DefaultXWikiGeneratorListener.java:906)
at org.xwiki.rendering.wikimodel.impl.InternalWikiScannerContext.onWord(InternalWikiScannerContext.java:1147)
at org.xwiki.rendering.wikimodel.impl.WikiScannerContext.onWord(WikiScannerContext.java:597)
at org.xwiki.rendering.wikimodel.xhtml.impl.TagStack.flushStack(TagStack.java:204)
at org.xwiki.rendering.wikimodel.xhtml.impl.TagStack.onCharacters(TagStack.java:227)
at org.xwiki.rendering.wikimodel.xhtml.impl.XhtmlHandler.characters(XhtmlHandler.java:180)
at org.xml.sax.helpers.XMLFilterImpl.characters(XMLFilterImpl.java:588)
at org.xwiki.rendering.wikimodel.xhtml.filter.XHTMLWhitespaceXMLFilter.sendCharacters(XHTMLWhitespaceXMLFilter.java:487)
at org.xwiki.rendering.wikimodel.xhtml.filter.XHTMLWhitespaceXMLFilter.sendCharacters(XHTMLWhitespaceXMLFilter.java:480)
at org.xwiki.rendering.wikimodel.xhtml.filter.XHTMLWhitespaceXMLFilter.flushContent(XHTMLWhitespaceXMLFilter.java:357)
at org.xwiki.rendering.wikimodel.xhtml.filter.XHTMLWhitespaceXMLFilter.flushContent(XHTMLWhitespaceXMLFilter.java:335)
at org.xwiki.rendering.wikimodel.xhtml.filter.XHTMLWhitespaceXMLFilter.endElement(XHTMLWhitespaceXMLFilter.java:200)
at org.xml.sax.helpers.XMLFilterImpl.endElement(XMLFilterImpl.java:570)
at org.xwiki.rendering.wikimodel.xhtml.filter.AccumulationXMLFilter.endElement(AccumulationXMLFilter.java:86)
at org.xml.sax.helpers.XMLFilterImpl.endElement(XMLFilterImpl.java:570)
at org.xwiki.rendering.wikimodel.xhtml.filter.DTDXMLFilter.endElement(DTDXMLFilter.java:86)
at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source)
at org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanEndElement(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at org.apache.xerces.parsers.XML11NonValidatingConfiguration.parse(Unknown Source)
at org.apache.xerces.parsers.XML11NonValidatingConfiguration.parse(Unknown Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at org.xml.sax.helpers.XMLFilterImpl.parse(XMLFilterImpl.java:357)
at org.xwiki.rendering.wikimodel.xhtml.filter.DefaultXMLFilter.parse(DefaultXMLFilter.java:58)
at org.xwiki.rendering.wikimodel.xhtml.XhtmlParser.parse(XhtmlParser.java:132)

Crash-reproducing Test Case
55
public void test0() throws Throwable {
…
SolrEntityReferenceResolver solrEntityReferenceResolver0 = new …();
EntityReferenceResolver entityReferenceResolver0 = … mock(…);
solrDocument0.put("wiki", (Object) entityType0);
Injector.inject(solrEntityReferenceResolver0, …);
Injector.validateBean(solrEntityReferenceResolver0, …);
…
// Undeclared exception!
solrEntityReferenceResolver0.resolve(solrDocument0, entityType0, objectArray0);
}
java.lang.ClassCastException: […]
at org…..SolrEntityReferenceResolver.getWikiReference(....java:93)
at org…..SolrEntityReferenceResolver.getEntityReference(….java:70)
at org…..SolrEntityReferenceResolver.resolve(….java:63)

Crash-Guided
Genetic Algorithm
• EvoCrash
• Implemented on top of EvoSuite
• Requires
• Stack trace
• Binaries
• .jar files
• Time budget
• Set by the user
56
Initialize population
Evaluate fitness
Next generation
Selection
Crossover
Mutation
Reinsertion
[fitness == 0 or
budget exhausted]

Challenges

@Test
testDecodeQueryString() {
Map<String, String> p = newHashMap();
String uri = "something?test=value";
RestUtils.decodeQueryString
(uri, uri.indexOf(’?’) + 1,p);
assertThat(p.size(), equalTo(1));
assertThat(p.get("test"), equalTo("value"));
}
Flaky tests
Does Refactoring of Test Smells Induce Fixing Flaky Tests? Fabio
Palomba and Andy Zaidman. ICSME 2017.

Complex builds

Multi module projects
•Many projects are composed of multiple
modules
38 38
10
2
0
5
10
15
20
25
30
35
40
[1000, 50000) [50000,250000) [250000,1000000) >= 1000000 60

29
24
10
4 4
3 3
2 2
1 1 1 1 1 1 1
0
5
10
15
20
25
30
35
[0,10)
[10,20)
[20,30)
[30,40)
[40,50)
[50,60)
[60,70)
[70,80)
[80,90)
[90,100)
[100,110)
[110,120)
[120,130)
[130,140)
[140,150)
[150,160)
[160,170)
[170,180)
[180,190)
[190,200)
[200,210)
[210,220)
[220,230)
[230,240)
[240,250)
[250,260)
[260,270)
[270,280)
[280,290)
[290,300)
[300,310)
[310,320)
[320,330)
[330,340)
[340,350)
[350,360)
[360,370)
[370,380)
[380,390)
[390,400)
[400,410)
[410,420)
[420,430)
Min 2
Mean 35
Median 16
Max 423
61

867
669
1126
328
41 25 19 3 1
0
200
400
600
800
1000
1200
0 1 [2,10) [10,20) [20,30) [30,40) [40,50) [50,60) [60,70)
Min 0
Mean 4
Median 2
Max 67
62

https://github.com/hcoles/pitest 63

https://github.com/spring-cloud/spring-cloud-stream 64

Ecosystems of
dependencies
•Only 1% of Maven
Central
•31877 artefacts
•57227
dependencies
Collected and visualized by Amine Benelallam and Cesar Soto

Get Ops conditions
•Record production traffic
•Replay traffic
•Shadow traffic

Conclusion

Crash analysis
- Reproduction
- Localization
Online experiments
- Chaos
- A/B testing
- Unit, integration testing
- Coverage, mutation
- Test generation and repair
Test improvement
- Test refactoring
- Test amplification
- Test fixing
IDE
- Linters
- Completion

Automatic testing in DevOps
•Vibrant research and development topic
•The PR loop is an pportunity for research
•Incremental analyses and testing
•Developer in the loop
•Challenges
•Going in the Ops
•Build complexity

Acknowledgements
•Oscar Vera-Pérez, Benjamin Danglot, César
Soto, Amine Benelalam, Nicolas Harrand,
Martin Monperrus
•https://stamp.ow2.org

Automatic testing in DevOps

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Automatic testing in DevOps

Ähnlich wie Automatic testing in DevOps (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Automatic testing in DevOps