This is the slide deck for the Technical Briefing given by Kevin Moran, Mario Linares Vasquez, and Denys Poshyvanyk at ICSE17 this year. The presentation covers 1) The basics of GUI Testing and its importance to mobile testing; 2) The current state-of-the-art and practice for testing mobile applications; 3) Current Challenges facing researchers and practitioners aiming to design and build automated testing approaches; and 4) A new direction for automated mobile testing research issued as guidance and a challenge for the SE community.
Axa Assurance Maroc - Insurer Innovation Award 2024
ICSE17Tech Briefing - Automated GUI Testing of Android Apps: From Research to Practice
1. AUTOMATED GUI TESTING
OF ANDROID APPS:
FROM RESEARCH TO
PRACTICE
Kevin Moran
Mario Linares-Vásquez
Denys Poshyvanyk
2. Mario Linares-Vásquez
Assistant Professor
Universidad de los Andes
Bogotá, Colombia
m.linaresv@uniandes.edu.co
http://sistemas.uniandes.edu.co/~mlinaresv
Kevin Moran
Ph.D. candidate
College of William and Mary
Williamsburg, VA, USA
kpmoran@cs.wm.edu
http://www.cs.wm.edu/~kpmoran
Denys Poshyvanyk
Associate Professor
College of William and Mary
Williamsburg, VA, USA
denys@cs.wm.edu
http://www.cs.wm.edu/~denys
3. PART 1: CURRENT STATE OF RESEARCH & PRACTICE
PART 2: MOBILE TESTING CHALLENGES
PART 0: BACKGROUND AND CORE CONCEPTS
PART 3: PRELIMINARY SOLUTIONS & RESEARCH
VISION
5. The Importance of GUI Testing
• Several different types of testing are important
for ensuring software quality:
6. The Importance of GUI Testing
Unit Testing Performance Testing
Regression Testing
Integration Testing
Compatibility Testing
• Several different types of testing are important
for ensuring software quality:
7. The Importance of GUI Testing
Unit Testing Performance Testing
Regression Testing
Integration Testing
Compatibility Testing
• Several different types of testing are important
for ensuring software quality:
• For Mobile, GUI-Based Testing subsumes many other types
of testing
• GUI-Testing is typically expensive, and test scripts are
difficult to maintain
• There is a clear opportunity for automation to Improve
development workflows
13. GUI Testing: Example
Detecting and Localizing Internationalization Presentation Failures in Web Applications. Abdulmajeed Alameer, Sonal Mahajan, William G.J. Halfond. In
Proceeding of the 9th IEEE International Conference on Software Testing, Verification, and Validation (ICST). April 2016.
14. GUI Testing: Example
Detecting and Localizing Internationalization Presentation Failures in Web Applications. Abdulmajeed Alameer, Sonal Mahajan, William G.J. Halfond. In
Proceeding of the 9th IEEE International Conference on Software Testing, Verification, and Validation (ICST). April 2016.
85. Pros and Cons
Automation
Frameworks
✓ Easy reproduction
✓ High level syntax
✓ Black box testing
- Learning curve
- User-defined oracles
- Expensive maintenance
Record &
Replay ✓ Easy reproduction
- Expensive collection and
maintenance
- Coupled to locations
AIG: Random
Based
✓ Fast execution
✓ Good at finding crashes
- Invalid events
- Lack of expressiveness
96. Monkey
A or B ?
Breadth-First (BF)Depth-First (DF)
Systematic Exploration
97. Monkey
A or B ?
Breadth-First (BF)Depth-First (DF)
Random (Uniform) Random (A-priori distr.)
Other options (online decision)
Systematic Exploration
98. Tools: Google Robo Test
https://firebase.google.com/docs/test-lab/robo-ux-test
99. Pros and Cons
Automation
Frameworks
✓ Easy reproduction
✓ High level syntax
✓ Black box testing
- Learning curve
- User-defined oracles
- Expensive maintenance
Record &
Replay ✓ Easy reproduction
- Expensive collection and
maintenance
- Coupled to locations
AIG: Random
Based
✓ Fast execution
✓ Good at finding crashes
- Invalid events
- Lack of expressiveness
AIG:
Systematic
✓ Achieves Reasonable Coverage
✓ May miss crashes
- Can be time consuming
- Typically cannot
exercise complex
features
101. Model-Based Testing (MBT)
UI Events
Model Monkey AUT/SUT
- Manually generated
- Automatically generated (source code)
- Ripped at runtime (upfront)
- Ripped at runtime (interactive)
102. Pros and Cons
Automation
Frameworks
✓ Easy reproduction
✓ High level syntax
✓ Black box testing
- Learning curve
- User-defined oracles
- Expensive maintenance
Record &
Replay ✓ Easy reproduction
- Expensive collection and
maintenance
- Coupled to locations
AIG: Random
Based
✓ Fast execution
✓ Good at finding crashes
- Invalid events
- Lack of expressiveness
AIG:
Systematic
✓ Achieves Reasonable Coverage
✓ May miss crashes
- Can be time consuming
- Typically cannot exercise
complex features
AIG: Model
Based
✓ Event sequences
✓ Automatic exploration
- Some Invalid sequences
- State Explosion
- Incomplete models
103. Other Types of AIG Approaches
• Recently New Approaches have been introduced for AIG:
• Search-Based Approaches1
• Symbolic/Concolic Execution2
• NLP for Text Input Generation - @ICSE17
1Ke Mao, Mark Harman, and Yue Jia. 2016. Sapienz: multi-objective automated testing for Android applications. In Proceedings of
the 25th International Symposium on Software Testing and Analysis (ISSTA 2016)
2Nariman Mirzaei, Joshua Garcia, Hamid Bagheri, Alireza Sadeghi, and Sam Malek. 2016. Reducing combinatorics in GUI testing
of android applications. In Proceedings of the 38th International Conference on Software Engineering (ICSE '16)
110. Bug Reporting Services
• Features of Bug Reporting
Services:
• Video Recording
• App Analytics
• Automated Crash Reporting
111. Bug Reporting Services
• Features of Bug Reporting
Services:
• Video Recording
• App Analytics
• Automated Crash Reporting
112. Pros and Cons
Bug Reporting
Services
✓ Allows for more details
about field failures
✓ App Analytics can help
with UI/UX design
- Can be expensive
- Requires integrating library
- Typically do report GUI-
traces
127. Pros and Cons
Crowdsourcing
Services
✓ Low effort required
from developers
✓ Expert tester might
uncover unexpected
bugs
- Can be expensive
- May not fit within Agile
workflows
- Quality of Reports can
vary
Bug Reporting
Services
✓ Allows for more details
about field failures
✓ App Analytics can help
with UI/UX design
- Can be expensive
- Requires integrating library
- Typically do report GUI-
traces
128. Cloud Testing & Device Streaming (Devices farm)
Third Party Service
Third Party Service
137. Pros and Cons
Crowdsourcing
Services
✓ Low effort required
from developers
✓ Expert tester might
uncover unexpected
bugs
- Can be expensive
- May not fit within Agile
workflows
- Quality of Reports can
vary
Bug Reporting
Services
✓ Allows for more details
about field failures
✓ App Analytics can help
with UI/UX design
- Can be expensive
- Requires integrating library
- Typically do report GUI-
traces
Device
Streaming
✓ Allows remote users
access to controlled
devices
✓ Allows for collection of
detailed user information
- Can be difficult to configure
- Relies on strong network
connection
- Cannot simulate mobile
specific contexts like
sensors
141. Accidental Challenges in Mobile GUI-Testing
Accidental Challenges
Application Data and Cold Starts
142. Accidental Challenges in Mobile GUI-Testing
Accidental Challenges
Application Data and Cold Starts
Bugs in Framework Utilities
143. Accidental Challenges in Mobile GUI-Testing
Accidental Challenges
Application Data and Cold Starts
Bugs in Framework Utilities
Bugs and Limitations in the SDK Tools
144. Accidental Challenges in Mobile GUI-Testing
Accidental Challenges
Application Data and Cold Starts
Bugs in Framework Utilities
Bugs and Limitations in the SDK Tools
Challenges Scaling Concurrent Virtual Devices
162. Essential Challenges: Test Oracles
Potential Solution: Use a hash of the current screen hierarchy
Drawbacks: May miss some fined grained information about the GUI
(e.g. colors, images)
163. Essential Challenges: Test Oracles
Potential Solution: Use a hash of the current screen hierarchy
adb shell shell /system/bin/uiautomator
dump /sdcard/ui_dump.xml
Drawbacks: May miss some fined grained information about the GUI
(e.g. colors, images)
168. Fully automated mobile testing: Are we there yet?
APP
- Oracles
- Models
- Scripts
- Strategies
- Fuzzers
- Rippers
- MBT
- Automation APIs
169. Key Challenges
1. Open problems without solutions (e.g., oracles
generation)
2. Facets of mobile testing not investigated yet (e.g.,
mobile-specific fault models)
3. Specific “characteristics” in mobile testing that
drastically impact the testing process (e.g.,
fragmentation)
174. Key Challenges: Lack of fault models
Fault model (a.k.a., fault profile):
Catalog describing recurrent (or uncommon) bugs
associated with software developed for a particular
platform or domain
Fault Model
Potential locations/features
in source code to be tested
176. Key Challenges: History awareness in test cases
Current automated techniques do not support system/
acceptance testing because they are not history aware.
- Fuzzers are totally random
- Rippers follow heuristics for GUI traversal
- Test scripts are hard-coded
178. Key Challenges: Evolution/maintenance of
test scripts
Location-coupled scripts
Need different versions for each device and are highly
sensitive to changes in the GUI
Device/Location-agnostic scripts
Are resilient to device fragmentation but need to be
updated when the GUI changes in terms of added/
deleted components
180. Key Challenges: Oracles generation
Manually Codified Oracles (MCO)
Usually implemented as assertions or expected
exceptions
Exceptions-As-Oracle (EAO)
Crashes and errors reported during the execution of a
test case
Gui-state-As-Oracle (GAO)
Expected transitions between windows, and GUI
screenshots
182. Key Challenges
Challenge Solution
Fragmentation
Fault models
History-awareness
Multi-Goal testing
Test scripts evolution
Oracles generation
Crowd/cloud based testing
Research is needed
MBT is a promise
Open problem
Open problem
Open problem
183. The CEL Testing Principles
A “fully" automated solution for testing
mobile apps should continually generate
test cases, underlying models, oracles, and
actionable reports of detected errors,
without human intervention.
184. The CEL Testing Principles
Automated testing of mobile apps should
help developers increase software quality
within the following constraints: (i)
restricted time/budget for testing, (ii)
needs for diverse types of testing, and (iv)
pressure from users for continuous
delivery
185. The CEL Testing Principles
Continuous:
- Continuous multi-goal testing under different environmental
conditions
- Any change to the source code or environment should
trigger — automatically – a testing iteration on the current
version of the app
- Test cases executed during the iteration should cover only
the impact set of the changes that triggered the iteration
186. The CEL Testing Principles
Evolutionary:
- App source code and testing artifacts (i.e., the models, the
test cases, and the oracles) should not evolve independently
of one another
- The testing artifacts should adapt automatically to changes
in (i) the app, (ii) the usage patterns, and (iii) the available
devices/OSes
187. The CEL Testing Principles
Large-scale:
- CEL testing should be supported on infrastructures for
parallel execution of test cases on physical or virtual devices.
- The large-scale engine should be accessible in the context of
both cloud and on- premise hardware
-This engine should enable execution of test cases that
simulate real conditions in-the-wild.
188. CEL Testing: Proposed Architecture
APIs evolution
monitor
Changes Monitoring Subsystem
Developers
User reviews
APIs
Users
On-device
usages monitor
Source code
changes
monitor
Markets monitor
Source code
Impact analyzer
Execution
traces+ logs
analyzer
API
changes
analyzer
User reviews +
ratings analyzer
Models
repository
Models
generator
Test cases
generator
Large Execution Engine
Containers
Manager
Reports
repository
Reports
generator
Test cases
runner
Testing Artifacts Generation
Artifacts
repository
Multi-model
generator
Multi-model
repository
Test cases + oracles
Domain, GUI,
Usage, Contextual,
Faults models
Multi-model
189. Multi-Model
GUI model
Usage model
Domain model
Context model
Fault model
Windows, transitions, components
Single use cases and combinations;
(un)common usages
Entities, data types, values, enumerations
External events, sensors, connectivity
(Un)Common programming errors
195. Discussion Questions
• Potential solutions to challenges we covered?
• Other future research directions?
• How does the rise of native web-programming
frameworks for mobile apps impact testing?