Integration Testing as Validation and Monitoring

INTEGRATION TESTING AS
VALIDATION AND MONITORING
Melissa Benua
Senior Backend Engineer
PlayFab, Inc
STARWEST 2015

The challenge: Monitoring SaaS products
Software as a service is exploding, and so is testing complexity:
1. Not enough just to run tests at build time, now you also need need
deploy-time integration tests and continuous network monitoring
2. Every layer of tests adds
complexity & maintenance costs
3. There are a limited amount
of engineer-hours in the day
4. Engineers want to use their time
with maximum efficiency
Time spent writing the same tests over again is time that could be spent doing more
interesting and important stuff!

EXISTING OPTIONS
Commercial products you can buy now!
3

Cloud Monitoring Services
Providers:
• Keynote
• Gomez
• Pingdom
Pros:
• Lightweight
• Integrated alerting
• Public vs. private status pages
Cons:
• Difficult to manage multiple contributors
• Can’t do complex checks easily (log in a user and verify inventory)
• Can get expensive or require enterprise contracts

Hosted Monitoring Services
Providers:
• Sensu
• System Center Operations
Manager (SCOM)
• Nagios
Pros:
• Extremely powerful
• Older technology
Cons:
• Complex to set up
• Single centralized server
• Overkill for many services hosted in the cloud

OUR APPROACH
Do it the PlayFab way!
October 5, 2015 PlayFab Confidential 6

Our Solution
1. Author one set of HTTP-level tests
• Same as how clients connect
• Self-contained and self-initializing
• Repeatable and reliable
2. Deploy tests both within the build environment
and within the monitoring cloud
3. Collect data from tests into one central location
4. Present data for use by both devops and customers
Pros:
• Efficient use of engineering resources
• VM hosting bill is very small
• Can run complex tests without
worrying about maintainability
Cons:
• Pipeline requires some maintenance
• Requires knowing how to use two
different clouds
• Must be able to do test setup from
within a different ecosystem

Our solution, cont’d
Goals:
• Minimize number of lines of code
duplicated per functional piece
• Reliable & trustworthy reporting
• Affordable cost
• Adequate geo-location
• Very low maintenance time cost
• Easy to access
• More free time for engineering!
Limitations:
• Smaller # of monitoring leaf nodes (~10 instead of ~100 or ~1000)
• Vulnerable to gaps in dev logic
• Not as straightforward to set up
• Monitoring is only as good as your testing!

TESTING SCENARIOS
One of these may look familiar!

Scenario A – RESTful API
Sample characteristics:
• Custom service in Java layered on Apache
• Private hosting
• Tests via Junit
• Authenticates using private login
• Connects to several different backend
services (mongodb, sql, analytics,
queueing, etc)

Scenario B – MVC Website
Sample characteristics:
• Built on .net MVC
• Hosted in Azure
• Testing via custom harness
• Authenticates using OAuth and Facebook
• Backends into locally-hosted SQL server

Scenario C - PlayFab
Characteristics:
• JSON API built on C# + management website
• https://api.playfab.com/documentation
• Hosted in Windows on AWS
• Tests via VSTest
• Many moving parts
• Game server hosting
• Client versus server authentication
• Third-party purchasing and auth providers
• Various backend data sources

IMPLEMENTING OUR SOLUTION
How to wire up the pipeline!

Architecture
14
Build Server
 Compiles code
 Runs tests
Production
Deploys
Web Server
 Collects Data
Web Site
 Displays Data
Developer
Writes
Tests
Europe
Microsoft Azure
US-West US-East Asia
Amazon Web Services
Submits Code

Utilized Tech
Test Framework
• VSTest or Junit or custom executor
• Must output a predictable, machine-readable format
(.TRX from VSTest comes with an XSD for easy parsing)
Execution + Communication Layer
• Consul or custom cross-DC chatter
• Consul API is in many languages, easy to secure and simple configure
• Regularly executes the test executable
• Shares test results as ‘service health checks’ across DCs
Custom Data Bridge
• Transform test framework output into Consul input

Picking Monitoring Tests
Full App
Integration Test
Suite
Internal Service A
Test Suite
Library Unit
Test Suite
Integration
Suite
Internal Service B
Test Suite
Integration
Suite

Picking Monitoring Tests, con’t
Must-haves:
• Happen at same layer clients access (HTTP,
generally)
• Cover key ‘P0’ functionality areas
• Cover areas with lots of ‘moving parts’
Nice-to-haves:
• All exposed APIs
• Third-party integrations
• Full success-testing run
Ideal world:
• Full integration test suite

Scenario Must-Have Test Cases
REST API
• Login/Authenticate
• Logout
• One test per downstream
service
• Stretch: one test per API
MVC Website
• One test per login method
(OAuth, Facebook)
• Key pages
• Basic SQL coverage

Deployment Pipeline
The fewer manual steps the better!
Sample flow:
Submit Code
to Repo
CI Runs
Build
CI Runs Tests
Deployment
Packages
Created
Tests
Deployed
into Monitor
Cloud
Storage
Cloud
Storage
Distributes
to VMs

Monitoring Cloud
Any cloud will do!
Number of regions is important
• Azure has https://azure.microsoft.com/en-us/regions/#services
• AWS has http://docs.aws.amazon.com/general/latest/gr/rande.html#ec2_region
VMs can be teeny – no need for heavy compute or memory usage

Test Execution Frequency
How complex is it to run your tests?
• Run a simple executable?
• Have to download a lot of data?
• Long setup phase?
• How long does a full test pass take?
Periodic execution (every N seconds)
Faster is better! Pingdom ‘free’ tier is every 15 minutes per check
Ideal range is between 30 seconds and 5 minutes
Be careful not to drown your ‘real traffic’
• Test traffic hiding problems with real users is a legitimate issue!
• Try to stay under 10% of total traffic if possible

Collecting Results
Execute Tests
Put machine-readable test results into collator
• Consul accepts Datacenter, CaseName, Pass/Warn/Fail, Note (we store latency)
• Agents may be updated using SDK or direct to HTTP interface
• Example: http://localhost:8500/v1/agent/check/pass/mytestcase
• Full HTTP API: https://www.consul.io/docs/agent/http.html
Small adapter program reads test results and outputs to Consul Agent
(SDK or HTTP)

Alerting
Ideal to hear about outages as a push rather than a pull
Determine what ‘failure’ means to you
• Balance between false alarms and missing real alarms
Many options!
• Post alerts into VictorOps for paging
• Send email from monitoring website
• Send push notification through your cloud

Questions?
Melissa Benua
mbenua@gmail.com
https://www.linkedin.com/in/mbenua
http://www.slideshare.net/MelissaBenua

APPENDIX
Technical Details and Sample Config

Partial Consul Configuration
{
"datacenter": "prd-uswest1",
"retry_join_wan": [ “west.cloudapp.net",
“east.cloudapp.net" ],
"server": true,
"service": {
"name": "pfmonitor",
"checks": [
{
"script":
"C:WindowsSystem32WindowsPowerShellv1.0powershell
.exe -file c:runtests.ps1",
"interval": "120s"
}
]
}
}

Consul Commands
Full HTTP API: https://www.consul.io/docs/agent/http.html
Add a health check:
$body =
{
"ID": “mypath",
"Name": "Path Works",
"Notes": "Checking uptime and latency",
"HTTP": "http://my.service.com/path",
"TTL": "45s"
}
• Invoke-WebRequest http://localhost:8500/v1/agent/check/register -Body $body
List the health checks:
• Invoke-WebRequest http://localhost:8500/v1/health/checks/myservice
[
{
"Node": "somenode",
"CheckID": “mypath",
"Name": “Path Works",
"Status": "passing",
},
]

Consul Commands
Update a health check:
• Can add ?note=foo to pass details like latency
• Invoke-WebRequest
http://localhost:8500/v1/agent/check/pass/mypath
http://localhost:8500/v1/agent/check/warn/mypath
http://localhost:8500/v1/agent/check/fail/mypath

Integration Testing as Validation and Monitoring

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (11)

Ähnlich wie Integration Testing as Validation and Monitoring

Ähnlich wie Integration Testing as Validation and Monitoring (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Integration Testing as Validation and Monitoring