The document discusses the evolution of a SPARQL benchmarking framework from version 1.x to 2.x. Version 1.x had several limitations, such as only supporting SPARQL queries and using a hardcoded methodology. Version 2.x addressed these limitations by supporting different types of operations, separating the test methodology, and making the framework more customizable and extensible. Examples are given of how the framework is used internally and how others can further customize it for their needs.
4. 4
Presentation I gave at this conference in 2012
Slides at http://www.slideshare.net/RobVesse/practical-sparql-benchmarking
Highlighted some issues with SPARQL Benchmarking:
Standard Benchmarks all have know deficiencies
Lack of standardized methodology
Best benchmark is the one you run with your data and workload
Introduced the 1.x version of our SPARQL Query
Benchmarker tool
Java tool and API for benchmarking
Used a methodology based upon combination of the BSBM runner and Revelytix SP2B white
paper
Reports various appropriate statistics
Various configuration options to change what exactly is benchmarked e.g. whether results are
fully parsed and counted
5. 5
The 1.x tool was open sourced shortly after the 2012
conference under a 3 clause BSD License
Available on SourceForge
http://sourceforge.net/projects/sparql-query-bm/files/1.0.0/
Also as Maven artifacts (in Maven Central):
Group ID: net.sf.sparql-query-bm
Artifact IDs:
cmd
core
Latest 1.x Version: 1.1.0
7. The 1.x tool can only benchmark SPARQL queries
SPARQL 1.1 has been standardized since the 1.x version of
the tool was written and adds various additional SPARQL
features that you may want to test:
7
SPARQL Updates
SPARQL Graph Store Protocol
Queries are fixed
No parameterization support
Can't pass custom endpoint parameters in
For example enable/disable reasoning
Also no way to test endpoint specific extensions
e.g. transactions
8. 8
Requires using HTTP endpoints to access the SPARQL
system to be tested
Adds communication overheads to the results
Sometimes this may be desirable
No ability to test SPARQL operations in-memory
i.e. can't test lower level APIs
9. Only supports a single benchmarking methodology
Methodology is hard coded
Can't do things like run a subset of the provided operations
on each run
9
Or repeat an operation within a run
Or retry an operation under specific failure conditions
Configuration of the methodology is tightly coupled to the
methodology
Many aspects are actually independent of the methodology
10. 1
0
Used a simplistic text based format
One query file per line
No way to specify additional parameters
No way to assign a friendly name to queries
Assigns each query the filename
11. There is a progress monitoring API but it is limited
E.g. Gets called after a query completes but not before it
starts
Makes it awkward/impossible to implement some kinds of
monitoring
1
1
e.g. crash detection, memory usage
12. 1
2
In the interests of speed over usability we rolled our own
command line arguments parser
Means argument parsing is awkward to extend
14. 1
4
Earlier this year we found a compelling reason to rewrite
the tool and address the various limitations
First 2.x release was made 9th June 2014
Minor bug fix and maintenance releases since
Releases available at:
http://sourceforge.net/projects/sparql-query-bm/files/
Code is now using Git
http://git.code.sf.net/p/sparql-query-bm/git sparql-query-bm-git
Mirrors available on GitHub for those who think that it is the one true source
https://github.com/rvesse/sparql-query-bm
Maven artifacts available through Maven Central as before:
Group ID: net.sf.sparql-query-bm
Artifact IDs: core, cmd and dist
Latest 2.x version: 2.0.1
15. Concept of Queries replaced with the general concept of
Operations
Also divorces the definition of an operation with how to run
said operation
1
5
Makes it easier to change runtime behaviour of operations
20 built-in operations provided
API allows defining and plugging in new operations as
desired
http://sparql-query-bm.sourceforge.net/javadoc/latest/core/
16. 1
6
Several kinds of query/update
Fixed
Parameterized
Dataset Size
Variants for both remote endpoints and in-memory
datasets
Remote variants have additional NVP variants
Allows adding custom parameters to the remote request
Accounts for 13 of the built in operations
17. 1
7
One for each graph store protocol operation:
DELETE
GET
HEAD
POST
PUT
Accounts for a further 5 of the built-in operations
18. 1
8
Sleep
Do nothing for some period
Useful for simulating quiet periods as part of testing
Mix
Allow grouping a set of operations into a single operation
Lets you compose mixes from other mixes
19. 1
9
As already noted in-memory variants of some operations
are now available
These run tests against a Dataset implementation
Part of Apache Jena ARQ API
Removes SPARQL Protocol and HTTP overhead from testing
Of course depending on Dataset implementation may still be some communication overhead
But this is likely using lower level back end native communications protocols instead
20. 2
0
Addresses the limitation of hard coded methodology
Separates test running into three components:
Overall runner
Mix runner
Operation runner
Each has own API and can be customized as desired
Various useful base/abstract implementations provided
Four different test runners are provided:
Benchmark
Smoke
Soak
Stress
21. 2
1
Smoke
Runs the mix once and indicates whether it passes/fails
Pass is defined as all operations pass
Soak
Run the mix continuously for some period of time
Test how a system reacts under continuous load
Stress
Run the mix with increasingly high load
Test how a system reacts under increasing load
AbstractRunner provides a basic framework and helper
method to make it easy to add custom runners or
customize existing runs
22. 2
2
Allows customizing how mixes and individual operations
are run
Some alternative implementations built in:
E.g. SamplingOperationMixRunner
Runs a sample of the operations in the mix
May include repeats
E.g. RetryingOperationRunner
Retries an operation if it doesn't succeed
Easy to implement your own
23. 2
3
Separates test configuration from the test runner
Interface with all common configuration defined
Endpoints
Timeouts
Progress Listeners
etc
NB - Runners are typically defined such that they restrict
their input options to sub-interfaces that add runner
specific configuration e.g.
Warm-ups for benchmarks
Total runtime for soak testing
Ramp up factor for stress testing
24. 2
4
Now using TSV as the file format
Still wanted to be simple enough that someone with zero RDF/SPARQL knowledge can
configure
Each line is a series of parameters separated by a tab
character
First parameter is an identifier for the type of the operation
Used to decide how to interpret the remaining parameters
Can define your own mix file format and register a loader
for it
Possible to override the loader for a specific operation
identifier since this has an API
Means you can do neat tricks like use a mix designed for remote endpoints against an in-memory
dataset
26. Now provides notifications before and after operation and
mix runs
Improvements to how some of the built-in
implementations handle multi-threaded output
2
6
Makes it easier to distinguish where errors occurred when running multi-threaded
benchmarks
27. 2
7
Now based upon the powerful open source Airline library
https://github.com/airlift/airline
Provides a command line interface to each built-in runner
Also provides AbstractCommandwith all standard options exposed
Standardized exit codes across all commands
Comprehensive built-in help
Can help you define operation mixes
./operations
./operation --op param-query
29. These are things we've done (or are currently doing) with
the framework that aren't in the open source releases
However the 2.x framework makes these (hopefully) easy
to replicate yourself
2
9
30. 3
0
Many stores often have rich REST APIs in addition to their
SPARQL APIs
Can be useful to include testing of these in your mixes
Requires implementing two interfaces:
Operation
OperationCallable
Abstract implementations of both available to give you the
boiler plate bits
Internally we have 9 different custom operations defined
which test a subset of our REST API:
Database Management
Asynchronous Queries
Import Management
31. One thing we're particularly interested in is how operations
affect memory usage
3
1
We added custom progress listeners that track and monitor memory usage
Reports on min, max and average memory usage
We also have another progress listener that tracks
processes to identify when a test run may have been
impacted by other activity on the system
32. 3
2
public class RetryOnAuthFailureOperationRunner extends RetryingOperationRunner {
public RetryOnAuthFailureOperationRunner() {
this(1);
}
public RetryOnAuthFailureOperationRunner(int maxRetries) {
super(maxRetries);
}
@Override
protected <T extends Options> boolean shouldRetry(Runner<T> runner, T options,
Operation op, OperationRun run) {
return run.getErrorCategory() == ErrorCategories.AUTHENTICATION;
}
}
Extends the built-in RetryingOperationRunner
Simply adds a constraint on retries by overriding the
shouldRetry() method
34. 3
4
Embrace Java 7 features fully
Use ServiceLoader to automatically discover new operations and mix formats
Make it even easier to customize runners
i.e. provide more abstraction of the current implementations