2. Geoprocessing with R @ AGILE 2015 2
Motivation
Go beyond data
Share knowledge and ressources
Have fun and develop software
Process design is really really really really really really really re
ly really really really hard.
3. Geoprocessing with R @ AGILE 2015 3
Why R?
R-sig-geo
geos, rgdal, raster, gstat, …
4. Geoprocessing with R @ AGILE 2015 4
Use cases
Intuitive apps for
air quality data
• Interactive UI
• Pre-configured analysis
• Multiple „apps“
Reproducible scientific
analysis
• Transparency & full
control (levels)
• Existing research scripts
• Standards
• CollaborationUser selects an air quality
station on a map and a time
frame to start one complex
data analysis and explore the
result plot.
Researcher tests different
parameters of an algorithm
using nonpublic data within his
GIS environment, downloads all
scripts and recreates full set-
up in own environment.
5. Geoprocessing with R @ AGILE 2015 5
SENSORWEBY
User-friendly air quality timeseries analysis with…
http://joaquin.eu
7. Geoprocessing with R @ AGILE 2015 7
52°North JavaScript Sensor Web Client
https://github.com/52north/js-sensorweb-client/
http://sensorweb.demo.52north.org/jsClient/
8. Geoprocessing with R @ AGILE 2015 8
Sensorweby
sensorweby
Web app integrating
JS SWC with Shiny
https://github.com/52North/sensorweby
sensorweb4R + sos4R
client libraries to access data
services from R
https://github.com/52North/sensorweb4R
http://cran.r-
project.org/web/packages/sos4R/
13. Geoprocessing with R @ AGILE 2015 13
http://shiny.irceline.be/examples/
devtools::install_github("52North/sensorweb4R")
devtools::install_github("52North/sensorweby")
sensorweby::runExample("basic")
14. Geoprocessing with R @ AGILE 2015 14
WPS4R
Standardized scientific geoprocessing for land use analysis using…
15. Geoprocessing with R @ AGILE 2015 15
Web Processing Service
Describes inputs & output of geospatial web processing
services
http://en.wikipedia.org/wiki/Web_Processing_Service | http://www.opengeospatial.org/standards/wps
• GetCapabilities: service-level metadata
• DescribeProcess: process description (metadata,
inputs, outputs)
• Execute: start and retrieve results of a process
instance
– Synchronous
– Asynchronous
Open source implementations
OGC WPS
16. Geoprocessing with R @ AGILE 2015 16
52°North WPS4R
https://wiki.52north.org/bin/view/Geostatistics/WPS4R
Collaboration platform for web devs, IT,
and GIS/domain experts
Reproducible geoprocesses in SOA
20. Geoprocessing with R @ AGILE 2015 20
WPS4R DEMONSTRATOR
Reproducible Global Land Use Classification
21. Geoprocessing with R @ AGILE 2015 21
http://geoportal-glues.ufz.de/stories/landsystemarchetypes.html
22. Geoprocessing with R @ AGILE 2015 22
The following input parameters of the process are available
to the user:
• Sample size (numbers of points spread over the globe)
• Sampling type (strategy such as regular, random, …)
• Standardization method (how to move inputs to a common
scale)
• SOM-specific parameter
– Grid topology (rectangular or hexagonal)
– Grid dimensions (influences # of output classes)
Future versions might allow to select the used datasets or
integrate own datasets, select input layers, …
Process Inputs
23. Geoprocessing with R @ AGILE 2015 23
http://geoportal-glues.ufz.de/stories/openanalysis.html
27. Geoprocessing with R @ AGILE 2015 27
Next steps
WPS4R
Spatial output integrated with ArcGIS client
GitHub-based four-eyes-principle!
Generate provenance information from scripts?
Privacy-aware analysis of floating car data?
See https://github.com/52North/WPS/labels/wps4r
sensorweby
Apps, apps, apps
See https://github.com/52North/sensorweby/issues/
28. Geoprocessing with R @ AGILE 2015 28
wps-js [shameless plug]
WPS XML process
description > web form
https://github.com/
52North/wps-js/
Beta-stage, please join!
Goal: JSONIX
https://github.com/
highsource/jsonix
29. Geoprocessing with R @ AGILE 2015 29
Conclusions
Shiny
Quick, suitable for teaching, few lines of plain R creates
an app, full GIS, extensible within the platform
WPS4R
SDI/SII-ready, simple to complex processes/abstraction
levels, various clients, service oriented architecture,
completely customizable
30. Geoprocessing with R @ AGILE 2015 30
*
http://en.wikipedia.org/wiki/Law_of_the_instrument
* R is not the only hammer in the toolbox, 52°North WPS (and others) support Python, Java, and Matlab processes as well…
http://pixabay.com/p-35369/?no_redirect
http://pixabay.com/p-96174/?no_redirect
https://home.comcast.net/~tomhorsley/game/styles/tools.jpg
31. Geoprocessing with R @ AGILE 2015 31
References
http://blog.52north.org/2015/04/22/advanced-time-series-analysis-on-the-web-with-r/
http://geoportal-glues.ufz.de/stories/openanalysis.html
http://52north.org/wps
• Pebesma, E., D. Nüst, R. Bivand, 2012. The R
software environment in reproducible
geoscientific research. Eos, Transactions
American Geophysical Union 93, vol 16, p.
163164
• CRAN task views “Spatial” (
http://cran.rproject.org/web/views/Spatial
.html ) and “SpatioTemporal” (
http://cran.rproject.org/web/views/SpatioT
emporal.html )
• Nüst, D., Stasch, C. and Pebesma, E. J.
Connecting R to the Sensor Web in
Geertman, S.; Reinhardt, W. and Toppen, F.
(Eds.) Advancing Geoinformation Science for
a Changing World, Springer Lecture Notes in
Geoinformation and Cartography, 2011, 227-
246
• Matthias Hinz, Daniel Nüst, Benjamin Proß,
Edzer Pebesma, 2013. Spatial Statistics on
the Geospatial Web. Short paper, AGILE 2013
• Vaclavik, T., Lautenbach, S., Kuemmerle, T.,
Seppelt, R. (2013): Mapping global land
system archetypes. Global Environmental
Change 23(6): 16371647. DOI:
10.1016/j.gloenvcha.2013.09.004.
Contact: d.nuest@52north.org | @nordholmen
Hinweis der Redaktion
MOTIVATION: data is there, but analysis are not
Hardest part: design of the process (inputs, outputs), occams‘s razor
MIKE JACKSON‘s COMMENT > what is the generic aspect?
R is the lingua franca of statistical analysis today, which is to a great deal thanks to its growing communities and ever increasing number of extension packages [1].
R today has a strong geospatial community [2, 3] and can very well be labeled a full geographic information system (GIS). There are both R wrappers for widespread libraries for handling geospatial raster and vector data, such as geos [9] or gdal [10], and R implementations of
processing functionality, e.g. raster [11], gstat [12], or trajectories [20].
If you want the latest functionality then R is the place to look.
That is why we present two best practices for integrating processing functionality with R into web applications for two use cases.
The first use case requires an intuitive user interface for specific preconfigured analyses in the air quality domain: users should be able to interactively select a specific measurement station and time frame on a map and execute a preconfigured algorithm.
The second use case requires a completely reproducible and transparent analysis execution, flexible parameter configuration, and easy adaptation for scientific research scripts: scientists should be able to publish existing scripts used to conduct analysis and create plots for a scientific publication easily and provide them to other scientists.
Shiny is a web application framework that allows R users to easily publish their analysis as
interactive web applications [4].
Example: water gauge data >
The R extension package sensorweby [6] provides R functions to deliver the JS client through the Shiny web framework and extends it with R’s
advanced analysis functions. It is not possible to connect to the Shiny server component directly, as a specific JavaScript file and session management is only available through the running app. In other words, Shiny is not a generic processing server but an integrated clientserversolution.
A screenshot of the user interface integrating a complex windrose plot, which is readily available in R through the openair package [7], is shown below. In this example application, timeseries data is loaded from both proprietary and standardized sensor web APIs (e.g. OGC Sensor Observation Service [15]) using packages such as sensorweb4R [13] or sos4R [14], but geospatial datasets can also be easily imported into R from web services such as WMS or WFS using rgdal.
WPS4R [16] is a processing backend for the 52°North WPS framework [17], which allows R script developers to publish annotated R scripts as a standardized web service. It is a collaboration platform for web developers, GIS experts and analysts to collaborate on scientific and reproducible geoprocesses using a standardized web interface in a service oriented architecture.
Welldefined source code comments are added to an R analysis script, which is uploaded to the WPS server. The comments are parsed to create the required process metadata as well as map the web interface to an R workspace, i.e. populate input data provided to the process and return output data from the R algorithm to the requesting client application.
Process description for simple calculator
The reproducibility use case implemented with WPS4R deals with a global land use analysis conducted by Vaclavik et al. [18]. The complex analysis consists of several preprocessing steps, a selforganizing map algorithm (SOM) for classification of land use, and a visualisation of the derived land use archetypes. The second step was done in R and published as a reproducible geoprocess.
The following screenshot partially shows a web client [19] that allows users to change specific input parameters and run the SOM based on the global input data, which is available on the server and cannot all be made public. The lower part of the screenshot shows partial outputs of the process, e.g. the
classified data and plots. The client is based on wps-js [21], a form generator for WPS process descriptions supporting clientside preconfiguration to simplify complex processes. It is embedded into a so called “story page” introducing the topic, and the analysis scripts can be downloaded.
Use choice: 16 classes
Map of global land use archetypes calculated for 16 classes using a WPS4R-based geoprocess. The original paper’s classes were manually labelled and appropriately named. This is not possible for the automatic classification, which will not give the same name to the same detected pattern in subsequent runs of the algorithm.
Screenshot of some of output plots based on mere 100 sampling points for illustration purposes: (a) classification information (output of the self-organizing map) showing how the input variables (colour-coded) influence each of the output classes (circles at the top); (b) global plot of "property-distance" which is a measure of classification quality; (c) the global land use archetypes map in a very coarse version; (d) statistical background information of the self-organizing map.
Services are nothing without clients
Shiny allows to reuse R functionality quickly and immediately in web applications instead of reimplementing with more traditional web languages (e.g. Java, JavaScript). While the development and interface are very straightforward and require overseeable efforts, at the current state the solution does not support serviceoriented architectures but remains a standalone solution. Support of classical geospatial analysis and visualisation is demonstrated by examples from the Shiny gallery [8], and of course potential applications go beyond selecting parameters and displaying plots, but can also accept and return spatial data as files.
The two presented open source software stacks demonstrate the variety of web (geo)processing use cases and readily available tools. On the one hand there is an extremely user and developer friendly framework, which allows nonexperts, such as domain scientists, to analyse data in R and publish the results online for human users. Using an “app” approach, specific analyses can be published as focussed tools with limited functions. On the other hand there is a more complex framework, which is build on standards and allows integration into GIS software and spatial data infrastructures (SDI). Wps-js Shows that it is also possible to build easy to use interfaces on top of a complex implementation standard. The WPS-based solution has advantages of standardization and supports very different abstraction levels, whereas the Shinybased application allows to bring analyses to the web with only a few lines of R code.
In this presentation we demonstrate how R can both benefit from SDIs by integrating complex data services via R packages, but also enhance SDIs by providing processing functionality through a standardized web service. We take two use cases out as an example and show that suitable stateoftheart tools allow to implement the diverse requirements for geoprocessing on the web. Future work for WPS4R is expected in the areas of usability (e.g. foureyes principle) and security (e.g. sandboxing), and for sensorweby regarding spatiotemporal analysis of multiple measurement stations.
http://en.wikipedia.org/wiki/Law_of_the_instrument
The concept known as the law of the instrument, Maslow's hammer, Gavel or a golden hammer[a] is an over-reliance on a familiar tool; as Abraham Maslow said in 1966, "I suppose it is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail.“
http://pixabay.com/p-35369/?no_redirect
http://pixabay.com/p-96174/?no_redirect
https://home.comcast.net/~tomhorsley/game/styles/tools.jpg