SlideShare ist ein Scribd-Unternehmen logo
1 von 30
Downloaden Sie, um offline zu lesen
B00624300 Alfredo Conetta EGM701 MSc Research Paper
1
Research Article
A quantitative assessment of the quality of the OpenStreetMap
primary route network across urban areas of Africa.
Alfredo Conetta
MSc Student
University of Ulster
Dr Sally Cook
Department of Environmental Science
University of Ulster
Abstract
The massive increase in the creation, availability, and use of volunteered geographic
information, and in particular OpenStreetMap, has led to a number of studies with the
purpose of assessing its quality. These studies have largely focused on areas where the project
has best taken hold; Germany, France, United Kingdom and the United States. This study
addresses the less studied continent of Africa and in particular Nigeria, Sierra Leone, and
Kenya, and focuses on a comparison of the main route network for positional accuracy and
completeness against a reference dataset, the Multinational Geospatial Co-production
Programme data. The positional accuracy results for the buffer analysis showed that 95% of
the OpenStreetMap test data lay within a buffer of 8.5m for Maiduguri, 9.5m for Nairobi, and
13m for Freetown. These distances are comparable to studies in the UK (Haklay, 2010) and
France (Girres and Touya, 2010). The positional accuracy for coincident junctions showed an
average distance between junctions of 4.07m for Maiduguri, 4.65m for Nairobi, and 5.55m for
Freetown. These figures for positional accuracy would be suitable for many applications of
spatial data. The relative completeness assessment showed that the OpenStreetMap covered
98% of the Reference dataset for Freetown, 93% for Nairobi, and 64% for Maiduguri. These
completeness results suggest that OpenStreetMap of the urban areas in Africa may provide an
alternative source of road data.
1. INTRODUCTION
Following the decision of President Clinton to remove selective availability from civilian GPS in
May 2000, the positional accuracy of GPS technology increased by up to 10 times overnight
(Clinton, 2000). Investment in research and development by commercial vendors increased
dramatically and the technology began to migrating onto other platforms, and eventually the
mobile phone. Handheld GPS receivers reduced in size, became more affordable, and their
applications to recreation and vehicle navigation became more commercial (Haklay and Weber,
2008). With the conditions set a viable Locations Based Services (LBS) industry flourished and is
still flourishing to this day. Couple this with the ongoing evolution in the internet from Web 1.0,
where users could ‘search and access’ static information (Dodge and Kitchen 2011, p.2), to Web
2.0 where the user has much more interaction, creating, editing, and sharing their own content
(O’Reilly, 2005), and you have huge potential. Web 2.0 provided fertile ground for sites such as
Wikipedia to encourage collaboration of users and the ability for users to upload the own data,
and edit the data of others, and it is in the ‘army of editors and checkers’ that the real value lies
(Carr, 2007).
B00624300 Alfredo Conetta EGM701 MSc Research Paper
2
This user information supplied under a Web 2.0 environment has become known as user
generated content (UGC), or when of a geospatial nature, Volunteered Geographic Information
(VGI) (Goodchild, 2007). Web 2.0 also provides the underlying structure to support
crowdsourcing as a resource for projects. The ‘crowd metaphor signifies the power that can
emerge from a mass of individuals converging to tackle a set of tasks’ (Dodge and Kitchen 2011,
p.2). One such project that utilises this approach, and probably the best known VGI project
(Haklay 2010, Mooney et al 2011, Ramm et al 2010), is Steve Coast’s OpenStreetMap.
From its inception in 2004 the goal of OpenStreetMap has been to provide a free street level
dataset covering the world (Ramm et al, 2011). It has spread from the origin in London across the
UK, through Europe and across the Atlantic to the United States, and now has a volunteer
population of more than 2 million registered users (Apr 2015) and touches every continent on
Earth. This rapid spread has been helped largely by the advent of social media, geo tagging, GPS
enabled smart devices and the exponential growth in recent years of the LBS.
The job of mapping the earth accurately has historically rested in the hands of highly trained
cartographers in the military or National Mapping Agencies (NMA) (Haklay and Weber, 2008).
This type of data came from a recognised authoritative source and had been created by qualified
geospatial professionals, providing a trust in the data that is currently not present in VGI.
OpenStreetMap data is created and contributed to by volunteers who may lack cartographic
training, and only have limited restrictions on the methods of creation. The OpenStreetMap
project provides registered users with the power to create, edit, and add information to the
project using either the online Potlatch2 GUI, or the offline JOSM software to trace information
from images. Alternatively users can upload tracks from various GPS enabled platforms; mobile
phones being one. In the early years of the OpenStreetMap project many of the contributions
came from users uploading tracks from handheld GPS, with volunteers concentrating on ensuring
the road networks were captured (Neis & Zielstra, 2014). This lineage adds increased uncertainty
over accuracy; the positional accuracy of GPS devices varies considerably from vendor to vendor,
as does the ability to digitise accurately. Users must register to be allowed to add data; however,
there is still a lack of any quality assurance procedures prior to entering features into the
database, and there is a reliance on other registered users to check and correct data; this adds to
the lack of trust in the data. Similar projects such as Wikipedia successfully use a self-governing
policy whereby a user who adds spurious or incorrect data will have that data removed or
corrected by other users
OpenStreetMap uses large numbers of volunteers as a kind of ‘crowdsourcing’ (Tapscott et al,
2007), which may actually improve its quality if Linus’s law is to be believed; the more volunteer
contributors to a project the higher the quality; also known as the ‘many eyes principle’ (Haklay
et al, 2010). ‘Crowdsourcing’ enables areas of the world that are in crisis to be mapped rapidly by
thousands of volunteers (Mullins, 2010), in the case of OpenStreetMap this could be in excess of
2 million. These figures can be a little flattering as OpenStreetMap also exhibits a similar trend to
Wikipedia, and studies have shown that only a small number of registered users actually
contribute. Zipf’s law of distribution (90:9:1) has also been found to hold true for
OpenStreetMap, and a study by Neis and Zipf (2012) found that approx 5% of registered users
complete 90% of the transactions in the database.
The success of the OpenStreetMap can be attributed to its advantages of responsiveness and
flexibility (Girres and Touya, 2010), which enables it to stay current, outpacing conventional
B00624300 Alfredo Conetta EGM701 MSc Research Paper
3
mapping techniques. Traditionally national mapping is revised on a cycle of 2-5 years, causing an
ingrained lack of currency.
This aim of this study is to fill the knowledge gap concerning the quality of OpenStreetMap data
within the continent of Africa by assessing the positional accuracy and completeness of
OpenStreetMap against a common baseline reference dataset. The study covers three separate
areas from three different countries across the West and Centre of Africa.
1.1 Assessing the Quality of OpenStreeMap
Cartographer, Geographers and Geoscientists have struggled with the issue of data quality since
the first maps were made; this has become further compounded with the advent of Geographical
Information Systems and investigations into the propagation of error through geospatially based
models (Chrisman, 1991).
The International Organization for Standardization (ISO) is the overarching body for standards for
geospatial data. The new ISO standard 19157 (formerly 19113, 19114, 19138) provides the
principles by which spatial data quality is measured. The organisation defines the purpose of
defining the quality of geospatial information as ‘..to facilitate the comparison and selection of
the data set best suited to application needs or requirements’. ISO 19157. These standards
provide guidance for the NMAs, Militaries, and Commercial companies throughout the world.
The reservations that many of these organisations have with projects such as OpenStreetMap
are understandable when we consider that OpenStreetMap is not created by professional
cartographers and does not adhere to ISO 19157. Despite its power, crowdsourcing does worry
some, Keen (2007) articulating a concern that ‘it represents a disturbing trend that increases the
influence of amateurs at the expense of legitimate experts’.
The ISO standard uses a number of principles to define the quality of geospatial data.
 Completeness – The presence of an object in a dataset. This covers the omission as well
as the commission of objects.
 Logical consistency – The absence of conflicts in the dataset.
 Positional Accuracy – The accuracy of the position of the object in relation to its actual
real world position.
 Temporal Accuracy - How the provided temporal information is temporally consistent
with the data.
 Thematic Accuracy – This relates to how accurate the descriptive information for an
object is.
The use or usefulness is often mentioned as one of these quality principles but this is largely
down to the trust that the user has in the information that he is using, and its suitability for the
intended purpose. With the MGCP Reference dataset, its rigorous adherence to standards, and
its published accuracy statements, the trust is fairly inherent in the data.
Despite the obvious advantages of being free there are still barriers to OpenStreetMap, in
particular there are questions that still need answering in regards to its quality, and what this
means for its potential use. There is little doubt that many features in the OpenStreetMap data
are of good quality, as has been borne out by the studies conducted against the OS data for
B00624300 Alfredo Conetta EGM701 MSc Research Paper
4
London (Haklay 2010, Kounadi 2009). The issue is that not everyone who contributes to the data
adds quality data (Ciepluch et al, 2011). The majority of the respondents to a Data Quality
survey (DWG's 2007/2008) identified that companies were ‘willing to pay more for higher quality
data in their projects, if they could just be sure that the quality was there’; Quality may be more
important than cost. Despite numerous studies it is hard to determine if the quality measured is
repeatable in different locations.
Many aspect of spatial data quality are hard to quantify, this has led to studies generally focusing
on assessing the more quantifiable aspects of quality, such as positional accuracy and
completeness. How these qualities are assessed has generally varied in nature from comparison
against an authoritative Reference dataset (Goodchild and Hunter 1997, Haklay and Weber 2008,
Kounadi 2009, Ather 2009, Haklay 2010, Koukoletsos et al 2011, Ludwig et al 2011, Zhou et al
2014), to studies focussing on assessing less tangible information from the database, the amount
of times a feature has been edited (Mooney and Corcoran, 2012), the motivation of the
contributors(Coleman et al 2009, Budhathoki and Haythornthwaite 2012, Begin et al 2013), and
how these things affect the quality. The measure of accuracy of the thematic data has received
some attention, but perhaps understandably, not as much as positional accuracy and
completeness.
An early use of the increasing buffer for assessing the positional accuracy was the comparison
study into the positional accuracy of the OpenStreetMap road network data against Ordnance
Survey(OS) data completed by Haklay (2010) who found that it was ‘fairly accurate’ with most of
the data within 6m of the OS data. But the issue of quality did not escape Haklay who also makes
mention of the lack of an integrated quality-assurance mechanism, something that is very much
a continued barrier to even more widespread usage of OpenStreetMap. Other studies continued
in the home of OpenStreetMap with Ather (2009) and then Kounadi (2009) both using buffers to
assess positional accuracy of OpenStreetMap datasets for areas of London. Girres and Touya
(2010) took the work a step further and made a quality assessment of the French OSM data in
which they extended the amount of quality elements that were assessed: Geometric accuracy;
attribute accuracy; Completeness; Logical consistency; Semantic accuracy; Lineage; Usage.
Probably the most densely populated country in the OpenStreetMap database is that of
Germany. Zielstra and Zipf (2010) assessed the growth of the OpenStreetMap data for Germany
against the growth of Tele Atlas data, which showed that the OpenStreetMap had grown at a
significantly faster pace than the Tele Atlas data and that for the five towns under study the
differences in the data were shrinking as a result.
Studies have continued to branch out with one in Tehran (Forghani and Delavar, 2014) which
used a grid to divide both the Test and Reference datasets into 1km2
grid cells so that the roads
could be compared for completeness and the results visualised as a grid. There was a flaw in this
methodology as the roads were not matched before the assessment and could result in a 100%
completeness rating because the road lengths for both datasets are the same length, not
necessary the same roads. This study also used the increasing buffer technique to assess the
positional accuracy of the OpenStreetMap dataset.
For the data to be used by geospatial professionals it requires a quality assessment that will
determine which applications the data can be used for, in other words, its ‘fit for a particular
purpose’ (Goodchild 2008, Haklay and Weber 2008).
B00624300 Alfredo Conetta EGM701 MSc Research Paper
5
2. METHODOLOGY
An assumption that underpins this study and subsequent analysis is that the Reference dataset
representing the Primary roads is of higher quality spatially than the Test dataset, is consistent in
terms of its quality (Haklay 2010), and that the measurement of completeness is a relative
measurement against the data contained in the Reference dataset (Zielstra and Zipf, 2010). Due
to the continued updating of OpenStreetMap it has the ability to stay very current in areas that
are of interest to the contributors, while other areas may be less well mapped and not have a
uniform density of coverage. The outline methodology employed for this study can be seen in
Figure 1.
Figure 1- Employed methodology.
2.1 Project Study Areas
The continent of Africa has received little attention in OpenStreetMap quality studies, perhaps
due to the cost and availability of a reference dataset with sufficient quality for comparison.
Many of the countries of Africa do not have the resources to create and maintain their own
authoritative geospatial data, and the extensive cost of commercial data is often prohibitive. It is
therefore even more important for people wishing to use mapping data for the continent of
Africa to understand the quality of free sources of data such as OpenStreetMap. This study
focused on urban areas in three African countries, enabling not only an assessment of the quality
of data in Africa but also an insight into the quality of OpenStreetMap in different areas of Africa.
B00624300 Alfredo Conetta EGM701 MSc Research Paper
6
The first area to be studied was Maiduguri in Nigeria (Figure 2), Africa’s most populous country
and also boasting the best internet penetration in Africa. The second area was Freetown in Sierra
Leone (Figure 3), chosen as an area that has limited internet penetration, and the last area was
Nairobi Kenya (Figure 4), chosen as a heavily populated African capital city.
Figure 2- Study Area 1, Maiduguri in Nigeria. Figure 3- Study Area 2, Freetown in Sierra Leone.
Figure 4- Study Area 3, Nairobi in Kenya.
B00624300 Alfredo Conetta EGM701 MSc Research Paper
7
2.2 Reference Dataset
Many studies of the quality of OpenStreetMap have been conducted using the comparison
against a Reference datasets (Flanagin and Metzger 2008, Doan et al 2011, Haklay 2010) where
the Reference dataset is from an authoritative source for the country being studied. This study
utilises a Reference dataset that has been created over a number of countries by the same
organisation, UK Defence. This provides the ability to compare OpenStreetMap from different
countries against a common ‘baseline’ Reference dataset; as OpenStreetMap is usually created
by contributors who come from the areas that they contribute this study further enables a
comparison of the differences between the contributions from different countries.
The Reference data for this study came from the UK’s contribution to the Multinational
Geospatial Co-production Program (MGCP), and was supplied by the Defence Geographic Centre.
The MGCP dataset is not dissimilar program to the OpenStreetMap project, with the aim of the
participating nations to contribute to the creation of a complete database of the world; however,
this is a dataset that is used to create military topographic mapping. The dataset is created by
the mapping agencies of participating nations collected to a rigid specification aimed at mapping
126 layers containing features that could be used produce a 1:50,000 MGCP Derived Graphic for
military operations. The specification states that the positional accuracy of features should be at
least 25m circular error; the extracted feature will be within 25m of its true location on the earth
95% of the time.
Despite being tied to a 1:50,000 products, the scale relates to the density of extracted features
rather than its positional accuracy, which is reflected in the absence of the smaller residential
roads. The reality is that the MGCP dataset is collected from HR satellite imagery and other
sources at a scale of 1:2,000 (See Figure 5). The dataset was released in 2011 but has been
created from a number of sources with dates ranging from 2003-2011.
Figure 5- Data sources for MGCP. (Farkas, 2009)
B00624300 Alfredo Conetta EGM701 MSc Research Paper
8
It is important that the Reference data is of sufficient accuracy to be considered to be a fair
representation of the truth. The dataset is positionally accurate to within metres as can be seen
in Figure 6. An initial assessment of manually digitised junctions from high resolution imagery
was used to ensure the credibility of the Reference data. From a sample of 428 digitised
junctions the MGCP junctions were on average within 3.6m. This image shows a random pair of
junctions with the associated Reference dataset road network with the data within metres of the
actual centre line of the road.
Figure 6- Visual assessment of Reference roads in Maiduguri.
The dataset has been created by cartographically trained professionals to a rigid specification.
There are numerous layers collected during the MGCP production process, amongst which are
the main road network, including motorways, A roads and B roads. The residential roads are not
collected unless they are of strategic importance.
During the creation of the MGCP dataset analysts extracts the road data by digitising the centre
line of the roads; this is not the case for the OpenStreetMap data which captures the road in
both directions. This issue with the different representation of the road network was also
experience by Ather (2009). The advantage that the MGCP has provided to this study that
datasets used in other studies have not is that they have an attribute for the road width that is
mensurated from high resolution imagery by a trained image analyst; this enabled a more
accurate measurement to be used in the buffer operations, as each of the roads can be buffered
to reflect the actual width of the road, as opposed to a suggested norm being used.
B00624300 Alfredo Conetta EGM701 MSc Research Paper
9
2.3 Test Datasets
The Maiduguri Test dataset was downloaded from the OpenStreetMap website using the
OpenStreetMap editor tools for ArcGIS desktop at the start of the project in Nov 14. The
OpenStreetMap for Freetown and Nairobi were downloaded as shapefiles from the website
Geofabrik in Jan 15.
2.4 Data Matching
For the OpenStreetMap Test dataset to be compared against the Reference dataset, it required
some form of data matching to be carried out to ensure that the comparison was meaningful
(Ellul et al, 2012). To start this process of ensuring features could be compared both datasets
were stripped down to contain only the motorways, trunk, primary and secondary roads.
Other studies have looked at the possibility of automating the matching process, with varying
degrees of success, the main issue with trying to automatically match features being that
OpenStreetMap test datasets are hampered by their heterogeneity. This was particularly noted
as a problem with the study conducted over France (Girres and Touya, 2010). This is due to
contributors having freedom to create data without adherence to a ‘rigid’ specification and the
differences in methods of creation, either GPS uploads or the digitisation of features at varying
scales and with varied accuracy.
On visual inspection it was evident that there was a good match between features of both
datasets; it was however necessary to remove the residential roads from the Test dataset as they
were not represented in the Reference dataset. To accomplish this a 25m buffer was created
around the Reference dataset, which was then used to select all the features from the Test
dataset that had their centroid within the buffer. This removed a large majority of the residential
roads, and roads that had no partner feature in the Reference dataset. Figure 7 shows the buffer
with the selected roads highlighted in cyan. The dark blue roads represent roads that have no
matching road in the Reference dataset. There had to be a vast amount of manual editing to
ensure the representation of certain OpenStreetMap junctions were simplified at some
locations. This was largely down to the extremely detailed configuration of the junctions in the
OpenStreetMap and simplistic representation in the Reference.
B00624300 Alfredo Conetta EGM701 MSc Research Paper
10
Figure 7- Selection by location with centroid enclosed.
The selection by location still left a number of roads within the dataset that needed manual
attention; some examples are explained in the following pages. Figures 8 and 9 show road
features that have been selected by the analysis but have no feature in the Reference dataset to
be compared against, and were therefore manually deleted.
Figure 8- Feature extending. Figure 9- No matching feature.
B00624300 Alfredo Conetta EGM701 MSc Research Paper
11
In Figure 10 the road highlighted by the red arrow clearly fits into the buffered selection but has
not been included in the selection as its centroid is outside of the buffered area. This was
resolved by splitting the feature to enable its subsequent inclusion in the selection. Figure 11
shows the extension of roads that have no element for comparison; again these were manually
clipped to the edge of the buffer.
Figure 10- Feature needs inclusion. Figure 11- Feature needs reducing.
In Figures 12 and 13 on the next page, the highlighted lines in cyan show two examples where
the digitisation of the road had created a single feature that moved from B Road to residential
road. This was corrected by splitting the feature within the buffer zone ensuring that the
residential part of the feature would have a centroid outside of the buffer, and as a result would
not be selected with the following selection by location.
Figure 12- Feature needs splitting. Figure 13- Feature needs splitting.
B00624300 Alfredo Conetta EGM701 MSc Research Paper
12
Both the quality of position and completeness were assessed on the Test datasets after they had
been data matched with the Reference dataset, and as such the results are compared against the
Reference dataset and not the reality of the ever changing element of real life.
2.5 Assessing Positional Accuracy
The primary analysis was to find out the positional accuracy of the Test dataset using the
Increasing Buffers originally developed by Goodchild and Hunter (1997). Buffers were created
around the Reference dataset, in a similar manner to Haklay (2010) and Forghani and Delavar
(2014). The buffer distances for previous studies have concentrated on a standard buffer
distance dependant on the type of road, gradually increasing until the Test roads encapsulated
by the buffer reach the 95th
percentile. This study improves this method by using buffer widths
that are derived from imagery analysis of the actual widths of the individual roads (available
from an attribute in the Reference dataset). To provide some consistency with other studies, a
number of buffers were created including one that reached the 95th
percentile. In all instances
the buffer was used to clip the Test dataset providing a measurement of road length that was
encapsulate by the buffer.
The second method used to assess the positional accuracy of the Test dataset was the
identification of known points in both datasets (Girres and Touya 2012, Helbich et al 2012), and
measuring the Euclidean distance between them. The junctions were created for both the
Reference and Test datasets using the network analyst extension in ArcGIS 10.1. After a visual
assessment a buffer of 10m was created around the Reference junctions, which was then used to
select the junctions from the Test dataset that were coincident with the Reference dataset. A
near analysis was then used to get the Euclidean distance between both representations of the
junction.
B00624300 Alfredo Conetta EGM701 MSc Research Paper
13
2.6 Assessing Completeness
The measure of completeness of OpenStreetMap is one of the most difficult assessments to
conduct; this is due to the OpenStreetMap dataset existing in a continued state of flux as
contributors are continually adding new features. It is therefore difficult to find a Reference
dataset that is as current or complete as the Test dataset. It is for this reason the Test dataset
must be matched to the Reference dataset and the completeness assessment is therefore a
relative assessment (Zielstra and Zipf, 2010).
Forghani and Delavar (2014) compared completeness by clipping the data into a 1km2
grid and
comparing each 1km2
for completeness, measuring completeness as a function of road length of
both the Test and Reference datasets within that 1km2
. This may provide a 100% completeness
rating even if none of the roads in the 1km2
coincide with each other. To overcome this issue the
completeness in this study was carried out on the data after it had been matched with the
Reference dataset, removing errors of commission. Commission errors are additional roads
which have been captured in error and are included in the Test dataset.
To begin this process a 20km2
area of a UTM grid was downloaded from the internet and
projected to the relevant UTM zone for the town under study. To measure the completeness of
the Test dataset against the Reference dataset both datasets were clipped into 400 separate
1km2
groups of features. The model in Figure 14 was used at the start of this process to split the
20km2
grid into 400 individual 1km2
grids.
Figure 14- Model for splitting the Master Grid into individual 1km2
.
The model in Figure 15 on the following page was used to clip both datasets to the 400 individual
1km2
cells. The resultant 1km2
chips of Reference data were then dissolved and merged back to
one dataset of 400 features containing a combined road length for each of the 1km2
. This was
repeated for the Test dataset. Both datasets were then spatially joined with the Master Grid so
they inherited the Object ID for the 1km2
within the grid. This Object ID was then used to join
both of the individual Reference and Test datasets to the relevant Cell within the Master Grid,
and with it the road length data allocated to each 1km2
. A column was then added to the Master
Grid attribute table that enabled a calculation to work out the percentage of the Test dataset
that covered the Reference dataset.
B00624300 Alfredo Conetta EGM701 MSc Research Paper
14
Figure 15- Model for clipping the Reference and Test datasets to the 1km grid cells.
As well as looking at each individual 1km2
grid cell it is of interest to look as the tendency of
OpenStreetMap to have good coverage in the urban centres but limited coverage in the more
rural areas. To provide a basic assessment of this phenomenon a buffer was created at a
distance of 5km from a nominal centre of the urban area which provided a figure for road length.
This road length was then compared to a figure created by a buffer 10km from the same nominal
centre. By subtracting the length of roads in the 5km from the same figure for the 10km buffer it
was possible to get the figure for road length in the band between 5km and 10km (Rural). Figure
16 shows the two buffers used.
Figure 16- Model for clipping the Reference and Test datasets to the 1km2
grid cells.
B00624300 Alfredo Conetta EGM701 MSc Research Paper
15
3. RESULTS
This section details the results for the data qualities of positional accuracy and completeness of
the individual areas independently starting with Maiduguri in NW Nigeria, followed by Freetown
in Sierra Leone, and finally on Nairobi in Kenya. The meaning of the results will be discussed in
the next section.
3.1 Data Matching Maiduguri
Figure 17 below shows the result of the data matching process.
Figure 17- Data Matched full length comparison Maiduguri.
3.2 Full Length Comparison Maiduguri
The result in Table 1 shows the coverage of the Test dataset against the coverage of the
Reference dataset.
.
Total Reference dataset 791801m N/A
Total Test dataset 509494m 64%
Table1 - Data Matched total length comparison.
B00624300 Alfredo Conetta EGM701 MSc Research Paper
16
3.3 Positional Accuracy Maiduguri (Buffer- Increasing buffer)
The results in Table 2 show the amount of the Test dataset that is encapsulated by the various
buffer widths. The road width is buffered to a number of distances that represent the actual road
width on the ground. The bordered box shows the distance where the 95th
percentile is reached.
Buffer Road Width 5m 8m 8.5m 9m 10m 15m
Test 301543m 436535m 482057m 485284m 487887m 492182m 502523m
% 59.18% 85.68% 94.61% 95.24% 95.75% 96.60% 98.63%
Table 2- Positional accuracy of linear features (Roads).
3.4 Positional Accuracy Maiduguri (Junction Nodes)
Table 3 shows the result of the near analysis between coincident junctions from both datasets.
Nodes 596
Maximum 9.97m
Mean 4.07m
Std Dev 2.25m
Table 3- Positional accuracy of coincident junctions.
B00624300 Alfredo Conetta EGM701 MSc Research Paper
17
3.5 Completeness Assessment Maiduguri – Grid
The map in Figure 18 shows the completeness of the Test dataset as a percentage of the
Reference dataset.
Figure 18- Percentage of Test data coverage per 1km2
.
3.6 Completeness Assessment Maiduguri – Urban v Rural
Table 4 shows the reduction in the coverage of the test dataset as the distance from the centre of
town increases.
Distance from Town Centre Within 5km 5km to10km Outside 10km
Test 307913m 186908m 15282m
%Test 60% 37% 3%
Reference 315191m 364577m 187967m
%Reference 40% 46% 14%
Table 4- Completeness from nominal centre of town.
B00624300 Alfredo Conetta EGM701 MSc Research Paper
18
3.7 Data Matching Freetown
Figure 19 shows the result of the data matching process.
Figure 19- Data Matched full length comparison Freetown.
3.8 Full Length Comparison Freetown
The result in Table 5 shows the coverage of the Test dataset against the coverage of the
Reference dataset.
Total Reference dataset 309401m N/A
Total Test dataset 304532m 98%
Table 5- Data Matched full length comparison.
B00624300 Alfredo Conetta EGM701 MSc Research Paper
19
3.9 Positional Accuracy Freetown (Buffer- Increasing Buffer)
The results in Table 6 show the amount of the Test dataset that is encapsulated by the various
buffer widths. The road width is buffered to a number of distances that represent the actual road
width on the ground. The bordered box shows the distance where the 95th
percentile is reached.
Buffer RoadWidth 5m 10m 12m 13m 14m 15m
Test 118815m 198357m 275728m 285785m 289548m 292178m 294000m
% 39.02% 65.13% 90.45% 93.84% 95.08% 95.95% 96.54%
Table 6- Positional accuracy of linear features (Roads).
3.10 Positional Accuracy Freetown (Junction Nodes)
Table 7 below shows the result of the near analysis between coincident junctions from both
datasets.
Nodes 252
Minimum 0.31m
Maximum 9.92m
Mean 5.55m
Std Dev 2.56m
Table 7- Positional accuracy of coincident junctions.
B00624300 Alfredo Conetta EGM701 MSc Research Paper
20
3.11 Completeness Assessment Freetown
The map in Figure 20 shows the completeness of the Test dataset as a percentage of the
Reference dataset.
.
Figure 20- Percentage of Test data coverage per 1km2
.
3.12 Completeness Assessment Freetown– Urban v Rural
Table 8 shows the reduction in the coverage of the test dataset as the distance from the centre of
town increases.
Distance from centre of town Within 5km 5km to10km Outside 10km
Test 176215m 74263m 54810m
%Test 58% 24% 18%
Reference 179138m 75435m 54828m
%Reference 58% 24% 18%
Table 8- Completeness from nominal centre of town.
B00624300 Alfredo Conetta EGM701 MSc Research Paper
21
3.13 Data Matching Nairobi
Figure 21 below shows the result of the data matching process.
Figure 21- Data Matched full length comparison Nairobi.
3.14 Full Length Comparison Nairobi
The result in Table 9 shows the coverage of the Test dataset against the coverage of the
Reference dataset.
Total Reference dataset 983357m N/A
Total Test dataset 919756m 93%
Table 9- Data Matched full length comparison.
B00624300 Alfredo Conetta EGM701 MSc Research Paper
22
3.15 Positional Accuracy Nairobi (Buffer- Increasing buffer)
The results in Table 10 show the amount of the Test dataset that is encapsulated by the various
buffer widths. The road width is buffered to a number of distances that represent the actual road
width on the ground. The bordered box shows the distance where the 95th
percentile is reached.
Buffer Road Width 5m 9m 9.5m 10m 15m
Test 518625m 739168m 873482m 878371m 882414m 902216m
% 56.39% 80% 94.96% 95.49% 95.94% 98.09%
Table 10- Positional accuracy of linear features (Roads).
3.16 Positional Accuracy Nairobi (Junction Nodes)
Table 11 shows the result of the near analysis between coincident junctions from both datasets.
Nodes 688
Minimum 0.22m
Maximum 9.95m
Mean 4.65m
St Dv 2.29m
Table 11- Positional accuracy of coincident junctions.
B00624300 Alfredo Conetta EGM701 MSc Research Paper
23
3.17 Completeness Assessment Nairobi
The map in Figure 22 shows the completeness of the Test dataset as a percentage of the
Reference dataset.
Figure 22- Percentage of Test data coverage per 1km2
.
3.18 Completeness Assessment Nairobi– Urban v Rural
The Table 12 below shows the reduction in the coverage of the Test dataset as the distance from
the centre of town increases.
Distance from centre of town 5km 5km to10km Outside 10km
Test 275758m 507531m 136467m
%Test 30% 55% 15%
Reference 289076m 540597m 153684m
%Reference 29% 55% 16%
Table 12- Completeness from nominal centre of town.
B00624300 Alfredo Conetta EGM701 MSc Research Paper
24
4. DISCUSSION
The continued search for a new ‘effective’ method to assess the quality of OpenStreetMap data
is a testament to the difficulties faced when assessing a dataset that is continually being
contributed to by millions of individuals with varying degrees of skill, knowledge, access and
motivation. There is however a pattern that is evident throughout many of the studies
researched (Goodchild and Hunter 1997, Haklay and Weber 2008, Haklay 2010, Kounadi 2009,
Ather 2009, Ludwig et al 2011, Zhou et al 2014), that is the use of a Reference dataset with the
buffer technique for positional accuracy, and the use of a grid system to visualise the results of a
completeness assessment. With this in mind, improvements to this approach would involve
improvements to the Reference dataset.
The initial assessment of positional accuracy was the road width attributes buffer distance that
was used on the Maiduguri Test dataset. This analysis created a number of widths of buffers
which resulted in 59% of the Test dataset being encapsulated within the various buffers. To
further assess the positional accuracy the buffer continued until 95% of the Test dataset was
contained by the buffer. Table 2 shows that the buffer at 8.5m contains 95% of the Test dataset.
The percentages of the Freetown Reference dataset enclosed by the road width attribute were a
mere 39.02% of the dataset(Table 6); this difference in positional accuracy in Freetown continues
only reaching 95% coverage using a buffer of 13m. The final area studied, Nairobi, had only
56.39% enclosed by the road width attribute but managed 95% when the buffer reached 9.5m.
By using a similar method to that employed in other studies it is possible to have some form of
comparison, although only at a rudimentary level, as the difference in data matching and the
quality of the Reference dataset obviously also have influence.
These results compare well against a study on OpenStreetMap of France carried out by
Koukoletsos et al (2011) which reached 95.3% of the Test dataset inside a buffer of 15.5m. In
another study conducted by Zhou et al (2014) buffer sizes of 11.25m and 7.5m resulting in
99.69% and 88.03% respectively, of the Test dataset being encapsulated. These distances again
assigned the distances dependent on road type. A study by Siebritz and Sithole(2014)used a
standard buffer of 10m to assess the positional accuracy of 9 Provinces across South Africa the
results varied from 64.8% - 94.3% enclosed within the buffer. This would suggest that the
positional accuracy of the OpenStreetMap for all of the areas within this study of Africa is similar
if not better than other studies, but also underlines that the assessment is still only valid for the
area that is under study.
The second part of the positional accuracy relied on the comparison of coincident road junctions
in both the Reference dataset and Test dataset. The results from a similar comparison carried out
in France (Girres and Touya, 2012) also used pairs of road junctions as points to compare. This
resulted in an average error of 6.65m from a sample of 207 pairs, although it also showed a
concentration of 2.5m and 10m showing that there is no consistency to the error; again the data
could have been uploaded from devices with different inherent positional error. All of the study
areas used in this study of Africa used significantly more junctions in the comparison with the
results comparing favourably against the French study. Maiduguri had an average error of 4.07m
from a sample of 596; Nairobi had an average error of 4.65m from a sample of 688; and
Freetown had an average error of 5.55m from a sample of 252. To match the junctions in this
study the 10m buffer ensured that any outliers did not negatively affect the results of the
analysis. These results also have similarities to the distances that were observed by Haklay
B00624300 Alfredo Conetta EGM701 MSc Research Paper
25
(2010) in his comparison between OpenStreetMap to Ordnance Survey, which perhaps lends
some credibility to the spatial accuracy of the Reference dataset.
For an assessment of the completeness of a geospatial dataset to be done the Reference dataset
should be as close to the reality as possible. As OpenStreetMap is continuously being updated by
volunteers it is hard to measure it for completeness without comparing it to the situation on the
ground by visual inspection of Commercial Satellite Imagery; this was out of the scope of this
study because of both cost and time constraints. It then limits the study to a relative assessment
as the Reference dataset is not reality; the study was further hindered by the selection of study
locations as the areas in Africa are not abundant with high quality datasets.
Completeness is intrinsically linked with how current the data is, measured by assessing against
the known reality, or another dataset that is close to it. The issue with this is that this ties
completeness to the currency of the dataset. There are not many datasets that are close to
reality for many areas of Africa, which could be a reason that there are limited studies. Forghani
and Delavar (2014) compared completeness for Tehran by clipping the data into a 1km2
grid and
comparing each individual grid for completeness. A problem with this approach is that without
data matching it is possible to have equal lengths of roads represented in a grid square which
would equate to 100% completeness, even if the roads from both datasets are not coincident.
All three areas that were used for this study covered an area of 20km x 20km, but despite this
the areas differ significantly in the amount of primary road that are present within this area.
Zhou et al (2014) carried out a study on three areas of China measuring completeness on the
comparison of lengths for both the Reference dataset and Test dataset. It showed the maximum
completeness of 60.77%, 32.06% and 28.62% respectively for the three areas studied. The three
areas examined in this study of Africa have better completeness than that identified in China.
Nairobi had the largest concentration of primary roads having 983km in the Reference dataset,
of which the Test dataset covered 93%. Maiduguri had 792km of primary roads in the Reference
dataset with the Test dataset covering 64% of the Reference roads. With the least amount of
primary roads the Freetown Test dataset covered 98% of the 309km in the Reference dataset.
Figure 17 shows the result of the data matching of the roads in Maiduguri. It is evident that as
the distance from the centre of the urban area increases the completeness of the coverage
decreases. This has been reflected in other studies, however the opposite was found to be true
for the US. It is further highlighted in Figure 18 which represents the coverage as a percentage
showing only 95 of the 1km2
had 91-100% completeness, and areas of ‘no cover’ or ‘limited
cover’ at extremities of the urban fringe. There are also 132 grids which had road features from
the Reference dataset but no road features in the Test dataset. This could be because the urban
fringe may be less important to the local population, or perhaps the ability to contribute has
more barriers. These barriers could be a lack of GPS enabled platforms, lack of Mobile/Internet
coverage, or simply a lack of basic computer literacy.
In an attempt to further highlight this trend the final analysis of completeness involved
discovering the percentage of roads which were contained within a circular band between 5km -
10km, defined in this study to represent the Rural areas. Table 4 shows that for Maiduguri 60%
of the roads are within 5km of the centre of town dropping significantly to 37% in the 5km-10km
band; this is not reflected in the Reference dataset which has 40% within 5km and 46% in the
5km-10km band.
B00624300 Alfredo Conetta EGM701 MSc Research Paper
26
Freetown, despite having the worst of the positional accuracy results, 95% within 13m, actually
has 98% of the Reference dataset represented in the Test dataset, as can be seen in Figure 19.
Table 8 shows that for Freetown 58% of the roads are within 5km of the centre of town dropping
significantly to 24% in the 5km-10km band, this is however reflected in the Reference dataset
which has 58% within 5km and 24% in the 5km-10km band, as would be expected from nearly
100% coherence between Reference dataset and Test dataset.
Nairobi had by far the most roads in the Reference dataset, despite this there was 93% of the
dataset represented by the Test dataset, as can be seen in Figure 21. Table 12 shows that 30% of
the roads are within 5km of the centre of town rising significantly to 55% in the 5km-10km band;
this is however is also reflected in the Reference dataset which has 29% within 5km and 55% in
the 5km-10km band. This doesn’t follow the trend and on inspection of Figure 22 it is evident
that the coverage has no spatial autocorrelation. This difference in the coverage in 5km-10km
band could be that Nairobi is much bigger in size and the 5km-10km band is still in the urban
area. This could be down to the lack of interest from the general population in this area.
It is interesting to note, and perhaps not surprising, the study area with the lowest mobile phone
and internet coverage, Freetown, had the worst figures for positional accuracy. To contribute to
the OpenStreetMap project there is a requirement to have a GPS platform, usually a mobile
phone, and an internet connection (Neis and Zielstra, 2014). This could mean that poorer
nations across Africa may reflect similar results to those identified in this study. It seems ironic
that those nations that could most benefit from free spatial data may have too big a hurdle in the
form of the ‘Digital Divide’, experiencing ‘participation inequality’ (Neis and Zielstra, 2014).
This study does have a few weaknesses, one of which was defining the actual currency of the
Reference dataset due to the different imagery sources used. Although consistent throughout in
density and quality of positionally accuracy, and having a respectable lineage, the Reference data
contained fewer roads than the OpenStreetMap Test dataset. By data matching the
OpenStreetMap Test datasets prior to the completeness there was a lot of information that was
not studied. This is the case for most reference datasets that exist as already mentioned it is hard
to stay as current as a database that never sleeps.
6. SUMMARY
Adoption of the OpenStreetMap dataset will ultimately be a measure of its success, and perhaps
OpenStreetMap will not lead to the death of the NMA (Devillers et al, 2012) but may lead to a
revision of its operations to ensure that they stays relevant in the world that can now be mapped
by volunteers, on digital platforms, every minute of the day. Given the accuracy identified in this
study it may prompt further study of the continent of Africa and lend some weight to the
adoption of the OpenStreetMap data to increase the density of the coverage in Government
datasets.
B00624300 Alfredo Conetta EGM701 MSc Research Paper
27
References
Ather, A. (2009). A Quality Analysis of OpenStreetMap Data. MEng. Thesis. University College
London: U.K.
Barron, C,. Neis, P., Zipf, A. (2013), ‘A comprehensive Framework for Intrinsic OpenStreetMap
Quality Analysis’, Transactions in GIS, Vol6, p.76-106.
Begin, D., (2012), ‘Towards Integrating VGI and National Mapping Agency Operations – A
Canadian Case Study’, Role of Volunteer Geographic Information in Advancing Science: Quality
and Credibility Workshop, GIScience Conference, September 18, Columbus Ohio.
Begin, D., Devillers, R., Roache, S. (2013), ‘Assessing Volunteered Geographic Information (VGI)
Quality Based on Contributors’ Mapping Behaviours’, International Archive of the
Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol XL-2/W1, 8th
International Symposium on Spatial data Quality 30 May – 1Jun 13, Hong Kong.
Budhathoki, N, R., Haythornthwaite, C. (2013), ‘Motivation for open Collaboration: Crowd and
Community Models and the Case of OpenStreetMap ’, American Behavioral Scientist, Vol57,
p.548-575.
Carr, N, G. (2007),‘The ignorance of crowds’ Strategy + Business Magazine, 47: 1-5.
Chrisman, N, R.(1991), ‘The Error Component In Spatial Data’, Geographical Information Systems:
Overview Principles and Applications, Eds D J Maguire, M F Goodchild, D W Rhind, Longman,
Harrow, Essex p. 165-174.
Ciepluch, B., Mooney, P., Jacob, R. (2011), ‘A comparison of the accuracy of OpenStreetMap for
Ireland with Google Maps and Bing Maps’, In the Proceedings of the Ninth International
Symposium on Spatial Data Accuracy in Natural Resources and Environmental Science, Leicester,
UK, 20-23 July 2010.
Ciepluch, B., Mooney, P., Jacob, R. (2011), ‘Sketches of Generic Framework for Quality
Assessment of Volunteered Geographical Data’, IEEE Geoscience and Remote Sensing Society
(GRRS), 1-5.
Coleman, D., Georgiadou, Y., Labonte, J. (2009), ‘Volunteered Geographic Information: the
Nature and Motivation of Producers’, Article under Review for the International Journal of
Spatial data Infrastructures Research, Special Issue GSDI-11, Submitted 2009.
Corcoran, P., Mooney, P. (2012), ‘The Annotation Process in OpenStreetMap’, Transactions in
GIS, 2012, Vol 16(4), p. 561-579.
Corcoran, P., Mooney, P., Bertolotto, M. (2013), ‘Analysing the Growth of OpenStreetMap
Networks’ Spatial Statistics, Vol 3, p.21-32.
B00624300 Alfredo Conetta EGM701 MSc Research Paper
28
Devillers, R., Begin, D., Vandecasteele, A. (2012), ‘Is the rise of Volunteered Geographic
Information (VGI) a sign of the end of National Mapping Agencies as we know them?’, GIScience
2012 Workshop ‘Role of Volunteer Geographic Information in Advancing Science: Quality and
Credibility’, Columbus, OH, September 18, 2012.
Dodge, M., Kitchin, R. (2011), ‘Mapping Experience: Crowdsourced Cartography’ Social Sciences
Research Network, Vol 4, p.55-80.
Farkas, I. (2009),’Multinational Geospatial Co-production Program –Production Worldwide and in
Hungary’, Geoscience, Vol 8 N⁰1, 151-157.
Flanagin, A., Metzger, M. (2008),‘The Credibility of Volunteered Geographic Information’,
GeoJournal, Vol 72, p.137-148.
Forghani, M., Delavar, M,R. (2014), ‘A Quality Study of the OpenStreetMap Dataset for Tehran’,
ISPRS International Journal of Geo-Information, Vol 3, p. 750-763.
Girres, J., Touya, G. (2010), ‘Quality Assessment of the French OpenStreetMap Dataset’,
Transactions in GIS, 2010, Vol 14(4), p. 435-459
Goodchild, M, F. (2007), ‘Citizens as Sensors: The world of volunteered geography’. GeoJournal,
Vol 69, p.211-221.
Goodchild, M, F.(2008), ‘Assertion and Authority: The science of user-generated geographic
content’. Proceedings of the Colloquium for Andrew U. Frank’s 60th
Birthday, Department of
Geoinformation and Cartography, Vienna University of Technology, Vienna, Austria.
Haklay, M., Weber, P. (2008),’OpenStreetMap – User-generated Street Map’, IEEE Pervasive
Computing, Vol 7, p. 12-18.
Haklay, M., (2010), ‘How good is volunteered geographical information? A Comparative Study of
OpenStreetMap and Ordnance Survey datasets’, Environment and Planning B: Planning Design
2010, Vol 37, p. 682-703
Helbich, M., Amelunxen, C., Neis, P., Zipf, A., (2012), ‘Comparative Spatial Analysis of Positional
Accuracy of OpenStreetMap and Proprietary Geodata’, accessed online [13 Dec 2014]
http://koenigstuhl.geog.uni-
heidelberg.de/publications/2010/Helbich/Helbich_etal_AGILE2011.pdf
Keen, A. (2007), ‘The Cult of the Amateur: How Todays Internet is Killing Our Culture’, Doubleday,
New York, NY, USA.
Keßler, C., René Theodore , R., de Groot, A. (2013),‘Trust as a Proxy Measure for the Quality of
Volunteered Geographic Information in the Case of OpenStreetMap’. D. Vandenbroucke et al.
(eds.), Geographic Information Science at the Heart of Europe, Lecture Notes in Geoinformation
and Cartography, DOI: 10.1007/978-3-319-00615-4_2, _ Springer International Publishing
Switzerland 2013.
B00624300 Alfredo Conetta EGM701 MSc Research Paper
29
Koukoletsos, T., Haklay, M., Ellul, C. (2012), ‘Assessing Data Completeness of VGI through an
Automated Matching Procedure for Linear Data’, Transactions in GIS, 2012, Vol 16(4), p. 477-498
Kounadi, O. (2009). Assessing the Quality of OpenStreetMap Data. MSc. Thesis. University
College London: U.K.
Ludwig, I., Voss, A., Krause-Traudes, M. (2011),’A Comparison of the Street Networks of Navteq
and OSM in Germany’, Advancing Geoinformation Science of a Changing World Lecture Notes in
Geoinformation and Cartography 2011, published by Springer, 65-84.
Mooney, P., Sun, H., Corcoran, P., Yan. L, (2011), ‘Citizen Generated Spatial Data and
Information: Risks and Opportunities’ Proceedings of the 2nd
International Conference on
Network Engineering and Computer Science, Xi’an, Shaanxi, China, 23-25 September.
Mooney, P., Corcoran, P. (2012) ‘The annotation process in OpenStreetMap’. Transactions in GIS
16(4):561–579
Mooney, P., Corcoran, P. (2012) ‘Characteristics in Heavily Edited Objects in OpenStreetMap’.
Future Internet, Vol 4, 285–305
Mullins, J. (2010, Jan), Haiti gets help from net effect, NewScientist.
Neis, P., Zielstra, D. (2014), ‘Recent Developments and Future Trends in Volunteered Geographic
Information Research: The Case of OpenStreetMap’, Future Internet, vol6, p.76-106.
Neis, P., Goetz, M, and Zipf, A. (2012), ‘Towards automatic Vandalism detection in
OpenStreetMap’, ISPRS International Journal of Geo-Information, Vol1, p.315-332.
O’Reilly, T (2005), What is Web 2.0: Design Patterns and Business Models for the Next
Generation of Software, O’Reilly Media, Cambridge, MA, USA.
President Clinton (2000), ‘Statement by the President regarding the United States’ decision to
stop degrading Global Positioning System Accuracy’, Office of the Press Secretary, White House.
http://clinton3.nara.gov/WH/EOP/OSTP/html/0053_2.html [accessed 03 Nov 14]
Ramm, F., Topf, J., Chilton, S., (2011),’ OpenStreetMap: Using and Enhancing the Free Map of the
World’, UIT, UK, Cambridge.
Severinsen, J., Reitsma, F., (2013), ‘Finding the Quality in Quantity: Establishing Trust For
Volunteered Geographic Information’, SIRC NX 2013 GIS and Remote Sensing Research
Conference, University of Otago, Dunedin, New Zealand, 29th
-30th
August 2013.
Siebritz, L., Sithole, G.(2014),’Assessing the Quality of OpenStreetMap in South Africa in
Reference to National Mapping Standards’, Proceedings of the 2nd
AfricaGEO Conference, South
Africa, Cape Town, 1-3 July 2014.
Sehra, S., Singh, S, J., Rai, H, S. (2014), ‘A Systematic Study of OpenStreetMap Data Quality
Assessment’, 11th
International Conference on Information Technology: New generations.
B00624300 Alfredo Conetta EGM701 MSc Research Paper
30
Tapscott, D. Williams, A, D.(2007) ‘Wikinomics: How Mass Collaboration Changes Everything’,
New York, Portfolio Hardcover .
Zielstra, D., Zipf, A., (2010), ‘A Comparative Study of Proprietary Geodata and Volunteered
Geographic Information for Germany’, 13th
AGILE International Conference on Geographic
Information Science, Guimarães, Portugal, 2010.
Zhou, P., Huang, W., Jang, J. (2014), ‘Validation analysis of OpenStreetMap Data in Some Areas of
China’, The International Archives of Photogrammetry, Remote Sensing and Spatial information
Sciences, Vol XL-4, 2014 ISPRS Technical Commission IV Symposium, 14-16 may 2014, Suzhou,
China.

Weitere ähnliche Inhalte

Was ist angesagt?

Social Geosemantics
Social GeosemanticsSocial Geosemantics
Social GeosemanticsDiegoCerda
 
Citizen Science, Geocrowdsourcing and Big Data in Urban Context
Citizen Science, Geocrowdsourcing and Big Data in Urban ContextCitizen Science, Geocrowdsourcing and Big Data in Urban Context
Citizen Science, Geocrowdsourcing and Big Data in Urban ContextMaria Antonia Brovelli
 
Big data from space technology 150611 @ spaceops 2015
Big data from space technology 150611 @ spaceops 2015Big data from space technology 150611 @ spaceops 2015
Big data from space technology 150611 @ spaceops 2015Pier Giorgio Marchetti
 
OpenStreetMap and CycleStreets: collaborative map-making and cartography in t...
OpenStreetMap and CycleStreets: collaborative map-making and cartography in t...OpenStreetMap and CycleStreets: collaborative map-making and cartography in t...
OpenStreetMap and CycleStreets: collaborative map-making and cartography in t...CycleStreets
 
Geospatial intelligence satellite applications catapult pdf - july 23 2019
Geospatial intelligence   satellite applications catapult pdf - july 23 2019Geospatial intelligence   satellite applications catapult pdf - july 23 2019
Geospatial intelligence satellite applications catapult pdf - july 23 2019Peter Bloomfield
 
IMED 2018: Innovations and Challenges in the Use of Open-source Remote Sensin...
IMED 2018: Innovations and Challenges in the Use of Open-source Remote Sensin...IMED 2018: Innovations and Challenges in the Use of Open-source Remote Sensin...
IMED 2018: Innovations and Challenges in the Use of Open-source Remote Sensin...Louisa Diggs
 
Big Data for Local Context
Big Data for Local ContextBig Data for Local Context
Big Data for Local ContextGeorge Percivall
 
Crowd-Sourced Mapping for Open Government
Crowd-Sourced Mapping for Open GovernmentCrowd-Sourced Mapping for Open Government
Crowd-Sourced Mapping for Open GovernmentMicah Altman
 
Gi science discipline_foloo
Gi science discipline_folooGi science discipline_foloo
Gi science discipline_foloooloofrank
 
Open Standards Role in EarthCube (AGU 2013)
 Open Standards Role in EarthCube (AGU 2013) Open Standards Role in EarthCube (AGU 2013)
Open Standards Role in EarthCube (AGU 2013)George Percivall
 
Big Data, Data and Information Mining for Earth Observation
Big Data, Data and Information Mining for Earth ObservationBig Data, Data and Information Mining for Earth Observation
Big Data, Data and Information Mining for Earth ObservationPier Giorgio Marchetti
 
Taking Citizen Science to Extremes: from the Arctic to the Rainforest
Taking Citizen Science to Extremes:  from the Arctic to the RainforestTaking Citizen Science to Extremes:  from the Arctic to the Rainforest
Taking Citizen Science to Extremes: from the Arctic to the Rainforestmichalis_vitos
 
SDSC Technology Forum: Increasing the Impact of High Resolution Topography Da...
SDSC Technology Forum: Increasing the Impact of High Resolution Topography Da...SDSC Technology Forum: Increasing the Impact of High Resolution Topography Da...
SDSC Technology Forum: Increasing the Impact of High Resolution Topography Da...OpenTopography Facility
 
Locate17 and ISDE10 Keynote_S Ramage GEO April 2017
Locate17 and ISDE10 Keynote_S Ramage GEO April 2017Locate17 and ISDE10 Keynote_S Ramage GEO April 2017
Locate17 and ISDE10 Keynote_S Ramage GEO April 2017Steven Ramage
 
INSPIRE - ensuring access or continuity of access?
INSPIRE - ensuring access or continuity of access?INSPIRE - ensuring access or continuity of access?
INSPIRE - ensuring access or continuity of access?Martin Donnelly
 
Comprehensive Overview of the Geoweb
Comprehensive Overview of the GeowebComprehensive Overview of the Geoweb
Comprehensive Overview of the GeowebGovernment/CU Denver
 

Was ist angesagt? (18)

GPS Update for USGS Liaisons
GPS Update for USGS LiaisonsGPS Update for USGS Liaisons
GPS Update for USGS Liaisons
 
Social Geosemantics
Social GeosemanticsSocial Geosemantics
Social Geosemantics
 
Citizen Science, Geocrowdsourcing and Big Data in Urban Context
Citizen Science, Geocrowdsourcing and Big Data in Urban ContextCitizen Science, Geocrowdsourcing and Big Data in Urban Context
Citizen Science, Geocrowdsourcing and Big Data in Urban Context
 
Big data from space technology 150611 @ spaceops 2015
Big data from space technology 150611 @ spaceops 2015Big data from space technology 150611 @ spaceops 2015
Big data from space technology 150611 @ spaceops 2015
 
OpenStreetMap and CycleStreets: collaborative map-making and cartography in t...
OpenStreetMap and CycleStreets: collaborative map-making and cartography in t...OpenStreetMap and CycleStreets: collaborative map-making and cartography in t...
OpenStreetMap and CycleStreets: collaborative map-making and cartography in t...
 
Geospatial intelligence satellite applications catapult pdf - july 23 2019
Geospatial intelligence   satellite applications catapult pdf - july 23 2019Geospatial intelligence   satellite applications catapult pdf - july 23 2019
Geospatial intelligence satellite applications catapult pdf - july 23 2019
 
Crampton GGISA
Crampton GGISACrampton GGISA
Crampton GGISA
 
IMED 2018: Innovations and Challenges in the Use of Open-source Remote Sensin...
IMED 2018: Innovations and Challenges in the Use of Open-source Remote Sensin...IMED 2018: Innovations and Challenges in the Use of Open-source Remote Sensin...
IMED 2018: Innovations and Challenges in the Use of Open-source Remote Sensin...
 
Big Data for Local Context
Big Data for Local ContextBig Data for Local Context
Big Data for Local Context
 
Crowd-Sourced Mapping for Open Government
Crowd-Sourced Mapping for Open GovernmentCrowd-Sourced Mapping for Open Government
Crowd-Sourced Mapping for Open Government
 
Gi science discipline_foloo
Gi science discipline_folooGi science discipline_foloo
Gi science discipline_foloo
 
Open Standards Role in EarthCube (AGU 2013)
 Open Standards Role in EarthCube (AGU 2013) Open Standards Role in EarthCube (AGU 2013)
Open Standards Role in EarthCube (AGU 2013)
 
Big Data, Data and Information Mining for Earth Observation
Big Data, Data and Information Mining for Earth ObservationBig Data, Data and Information Mining for Earth Observation
Big Data, Data and Information Mining for Earth Observation
 
Taking Citizen Science to Extremes: from the Arctic to the Rainforest
Taking Citizen Science to Extremes:  from the Arctic to the RainforestTaking Citizen Science to Extremes:  from the Arctic to the Rainforest
Taking Citizen Science to Extremes: from the Arctic to the Rainforest
 
SDSC Technology Forum: Increasing the Impact of High Resolution Topography Da...
SDSC Technology Forum: Increasing the Impact of High Resolution Topography Da...SDSC Technology Forum: Increasing the Impact of High Resolution Topography Da...
SDSC Technology Forum: Increasing the Impact of High Resolution Topography Da...
 
Locate17 and ISDE10 Keynote_S Ramage GEO April 2017
Locate17 and ISDE10 Keynote_S Ramage GEO April 2017Locate17 and ISDE10 Keynote_S Ramage GEO April 2017
Locate17 and ISDE10 Keynote_S Ramage GEO April 2017
 
INSPIRE - ensuring access or continuity of access?
INSPIRE - ensuring access or continuity of access?INSPIRE - ensuring access or continuity of access?
INSPIRE - ensuring access or continuity of access?
 
Comprehensive Overview of the Geoweb
Comprehensive Overview of the GeowebComprehensive Overview of the Geoweb
Comprehensive Overview of the Geoweb
 

Andere mochten auch

EEA EPA Network - Frameworks for Citizen Science - 2015
EEA EPA Network - Frameworks for Citizen Science - 2015EEA EPA Network - Frameworks for Citizen Science - 2015
EEA EPA Network - Frameworks for Citizen Science - 2015Muki Haklay
 
#FuturePub - Citizen Science, Open Science & scientific publications
#FuturePub - Citizen Science, Open Science & scientific publications#FuturePub - Citizen Science, Open Science & scientific publications
#FuturePub - Citizen Science, Open Science & scientific publicationsMuki Haklay
 
Building centre event "mapping for making"
Building centre event "mapping for making" Building centre event "mapping for making"
Building centre event "mapping for making" Muki Haklay
 
Extreme Citizen Science: the socio-political potential of citizen science
Extreme Citizen Science: the socio-political potential of citizen scienceExtreme Citizen Science: the socio-political potential of citizen science
Extreme Citizen Science: the socio-political potential of citizen scienceMuki Haklay
 
4B_2_A step towards the improvement of spatial quality of web 2.0 geo-applica...
4B_2_A step towards the improvement of spatial quality of web 2.0 geo-applica...4B_2_A step towards the improvement of spatial quality of web 2.0 geo-applica...
4B_2_A step towards the improvement of spatial quality of web 2.0 geo-applica...GISRUK conference
 
OpenStreetMap Completeness for England 03/10
OpenStreetMap Completeness for England 03/10OpenStreetMap Completeness for England 03/10
OpenStreetMap Completeness for England 03/10Muki Haklay
 
Osm Quality Assessment 2008
Osm Quality Assessment 2008Osm Quality Assessment 2008
Osm Quality Assessment 2008Muki Haklay
 
Overview of Citizen Science - Zurich November 2015
Overview of Citizen Science - Zurich November 2015Overview of Citizen Science - Zurich November 2015
Overview of Citizen Science - Zurich November 2015Muki Haklay
 
Citizen Observatories: Mapping for Change air quality studies
Citizen Observatories:  Mapping for Change air quality studiesCitizen Observatories:  Mapping for Change air quality studies
Citizen Observatories: Mapping for Change air quality studiesMuki Haklay
 
Extreme Citizen Science: Current Development
Extreme Citizen Science: Current Development Extreme Citizen Science: Current Development
Extreme Citizen Science: Current Development Muki Haklay
 
Beyond good enough? Spatial Data Quality and OpenStreetMap data
Beyond good enough? Spatial Data Quality and OpenStreetMap dataBeyond good enough? Spatial Data Quality and OpenStreetMap data
Beyond good enough? Spatial Data Quality and OpenStreetMap dataMuki Haklay
 
Oxford Martin School talk - May 2014
Oxford Martin School talk - May 2014Oxford Martin School talk - May 2014
Oxford Martin School talk - May 2014Muki Haklay
 
Eye on Earth Summit - Data Revolution plenary
Eye on Earth Summit - Data Revolution plenary Eye on Earth Summit - Data Revolution plenary
Eye on Earth Summit - Data Revolution plenary Muki Haklay
 
INSPIRE 2014 conference
INSPIRE 2014 conferenceINSPIRE 2014 conference
INSPIRE 2014 conferenceMuki Haklay
 
Citizen Science & Geographical Technologies: creativity, learning, and engage...
Citizen Science & Geographical Technologies: creativity, learning, and engage...Citizen Science & Geographical Technologies: creativity, learning, and engage...
Citizen Science & Geographical Technologies: creativity, learning, and engage...Muki Haklay
 
Citizen science - theory, practice & policy workshop
Citizen science - theory, practice & policy workshopCitizen science - theory, practice & policy workshop
Citizen science - theory, practice & policy workshopMuki Haklay
 
Data and the City workshop 2015
Data and the City workshop 2015Data and the City workshop 2015
Data and the City workshop 2015Muki Haklay
 
Haw GIScience lost its interdisciplinary mojo?
Haw GIScience lost its interdisciplinary mojo?Haw GIScience lost its interdisciplinary mojo?
Haw GIScience lost its interdisciplinary mojo?Muki Haklay
 

Andere mochten auch (20)

B00624300_AlfredoConetta_EGM716_MAUP_Projectc
B00624300_AlfredoConetta_EGM716_MAUP_ProjectcB00624300_AlfredoConetta_EGM716_MAUP_Projectc
B00624300_AlfredoConetta_EGM716_MAUP_Projectc
 
EEA EPA Network - Frameworks for Citizen Science - 2015
EEA EPA Network - Frameworks for Citizen Science - 2015EEA EPA Network - Frameworks for Citizen Science - 2015
EEA EPA Network - Frameworks for Citizen Science - 2015
 
#FuturePub - Citizen Science, Open Science & scientific publications
#FuturePub - Citizen Science, Open Science & scientific publications#FuturePub - Citizen Science, Open Science & scientific publications
#FuturePub - Citizen Science, Open Science & scientific publications
 
Building centre event "mapping for making"
Building centre event "mapping for making" Building centre event "mapping for making"
Building centre event "mapping for making"
 
Extreme Citizen Science: the socio-political potential of citizen science
Extreme Citizen Science: the socio-political potential of citizen scienceExtreme Citizen Science: the socio-political potential of citizen science
Extreme Citizen Science: the socio-political potential of citizen science
 
4B_2_A step towards the improvement of spatial quality of web 2.0 geo-applica...
4B_2_A step towards the improvement of spatial quality of web 2.0 geo-applica...4B_2_A step towards the improvement of spatial quality of web 2.0 geo-applica...
4B_2_A step towards the improvement of spatial quality of web 2.0 geo-applica...
 
AlfredoConetta_EGM712_GIS_Project
AlfredoConetta_EGM712_GIS_ProjectAlfredoConetta_EGM712_GIS_Project
AlfredoConetta_EGM712_GIS_Project
 
OpenStreetMap Completeness for England 03/10
OpenStreetMap Completeness for England 03/10OpenStreetMap Completeness for England 03/10
OpenStreetMap Completeness for England 03/10
 
Osm Quality Assessment 2008
Osm Quality Assessment 2008Osm Quality Assessment 2008
Osm Quality Assessment 2008
 
Overview of Citizen Science - Zurich November 2015
Overview of Citizen Science - Zurich November 2015Overview of Citizen Science - Zurich November 2015
Overview of Citizen Science - Zurich November 2015
 
Citizen Observatories: Mapping for Change air quality studies
Citizen Observatories:  Mapping for Change air quality studiesCitizen Observatories:  Mapping for Change air quality studies
Citizen Observatories: Mapping for Change air quality studies
 
Extreme Citizen Science: Current Development
Extreme Citizen Science: Current Development Extreme Citizen Science: Current Development
Extreme Citizen Science: Current Development
 
Beyond good enough? Spatial Data Quality and OpenStreetMap data
Beyond good enough? Spatial Data Quality and OpenStreetMap dataBeyond good enough? Spatial Data Quality and OpenStreetMap data
Beyond good enough? Spatial Data Quality and OpenStreetMap data
 
Oxford Martin School talk - May 2014
Oxford Martin School talk - May 2014Oxford Martin School talk - May 2014
Oxford Martin School talk - May 2014
 
Eye on Earth Summit - Data Revolution plenary
Eye on Earth Summit - Data Revolution plenary Eye on Earth Summit - Data Revolution plenary
Eye on Earth Summit - Data Revolution plenary
 
INSPIRE 2014 conference
INSPIRE 2014 conferenceINSPIRE 2014 conference
INSPIRE 2014 conference
 
Citizen Science & Geographical Technologies: creativity, learning, and engage...
Citizen Science & Geographical Technologies: creativity, learning, and engage...Citizen Science & Geographical Technologies: creativity, learning, and engage...
Citizen Science & Geographical Technologies: creativity, learning, and engage...
 
Citizen science - theory, practice & policy workshop
Citizen science - theory, practice & policy workshopCitizen science - theory, practice & policy workshop
Citizen science - theory, practice & policy workshop
 
Data and the City workshop 2015
Data and the City workshop 2015Data and the City workshop 2015
Data and the City workshop 2015
 
Haw GIScience lost its interdisciplinary mojo?
Haw GIScience lost its interdisciplinary mojo?Haw GIScience lost its interdisciplinary mojo?
Haw GIScience lost its interdisciplinary mojo?
 

Ähnlich wie B00624300_EGM701_MSc ResearchPaper_AlfredoConetta_03-May-15

OpenTransportNet: Stimulating Innovation with Open Geographic Information
OpenTransportNet: Stimulating Innovation with Open Geographic InformationOpenTransportNet: Stimulating Innovation with Open Geographic Information
OpenTransportNet: Stimulating Innovation with Open Geographic Information21cConsultancy_2012
 
An Exposition Of The Nature Of Volunteered Geographical Information And Its S...
An Exposition Of The Nature Of Volunteered Geographical Information And Its S...An Exposition Of The Nature Of Volunteered Geographical Information And Its S...
An Exposition Of The Nature Of Volunteered Geographical Information And Its S...Kayla Jones
 
Future Development of NSDI Based on the European INSPIRE Directive – a Case S...
Future Development of NSDI Based on the European INSPIRE Directive – a Case S...Future Development of NSDI Based on the European INSPIRE Directive – a Case S...
Future Development of NSDI Based on the European INSPIRE Directive – a Case S...Maksim Sestic
 
Integrating Web Services With Geospatial Data Mining Disaster Management for ...
Integrating Web Services With Geospatial Data Mining Disaster Management for ...Integrating Web Services With Geospatial Data Mining Disaster Management for ...
Integrating Web Services With Geospatial Data Mining Disaster Management for ...Waqas Tariq
 
Crowdsourced mapping for open collaboration: A story of Taiwan so far
Crowdsourced mapping for open collaboration: A story of Taiwan so farCrowdsourced mapping for open collaboration: A story of Taiwan so far
Crowdsourced mapping for open collaboration: A story of Taiwan so farDongpo Deng
 
SDI-Initiatives-in-Nepal (1).pptx
SDI-Initiatives-in-Nepal (1).pptxSDI-Initiatives-in-Nepal (1).pptx
SDI-Initiatives-in-Nepal (1).pptxFareLessmotiVation
 
Experiences as a producer, consumer and observer of open data
Experiences as a producer, consumer and observer of open dataExperiences as a producer, consumer and observer of open data
Experiences as a producer, consumer and observer of open dataProgCity
 
SC7 Workshop 3: Space-based applications and Big Data
SC7 Workshop 3: Space-based applications and Big DataSC7 Workshop 3: Space-based applications and Big Data
SC7 Workshop 3: Space-based applications and Big DataBigData_Europe
 
The UK Location Programme
The UK Location ProgrammeThe UK Location Programme
The UK Location Programmeuklp
 
Relative value of radar and optical data for land cover/use mapping: Peru exa...
Relative value of radar and optical data for land cover/use mapping: Peru exa...Relative value of radar and optical data for land cover/use mapping: Peru exa...
Relative value of radar and optical data for land cover/use mapping: Peru exa...rsmahabir
 
Predictive geospatial analytics using principal component regression
Predictive geospatial analytics using principal component regression Predictive geospatial analytics using principal component regression
Predictive geospatial analytics using principal component regression IJECEIAES
 
Large scale geospatial analysis on mobile application usage
Large scale geospatial analysis on mobile application usageLarge scale geospatial analysis on mobile application usage
Large scale geospatial analysis on mobile application usageEricsson
 

Ähnlich wie B00624300_EGM701_MSc ResearchPaper_AlfredoConetta_03-May-15 (20)

OpenTransportNet: Stimulating Innovation with Open Geographic Information
OpenTransportNet: Stimulating Innovation with Open Geographic InformationOpenTransportNet: Stimulating Innovation with Open Geographic Information
OpenTransportNet: Stimulating Innovation with Open Geographic Information
 
LinkedIn-Presentations
LinkedIn-PresentationsLinkedIn-Presentations
LinkedIn-Presentations
 
An Exposition Of The Nature Of Volunteered Geographical Information And Its S...
An Exposition Of The Nature Of Volunteered Geographical Information And Its S...An Exposition Of The Nature Of Volunteered Geographical Information And Its S...
An Exposition Of The Nature Of Volunteered Geographical Information And Its S...
 
Future Development of NSDI Based on the European INSPIRE Directive – a Case S...
Future Development of NSDI Based on the European INSPIRE Directive – a Case S...Future Development of NSDI Based on the European INSPIRE Directive – a Case S...
Future Development of NSDI Based on the European INSPIRE Directive – a Case S...
 
Integrating Web Services With Geospatial Data Mining Disaster Management for ...
Integrating Web Services With Geospatial Data Mining Disaster Management for ...Integrating Web Services With Geospatial Data Mining Disaster Management for ...
Integrating Web Services With Geospatial Data Mining Disaster Management for ...
 
Crowdsourced mapping for open collaboration: A story of Taiwan so far
Crowdsourced mapping for open collaboration: A story of Taiwan so farCrowdsourced mapping for open collaboration: A story of Taiwan so far
Crowdsourced mapping for open collaboration: A story of Taiwan so far
 
Open Data Technological Citizenship & Imagined Futures
Open DataTechnological Citizenship& Imagined FuturesOpen DataTechnological Citizenship& Imagined Futures
Open Data Technological Citizenship & Imagined Futures
 
SDI-Initiatives-in-Nepal (1).pptx
SDI-Initiatives-in-Nepal (1).pptxSDI-Initiatives-in-Nepal (1).pptx
SDI-Initiatives-in-Nepal (1).pptx
 
Geostor Essay
Geostor EssayGeostor Essay
Geostor Essay
 
RJW ReGIS 1994
RJW ReGIS 1994RJW ReGIS 1994
RJW ReGIS 1994
 
ReGIS 1994
ReGIS 1994ReGIS 1994
ReGIS 1994
 
Experiences as a producer, consumer and observer of open data
Experiences as a producer, consumer and observer of open dataExperiences as a producer, consumer and observer of open data
Experiences as a producer, consumer and observer of open data
 
SC7 Workshop 3: Space-based applications and Big Data
SC7 Workshop 3: Space-based applications and Big DataSC7 Workshop 3: Space-based applications and Big Data
SC7 Workshop 3: Space-based applications and Big Data
 
The UK Location Programme
The UK Location ProgrammeThe UK Location Programme
The UK Location Programme
 
Relative value of radar and optical data for land cover/use mapping: Peru exa...
Relative value of radar and optical data for land cover/use mapping: Peru exa...Relative value of radar and optical data for land cover/use mapping: Peru exa...
Relative value of radar and optical data for land cover/use mapping: Peru exa...
 
Topic 19
Topic 19Topic 19
Topic 19
 
Predictive geospatial analytics using principal component regression
Predictive geospatial analytics using principal component regression Predictive geospatial analytics using principal component regression
Predictive geospatial analytics using principal component regression
 
A genealogy of data assemblages: tracing the geospatial open access and open ...
A genealogy of data assemblages: tracing the geospatial open access and open ...A genealogy of data assemblages: tracing the geospatial open access and open ...
A genealogy of data assemblages: tracing the geospatial open access and open ...
 
Lesson1 esa summer_school_brovelli
Lesson1 esa summer_school_brovelliLesson1 esa summer_school_brovelli
Lesson1 esa summer_school_brovelli
 
Large scale geospatial analysis on mobile application usage
Large scale geospatial analysis on mobile application usageLarge scale geospatial analysis on mobile application usage
Large scale geospatial analysis on mobile application usage
 

B00624300_EGM701_MSc ResearchPaper_AlfredoConetta_03-May-15

  • 1. B00624300 Alfredo Conetta EGM701 MSc Research Paper 1 Research Article A quantitative assessment of the quality of the OpenStreetMap primary route network across urban areas of Africa. Alfredo Conetta MSc Student University of Ulster Dr Sally Cook Department of Environmental Science University of Ulster Abstract The massive increase in the creation, availability, and use of volunteered geographic information, and in particular OpenStreetMap, has led to a number of studies with the purpose of assessing its quality. These studies have largely focused on areas where the project has best taken hold; Germany, France, United Kingdom and the United States. This study addresses the less studied continent of Africa and in particular Nigeria, Sierra Leone, and Kenya, and focuses on a comparison of the main route network for positional accuracy and completeness against a reference dataset, the Multinational Geospatial Co-production Programme data. The positional accuracy results for the buffer analysis showed that 95% of the OpenStreetMap test data lay within a buffer of 8.5m for Maiduguri, 9.5m for Nairobi, and 13m for Freetown. These distances are comparable to studies in the UK (Haklay, 2010) and France (Girres and Touya, 2010). The positional accuracy for coincident junctions showed an average distance between junctions of 4.07m for Maiduguri, 4.65m for Nairobi, and 5.55m for Freetown. These figures for positional accuracy would be suitable for many applications of spatial data. The relative completeness assessment showed that the OpenStreetMap covered 98% of the Reference dataset for Freetown, 93% for Nairobi, and 64% for Maiduguri. These completeness results suggest that OpenStreetMap of the urban areas in Africa may provide an alternative source of road data. 1. INTRODUCTION Following the decision of President Clinton to remove selective availability from civilian GPS in May 2000, the positional accuracy of GPS technology increased by up to 10 times overnight (Clinton, 2000). Investment in research and development by commercial vendors increased dramatically and the technology began to migrating onto other platforms, and eventually the mobile phone. Handheld GPS receivers reduced in size, became more affordable, and their applications to recreation and vehicle navigation became more commercial (Haklay and Weber, 2008). With the conditions set a viable Locations Based Services (LBS) industry flourished and is still flourishing to this day. Couple this with the ongoing evolution in the internet from Web 1.0, where users could ‘search and access’ static information (Dodge and Kitchen 2011, p.2), to Web 2.0 where the user has much more interaction, creating, editing, and sharing their own content (O’Reilly, 2005), and you have huge potential. Web 2.0 provided fertile ground for sites such as Wikipedia to encourage collaboration of users and the ability for users to upload the own data, and edit the data of others, and it is in the ‘army of editors and checkers’ that the real value lies (Carr, 2007).
  • 2. B00624300 Alfredo Conetta EGM701 MSc Research Paper 2 This user information supplied under a Web 2.0 environment has become known as user generated content (UGC), or when of a geospatial nature, Volunteered Geographic Information (VGI) (Goodchild, 2007). Web 2.0 also provides the underlying structure to support crowdsourcing as a resource for projects. The ‘crowd metaphor signifies the power that can emerge from a mass of individuals converging to tackle a set of tasks’ (Dodge and Kitchen 2011, p.2). One such project that utilises this approach, and probably the best known VGI project (Haklay 2010, Mooney et al 2011, Ramm et al 2010), is Steve Coast’s OpenStreetMap. From its inception in 2004 the goal of OpenStreetMap has been to provide a free street level dataset covering the world (Ramm et al, 2011). It has spread from the origin in London across the UK, through Europe and across the Atlantic to the United States, and now has a volunteer population of more than 2 million registered users (Apr 2015) and touches every continent on Earth. This rapid spread has been helped largely by the advent of social media, geo tagging, GPS enabled smart devices and the exponential growth in recent years of the LBS. The job of mapping the earth accurately has historically rested in the hands of highly trained cartographers in the military or National Mapping Agencies (NMA) (Haklay and Weber, 2008). This type of data came from a recognised authoritative source and had been created by qualified geospatial professionals, providing a trust in the data that is currently not present in VGI. OpenStreetMap data is created and contributed to by volunteers who may lack cartographic training, and only have limited restrictions on the methods of creation. The OpenStreetMap project provides registered users with the power to create, edit, and add information to the project using either the online Potlatch2 GUI, or the offline JOSM software to trace information from images. Alternatively users can upload tracks from various GPS enabled platforms; mobile phones being one. In the early years of the OpenStreetMap project many of the contributions came from users uploading tracks from handheld GPS, with volunteers concentrating on ensuring the road networks were captured (Neis & Zielstra, 2014). This lineage adds increased uncertainty over accuracy; the positional accuracy of GPS devices varies considerably from vendor to vendor, as does the ability to digitise accurately. Users must register to be allowed to add data; however, there is still a lack of any quality assurance procedures prior to entering features into the database, and there is a reliance on other registered users to check and correct data; this adds to the lack of trust in the data. Similar projects such as Wikipedia successfully use a self-governing policy whereby a user who adds spurious or incorrect data will have that data removed or corrected by other users OpenStreetMap uses large numbers of volunteers as a kind of ‘crowdsourcing’ (Tapscott et al, 2007), which may actually improve its quality if Linus’s law is to be believed; the more volunteer contributors to a project the higher the quality; also known as the ‘many eyes principle’ (Haklay et al, 2010). ‘Crowdsourcing’ enables areas of the world that are in crisis to be mapped rapidly by thousands of volunteers (Mullins, 2010), in the case of OpenStreetMap this could be in excess of 2 million. These figures can be a little flattering as OpenStreetMap also exhibits a similar trend to Wikipedia, and studies have shown that only a small number of registered users actually contribute. Zipf’s law of distribution (90:9:1) has also been found to hold true for OpenStreetMap, and a study by Neis and Zipf (2012) found that approx 5% of registered users complete 90% of the transactions in the database. The success of the OpenStreetMap can be attributed to its advantages of responsiveness and flexibility (Girres and Touya, 2010), which enables it to stay current, outpacing conventional
  • 3. B00624300 Alfredo Conetta EGM701 MSc Research Paper 3 mapping techniques. Traditionally national mapping is revised on a cycle of 2-5 years, causing an ingrained lack of currency. This aim of this study is to fill the knowledge gap concerning the quality of OpenStreetMap data within the continent of Africa by assessing the positional accuracy and completeness of OpenStreetMap against a common baseline reference dataset. The study covers three separate areas from three different countries across the West and Centre of Africa. 1.1 Assessing the Quality of OpenStreeMap Cartographer, Geographers and Geoscientists have struggled with the issue of data quality since the first maps were made; this has become further compounded with the advent of Geographical Information Systems and investigations into the propagation of error through geospatially based models (Chrisman, 1991). The International Organization for Standardization (ISO) is the overarching body for standards for geospatial data. The new ISO standard 19157 (formerly 19113, 19114, 19138) provides the principles by which spatial data quality is measured. The organisation defines the purpose of defining the quality of geospatial information as ‘..to facilitate the comparison and selection of the data set best suited to application needs or requirements’. ISO 19157. These standards provide guidance for the NMAs, Militaries, and Commercial companies throughout the world. The reservations that many of these organisations have with projects such as OpenStreetMap are understandable when we consider that OpenStreetMap is not created by professional cartographers and does not adhere to ISO 19157. Despite its power, crowdsourcing does worry some, Keen (2007) articulating a concern that ‘it represents a disturbing trend that increases the influence of amateurs at the expense of legitimate experts’. The ISO standard uses a number of principles to define the quality of geospatial data.  Completeness – The presence of an object in a dataset. This covers the omission as well as the commission of objects.  Logical consistency – The absence of conflicts in the dataset.  Positional Accuracy – The accuracy of the position of the object in relation to its actual real world position.  Temporal Accuracy - How the provided temporal information is temporally consistent with the data.  Thematic Accuracy – This relates to how accurate the descriptive information for an object is. The use or usefulness is often mentioned as one of these quality principles but this is largely down to the trust that the user has in the information that he is using, and its suitability for the intended purpose. With the MGCP Reference dataset, its rigorous adherence to standards, and its published accuracy statements, the trust is fairly inherent in the data. Despite the obvious advantages of being free there are still barriers to OpenStreetMap, in particular there are questions that still need answering in regards to its quality, and what this means for its potential use. There is little doubt that many features in the OpenStreetMap data are of good quality, as has been borne out by the studies conducted against the OS data for
  • 4. B00624300 Alfredo Conetta EGM701 MSc Research Paper 4 London (Haklay 2010, Kounadi 2009). The issue is that not everyone who contributes to the data adds quality data (Ciepluch et al, 2011). The majority of the respondents to a Data Quality survey (DWG's 2007/2008) identified that companies were ‘willing to pay more for higher quality data in their projects, if they could just be sure that the quality was there’; Quality may be more important than cost. Despite numerous studies it is hard to determine if the quality measured is repeatable in different locations. Many aspect of spatial data quality are hard to quantify, this has led to studies generally focusing on assessing the more quantifiable aspects of quality, such as positional accuracy and completeness. How these qualities are assessed has generally varied in nature from comparison against an authoritative Reference dataset (Goodchild and Hunter 1997, Haklay and Weber 2008, Kounadi 2009, Ather 2009, Haklay 2010, Koukoletsos et al 2011, Ludwig et al 2011, Zhou et al 2014), to studies focussing on assessing less tangible information from the database, the amount of times a feature has been edited (Mooney and Corcoran, 2012), the motivation of the contributors(Coleman et al 2009, Budhathoki and Haythornthwaite 2012, Begin et al 2013), and how these things affect the quality. The measure of accuracy of the thematic data has received some attention, but perhaps understandably, not as much as positional accuracy and completeness. An early use of the increasing buffer for assessing the positional accuracy was the comparison study into the positional accuracy of the OpenStreetMap road network data against Ordnance Survey(OS) data completed by Haklay (2010) who found that it was ‘fairly accurate’ with most of the data within 6m of the OS data. But the issue of quality did not escape Haklay who also makes mention of the lack of an integrated quality-assurance mechanism, something that is very much a continued barrier to even more widespread usage of OpenStreetMap. Other studies continued in the home of OpenStreetMap with Ather (2009) and then Kounadi (2009) both using buffers to assess positional accuracy of OpenStreetMap datasets for areas of London. Girres and Touya (2010) took the work a step further and made a quality assessment of the French OSM data in which they extended the amount of quality elements that were assessed: Geometric accuracy; attribute accuracy; Completeness; Logical consistency; Semantic accuracy; Lineage; Usage. Probably the most densely populated country in the OpenStreetMap database is that of Germany. Zielstra and Zipf (2010) assessed the growth of the OpenStreetMap data for Germany against the growth of Tele Atlas data, which showed that the OpenStreetMap had grown at a significantly faster pace than the Tele Atlas data and that for the five towns under study the differences in the data were shrinking as a result. Studies have continued to branch out with one in Tehran (Forghani and Delavar, 2014) which used a grid to divide both the Test and Reference datasets into 1km2 grid cells so that the roads could be compared for completeness and the results visualised as a grid. There was a flaw in this methodology as the roads were not matched before the assessment and could result in a 100% completeness rating because the road lengths for both datasets are the same length, not necessary the same roads. This study also used the increasing buffer technique to assess the positional accuracy of the OpenStreetMap dataset. For the data to be used by geospatial professionals it requires a quality assessment that will determine which applications the data can be used for, in other words, its ‘fit for a particular purpose’ (Goodchild 2008, Haklay and Weber 2008).
  • 5. B00624300 Alfredo Conetta EGM701 MSc Research Paper 5 2. METHODOLOGY An assumption that underpins this study and subsequent analysis is that the Reference dataset representing the Primary roads is of higher quality spatially than the Test dataset, is consistent in terms of its quality (Haklay 2010), and that the measurement of completeness is a relative measurement against the data contained in the Reference dataset (Zielstra and Zipf, 2010). Due to the continued updating of OpenStreetMap it has the ability to stay very current in areas that are of interest to the contributors, while other areas may be less well mapped and not have a uniform density of coverage. The outline methodology employed for this study can be seen in Figure 1. Figure 1- Employed methodology. 2.1 Project Study Areas The continent of Africa has received little attention in OpenStreetMap quality studies, perhaps due to the cost and availability of a reference dataset with sufficient quality for comparison. Many of the countries of Africa do not have the resources to create and maintain their own authoritative geospatial data, and the extensive cost of commercial data is often prohibitive. It is therefore even more important for people wishing to use mapping data for the continent of Africa to understand the quality of free sources of data such as OpenStreetMap. This study focused on urban areas in three African countries, enabling not only an assessment of the quality of data in Africa but also an insight into the quality of OpenStreetMap in different areas of Africa.
  • 6. B00624300 Alfredo Conetta EGM701 MSc Research Paper 6 The first area to be studied was Maiduguri in Nigeria (Figure 2), Africa’s most populous country and also boasting the best internet penetration in Africa. The second area was Freetown in Sierra Leone (Figure 3), chosen as an area that has limited internet penetration, and the last area was Nairobi Kenya (Figure 4), chosen as a heavily populated African capital city. Figure 2- Study Area 1, Maiduguri in Nigeria. Figure 3- Study Area 2, Freetown in Sierra Leone. Figure 4- Study Area 3, Nairobi in Kenya.
  • 7. B00624300 Alfredo Conetta EGM701 MSc Research Paper 7 2.2 Reference Dataset Many studies of the quality of OpenStreetMap have been conducted using the comparison against a Reference datasets (Flanagin and Metzger 2008, Doan et al 2011, Haklay 2010) where the Reference dataset is from an authoritative source for the country being studied. This study utilises a Reference dataset that has been created over a number of countries by the same organisation, UK Defence. This provides the ability to compare OpenStreetMap from different countries against a common ‘baseline’ Reference dataset; as OpenStreetMap is usually created by contributors who come from the areas that they contribute this study further enables a comparison of the differences between the contributions from different countries. The Reference data for this study came from the UK’s contribution to the Multinational Geospatial Co-production Program (MGCP), and was supplied by the Defence Geographic Centre. The MGCP dataset is not dissimilar program to the OpenStreetMap project, with the aim of the participating nations to contribute to the creation of a complete database of the world; however, this is a dataset that is used to create military topographic mapping. The dataset is created by the mapping agencies of participating nations collected to a rigid specification aimed at mapping 126 layers containing features that could be used produce a 1:50,000 MGCP Derived Graphic for military operations. The specification states that the positional accuracy of features should be at least 25m circular error; the extracted feature will be within 25m of its true location on the earth 95% of the time. Despite being tied to a 1:50,000 products, the scale relates to the density of extracted features rather than its positional accuracy, which is reflected in the absence of the smaller residential roads. The reality is that the MGCP dataset is collected from HR satellite imagery and other sources at a scale of 1:2,000 (See Figure 5). The dataset was released in 2011 but has been created from a number of sources with dates ranging from 2003-2011. Figure 5- Data sources for MGCP. (Farkas, 2009)
  • 8. B00624300 Alfredo Conetta EGM701 MSc Research Paper 8 It is important that the Reference data is of sufficient accuracy to be considered to be a fair representation of the truth. The dataset is positionally accurate to within metres as can be seen in Figure 6. An initial assessment of manually digitised junctions from high resolution imagery was used to ensure the credibility of the Reference data. From a sample of 428 digitised junctions the MGCP junctions were on average within 3.6m. This image shows a random pair of junctions with the associated Reference dataset road network with the data within metres of the actual centre line of the road. Figure 6- Visual assessment of Reference roads in Maiduguri. The dataset has been created by cartographically trained professionals to a rigid specification. There are numerous layers collected during the MGCP production process, amongst which are the main road network, including motorways, A roads and B roads. The residential roads are not collected unless they are of strategic importance. During the creation of the MGCP dataset analysts extracts the road data by digitising the centre line of the roads; this is not the case for the OpenStreetMap data which captures the road in both directions. This issue with the different representation of the road network was also experience by Ather (2009). The advantage that the MGCP has provided to this study that datasets used in other studies have not is that they have an attribute for the road width that is mensurated from high resolution imagery by a trained image analyst; this enabled a more accurate measurement to be used in the buffer operations, as each of the roads can be buffered to reflect the actual width of the road, as opposed to a suggested norm being used.
  • 9. B00624300 Alfredo Conetta EGM701 MSc Research Paper 9 2.3 Test Datasets The Maiduguri Test dataset was downloaded from the OpenStreetMap website using the OpenStreetMap editor tools for ArcGIS desktop at the start of the project in Nov 14. The OpenStreetMap for Freetown and Nairobi were downloaded as shapefiles from the website Geofabrik in Jan 15. 2.4 Data Matching For the OpenStreetMap Test dataset to be compared against the Reference dataset, it required some form of data matching to be carried out to ensure that the comparison was meaningful (Ellul et al, 2012). To start this process of ensuring features could be compared both datasets were stripped down to contain only the motorways, trunk, primary and secondary roads. Other studies have looked at the possibility of automating the matching process, with varying degrees of success, the main issue with trying to automatically match features being that OpenStreetMap test datasets are hampered by their heterogeneity. This was particularly noted as a problem with the study conducted over France (Girres and Touya, 2010). This is due to contributors having freedom to create data without adherence to a ‘rigid’ specification and the differences in methods of creation, either GPS uploads or the digitisation of features at varying scales and with varied accuracy. On visual inspection it was evident that there was a good match between features of both datasets; it was however necessary to remove the residential roads from the Test dataset as they were not represented in the Reference dataset. To accomplish this a 25m buffer was created around the Reference dataset, which was then used to select all the features from the Test dataset that had their centroid within the buffer. This removed a large majority of the residential roads, and roads that had no partner feature in the Reference dataset. Figure 7 shows the buffer with the selected roads highlighted in cyan. The dark blue roads represent roads that have no matching road in the Reference dataset. There had to be a vast amount of manual editing to ensure the representation of certain OpenStreetMap junctions were simplified at some locations. This was largely down to the extremely detailed configuration of the junctions in the OpenStreetMap and simplistic representation in the Reference.
  • 10. B00624300 Alfredo Conetta EGM701 MSc Research Paper 10 Figure 7- Selection by location with centroid enclosed. The selection by location still left a number of roads within the dataset that needed manual attention; some examples are explained in the following pages. Figures 8 and 9 show road features that have been selected by the analysis but have no feature in the Reference dataset to be compared against, and were therefore manually deleted. Figure 8- Feature extending. Figure 9- No matching feature.
  • 11. B00624300 Alfredo Conetta EGM701 MSc Research Paper 11 In Figure 10 the road highlighted by the red arrow clearly fits into the buffered selection but has not been included in the selection as its centroid is outside of the buffered area. This was resolved by splitting the feature to enable its subsequent inclusion in the selection. Figure 11 shows the extension of roads that have no element for comparison; again these were manually clipped to the edge of the buffer. Figure 10- Feature needs inclusion. Figure 11- Feature needs reducing. In Figures 12 and 13 on the next page, the highlighted lines in cyan show two examples where the digitisation of the road had created a single feature that moved from B Road to residential road. This was corrected by splitting the feature within the buffer zone ensuring that the residential part of the feature would have a centroid outside of the buffer, and as a result would not be selected with the following selection by location. Figure 12- Feature needs splitting. Figure 13- Feature needs splitting.
  • 12. B00624300 Alfredo Conetta EGM701 MSc Research Paper 12 Both the quality of position and completeness were assessed on the Test datasets after they had been data matched with the Reference dataset, and as such the results are compared against the Reference dataset and not the reality of the ever changing element of real life. 2.5 Assessing Positional Accuracy The primary analysis was to find out the positional accuracy of the Test dataset using the Increasing Buffers originally developed by Goodchild and Hunter (1997). Buffers were created around the Reference dataset, in a similar manner to Haklay (2010) and Forghani and Delavar (2014). The buffer distances for previous studies have concentrated on a standard buffer distance dependant on the type of road, gradually increasing until the Test roads encapsulated by the buffer reach the 95th percentile. This study improves this method by using buffer widths that are derived from imagery analysis of the actual widths of the individual roads (available from an attribute in the Reference dataset). To provide some consistency with other studies, a number of buffers were created including one that reached the 95th percentile. In all instances the buffer was used to clip the Test dataset providing a measurement of road length that was encapsulate by the buffer. The second method used to assess the positional accuracy of the Test dataset was the identification of known points in both datasets (Girres and Touya 2012, Helbich et al 2012), and measuring the Euclidean distance between them. The junctions were created for both the Reference and Test datasets using the network analyst extension in ArcGIS 10.1. After a visual assessment a buffer of 10m was created around the Reference junctions, which was then used to select the junctions from the Test dataset that were coincident with the Reference dataset. A near analysis was then used to get the Euclidean distance between both representations of the junction.
  • 13. B00624300 Alfredo Conetta EGM701 MSc Research Paper 13 2.6 Assessing Completeness The measure of completeness of OpenStreetMap is one of the most difficult assessments to conduct; this is due to the OpenStreetMap dataset existing in a continued state of flux as contributors are continually adding new features. It is therefore difficult to find a Reference dataset that is as current or complete as the Test dataset. It is for this reason the Test dataset must be matched to the Reference dataset and the completeness assessment is therefore a relative assessment (Zielstra and Zipf, 2010). Forghani and Delavar (2014) compared completeness by clipping the data into a 1km2 grid and comparing each 1km2 for completeness, measuring completeness as a function of road length of both the Test and Reference datasets within that 1km2 . This may provide a 100% completeness rating even if none of the roads in the 1km2 coincide with each other. To overcome this issue the completeness in this study was carried out on the data after it had been matched with the Reference dataset, removing errors of commission. Commission errors are additional roads which have been captured in error and are included in the Test dataset. To begin this process a 20km2 area of a UTM grid was downloaded from the internet and projected to the relevant UTM zone for the town under study. To measure the completeness of the Test dataset against the Reference dataset both datasets were clipped into 400 separate 1km2 groups of features. The model in Figure 14 was used at the start of this process to split the 20km2 grid into 400 individual 1km2 grids. Figure 14- Model for splitting the Master Grid into individual 1km2 . The model in Figure 15 on the following page was used to clip both datasets to the 400 individual 1km2 cells. The resultant 1km2 chips of Reference data were then dissolved and merged back to one dataset of 400 features containing a combined road length for each of the 1km2 . This was repeated for the Test dataset. Both datasets were then spatially joined with the Master Grid so they inherited the Object ID for the 1km2 within the grid. This Object ID was then used to join both of the individual Reference and Test datasets to the relevant Cell within the Master Grid, and with it the road length data allocated to each 1km2 . A column was then added to the Master Grid attribute table that enabled a calculation to work out the percentage of the Test dataset that covered the Reference dataset.
  • 14. B00624300 Alfredo Conetta EGM701 MSc Research Paper 14 Figure 15- Model for clipping the Reference and Test datasets to the 1km grid cells. As well as looking at each individual 1km2 grid cell it is of interest to look as the tendency of OpenStreetMap to have good coverage in the urban centres but limited coverage in the more rural areas. To provide a basic assessment of this phenomenon a buffer was created at a distance of 5km from a nominal centre of the urban area which provided a figure for road length. This road length was then compared to a figure created by a buffer 10km from the same nominal centre. By subtracting the length of roads in the 5km from the same figure for the 10km buffer it was possible to get the figure for road length in the band between 5km and 10km (Rural). Figure 16 shows the two buffers used. Figure 16- Model for clipping the Reference and Test datasets to the 1km2 grid cells.
  • 15. B00624300 Alfredo Conetta EGM701 MSc Research Paper 15 3. RESULTS This section details the results for the data qualities of positional accuracy and completeness of the individual areas independently starting with Maiduguri in NW Nigeria, followed by Freetown in Sierra Leone, and finally on Nairobi in Kenya. The meaning of the results will be discussed in the next section. 3.1 Data Matching Maiduguri Figure 17 below shows the result of the data matching process. Figure 17- Data Matched full length comparison Maiduguri. 3.2 Full Length Comparison Maiduguri The result in Table 1 shows the coverage of the Test dataset against the coverage of the Reference dataset. . Total Reference dataset 791801m N/A Total Test dataset 509494m 64% Table1 - Data Matched total length comparison.
  • 16. B00624300 Alfredo Conetta EGM701 MSc Research Paper 16 3.3 Positional Accuracy Maiduguri (Buffer- Increasing buffer) The results in Table 2 show the amount of the Test dataset that is encapsulated by the various buffer widths. The road width is buffered to a number of distances that represent the actual road width on the ground. The bordered box shows the distance where the 95th percentile is reached. Buffer Road Width 5m 8m 8.5m 9m 10m 15m Test 301543m 436535m 482057m 485284m 487887m 492182m 502523m % 59.18% 85.68% 94.61% 95.24% 95.75% 96.60% 98.63% Table 2- Positional accuracy of linear features (Roads). 3.4 Positional Accuracy Maiduguri (Junction Nodes) Table 3 shows the result of the near analysis between coincident junctions from both datasets. Nodes 596 Maximum 9.97m Mean 4.07m Std Dev 2.25m Table 3- Positional accuracy of coincident junctions.
  • 17. B00624300 Alfredo Conetta EGM701 MSc Research Paper 17 3.5 Completeness Assessment Maiduguri – Grid The map in Figure 18 shows the completeness of the Test dataset as a percentage of the Reference dataset. Figure 18- Percentage of Test data coverage per 1km2 . 3.6 Completeness Assessment Maiduguri – Urban v Rural Table 4 shows the reduction in the coverage of the test dataset as the distance from the centre of town increases. Distance from Town Centre Within 5km 5km to10km Outside 10km Test 307913m 186908m 15282m %Test 60% 37% 3% Reference 315191m 364577m 187967m %Reference 40% 46% 14% Table 4- Completeness from nominal centre of town.
  • 18. B00624300 Alfredo Conetta EGM701 MSc Research Paper 18 3.7 Data Matching Freetown Figure 19 shows the result of the data matching process. Figure 19- Data Matched full length comparison Freetown. 3.8 Full Length Comparison Freetown The result in Table 5 shows the coverage of the Test dataset against the coverage of the Reference dataset. Total Reference dataset 309401m N/A Total Test dataset 304532m 98% Table 5- Data Matched full length comparison.
  • 19. B00624300 Alfredo Conetta EGM701 MSc Research Paper 19 3.9 Positional Accuracy Freetown (Buffer- Increasing Buffer) The results in Table 6 show the amount of the Test dataset that is encapsulated by the various buffer widths. The road width is buffered to a number of distances that represent the actual road width on the ground. The bordered box shows the distance where the 95th percentile is reached. Buffer RoadWidth 5m 10m 12m 13m 14m 15m Test 118815m 198357m 275728m 285785m 289548m 292178m 294000m % 39.02% 65.13% 90.45% 93.84% 95.08% 95.95% 96.54% Table 6- Positional accuracy of linear features (Roads). 3.10 Positional Accuracy Freetown (Junction Nodes) Table 7 below shows the result of the near analysis between coincident junctions from both datasets. Nodes 252 Minimum 0.31m Maximum 9.92m Mean 5.55m Std Dev 2.56m Table 7- Positional accuracy of coincident junctions.
  • 20. B00624300 Alfredo Conetta EGM701 MSc Research Paper 20 3.11 Completeness Assessment Freetown The map in Figure 20 shows the completeness of the Test dataset as a percentage of the Reference dataset. . Figure 20- Percentage of Test data coverage per 1km2 . 3.12 Completeness Assessment Freetown– Urban v Rural Table 8 shows the reduction in the coverage of the test dataset as the distance from the centre of town increases. Distance from centre of town Within 5km 5km to10km Outside 10km Test 176215m 74263m 54810m %Test 58% 24% 18% Reference 179138m 75435m 54828m %Reference 58% 24% 18% Table 8- Completeness from nominal centre of town.
  • 21. B00624300 Alfredo Conetta EGM701 MSc Research Paper 21 3.13 Data Matching Nairobi Figure 21 below shows the result of the data matching process. Figure 21- Data Matched full length comparison Nairobi. 3.14 Full Length Comparison Nairobi The result in Table 9 shows the coverage of the Test dataset against the coverage of the Reference dataset. Total Reference dataset 983357m N/A Total Test dataset 919756m 93% Table 9- Data Matched full length comparison.
  • 22. B00624300 Alfredo Conetta EGM701 MSc Research Paper 22 3.15 Positional Accuracy Nairobi (Buffer- Increasing buffer) The results in Table 10 show the amount of the Test dataset that is encapsulated by the various buffer widths. The road width is buffered to a number of distances that represent the actual road width on the ground. The bordered box shows the distance where the 95th percentile is reached. Buffer Road Width 5m 9m 9.5m 10m 15m Test 518625m 739168m 873482m 878371m 882414m 902216m % 56.39% 80% 94.96% 95.49% 95.94% 98.09% Table 10- Positional accuracy of linear features (Roads). 3.16 Positional Accuracy Nairobi (Junction Nodes) Table 11 shows the result of the near analysis between coincident junctions from both datasets. Nodes 688 Minimum 0.22m Maximum 9.95m Mean 4.65m St Dv 2.29m Table 11- Positional accuracy of coincident junctions.
  • 23. B00624300 Alfredo Conetta EGM701 MSc Research Paper 23 3.17 Completeness Assessment Nairobi The map in Figure 22 shows the completeness of the Test dataset as a percentage of the Reference dataset. Figure 22- Percentage of Test data coverage per 1km2 . 3.18 Completeness Assessment Nairobi– Urban v Rural The Table 12 below shows the reduction in the coverage of the Test dataset as the distance from the centre of town increases. Distance from centre of town 5km 5km to10km Outside 10km Test 275758m 507531m 136467m %Test 30% 55% 15% Reference 289076m 540597m 153684m %Reference 29% 55% 16% Table 12- Completeness from nominal centre of town.
  • 24. B00624300 Alfredo Conetta EGM701 MSc Research Paper 24 4. DISCUSSION The continued search for a new ‘effective’ method to assess the quality of OpenStreetMap data is a testament to the difficulties faced when assessing a dataset that is continually being contributed to by millions of individuals with varying degrees of skill, knowledge, access and motivation. There is however a pattern that is evident throughout many of the studies researched (Goodchild and Hunter 1997, Haklay and Weber 2008, Haklay 2010, Kounadi 2009, Ather 2009, Ludwig et al 2011, Zhou et al 2014), that is the use of a Reference dataset with the buffer technique for positional accuracy, and the use of a grid system to visualise the results of a completeness assessment. With this in mind, improvements to this approach would involve improvements to the Reference dataset. The initial assessment of positional accuracy was the road width attributes buffer distance that was used on the Maiduguri Test dataset. This analysis created a number of widths of buffers which resulted in 59% of the Test dataset being encapsulated within the various buffers. To further assess the positional accuracy the buffer continued until 95% of the Test dataset was contained by the buffer. Table 2 shows that the buffer at 8.5m contains 95% of the Test dataset. The percentages of the Freetown Reference dataset enclosed by the road width attribute were a mere 39.02% of the dataset(Table 6); this difference in positional accuracy in Freetown continues only reaching 95% coverage using a buffer of 13m. The final area studied, Nairobi, had only 56.39% enclosed by the road width attribute but managed 95% when the buffer reached 9.5m. By using a similar method to that employed in other studies it is possible to have some form of comparison, although only at a rudimentary level, as the difference in data matching and the quality of the Reference dataset obviously also have influence. These results compare well against a study on OpenStreetMap of France carried out by Koukoletsos et al (2011) which reached 95.3% of the Test dataset inside a buffer of 15.5m. In another study conducted by Zhou et al (2014) buffer sizes of 11.25m and 7.5m resulting in 99.69% and 88.03% respectively, of the Test dataset being encapsulated. These distances again assigned the distances dependent on road type. A study by Siebritz and Sithole(2014)used a standard buffer of 10m to assess the positional accuracy of 9 Provinces across South Africa the results varied from 64.8% - 94.3% enclosed within the buffer. This would suggest that the positional accuracy of the OpenStreetMap for all of the areas within this study of Africa is similar if not better than other studies, but also underlines that the assessment is still only valid for the area that is under study. The second part of the positional accuracy relied on the comparison of coincident road junctions in both the Reference dataset and Test dataset. The results from a similar comparison carried out in France (Girres and Touya, 2012) also used pairs of road junctions as points to compare. This resulted in an average error of 6.65m from a sample of 207 pairs, although it also showed a concentration of 2.5m and 10m showing that there is no consistency to the error; again the data could have been uploaded from devices with different inherent positional error. All of the study areas used in this study of Africa used significantly more junctions in the comparison with the results comparing favourably against the French study. Maiduguri had an average error of 4.07m from a sample of 596; Nairobi had an average error of 4.65m from a sample of 688; and Freetown had an average error of 5.55m from a sample of 252. To match the junctions in this study the 10m buffer ensured that any outliers did not negatively affect the results of the analysis. These results also have similarities to the distances that were observed by Haklay
  • 25. B00624300 Alfredo Conetta EGM701 MSc Research Paper 25 (2010) in his comparison between OpenStreetMap to Ordnance Survey, which perhaps lends some credibility to the spatial accuracy of the Reference dataset. For an assessment of the completeness of a geospatial dataset to be done the Reference dataset should be as close to the reality as possible. As OpenStreetMap is continuously being updated by volunteers it is hard to measure it for completeness without comparing it to the situation on the ground by visual inspection of Commercial Satellite Imagery; this was out of the scope of this study because of both cost and time constraints. It then limits the study to a relative assessment as the Reference dataset is not reality; the study was further hindered by the selection of study locations as the areas in Africa are not abundant with high quality datasets. Completeness is intrinsically linked with how current the data is, measured by assessing against the known reality, or another dataset that is close to it. The issue with this is that this ties completeness to the currency of the dataset. There are not many datasets that are close to reality for many areas of Africa, which could be a reason that there are limited studies. Forghani and Delavar (2014) compared completeness for Tehran by clipping the data into a 1km2 grid and comparing each individual grid for completeness. A problem with this approach is that without data matching it is possible to have equal lengths of roads represented in a grid square which would equate to 100% completeness, even if the roads from both datasets are not coincident. All three areas that were used for this study covered an area of 20km x 20km, but despite this the areas differ significantly in the amount of primary road that are present within this area. Zhou et al (2014) carried out a study on three areas of China measuring completeness on the comparison of lengths for both the Reference dataset and Test dataset. It showed the maximum completeness of 60.77%, 32.06% and 28.62% respectively for the three areas studied. The three areas examined in this study of Africa have better completeness than that identified in China. Nairobi had the largest concentration of primary roads having 983km in the Reference dataset, of which the Test dataset covered 93%. Maiduguri had 792km of primary roads in the Reference dataset with the Test dataset covering 64% of the Reference roads. With the least amount of primary roads the Freetown Test dataset covered 98% of the 309km in the Reference dataset. Figure 17 shows the result of the data matching of the roads in Maiduguri. It is evident that as the distance from the centre of the urban area increases the completeness of the coverage decreases. This has been reflected in other studies, however the opposite was found to be true for the US. It is further highlighted in Figure 18 which represents the coverage as a percentage showing only 95 of the 1km2 had 91-100% completeness, and areas of ‘no cover’ or ‘limited cover’ at extremities of the urban fringe. There are also 132 grids which had road features from the Reference dataset but no road features in the Test dataset. This could be because the urban fringe may be less important to the local population, or perhaps the ability to contribute has more barriers. These barriers could be a lack of GPS enabled platforms, lack of Mobile/Internet coverage, or simply a lack of basic computer literacy. In an attempt to further highlight this trend the final analysis of completeness involved discovering the percentage of roads which were contained within a circular band between 5km - 10km, defined in this study to represent the Rural areas. Table 4 shows that for Maiduguri 60% of the roads are within 5km of the centre of town dropping significantly to 37% in the 5km-10km band; this is not reflected in the Reference dataset which has 40% within 5km and 46% in the 5km-10km band.
  • 26. B00624300 Alfredo Conetta EGM701 MSc Research Paper 26 Freetown, despite having the worst of the positional accuracy results, 95% within 13m, actually has 98% of the Reference dataset represented in the Test dataset, as can be seen in Figure 19. Table 8 shows that for Freetown 58% of the roads are within 5km of the centre of town dropping significantly to 24% in the 5km-10km band, this is however reflected in the Reference dataset which has 58% within 5km and 24% in the 5km-10km band, as would be expected from nearly 100% coherence between Reference dataset and Test dataset. Nairobi had by far the most roads in the Reference dataset, despite this there was 93% of the dataset represented by the Test dataset, as can be seen in Figure 21. Table 12 shows that 30% of the roads are within 5km of the centre of town rising significantly to 55% in the 5km-10km band; this is however is also reflected in the Reference dataset which has 29% within 5km and 55% in the 5km-10km band. This doesn’t follow the trend and on inspection of Figure 22 it is evident that the coverage has no spatial autocorrelation. This difference in the coverage in 5km-10km band could be that Nairobi is much bigger in size and the 5km-10km band is still in the urban area. This could be down to the lack of interest from the general population in this area. It is interesting to note, and perhaps not surprising, the study area with the lowest mobile phone and internet coverage, Freetown, had the worst figures for positional accuracy. To contribute to the OpenStreetMap project there is a requirement to have a GPS platform, usually a mobile phone, and an internet connection (Neis and Zielstra, 2014). This could mean that poorer nations across Africa may reflect similar results to those identified in this study. It seems ironic that those nations that could most benefit from free spatial data may have too big a hurdle in the form of the ‘Digital Divide’, experiencing ‘participation inequality’ (Neis and Zielstra, 2014). This study does have a few weaknesses, one of which was defining the actual currency of the Reference dataset due to the different imagery sources used. Although consistent throughout in density and quality of positionally accuracy, and having a respectable lineage, the Reference data contained fewer roads than the OpenStreetMap Test dataset. By data matching the OpenStreetMap Test datasets prior to the completeness there was a lot of information that was not studied. This is the case for most reference datasets that exist as already mentioned it is hard to stay as current as a database that never sleeps. 6. SUMMARY Adoption of the OpenStreetMap dataset will ultimately be a measure of its success, and perhaps OpenStreetMap will not lead to the death of the NMA (Devillers et al, 2012) but may lead to a revision of its operations to ensure that they stays relevant in the world that can now be mapped by volunteers, on digital platforms, every minute of the day. Given the accuracy identified in this study it may prompt further study of the continent of Africa and lend some weight to the adoption of the OpenStreetMap data to increase the density of the coverage in Government datasets.
  • 27. B00624300 Alfredo Conetta EGM701 MSc Research Paper 27 References Ather, A. (2009). A Quality Analysis of OpenStreetMap Data. MEng. Thesis. University College London: U.K. Barron, C,. Neis, P., Zipf, A. (2013), ‘A comprehensive Framework for Intrinsic OpenStreetMap Quality Analysis’, Transactions in GIS, Vol6, p.76-106. Begin, D., (2012), ‘Towards Integrating VGI and National Mapping Agency Operations – A Canadian Case Study’, Role of Volunteer Geographic Information in Advancing Science: Quality and Credibility Workshop, GIScience Conference, September 18, Columbus Ohio. Begin, D., Devillers, R., Roache, S. (2013), ‘Assessing Volunteered Geographic Information (VGI) Quality Based on Contributors’ Mapping Behaviours’, International Archive of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol XL-2/W1, 8th International Symposium on Spatial data Quality 30 May – 1Jun 13, Hong Kong. Budhathoki, N, R., Haythornthwaite, C. (2013), ‘Motivation for open Collaboration: Crowd and Community Models and the Case of OpenStreetMap ’, American Behavioral Scientist, Vol57, p.548-575. Carr, N, G. (2007),‘The ignorance of crowds’ Strategy + Business Magazine, 47: 1-5. Chrisman, N, R.(1991), ‘The Error Component In Spatial Data’, Geographical Information Systems: Overview Principles and Applications, Eds D J Maguire, M F Goodchild, D W Rhind, Longman, Harrow, Essex p. 165-174. Ciepluch, B., Mooney, P., Jacob, R. (2011), ‘A comparison of the accuracy of OpenStreetMap for Ireland with Google Maps and Bing Maps’, In the Proceedings of the Ninth International Symposium on Spatial Data Accuracy in Natural Resources and Environmental Science, Leicester, UK, 20-23 July 2010. Ciepluch, B., Mooney, P., Jacob, R. (2011), ‘Sketches of Generic Framework for Quality Assessment of Volunteered Geographical Data’, IEEE Geoscience and Remote Sensing Society (GRRS), 1-5. Coleman, D., Georgiadou, Y., Labonte, J. (2009), ‘Volunteered Geographic Information: the Nature and Motivation of Producers’, Article under Review for the International Journal of Spatial data Infrastructures Research, Special Issue GSDI-11, Submitted 2009. Corcoran, P., Mooney, P. (2012), ‘The Annotation Process in OpenStreetMap’, Transactions in GIS, 2012, Vol 16(4), p. 561-579. Corcoran, P., Mooney, P., Bertolotto, M. (2013), ‘Analysing the Growth of OpenStreetMap Networks’ Spatial Statistics, Vol 3, p.21-32.
  • 28. B00624300 Alfredo Conetta EGM701 MSc Research Paper 28 Devillers, R., Begin, D., Vandecasteele, A. (2012), ‘Is the rise of Volunteered Geographic Information (VGI) a sign of the end of National Mapping Agencies as we know them?’, GIScience 2012 Workshop ‘Role of Volunteer Geographic Information in Advancing Science: Quality and Credibility’, Columbus, OH, September 18, 2012. Dodge, M., Kitchin, R. (2011), ‘Mapping Experience: Crowdsourced Cartography’ Social Sciences Research Network, Vol 4, p.55-80. Farkas, I. (2009),’Multinational Geospatial Co-production Program –Production Worldwide and in Hungary’, Geoscience, Vol 8 N⁰1, 151-157. Flanagin, A., Metzger, M. (2008),‘The Credibility of Volunteered Geographic Information’, GeoJournal, Vol 72, p.137-148. Forghani, M., Delavar, M,R. (2014), ‘A Quality Study of the OpenStreetMap Dataset for Tehran’, ISPRS International Journal of Geo-Information, Vol 3, p. 750-763. Girres, J., Touya, G. (2010), ‘Quality Assessment of the French OpenStreetMap Dataset’, Transactions in GIS, 2010, Vol 14(4), p. 435-459 Goodchild, M, F. (2007), ‘Citizens as Sensors: The world of volunteered geography’. GeoJournal, Vol 69, p.211-221. Goodchild, M, F.(2008), ‘Assertion and Authority: The science of user-generated geographic content’. Proceedings of the Colloquium for Andrew U. Frank’s 60th Birthday, Department of Geoinformation and Cartography, Vienna University of Technology, Vienna, Austria. Haklay, M., Weber, P. (2008),’OpenStreetMap – User-generated Street Map’, IEEE Pervasive Computing, Vol 7, p. 12-18. Haklay, M., (2010), ‘How good is volunteered geographical information? A Comparative Study of OpenStreetMap and Ordnance Survey datasets’, Environment and Planning B: Planning Design 2010, Vol 37, p. 682-703 Helbich, M., Amelunxen, C., Neis, P., Zipf, A., (2012), ‘Comparative Spatial Analysis of Positional Accuracy of OpenStreetMap and Proprietary Geodata’, accessed online [13 Dec 2014] http://koenigstuhl.geog.uni- heidelberg.de/publications/2010/Helbich/Helbich_etal_AGILE2011.pdf Keen, A. (2007), ‘The Cult of the Amateur: How Todays Internet is Killing Our Culture’, Doubleday, New York, NY, USA. Keßler, C., René Theodore , R., de Groot, A. (2013),‘Trust as a Proxy Measure for the Quality of Volunteered Geographic Information in the Case of OpenStreetMap’. D. Vandenbroucke et al. (eds.), Geographic Information Science at the Heart of Europe, Lecture Notes in Geoinformation and Cartography, DOI: 10.1007/978-3-319-00615-4_2, _ Springer International Publishing Switzerland 2013.
  • 29. B00624300 Alfredo Conetta EGM701 MSc Research Paper 29 Koukoletsos, T., Haklay, M., Ellul, C. (2012), ‘Assessing Data Completeness of VGI through an Automated Matching Procedure for Linear Data’, Transactions in GIS, 2012, Vol 16(4), p. 477-498 Kounadi, O. (2009). Assessing the Quality of OpenStreetMap Data. MSc. Thesis. University College London: U.K. Ludwig, I., Voss, A., Krause-Traudes, M. (2011),’A Comparison of the Street Networks of Navteq and OSM in Germany’, Advancing Geoinformation Science of a Changing World Lecture Notes in Geoinformation and Cartography 2011, published by Springer, 65-84. Mooney, P., Sun, H., Corcoran, P., Yan. L, (2011), ‘Citizen Generated Spatial Data and Information: Risks and Opportunities’ Proceedings of the 2nd International Conference on Network Engineering and Computer Science, Xi’an, Shaanxi, China, 23-25 September. Mooney, P., Corcoran, P. (2012) ‘The annotation process in OpenStreetMap’. Transactions in GIS 16(4):561–579 Mooney, P., Corcoran, P. (2012) ‘Characteristics in Heavily Edited Objects in OpenStreetMap’. Future Internet, Vol 4, 285–305 Mullins, J. (2010, Jan), Haiti gets help from net effect, NewScientist. Neis, P., Zielstra, D. (2014), ‘Recent Developments and Future Trends in Volunteered Geographic Information Research: The Case of OpenStreetMap’, Future Internet, vol6, p.76-106. Neis, P., Goetz, M, and Zipf, A. (2012), ‘Towards automatic Vandalism detection in OpenStreetMap’, ISPRS International Journal of Geo-Information, Vol1, p.315-332. O’Reilly, T (2005), What is Web 2.0: Design Patterns and Business Models for the Next Generation of Software, O’Reilly Media, Cambridge, MA, USA. President Clinton (2000), ‘Statement by the President regarding the United States’ decision to stop degrading Global Positioning System Accuracy’, Office of the Press Secretary, White House. http://clinton3.nara.gov/WH/EOP/OSTP/html/0053_2.html [accessed 03 Nov 14] Ramm, F., Topf, J., Chilton, S., (2011),’ OpenStreetMap: Using and Enhancing the Free Map of the World’, UIT, UK, Cambridge. Severinsen, J., Reitsma, F., (2013), ‘Finding the Quality in Quantity: Establishing Trust For Volunteered Geographic Information’, SIRC NX 2013 GIS and Remote Sensing Research Conference, University of Otago, Dunedin, New Zealand, 29th -30th August 2013. Siebritz, L., Sithole, G.(2014),’Assessing the Quality of OpenStreetMap in South Africa in Reference to National Mapping Standards’, Proceedings of the 2nd AfricaGEO Conference, South Africa, Cape Town, 1-3 July 2014. Sehra, S., Singh, S, J., Rai, H, S. (2014), ‘A Systematic Study of OpenStreetMap Data Quality Assessment’, 11th International Conference on Information Technology: New generations.
  • 30. B00624300 Alfredo Conetta EGM701 MSc Research Paper 30 Tapscott, D. Williams, A, D.(2007) ‘Wikinomics: How Mass Collaboration Changes Everything’, New York, Portfolio Hardcover . Zielstra, D., Zipf, A., (2010), ‘A Comparative Study of Proprietary Geodata and Volunteered Geographic Information for Germany’, 13th AGILE International Conference on Geographic Information Science, Guimarães, Portugal, 2010. Zhou, P., Huang, W., Jang, J. (2014), ‘Validation analysis of OpenStreetMap Data in Some Areas of China’, The International Archives of Photogrammetry, Remote Sensing and Spatial information Sciences, Vol XL-4, 2014 ISPRS Technical Commission IV Symposium, 14-16 may 2014, Suzhou, China.