Best Rate (Guwahati ) Call Girls Guwahati ⟟ 8617370543 ⟟ High Class Call Girl...
Introducing the SQUAD Tool: A tool for identifying anomalies in large spatial data sets
1. SQUAD Tool
Identifying Anomalies to
Improvethe Quality of
Spatial Data Sets
John Spencer
Becky Wilkes
Veronica Escamilla
MEASURE Evaluation
May 31, 2018
11. Real LocationCoordinate
places it here
With poor data quality, you can end
up with facilities seeming like they’re
in places where they aren’t, such as
in lakes or oceans…
12. Or on the other side of the world
Reallyit is
here
Likelynot in
Denmark
13. ?
Whendata is wrong it can lead to confusion
that can interferewith adequate provisionof
services.
15. Spatial
• Is there a coordinate?
• Is it in an appropriateplace?
o Not in a lake
o Not outsidethe country
• Are coordinates duplicated?
16. Attribute
• Are there duplicatenames?
• Missingvalues?
• Out of range values?
Spatial
• Is there a coordinate?
• Is it in an appropriateplace?
o Not in a lake
o Not outsidethe country
• Are coordinates duplicated?
22. Six anomalies that may indicate a
data quality issue
1. Missingcoordinate
2. Coordinatestruncated
3. Duplicatecoordinates
4. Duplicatefacilitynames
5. Siteis slightlyoutsideof
expected administrativeunit
6. Siteis far outsideof expected
administrativeunit
25. Anomaly 1: Missing coordinate
Problem: No coordinate for site or coordinate of 0,0
Possible solutions
• Reviewthe GPS log or other
records
• Recapturethe locationon the next
sitevisit
• Use imageryfrom ArcGIS,Google
Earth,or another source to locate
the siteand get the coordinate
?
26. Anomaly 2: Coordinates truncated
Problem: Coordinate is missing significant digits
Possiblesolutions
• Reviewthe GPS log or other
records
• Recapturethe locationon the
next sitevisit
• Use imageryfrom ArcGIS,Google
Earth,or another source to locate
the siteand get the coordinate
Example: -6.72, 35.43
Coordinate Approximate
precision
23.1 10 kilometers
23.12 1 kilometer
23.123 100 meters
23.1234 10 meters
23.12345 1 meter
23.123456 10 centimeters
27. Anomaly 3: Duplicate coordinates
Problem: Multiple records with identical coordinates
Possiblesolutions
• Determineif there are, in fact, two
distinctsitesat that location. If there
aren’t two sites,then:
o Reviewthe GPS log or other records
o Recapturethe locationon the next
sitevisit
o Use imageryfrom ArcGIS,Google
Earth,or another source to locate
the siteand get the coordinate
28. Anomaly 4: Duplicate facility names
Problem: Multiple records with identical names
Possiblesolutions
• Determineif there are, in fact, two
distinctsiteswiththe same name
o Contact the sitedirectly
o Reviewdocuments or reports
to determinethe name
MercyClinic
MercyClinic
29. Anomaly 5: Site is outside expected
location but is within 2 kilometers
Problem:Site is slightly outside its expectedadministrativeunit
Possiblesolutions
• Try a different administrativeunit
boundary file
• Reviewrecords to validatethe
administrativeunit and GPS
coordinate
• Locate the siteusing imagery
• Revisitthesite
NorthDistrict
SouthDistrict 1.8KM
Facility Name District
Mercy North
30. Anomaly 6: Site is not at all near its
expected location
Problem: Siteis more than2 kilometersfrom its expected location
Possiblesolutions
• Look for obvious issues in the
coordinate(e.g., X/Y transposed;
typos)
• Reviewthe GPS log, if available
• Locate the siteusing imagery
• Revisitthesite
Nairobi
General
Hospital
32. Prerequisites to use SQUAD
Site LocationFile
• Unique identifierfor
each site
• X/Y coordinate
• Name of site
• Field with administrative
unit*
AdministrativeUnits File
• Name field for administrative
unit*
*Bothfilesshouldrelyonstandardnamesfortheadministrativeunitthatuseconsistentnaming
conventionsanddiacriticalmarks.
33. Load the relevant files
and the tool into ArcGIS or QGIS
Load the relevantfiles
• Load the individualfeature
class files for sites and
boundary files
Load the SQUAD Tool
• Add the tool to the
Arctoolbox or use the Plugin
Managerin QGIS
34. Open the tool, provide parameters,
and run the tool
The tool will ask you to indicate the relevant files and fields.
35. Run the tool and review results
Fields added for each Anomaly
type
• 1 = Anomalypresent
36. The presence of an anomaly
does not automatically
indicate an error.
Anomalous records do require
investigation, though.
37. The SQUAD Tool is suitable for initial
data quality checks in a large spatial
data set and for routine quality checks.