Our technological reliance on data has increased exponentially over the last decade. As the amount of data society generates grows, so does our reliance on the quality of the data we use. All too often projects are hampered by the effects of bad data sets. Why is the accuracy of data so important? Is there a threshold for data accuracy? How do we collect or generate accurate data? With the myriad of applications that take advantage of the data we collect what is an acceptable amount of error? As the era of “Big Data” is booming we should ensure it has the best spatial data foundation for it to be built upon. A look at collection and management strategies within the utilities field.
2. About me:
Studied GIS and Computer Science (M.Sci.)
9 years of GIS experience
Application Development, Programming, Cartography
Thursday, October 10, 13
3. Field Data Collection
Data Model development
Utilities, Forestry, Agriculture
Thursday, October 10, 13
Collect around 200,000 point locations per year.
Average around 6 million pieces of data to populate, process, and do Quality Control.
4. What is Data?
Why or How is it useful?
Why do we need accurate data?
What is “accurate data”?
How do we get it?
Thursday, October 10, 13
Topics that I will cover
5. What is Data?
Any piece of information
Qualitative or Quantitative
In GIS it can be Raster of Vector
Large collections form a database
Huge collections of databases becomes “BIG DATA”
Thursday, October 10, 13
Data can be just about anything
Text, numbers, symbols.
6. Big Data?
Large and complex datasets requiring special tools to
manage and process.
Working in to every corner of our world.
Finance
Government
Environmental Management
Utilities
Thursday, October 10, 13
Big data is here. If you havent learned how to handle it, it will bury you.
7. Its not what it is but how you use it...
Thursday, October 10, 13
Data is the foundation of the pyramid.
Every inventory and every piece of data we have collected, and there has been a lot, is part of a foundation that we construct.
This foundation is built upon extremely accurate data so that as your needs evolve you know that what you have built will work because
the foundation is rock solid. We know the value of having accurate information; that it can save enormous amounts of energy, time, and
money.
9. When is a map more than a map?
Google Map started in 2005.
The “Hello World” project was simple a pannable
zoomable map.
Thursday, October 10, 13
10. Added Directions and road networks
Integrated satellite imagery in to a hybrid view
Added geocoding functionality
Deeper and more immersive content
Thursday, October 10, 13
Google kept working to update and improve upon their maps.
11. In 2008, started crowdsourcing business information.
Refine searches based on user ratings.
2009, Street View was launched.
Map Maker allowed user changes to be implemented in near real-time.
Made Google Maps the ubiquitous data source.
Thursday, October 10, 13
2008 - Google began using user inputs to update their maps. Leveraging the drive of other
map users to have accurate data.
12. Street View now
lets you tour inside
some buildings.
Thursday, October 10, 13
You can tour CERN, the Burj Khalifa tower in Dubai, The Grand Canyon, Everest Base Camp,
and the Kennedy Space Center... although now all the door are locked and the lights are off.
13. You can even go to the Moon!
Thursday, October 10, 13
14. “The never-ending quest for the perfect map.”
Brian McClendon, VP of Engineering, Google Maps
Thursday, October 10, 13
15. Why do we need accurate data?
Efficiency.
Cost Savings.
Data driven world.
Thursday, October 10, 13
16. Utility Industry Example
Electrical Networks are
becoming increasingly
complex and automated.
Outage Management Systems
rely on accurate base data to
localize power disruptions.
Bad data means the lights stay
off longer.
Thursday, October 10, 13
17. Good data:
Reduces operating costs.
Speeds repairs.
Better maintenance.
Smarter Capital Investments
Thursday, October 10, 13
18. Having an accurate dataset is crucial to avoid
compounding problems.
With more and more data available, the base data
becomes increasingly important.
Can you make informed decisions with spotty data?
Thursday, October 10, 13
19. So What is Accurate Data?
Thursday, October 10, 13
20. Data accuracy is often difficult to quantify.
Some acceptable error rate is usually established.
For some applications, 90% accurate will work
For others accuracy must be 99.5%+
Thursday, October 10, 13
21. Take a large data set of 1 million
values.
95% accuracy
You have 50,000 errors in your data.
The more data you have the more it
becomes an issue.
Thursday, October 10, 13
22. Is 100% Possible?
Cost?
Time?
How do we make the most accurate data we can?
Thursday, October 10, 13
We have worked with numerous data sets that take more time and money to FIX that it would
cost to just recollect the data.
They use datasets with accuracy closer to 75%. Make multi-million dollar capital
expenditures on this data.
They attempt to run advanced analysis tools that dont work because the data is so bad.
Band-Aid on a bullet wound.
23. How do we get accurate Data?
Thursday, October 10, 13
24. Where does the GIS data we have come from?
Thursday, October 10, 13
25. Legacy Data is from
digitized paper maps
Converted from Paper to
CAD to GIS
You inherit their accuracy
You are limited by their
completeness
Thursday, October 10, 13
27. Field Data Collection
Data Capture Devices
GPS- Record Accuracy
Photos
Quality Control
Thursday, October 10, 13
Multitude of GPS units, Data input devices.
Photos in your system can be used to create your own google street view of an area or
interest.
Quality control of data collection is key. Its long, tedious, repetitive but has to be nearly
perfect.
28. Data Capture Devices
Spatial data created with GPS receivers
Spatial Accuracy?
Signal Strength
Multipathing
Data entry accuracy
Speed and ease of use
Thursday, October 10, 13
Concerns about varying levels of location accuracy.
Data entry set up as pick lists as often as possible to avoid typing errors
Fields are limited to a specific data types (string, numbers, etc..)
30. Quality Control
Verify field data collection completeness.
Topology rules do automated error checking.
Field check the data in a second pass through the field.
Thursday, October 10, 13
Topology is where the work gets done.
Check phasing, linkages
Verify all the fields are properly filled out
Can flag spatial errors
31. Thank you!
Ben Metcalfe
Global Mapping Solutions
email: bmetcafe@gmappingsolutions.com
phone: (541) 913-5116
Thursday, October 10, 13