1. 8/20/2010
DATA ACQUISITION and
PREPARATION
Engr. Ablao GE517 Geographic Information System
On data input
… refers to the process of converting both paper and
digital geographic data into a format compatible and
useful to a GIS.
Data input is the bottleneck of GIS operations.
data input is slow, expensive and prone to error
cost of data and its conversion is often up to 80 per cent of the
total GIS cost
data conversion requires careful planning and constant
management
the GIS is only as good as the data that it has at its disposal
1
2. 8/20/2010
Issues in data input
Data sources are varied
topographic and cadastral maps
t hi d d t l
aerial photography and satellite imagery
field sheets and census information
etc.
Source data is at different scales and map projection
GIS involves encoding of spatial and non-spatial data
g p p
Automation of data conversion only partly successful, but could
revolutionize the process
More and more spatial data are becoming available in digital form
Prior to data input
Definition of data requirements
q
Operational planning and estimates
Data preparation
Data Input
Editing
2
4. 8/20/2010
Data Sources
Census May be spatial in character
and if each item has a spatial
reference, allowing its
Survey location on the Earth to be
Data identified
Usually in tabular format
y
Examples: population
census, employment data,
agricultural census data,
marketing data
Aerial Photographs
Aerial Photographs and
g p
Satellite Images
First method of remote sensing
A ‘snapshot’ of the Earth at a
particular instant in time
May be used as a background or
base map for other data in a GIS
Provides spatial context and aids
in interpretation
Versatile, relatively inexpensive
and detailed source of data for
GIS
4
5. 8/20/2010
Data Sources
Ground/ Using tapes transits,
tapes, transits
theodolites, total stations, etc.
Land Used to collect field data
Surveying such as coordinates,
Data elevations, and distances
Data collected are in analog
format (written down in
paper) which still need to be
transformed to digital format
for use in GIS
Data Sources
GPS Relatively new technique of
(Global field data collection
Positioning Originally designed for real-
time navigation
Systems) Can store collected
Data coordinates and associated
attribute information, which
may be downloaded directly
into a GIS database
Accuracy ranges from 100
meters to a few centimeters
5
6. 8/20/2010
Categories of Geographic Data Acquisition
Primary – collected through first-hand observation
first hand
Secondary – data collected by another individual
or organization; most are published data
Primary Raster and Vector Data
Raster Data
satellite images
scanned aerial photographs
Vector Data
Land survey points
GPS observation data
6
7. 8/20/2010
Methods of Data Input
Methods of Data Input
1. Raster Data Acquisition
Scanning
Photogrammetry
Remote sensing
2. Vector Data Acquisition
Manual digitizing
Computer-assisted
Computer assisted digitizing
Field surveying
GPS surveying
3. Attribute Data Acquisition
Keyboard entry
7
8. 8/20/2010
Methods of data input
Keyboard entry
Manual or operator-assisted digitizing
Scanning
Photogrammetric methods
Satellite remote sensing systems
Field
Fi ld survey
Satellite positioning systems
Other computer systems
1. Keyboard entry
Keyboard entry is primarily used for entering
tabular data into the GIS database
database.
Typical attribute data sets entered may be:
vegetation classes
polygon identifiers
soil types
topographic detail
The f
h form of this data may be:
f h d b
numeric
alpha-numeric
logical
8
9. 8/20/2010
2.a. Manual digitizing
Conventional digitizing is the manual process of converting geographic map
data into digital form.
The digitizing process is as follows:
Map is placed on a flat digitizing tablet and affixed using tape.
Operator identifies control (tic) points which have known geographic locations.
Usually four or more points are identified.
Operator digitizes the control points by moving the cursor to each location and
then activating the digitizer by pressing a button.
Software then performs calibration to enable any features digitized to be
transformed into true geographic coordinates.
Map features are then digitized by tracing their boundaries and activating the
digitizer as required.
2.a. Manual digitizing
Digitizing modes
Point mode - the operator activates the button each time they want a
location recorded
Stream mode - a continuous stream of coordinates are recorded during the
digitizing process with no need to activate a record button. The rate of
sampling can be controlled by time or distance intervals.
Digitizing accuracy
Sources of errors: original map errors, internal digitizer errors, operator
S f i i l i t l di iti t
errors and control point errors.
Digitizing accuracy = 1 mm x map scale
e.g., Digitizing accuracy = 1 mm x 50,000 = 50 m
9
10. 8/20/2010
2.b. Operator-assisted digitizing
Also known as heads-up digitizing because the
heads up
operator works with his head up looking at the screen
rather than with his head down following the cursor on
a digitizing tablet.
It is said to be 10 times faster than manual digitizing
digitizing.
2.b. Operator-assisted digitizing
The process is as follows:
A map is scanned into a computer system and resides in
raster format.
The map is displayed on the screen for the operator to use as
a reference.
The operator moves the cursor to a position at the start of a
line or contour and activates the computer software.
p
The computer software takes over and converts the raster
data to vector by following the pixels until there is a break.
When it stops, operator moves cursor to a new line.
10
12. 8/20/2010
3. Scanning
Scanning i an
S i is
automated process
of converting from
paper-based
products to digital
formats.
formats
12
13. 8/20/2010
3. Scanning
In the scanning process:
a map is passed through a scanning system which h a number of scanning
i d h h i hi h has b f i
detection units;
the detection units “detect” the reflected light emitted from features on the
map;
the reflected light is converted to a reflectance value
the image can then be converted to vector format and edited
Scanner types:
pass-through
h h
drum (normally used for large maps)
flat bed (normally used for small maps)
aperture card
3. Scanning
Characteristics of scanners
data f
d format - gray scale or color
l l
resolution - generally given in dots-per-inch (dpi)
scan speed - a function of scanner memory and transfer rate to
storage device
thresholding - ability to control the scanner’s sensitivity to various
features and colors
Note on scanning
g
Requires expensive scanners
Requires large hard disk space
Requires powerful workstations
13
14. 8/20/2010
4. Photogrammetry and remote sensing
Both are concerned with
collecting geographic data using
remote means.
Planimetric and topographic
information are usually derived
from aerial photographs.
Land cover and other information
are usually derived from satellite
imagery.
Scanned aerial photographs and
remotely sensed data are in
digital raster format already.
5. Field surveys
Traditionally, field measurements are made by
surveyors or field staff who use specialized equipment
and procedures for gathering geographic data.
Field measurements usually include:
measurements of distance and direction
measurements in both horizontal and vertical planes
14
15. 8/20/2010
5. Field surveys
Measurements can be made with:
compasses, transits and theodolites (for direction)
tapes, chains and distance meters (for distance)
levels (for elevation)
GPS (all of the above)
6. Satellite-based positioning systems
GPS (U d States), GLONASS (R
(United S ) (Russia) and
) d
GNSS (Europe) are three civilian satellite
positioning systems that are operational at
present.
Primarily developed for military
applications, the American and Russian
systems are subject to degradation for
civilian use
use.
Receivers cost anywhere between US$
1,000 to US$ 100,000 and give accuracy
from 100 m to 1 cm (using sophisticated
GPS data processing techniques).
15
16. 8/20/2010
7. Electronic Data Transfer
Used when data is
already in digital form
Usually followed by data
conversion, particularly
when the transferred
data is in a different
format than what is
required
7. Electronic Data Transfer
Some local data that are available, include:
Municipal boundaries, 1:250,000 (P70,000)
100-m contours, 1:250,000 (P50,000)
Barangay boundaries (from NSO)
16
17. 8/20/2010
Data Editing
Data Editing
Errors and inaccuracies during data acquisition and
input translate into errors in the GIS
Before further analyses are made, these errors
should be corrected to prevent the errors from
propagating t generated information
ti to t di f ti
17
18. 8/20/2010
Errors in data input
Entity error
missing entities
incorrectly-placed entities
disordered entities
Attribute error
using the wrong code for an attribute
misspellings
i lli
Entity-attribute agreement (logical consistency) error
correct code is linked to the wrong entity
Entity errors
All entities that should have been entered are present.
p
No extra entities have been digitized.
The entities are in the right place and are of the correct shape and
size.
All entities that are supposed to be connected to each other are.
All polygons have only a single label point to identify them.
All entities are within the outside boundary identified with
registration marks.
18
19. 8/20/2010
Entity errors Attribute errors
Pseudo-nodes Missing attributes
Dangling node
undershoot
Incorrect attribute values
overshoot Other problems
Missing labels and too many Projection changes
j g
labels
l b l
Sliver polygons Edge matching
“Weird” polygons Rubber sheeting
19
20. 8/20/2010
Joining Adjacent Layers
Needed when there are multiple map sheets to be
used
Ensures that all layers form a continuous geographic
database when joined together
Data Conversion
After input and editing of individual datasets, it is
p g ,
usually necessary to process the data before
integrating them all into a single GIS
Process of converting data on one form to a more
useful format for the specific GIS application
p pp
One of the most tedious, time-consuming, and error-
prone processes in GIS
20
21. 8/20/2010
Raster to Vector Conversion
Vectorization
Converting scanned raster images to vector features
(point, line, or polygons)
Results are visually problematic most of the time
Raster Line Thinning
‘skeletonizing’
skeletonizing
Process of reducing raster linear features into unit
width
21
22. 8/20/2010
Line Smoothing
Employed to make the resulting
vectors more visually appealing
during raster to vector conversion, the
results are usually j
l ll jagged/crooked
d/ k d
(especially for diagonal lines)
Vectorization Methods
1.
1 Manual – user selects and picks out features to be
converted
2. Automatic – entire raster image is converted by the
computer software without user intervention
3. Semi-automatic – combination of manual point
p
picking and computerized line tracing
– produces best results
22
23. 8/20/2010
Raster to Vector Conversion
Changing raster images into vector
graphics
May be done manually, automatically,
or semi-automatically
i t ti ll
Major limiting factor is the map quality
Graphical Data Editing
Cleaning
Cl i graphics b removing
hi by i
data conversion errors
23
24. 8/20/2010
Attribute Data Tagging
Adding attribute data (e g
(e.g.,
feature identifiers, feature codes,
and contour labels) to the
graphical data
Vector to Raster Conversion
Rasterization
process of converting vector data (points, lines and
polygons) into raster data (series of cells each with a
discrete value)
Produces visually satisfactory results
y y
May be problematic in terms of the attributes assigned
to pixels
Most evident along edges/boundaries (partial cells)
24
25. 8/20/2010
Rasterization of Lines
Data Integration
Combining data from various
sources and in various formats to
be able to extract more/better
information
25
26. 8/20/2010
Two types of spatial data integration:
1. Horizontal Integration
‘tiling’; merging of
adjacent data sets
Two types of spatial data integration:
2. Vertical Integration –
map overlay; stacking of
data sets/layers
26
27. 8/20/2010
Examples of Adjustments Required for Data Integration
Mathematical Transformations – translation, scaling, rotation, or skewing
Rectification – rearrangement of the location of objects to correspond to a specific
(geodetic) reference system
Registration – rearrangement of the location of objects of one set so they correspond with
those of another, without referring to a specific reference system
Rubber Sheeting – data set/layer is differentially ‘stretched’ so that tic p
g y y points on the layer
y
are moved to approximate the location of the corresponding ground control points or
corresponding tic points in another layer
Edge Matching – employed to properly connect or line-up corresponding features in
adjacent map sheets to create a seamless model
translation differential
scaling ground control
map locations
rotation
skewing
GIS file
Mathematical Transformations Rubber Sheeting
27
28. 8/20/2010
Widescreen Test Pattern (16:9)
Aspect Ratio Test
(Should appear
circular)
4x3
16x9
28