Preparation and Presentation of Geospatial
Data in Maps
AFM Tariqul Islam
Scientific Officer, BARI,Gazipur
E-mail: afmtareq@gmail.com
What is Spatial Data?
Spatial Data – it is the data or information that identifies the
geographic location of features and boundaries on Earth ,
such as natural or constructed features, oceans , and more .
Spatial data is usually stored as coordinate and topology, and
is data that can be mapped.
Types of SPATIAL DATA
• RASTER
• VECTOR
• Real World
Source: Defense Mapping School
National Imagery and Mapping Agency
Raster and Vector Data Models
Vector Representation
X-AXIS
500
400
300
200
100
600
500
400
300
200
100
Y-AXIS
River
House
600
Trees
Trees
B
B
B B
B
B
B
B G
G
BK
B
B
B
G
G
G G
G
Raster Representation
1 2 3 4 5 6 7 8 9 10
1
2
3
4
5
6
7
8
9
10
Real World
G
G
Source: Defense Mapping School
National Imagery and Mapping Agency
Vector Data
Vector data provide a way to represent real world features
within the GIS environment. A vector feature has its shape
represented using geometry. The geometry is made up of one
or more interconnected vertices. A vertex describe a position
in space using an x, y and optionally z axis. In the vector data
model, features on the earth are represented as:
• points
• lines / routes
• polygons / regions
• TINs (triangulated irregular networks)
Vector Data
This system of recording features is based on the interaction
between arcs and nodes, represented by points, lines and
polygons. A point is a single node, a line is two nodes with an
arc between them, and a polygon is a closed group of three
or more arcs. With these three elements , it is possible to
record most all necessary information.
Points Lines Polygons
Vector Data
Advantages : Data can be represented at its original resolution
and form without generalization. Graphic output is usually
more aesthetically pleasing (traditional cartographic
representation); Since most data, e.g. hard copy maps, is in
vector form no data conversion is required. Accurate
geographic location of data is maintained.
Disadvantages: The location of each vertex needs to be stored
explicitly. For effective analysis, vector data must be
converted into a topological structure. This is often processing
intensive and usually requires extensive data cleaning. As
well, topology is static, and any updating or editing of the
vector data requires re-building of the topology.
Raster Data
Raster Data – cell –based data such as aerial imagery and
digital elevation models. Raster data is characterized by pixel
values. Basically, a raster file is a giant table, where each pixel
is assigned a specific value from 0 to 255. The meaning
behind these values is specified by the user – they can
represent elevations, temperature, hydrology and etc.
Raster Data
Raster data are good at:
• representing continuous data (e.g., slope, elevation)
• representing multiple feature types (e.g., points, lines, and
polygons) as single feature types (cells)
• rapid computations ("map algebra") in which raster layers
are treated as elements in mathematical expressions
• analysis of multi-layer or multivariate data (e.g., satellite
image processing and analysis)
• hogging disk space
Raster Data
Advantages : The geographic location of each cell is implied
by its position in the cell matrix. Accordingly, other than an
origin point, e.g. bottom left corner, no geographic
coordinates are stored. Due to the nature of the data storage
technique data analysis is usually easy to program and quick
to perform. The inherent nature of raster maps, e.g. one
attribute maps, is ideally suited for mathematical modeling
and quantitative analysis. Grid-cell systems are very
compatible with raster-based output devices, e.g.
electrostatic plotters, graphic terminals.
Raster Data
Disadvantages: The cell size determines the resolution at
which the data is represented.; It is especially difficult to
adequately represent linear features depending on the cell
resolution. Accordingly, network linkages are difficult to
establish. Processing of associated attribute data may be
cumbersome if large amounts of data exists. Raster maps
inherently reflect only one attribute or characteristic for an
area. Most output maps from grid-cell systems do not
conform to high-quality cartographic needs.
What is GIS ?
• A method to
visualize, manipulate,
analyze, and display spatial
data
• “Smart Maps” linking a
database to the map
GIS data formats (files)
• Shapefiles
• Coverages
• TIN (e.g. elevation can be stored as TIN)
– Triangulated Irregular Network
• Grid (e.g. elevation can be stored as Grid)
• Image (e.g. elevation can be stored as image)
Vector data
Raster data
Shape Files
• Nontopological
• Advantages no overhead to process topology
• Disadvantages polygons are double digitized,
no topologic data checking
• At least 3 files .shp .shx .dbf
Data Collection
• Can be most expensive GIS activity
• Many diverse sources
• Two broad types of collection
– Data capture (direct collection)
– Data transfer
• Two broad capture methods
– Primary (direct measurement)
– Secondary (indirect derivation)
Data Collection Techniques
Field/Raster Object/Vector
Primary Digital remote
sensing images
GPS
measurements
including VGI
Digital aerial
photographs
Survey
measurements
Secondary Scanned maps Topographic
surveys
DEMs from maps Toponymy data
sets from atlases
Primary Data Capture
• Capture specifically for GIS use
• Raster – remote sensing
– e.g., SPOT and IKONOS satellites and aerial photography,
echosounding at sea
– Passive and active sensors
• Resolution is key consideration
– Spatial
– Spectral, Acoustic
– Temporal
Vector Primary Data Capture
• Surveying
– Locations of objects determines by angle and distance
measurements from known locations
– Uses expensive field equipment and crews
– Most accurate method for large scale, small areas
• GPS
– Collection of satellites used to fix actual locations on
Earth’s surface
– Differential GPS used to improve accuracy
Secondary Geographic Data Capture
• Data collected for other purposes, then
converted for use in GIS
• Raster conversion
– Scanning of maps, aerial photographs, documents,
etc.
– Important scanning parameters are spatial and
spectral (bit depth) resolution
Map Types
• Different demands require different types of maps
– Dependent on the data being used.
• Different maps can have many symbols, or only one
symbol.
– Depends on what you’re trying to show.
• Maps might use
– Nominal data- names or ID’s objects
– Categorical data- separates data into groups or classes
– Ordinal data- separates data based on quantitative rank
– Numerical data- data based on numbers with a standard
interval between them
Nominal data
• Data identified or named by some type of label
– Can be text or number
• Maps often have many objects, almost all of
which have points, lines and polygons that are
identified as some unique feature
– Points may be a city or house
– Lines may be rivers, faults, railroads, roads, etc.
– Polygons may be parks, states, counties, countries,
etc.
Ordinal data
• Data are grouped by rank according to some
quantitative measure
– Cities may be small medium or large
– Students may earn A B C or D’s in class
– Soils may have I, II, III, IV infiltration
• The data must be represented by unique values
maps and colors must show or portray an increasing
sense of value
A geological map is a Unique Values Map based on categorical data
representing different formations, or other geological units
Numerical data
• Numbers that represent continuous phenomena that fall along a
regularly spaced interval
– Rainfall, elevations, populations, chemical concentrations, etc.
• Equal changes in the interval involve equal changes in the thing being
measured
• Ratio vs Interval numerical data
– Ratio measured with respect to some meaningful zero point
• Ex.- Rainfall; if zero, then no rain has fallen
• Can add, subtract, multiply and divide these data
– Interval measured against no meaningful zero point
• Ex.- Temperature in F or C scale; has a regular scale, but zero on the
thermometer does not mean a total lack of temperature.
• Any data that can have a negative value is Interval (e.g. elevation)
• Interval can only support addition and subtraction
Symbols associated with numerical data
• Points and Lines typically arranged so that the bigger the
numerical attribute number, the larger the point or the
thicker the line
– Graduated Symbols- Points and lines are divided into
classes with a given range of values for each class and
a symbol unique to that class
• A classed map
– Proportional Symbols- numeric value is proportional to
the size of the symbol
• Creates what is referred to as an unclassed map
• Polygons-numeric data are typically represented
by colors
– Can vary by hue, saturation or intensity
– Changes in rainfall are commonly represented this way
with each class a deeper shading of the color
(intensity) for that shape
Symbols associated with numerical data
Two varieties of precipitation maps
using color intensity
The top map uses a
monochromatic intensity ramp to
represent various increasing
amounts of annual rainfall
The bottom is a two toned color
ramp of the same data, with yellow
= dryer and green = wetter
Graduated color maps or Choropleth maps
Normalized data
• Some features will have larger symbols due to
larger attribute values associated with larger
coverages or areas
– Larger counties will often have more farmland or
larger populations, but it will be spread out over larger
areas.
– Normalizing the population to area (people divided by
square miles) keeps the symbols from being
disproportionally larger and therefore seemingly more
important
Dot density maps can normalize the data by letting each dot represent 1
million people.
the more dots, the more people in that state. Can be arranged in
specific locations in the state too
Classifying (grouping) data
• Many methods for grouping numeric data
– Depends what you want to show
• Natural breaks (Jenks)- looks for gaps in data values
• Equal interval-equal size for the intervals
• Defined interval- range of values defined by user
• Quantile- same number of features in each class
– Class defined
• Geometric interval- each class multiplied by a coefficient
to create the next class
• Standard deviation- the statistical deviation from normal
of the data in any attribute field
• Manual (arbitrary) breaks- self explanatory
Raster data
• Two types of rasters
– Thematic Raster and Image raster
• Thematic- 2 categories
– Discrete- coded values identify discrete regions of
similar values
• e.g., geology or land use
– Continuous- values change continuously from one
location to another
• e.g., elevation or precipitation
• Image- from satellites and photos
– Pixels are given lightness/darkness values from 0-255
with 0 being black and 255 being white
Discrete raster
• Best using Unique Values classification
– Each value receives a color
• Geology map example on next slide
Continuous raster
• Classified
– Values divided into classes and classes are
given colors
• Elevation map example on next slide
• Stretched
– Values are scaled to one of 256 color shades
• Elevation map c) on next slide
a) Thematic raster discrete unique
values- geology
b) Thematic raster continuous classified
values- elevation