SlideShare ist ein Scribd-Unternehmen logo
1 von 99
Geoprocessing & Spatial Analysis
                                 GES673
                             at Shady Grove

                            Richard Heimann




                               Richard Heimann © 2013

Thursday, February 21, 13
Course Description
The increased access to spatial data and overall improved application of spatial analytical methods present
certain challenges to social scientific research. This graduate course is designed to focus on substantive social
science research topics while exposing rewards and potential risks involved in the application of geographic
information systems (GIS), spatial analysis, and spatial statistics in their own research. 

The course will also highlight connections between spatial concepts and data availability. Both traditional spatial
science data will be used as well as new emerging social media data, which better reflect some of the more recently
developments in Big Data - most notably the socially critical exploration of such data. Substantive foci will include
readings and discussions of spatially explicit theory leaning toward acknowledgment of a social and spatial turn
in Big Data and enhanced role and extension of spatial analysis to keep with such trends.

Throughout the course, lectures and discussions will be complemented with lab sessions introducing spatial
analysis methods and GIS and spatial analysis software. The lab sessions will include the use of among other
software GeoDa and ArcGIS. These lab sessions will introduce many methodological and technical issues relevant
to spatial analysis. Assignments for the courses include up to two writing assignments, up to four lab assignments,
and a final project which will be presented as a short 15-minute presentation as well as submitted as a term paper.
The writing assignments will include an annotated bibliography/brief literature review within a selected theme
area of spatial thinking/perspectives/methods. The lab assignments will focus on building geospatial databases,
basic spatial analysis, exploratory spatial data analysis, and spatial regression modeling. The courses will include
other labs and assignments that will be completed for no grade; these are intended as mechanisms/opportunities
for developing and enhancing familiarity with selected software, data resources, and analytic methods.

Course Objectives:
- Examine methods and literature of geographic information science, spatial analysis and geographic knowledge
discovery.
- Learn about solving problems and answering questions using GIS and quantitative methods.
- Use GIS software to learn some of the analytical tools available - ArcGIS Desktop & GeoDa.
- Gain experience working with traditional and nontraditional social science data (i.e. Flickr, Twitter).



                                                Richard Heimann © 2013

Thursday, February 21, 13
Course Notes
Text:
1. Geospatial Analysis, 3rd edition. By: Michael J. de Smith, Michael Goodchild, and Paul A.Longley. The text is available as
an Adobe readable file for download (uses special secure PDF reader), a version for the Kindle, on-line via a website, and as a
printed book. See http://www.spatialanalysisonline.com/ for further information.

2. Making Spatial Decisions Using GIS: A Workbook. 2nd edition. By: Kathryn Keranen and Robert Kolvoord. Should be
available in the Shady Grove Bookstore or ESRI Press or Amazon: http://www.amazon.com/Making-Spatial-Decisions-Using-
GIS/dp/1589482808

3. GeoDa User Guide 0.9.3. (UG) The documentation will be somewhat unsyncronized with the software but not so much so
that you will be prevented from completing labs.
https://geodacenter.asu.edu/software/documentation

4. Exploring Spatial Data with GeoDa: A Workbook (UGW) http://www.csiss.org/clearinghouse/GeoDa/geodaworkbook.pdf

5. Other readings will be required and further suggested. They will be noted in the syllabus and either provided or will be
cited for your discovery.

Optional Text:
a. The GIS 20: Essential Skills - http://www.amazon.com/GIS-20-Essential-Skills/dp/1589482565



Evaluation
Midterm exam (15%) (20 “T/F with explanation”) Based on lectures and readings (open book)
Lab Assignment 50 points (25%) (5 x 10)
Reading Labs 40 points (20%) (4 x 10)
Paper (60 points) & Presentation (20) (40%)


                                                   Richard Heimann © 2013

Thursday, February 21, 13
What will we discuss…?

            Methods                                                   Theory
-Visual Data Analysis                                           -First Law of Geography
-Spatial Analysis                                                 -Spatial Heterogeneity
-ESDA                                                         -Spatially Explicit Theory
-Spatial Analysis
-Geographic Knowledge Discovery
-Spatial Econometrics
-Spatial Modeling




                                           Data
                            Big Data, Small Data vs. Big Data
                                MAUP, Ecological Fallacy,
                                 Atomistic Fallacy, etc.
                                     Richard Heimann © 2013

Thursday, February 21, 13
Why GeoDa, Python, and R?
Not a GIS, but…
•Complements all major GIS packages.
•Windows based, so familiar interface.
•Relies on same programming/math as the R package
spdep and extends into Python using PySAL.
• Incorporates more sophisticated statistical routines into
spatial analysis than a GIS (i.e. ArcGIS Desktop).
•Developed by Dr. Luc Anselin, Arizona State U.
•FREE!
•Python is an OS interpreted, object-oriented, high-level
programming language.
• R is an OS strongly functional language and
environment to statistically explore data sets and analyze
datasets.
                            Richard Heimann © 2013

Thursday, February 21, 13
What do I mean when I say OS?

  Free and Open Source: you can think of it as “free” as in
        “free speech,” and “free” as in “free beer.”

                                        


      Open GeoDa is a cross-platform,
      open source version.

      PySAL is the underlying open
      source library with extended
      functionality.

                            Richard Heimann © 2013

Thursday, February 21, 13
Introductions

 Name

 Background

 Experience w/ Spatial Analysis

 Expectations…

 Recently watched movie or book read…




                               Richard Heimann © 2013

Thursday, February 21, 13
Geoprocessing & Spatial Analysis (GES673)

What will we talk about today?

Just an introduction...but we will be gaining
momentum.

What is GIS? Spatial Analysis?

Why is Spatial Analysis and what are the four
levels?

The Social Turn in Big Data and the neospatial
analysis and mining for knowledge discovery.
                            Richard Heimann © 2013

Thursday, February 21, 13
What is GIS?
   This is NOT a GIS Class.
   Geographic Information Information is
   knowledge about “what is where when”
  Geographic/geospatial: synonymous.
  ...spatial subtly different.

  What is the ‘S’ in GIS?
   Systems: the technology.
   Science: the concepts and theory.
   Studies: the societal context.

                               Richard Heimann © 2013

Thursday, February 21, 13
Defining Geographic Information Systems (GIS)


The common ground between information processing and the
many fields using spatial analysis techniques. (Tomlinson, 1972)

A powerful set of tools for collecting, storing, retrieving,
transforming, and displaying spatial data from the real world.
(Burroughs, 1986)

A computerised database management system for the capture,
storage, retrieval, analysis and display of spatial (locationally
defined) data. (NCGIA, 1987)

A decision support system involving the integration of spatially
referenced data in a problem solving environment. (Cowen, 1988)


                            Richard Heimann © 2013

Thursday, February 21, 13
Geographic Information System:
                                   intuitive description




                  A map with a database behind it; a virtual
                   representation of the real world and its
                              infrastructure.




                                      Richard Heimann © 2013

Thursday, February 21, 13
GI Systems, Science and Studies
                                           Which will we do?
    Systems
   Advanced Seminar is GIS GES670
   Professional Seminar in Geospatial Technologies GES659
   *Geoprocessing and Spatial Analysis GES673
   *Spatial Social Science GES679
    Science
   *Geoprocessing and Spatial Analysis GES673
   GIS Modeling Techniques GES773
   Spatial Social Science GES679
   *Spatial Statistics GES774
   Advanced Visualization and Presentation
    Studies
   *Geoprocessing and Spatial Analysis GES673
   GIS Modeling Techniques GES773
   *Spatial Social Science GES679
   *Combine hands-on technical training with an understanding of the underlying science, and an emphasis on multidisciplinary applications

                                                         Richard Heimann © 2013

Thursday, February 21, 13
Where Most UMBC Students Work and Live




                            Richard Heimann © 2013

Thursday, February 21, 13
The GIS Data Model




                                  Richard Heimann © 2013

Thursday, February 21, 13
The GIS Data Model: Purpose

    Allows geographic features to be digitally
    represented and stored in a database so that
    they can be abstractly presented in map
    (analog) form, and can also be worked with
    and manipulated to address some problem.


                                                     (see associated diagrams)




                            Richard Heimann © 2013

Thursday, February 21, 13
Richard Heimann © 2013

Thursday, February 21, 13
A layer-cake of information




                            GIS Data Model

                              Richard Heimann © 2013

Thursday, February 21, 13
Spatial and Attribute Data
   Spatial data (where)
   specifies location; stored in a shape file, geodatabase or
   similar geographic file.
   Attribute (descriptive) data (what, how much, when)
   specifies characteristics at that location, natural or
   human-created stored in a data base table.

    GIS systems traditionally maintain spatial and
    attribute data separately, then “join” them for display
    or analysis.

   	
                            Richard Heimann © 2013

Thursday, February 21, 13
Spatial and Attribute Data
                                                          ALABAMA                AL


Lack of Locational Invariance (Goodchild et al)
                                                          ALASKA                 AK
                                                          ARIZONA                AZ
                                                          ARKANSAS               AR


• fundamental property of spatial analysis
                                                          CALIFORNIA             CA
                                                          COLORADO               CO
                                                          CONNECTICUT            CT


• results change when location changes                    DELAWARE               DE
                                                          DISTRICT OF COLUMBIA   DC
                                                          FLORIDA                FL

where matters                                             GEORGIA
                                                          HAWAII
                                                          IDAHO
                                                                                 GA
                                                                                 HI
                                                                                 ID
                                                          ILLINOIS               IL
                                                          INDIANA                IN
                                                          IOWA                   IA
                                                          KANSAS                 KS
                                                          KENTUCKY               KY
                                                          LOUISIANA              LA
                                                          MAINE                  ME
                                                          MARYLAND               MD
                                                          MASSACHUSETTS          MA
                                                          MICHIGAN               MI
                                                          MINNESOTA              MN
                                                          MISSISSIPPI            MS
                                                          MISSOURI               MO
                                                          MONTANA                MT
                                                          NEBRASKA               NE
                                                          NEVADA                 NV
                                                          NEW HAMPSHIRE          NH
                                                          NEW JERSEY             NJ
                                                          NEW MEXICO             NM
                                                          NEW YORK               NY
                                                          NORTH CAROLINA         NC
                                                          NORTH DAKOTA           ND
                                                          OHIO                   OH
                                                          OKLAHOMA               OK
                                                          OREGON                 OR
                                                          PENNSYLVANIA           PA
                                                          RHODE ISLAND           RI
                                                          SOUTH CAROLINA         SC
                                                          SOUTH DAKOTA           SD
                                                          TENNESSEE              TN
                                                          TEXAS                  TX
                                                          UTAH                   UT
                                                          VERMONT                VT
                                                          VIRGINIA               VA
                                                          WASHINGTON             WA
                                                          WEST VIRGINIA          WV
                                                          WISCONSIN              WI
                                                          WYOMING                WY




                                 Richard Heimann © 2013

Thursday, February 21, 13
Representing Data with Raster and Vector Models
    Raster Model
    Area is covered by grid with (usually) equal-sized, square cells;
    Regular Lattices.
    Attributes are recorded by assigning each cell a single value based on
    the majority feature (attribute) in the cell, such as land use type.
    Image data is a special case of raster data in which the “attribute” is
    a reflectance value from the geomagnetic spectrum
    Cells in image data often called pixels (picture elements)

    Vector Model
    The fundamental concept of vector GIS is that all geographic features
    in the real work can be represented either as:
    Points or dots (nodes): Cities, human sensors, individual obs.
    Lines (arcs): movement, connectedness, networks
    Areas (polygons): Countries, States, Census Tracts, Cities, Irregular
    Lattices - Multivariate in nature.
                                Richard Heimann © 2013

Thursday, February 21, 13
Lattice Data; Yes or No?




                            Richard Heimann © 2013

Thursday, February 21, 13
Lattice Data; Yes or No?




                            Irregular
                             Lattice




             Regular                                  Irregular
             Lattice         Richard Heimann © 2013    Lattice
Thursday, February 21, 13
What is spatial analysis?
  From Data to Information
  ...beyond mapping.
  transformations, manipulations and application of
  analytical methods to spatial (geographic) data

  Lack of locational invariance (Goodchild et al)
  Fundamental property of spatial analysis.
  Analyses where the outcome changes when the locations of
  the objects under study change.
  Median center vs. Median, Standard Deviational Ellipses.,
  Autocorrelation vs. Spatial Autocorrelation.

  Where matters
  In an absolute sense (coordinates)
  In a relative sense (spatial arrangement, distance)
                            Richard Heimann © 2013

Thursday, February 21, 13
Spatial analysis as a process
    Problem formulation

    Data gathering

    Exploratory analysis

    Hypothesis formulation

    Modeling and testing

    Consultation and review

    Reporting and implementation

                            Richard Heimann © 2013

Thursday, February 21, 13
Analytical methodologies
             Mitchell (2005)                  Draper et al (2005)




                               Richard Heimann © 2013

Thursday, February 21, 13
Analytical methodologies - PPDAC
      Mackay & Oldford (2002)




                            Richard Heimann © 2013

Thursday, February 21, 13
Components of Spatial Analysis
    Visualization
    Showing interesting patterns

    Exploratory Spatial Data Analysis (ESDA)
    Finding interesting patterns

     Spatial Modeling, Regression
    Explaining interesting patterns


                            Richard Heimann © 2013

Thursday, February 21, 13
THE PROBLEM … GEOGRAPHICAL LITERACY
    Despite having a highly education society, Americans are arguably the
    world’s most geographically ignorant people


    By comparison, children throughout much of the world are exposed to
    geographic training in both primary and secondary schools


    Most Americans learn what little geography they know in elementary
    or middle school.


    In the United States, the last time a student hears the word
    “geography” is usually in the third grade
   Discussion of geography at any higher level is hidden under the heading “social studies”


    Concern over geographical illiteracy led President Reagan to declare
    November 15-21, 1987 as the first Geography Awareness Week (a joint
    resolution of the One Hundredth Congress)

                                      Richard Heimann © 2013

Thursday, February 21, 13
GEOGRAPHY TODAY
    The National Geographic Society released the Roper Public
      Affairs 2006 Geographic Literacy Study in May, 2006

    510 interviews were conducted among a sample of 18- to 24-year old adults in
    the continental United States between December 17, 2006 and January 20,
    2006)
          The sample has a margin or error of +/- 4.4 % at the 95% confidence level


  Survey results …
   Over 6 in ten (63%) of those surveyed could not locate Iraq on a map of the Middle
   East
   Nearly nine in ten (88%) could not identify Afghanistan on a map of Asia
   Seven in ten (70%) could not find North Korea on a map, and 63% did not know its
   border with South Korea is the most heavily fortified in the world
   Sizeable percentages did not know that Sudan and Rwanda are in located in Africa
   (54% and 40%, respectively)
                                      Richard Heimann © 2013

Thursday, February 21, 13
GEOGRAPHY TODAY (CONTINUED)
   Three-quarters could not find Indonesia on a world map and were unaware that
   a majority of Indonesia’s population is Muslin, making it the largest Muslim
   country in the world.


   A third or more could not find Louisiana or Mississippi on a map of the United
   States.


   Only 18% could correctly answer a multiple-choice question about the most
   widely spoken native language in the world. (5 Part Questionnaire)


   Although half said map reading skills are “absolutely necessary” in today’s
   world, many Americans lack basic practical skills necessary for safety and
   employment in today’s world.


  One-third (34%) would go in the wrong direction in the event of an evacuation
  One third (32%) would miss a conference call scheduled with colleagues in
  another time zone.
                                                           Recommended Link
                                                           2006 National Geographic – Roper Survey of Geographic Literacy
                                  Richard Heimann © 2013   http://www.nationalgeographic.com/roper2006/findings.html

Thursday, February 21, 13
Advanced Placement Human Geography

                                                     Score   Percent
   This college-level course introduces
   students to the systematic study of               5       11.6%
   patterns and processes that have
   shaped human understanding, use,                  4       16.7%
   and alteration of Earth's surface.
                                                     3       21.9%
   Students employ spatial concepts
   and landscape analyses to analyze                 2       16.6%
   human social organization and its
   environmental consequences. They                  1       33.2%
   also learn about the methods and
   tools geographers use in their                        In the 2009
   science and practice.                               administration,
                                                      50,730 students
                                                     took the exam and
                                                       the mean score
                            Richard Heimann © 2013       was a 2.57. 
Thursday, February 21, 13
Human Geography




                                 Richard Heimann © 2013

Thursday, February 21, 13
Human Geography




                                                          http://www.benjaminbarber.com/bio.html


                                 Richard Heimann © 2013

Thursday, February 21, 13
Human Geography




                                 Richard Heimann © 2013

Thursday, February 21, 13
WHAT IS GEOGRAPHY?
• Geography is the study of the earth’s surface as the space within
    which human population live
•   Geography combines characteristics of both the natural and
    social sciences and literally bridges the gap between the two -
    more on this later.
•   Geography is a generalized as opposed to a specialized field of
    study
•   Space is the unifying theme for geographers
•   Geography is the science of space and place
•   Geographers are interested in …
    • Where things are located on the earth’s surface
    • Why they are located where they are
    • How places differ from one another
    • How people interact with the environment
•   Geographers were among the first scientists to sound the alarm
    that human-induced changes to the environment are beginning
    to threaten the balance of life
                              Richard Heimann © 2013

Thursday, February 21, 13
What was wrong with Geography?
    Geography had a number of problems, including:


    1. It was overly descriptive
    	     Geography followed a set format for the inventory of physical and
    cultural features
    2. It was almost purely educational
    	     Regions don't really exist
    3. It failed to explain geographic patterns
    	   Geography was descriptive and did not explain why patterns
    were the way they were
    	   Where attempts at explanation did exist, they favored historical
    approaches
    4. The biggest problem of geography was the fact that it was
    unscientific
    
    …the Nomothetic & Idiographic debate in geography begins!

                                Richard Heimann © 2013

Thursday, February 21, 13
Introduction to Spatial Analysis

Topics
•Description versus Analysis
•The concepts of Process, Pattern and
Analysis
•Issues and challenges in spatial data
analysis
•Measuring space

                            Richard Heimann © 2013

Thursday, February 21, 13
Process, Pattern and Analysis


         Processes operating in space produce
                      patterns

                      Spatial Analysis is aimed at:
      1., 2. Identifying and describing the pattern
        3., 4. Identifying and understanding the
                          process

                               Richard Heimann © 2013

Thursday, February 21, 13
Complete Spatial Randomness


                                   Deviations from spatial
                               randomness suggests underlying
                                      social processes.

                                  “Every observable effect has a
                                    physical cause” (Thales)

                             Perhaps the most profound insight-
Randomized Variable             causality is a rejection of the                                                         Total TTL Count –
  – 500 meter cell                                                                                                       500 meter cell
                                        randomness.


  “Every observable effect has a physical cause” (Thales) Perhaps the most profound insight-causality is a rejection of the randomness.


                                                        Richard Heimann © 2013

Thursday, February 21, 13
Complete Spatial Randomness




Randomized Variable                                                                                                     Total TTL Count –
  – 500 meter cell                                                                                                       500 meter cell




  “Every observable effect has a physical cause” (Thales) Perhaps the most profound insight-causality is a rejection of the randomness.


                                                        Richard Heimann © 2013

Thursday, February 21, 13
Description vs. Analysis
                      Description

       Most GIS systems are used by
         governments and private
       companies to describe the real
      world this helps the organization
                 “do its job”

        For example, manage sewer and water
          networks manage land resources
        Most GIS systems are primarily
          designed for this purpose

    They are used to develop spatial databases to
     describe the real world and help manage it.


                                      Richard Heimann © 2013

Thursday, February 21, 13
Description vs. Analysis

             Analysis
     Tries to understand the
    processes which cause or
    create the patterns in the
            real world
    Understanding processes:
      Helps the organization do its
               job better
      Make better decisions, for example
          Helps us understand the
             phenomena itself
             This is the role of science


                                           Richard Heimann © 2013

Thursday, February 21, 13
Description vs. Analysis
                                                         Is the locations of the software industry
                                                           different from the telecommucations
                                                                         industry?
             Analysis
     Tries to understand the
    processes which cause or
    create the patterns in the
            real world
    Understanding processes:
      Helps the organization do its
               job better
      Make better decisions, for example
          Helps us understand the                       Here, we are using “centrographic statistics” to
             phenomena itself                                     help answer this question
             This is the role of science


                                           Richard Heimann © 2013

Thursday, February 21, 13
Dr. Snow maps cholera in Soho London (1854)




                            Richard Heimann © 2013

Thursday, February 21, 13
The first example of Spatial Analysis




                            Richard Heimann © 2013

Thursday, February 21, 13
The first example of Spatial Analysis
• John Snow’s maps of cholera in 1850s London




                            Richard Heimann © 2013

Thursday, February 21, 13
The first example of Spatial Analysis
• John Snow’s maps of cholera in 1850s London




                            Richard Heimann © 2013

Thursday, February 21, 13
The first example of Spatial Analysis
• John Snow’s maps of cholera in 1850s London




                            Richard Heimann © 2013

Thursday, February 21, 13
The first example of Spatial Analysis
• John Snow’s maps of cholera in 1850s London




                            Richard Heimann © 2013

Thursday, February 21, 13
The first example of Spatial Analysis
• John Snow’s maps of cholera in 1850s London




                            Richard Heimann © 2013

Thursday, February 21, 13
The first example of Spatial Analysis
• John Snow’s maps of cholera in 1850s London




                            Richard Heimann © 2013

Thursday, February 21, 13
The first example of Spatial Analysis
• John Snow’s maps of cholera in 1850s London




                            Richard Heimann © 2013

Thursday, February 21, 13
The first example of Spatial Analysis
• John Snow’s maps of cholera in 1850s London




Was it ESDA or hypothesis testing?




                            Richard Heimann © 2013

Thursday, February 21, 13
The first example of Spatial Analysis
• John Snow’s maps of cholera in 1850s London




Was it ESDA or hypothesis testing?
• Did he discover the association between water and
  cholera after drawing the map: ESDA


                            Richard Heimann © 2013

Thursday, February 21, 13
The first example of Spatial Analysis
• John Snow’s maps of cholera in 1850s London




Was it ESDA or hypothesis testing?
• Did he discover the association between water and
  cholera after drawing the map: ESDA
• Did he draw the map in order to prove the
  association: using a map for hypothesis testing
                            Richard Heimann © 2013

Thursday, February 21, 13
Spatial Analysis: successive levels of sophistication
    Four levels of Spatial Analysis:
    	 --Each is more advanced (more difficult!)

    Spatial data description (the primitives)
    Exploratory Spatial Data Analysis (ESDA)
    Spatial statistical analysis and hypothesis testing
    Spatial modeling and prediction

    We will look at all 4 levels in this lecture series



                            Richard Heimann © 2013

Thursday, February 21, 13
Spatial Analysis: successive levels of sophistication
    Four levels of Spatial Analysis:
    	 --Each is more advanced (more difficult!)

    Spatial data description (the primitives)
    Exploratory Spatial Data Analysis (ESDA)
    Spatial statistical analysis and hypothesis testing
    Spatial modeling and prediction

    We will look at all 4 levels in this lecture series



                            Richard Heimann © 2013

Thursday, February 21, 13
Spatial Analysis: successive levels of sophistication
    1. Spatial data description (primitive):
   Focus is on describing the world,
     and representing it in a digital
     format
   	   --computer map
      --computer database

   Uses classic GIS capabilities
      --buffering, map layer overlay
      --spatial queries & measurement


                            Richard Heimann © 2013

Thursday, February 21, 13
Spatial Analysis: successive levels of sophistication
    1. Spatial data description (primitive):
   Focus is on describing the world,
     and representing it in a digital
     format
   	   --computer map
      --computer database

   Uses classic GIS capabilities
      --buffering, map layer overlay
      --spatial queries & measurement


                            Richard Heimann © 2013

Thursday, February 21, 13
Spatial Analysis: successive levels of sophistication
    2.	 Exploratory Spatial Data Analysis

      Searching for patterns and possible explanations
      GeoVisualization through calculation and display
       of Centrographic statistics and other spatially
                    descriptive statistics




                            Richard Heimann © 2013

Thursday, February 21, 13
Spatial Analysis: successive levels of sophistication
    2.	 Exploratory Spatial Data Analysis
                         Centrographics - Moments of Data




      Map showing changes to the mean center of population for the
           United States, 1790–2010 (U.S. Census Bureau)[1]




                                      http://en.wikipedia.org/wiki/Moment_(mathematics)

                                                        Richard Heimann © 2013

Thursday, February 21, 13
Spatial Analysis: successive levels of sophistication




                            Richard Heimann © 2013

Thursday, February 21, 13
Spatial Analysis: successive levels of sophistication




                                                       The Geography of the Nazi Vote:
                                                     Context, Confession, and Class in the
                                                     Reichstag Election of 1930 Author(s):
                                                      John O'Loughlin, Colin Flint, Luc
                                                         Anselin Source: Annals of the
                                                     Association of American Geographers

                            Richard Heimann © 2013

Thursday, February 21, 13
Spatial Analysis: successive levels of sophistication
    3. Spatial statistical analysis and hypothesis testing
   Are data “to be expected” or are they “unexpected” relative to some
   statistical model, usually of a random process (pure chance).




      2.5%                                                                                                      2.5%
                 -1.96                                                                           1.96
                                                             0



   We can test if the spatial pattern for voting behavior in Germany in
   1930 is in fact cluster or random.
      The Geography of the Nazi Vote: Context, Confession, and Class in the Reichstag Election of 1930 Author(s): John O'Loughlin,
                         Colin Flint, Luc Anselin Source: Annals of the Association of American Geographers




                                                     Richard Heimann © 2013

Thursday, February 21, 13
Making things even harder...

• Inward and outward asymptotics i.e. increasing
  spatial extent, increasing temporal lags, finer
  spatial resolution, finer temporal resolution.
• Increased number of cross sections.
• …visual correlations and visual detection of
  change over space and time do not exist.
• Apophenia is real!
• Spatial Analysis and Geographic Pattern
  Recognition will reduce patternicity (Sherman,
  2008).


                            Richard Heimann © 2013

Thursday, February 21, 13
Spatially Random or Spatially Clustered?




                            Richard Heimann © 2013

Thursday, February 21, 13
Spatially Random or Spatially Clustered?



                            Moran’s I:                      Moran’s I:
                             0.689                           0.003




                                   Richard Heimann © 2013

Thursday, February 21, 13
Spatial Analysis: successive levels of sophistication
    4. Spatial modeling: prediction	
   Construct models (of processes) to predict spatial outcomes
   (patterns)
Coefficient: % Poverty Coefficient: % FB      Coefficient: % Elderly Coefficient: % Black




                                  Richard Heimann © 2013

Thursday, February 21, 13
Spatial Analysis: successive levels of sophistication




                            Richard Heimann © 2013

Thursday, February 21, 13
Spatial Analysis: successive levels of sophistication
                                                              Statistically
                               Statistically                   significant
                            significant global                     global
                              variables that                 variables that
                              exhibit strong                  exhibit little
                                 regional                       regional
                                                                variation
                            variation inform                 inform region
                               local policy.                  wide policy.




                                         Richard Heimann © 2013

Thursday, February 21, 13
Spatial Analysis: successive levels of sophistication

                                                     Local R2 informs
                                                       us where the
                                                         model is
                                                     performing well
                                                      and where it is
                                                        performing
                                                          poorly.

                                                     The poor results
                                                     in the south may
                                                      indicate that an
                                                         important
                                                         variable is
                                                     missing from our
                                                           model.
                            Richard Heimann © 2013

Thursday, February 21, 13
Issues/Challenges/Problems
         in Spatial Analysis
        Summarize these now.

           Talk in greater detail about
              them throughout this
                  lecture series.
                            Richard Heimann © 2013

Thursday, February 21, 13
Critical Issues in Spatial Analysis




                            Richard Heimann © 2013

Thursday, February 21, 13
Critical Issues in Spatial Analysis
• Spatial autocorrelation
      – Data from locations near to each other are usually more similar than data from
        locations far away from each other




                                     Richard Heimann © 2013

Thursday, February 21, 13
Critical Issues in Spatial Analysis
• Spatial autocorrelation
      – Data from locations near to each other are usually more similar than data from
        locations far away from each other
• Modifiable areal unit problem (MAUP-zone )
      – Results may depend on the specific geographic unit used in the study
      – Province or county; county or city




                                     Richard Heimann © 2013

Thursday, February 21, 13
Critical Issues in Spatial Analysis
• Spatial autocorrelation
      – Data from locations near to each other are usually more similar than data from
        locations far away from each other
• Modifiable areal unit problem (MAUP-zone )
      – Results may depend on the specific geographic unit used in the study
      – Province or county; county or city
• Scale affects representation and results
      – Cities may be represented as points or polygons
      – Results depend on the scale at which the analysis is conducted: province or county
      – MAUP—scale effect




                                      Richard Heimann © 2013

Thursday, February 21, 13
Critical Issues in Spatial Analysis
• Spatial autocorrelation
      – Data from locations near to each other are usually more similar than data from
        locations far away from each other
• Modifiable areal unit problem (MAUP-zone )
      – Results may depend on the specific geographic unit used in the study
      – Province or county; county or city
• Scale affects representation and results
      – Cities may be represented as points or polygons
      – Results depend on the scale at which the analysis is conducted: province or county
      – MAUP—scale effect
• Ecological fallacy
      – Results obtained from aggregated data (e.g. provinces) cannot be assumed to
        apply to individual people
      – MAUP—individual effect




                                      Richard Heimann © 2013

Thursday, February 21, 13
Critical Issues in Spatial Analysis
• Spatial autocorrelation
      – Data from locations near to each other are usually more similar than data from
        locations far away from each other
• Modifiable areal unit problem (MAUP-zone )
      – Results may depend on the specific geographic unit used in the study
      – Province or county; county or city
• Scale affects representation and results
      – Cities may be represented as points or polygons
      – Results depend on the scale at which the analysis is conducted: province or county
      – MAUP—scale effect
• Ecological fallacy
      – Results obtained from aggregated data (e.g. provinces) cannot be assumed to
        apply to individual people
      – MAUP—individual effect
• Non-uniformity of Space
      – Phenomena are not distributed evenly in space
      – Be careful how you interpret results!



                                      Richard Heimann © 2013

Thursday, February 21, 13
Critical Issues in Spatial Analysis
• Spatial autocorrelation
      – Data from locations near to each other are usually more similar than data from
        locations far away from each other
• Modifiable areal unit problem (MAUP-zone )
      – Results may depend on the specific geographic unit used in the study
      – Province or county; county or city
• Scale affects representation and results
      – Cities may be represented as points or polygons
      – Results depend on the scale at which the analysis is conducted: province or county
      – MAUP—scale effect
• Ecological fallacy
      – Results obtained from aggregated data (e.g. provinces) cannot be assumed to
        apply to individual people
      – MAUP—individual effect
• Non-uniformity of Space
      – Phenomena are not distributed evenly in space
      – Be careful how you interpret results!
• Edge issues
      – Edges of the map, beyond which there is no data, can significantly affect results
                                      Richard Heimann © 2013

Thursday, February 21, 13
The common problems...




                                                     http://www.amazon.com/GIS-20-Essential-Skills/dp/1589482565




                            Richard Heimann © 2013

Thursday, February 21, 13
Measuring Space



                            Richard Heimann © 2013

Thursday, February 21, 13
Fundamental Spatial Concepts
    Distance
   The magnitude of spatial separation
   Euclidean (straight line) distance often only an
   approximation
    Adjacency or neighborhood
   Nominal or binary (0,1) equivalent of distance
   Levels of adjacency exist: 1st, 2nd, 3rd nearest
   neighbor, etc..
    Interaction
   The strength of the relationship between entities
   An inverse function of distance
                            Richard Heimann © 2013

Thursday, February 21, 13
Review (Part 1)


    What is Spatial Analysis?

    What are the four levels of Spatial Analysis?

    What are the three measures?




                                Richard Heimann © 2013

Thursday, February 21, 13
Take a Break!




                            Richard Heimann © 2013

Thursday, February 21, 13
Nontraditional Spatial Analysis
       Traditional spatial analyses grew up in an era of sparse data and very weak
       computational power. Today, both of those circumstances are reversed and
       many of the old solutions are no longer suitable to answer todays questions.
"Spatial Analysis and Data Mining", reflects this change and combines two things
     which, until recently, engaged quite different groups of researchers and
  practitioners. Together, they require particular techniques and a sophisticated
     understanding of the special problems associated with spatial data. This
 geographic data mining, or Geographic Knowledge Discovery (GKD), is not new,
but is developing and changing rapidly as both more, and different, data becomes
  available, and people see new applications. The days of ‘Big Data’ require fresh
                                     thinking.
     The aim of geographic data mining (GKD) is to assist in the generation of
hypotheses, which can be tested, about interesting or anomalous spatial patterns
which may be discovered in very large databases. It is important that the patterns
discovered should not be statistical or sampling artifacts, and should be nontrivial
      and useful. The intent is not to build a system that makes decisions or
 interpretations automatically, but supports humans in these tasks. Also GKD is
 not synonymous with statistical analyses, such tools have a role in the testing of
               hypotheses generated by GKD but not in GKD itself.

                                   Richard Heimann © 2013

Thursday, February 21, 13
DATA is the new OIL…




                               Richard Heimann © 2013

Thursday, February 21, 13
Long Tail of Big Data
       Head: Big Data




                                         Long Tail: Intelligence Reporting, Science Data – Dark Data




Head: Big Data – Large continuous datasets coincident over Time & Space. Ideal for multivariate analysis.
Tail {power law distribution} is good for business but suboptimal for governance. Data in tail is often
unmaintained beyond their initially designed use case and individually curated. As a result, the data is
discontiguous from other research efforts and discontinuous over space and time.
Dark data is suspected to exist or ought to exist but is difficult or impossible to find. The problem of dark data is
real and prevalent in the tail. The long tail is an intractably large management problem.

                                               Richard Heimann © 2013

Thursday, February 21, 13
Long Tail of NSF data…




  Power law                 80%                              20%
  Number of Grants          7,478                            1,869
  Dollar Amount             $938,548,595                     $1,199,088,125
  Total Grants (NSF07)      9,347 (Count)                    $2,137,636,716 (Amount)


                                    Richard Heimann © 2013

Thursday, February 21, 13
Long Tail of data science…
  Head                                        Tail
  Homogenous                                  Heterogeneous

  Centralized curation                        Individual curation

  Maintained                                  Unmaintained

  Continuous over S & T                       Discontinuous over S & T

  Visibly accessible                          DARK Data

  High Velocity                               Slow or NO velocity

  High Volume                                 Low Volume

  Easier Data Integration                     Harder Data Integration

  Unreasonable Effectiveness of Data          Reasonable Effectiveness of Data

  Open Innovation – Integrated Research       Closed Innovation – Vertical Research


                                   Richard Heimann © 2013

Thursday, February 21, 13
The Open Innovation Model
    In the new model of open innovation, a company commercializes both its own
    ideas as well as innovations from other firms and seeks ways to bring its in-
    house ideas to market by deploying pathways outside its current businesses.
    Note that the boundary between the company and its surrounding
    environment is porous (represented by a dashed line), enabling innovations to
    move more easily between the two.




            Henry W. Chesbrough, Era of Open Innovation. SPRING 2003 MIT SLOAN MANAGEMENT REVIEW




                                                                            Richard Heimann © 2013

Thursday, February 21, 13
“The Unreasonable
      Effectiveness of                                   “The Unreasonable
    Mathematics in the                                  Effectiveness of Data”
     Natural Sciences”




              Eugene Wigner (1960 Nobel                            Peter Norvig Director of Research
                      Laureate)                                             at Google Inc.
                                          Richard Heimann © 2013

Thursday, February 21, 13
Big Data, Small Theory
                                     Spatial Simpson’s Paradox
        Global standards will always compete with local social
                            phenomenon.

                                                 Violence in the                                                             Violence in the
                                                      north                                                                       north




                               Violence




            Violence in the
                 south                                                     Violence in the south


      Global models average regionally variant                                Local models account for regional variation.
      phenomenon.



                                                             Richard Heimann © 2013

Thursday, February 21, 13
New Aged Experimentation

    George Box
    “”The only way to understand complex
    systems is to shock those systems and
    observe the way they react””

    New motivation for experimentation
    especially in quasi-experimental methods.
                                       (...more later)


                            Richard Heimann © 2013

Thursday, February 21, 13
New Aged Experimentation




                            Richard Heimann © 2013

Thursday, February 21, 13
Nontraditional Datasets
     Twitter – Sampled ongoing collection of social media tweets with UserId and
     time. Some even have precise location data, but this is not the norm. Collection
     pulls roughly between 1-2 million tweets / day.




     	                  Example Proxy Problems:
  Discovery of crowd-sourced phenomena (e.g., people posting to beware of a certain
  neighborhood)
  Discovery of correlated trends (e.g., finding that people posting about a certain topic in an
  area correlates to higher crime in that area)
  Tracking sentiment on certain topics and issues
  Tracking language usage in areas to determine abnormal language presence in an area




                                       Richard Heimann © 2013

Thursday, February 21, 13
What is Geographic Knowledge Discovery??
• How can we infer movement patterns from vast amounts of what
  appears to be just point data collected in time and associated with an
  identifier ?
• Technique is applicable to Twitter, FourSquare and MANY others.




                                                        Volume plot of photos binned by area on log scale
                                                                 Paris as seen from Flickr over all time


                               Richard Heimann © 2013

Thursday, February 21, 13
What is Geographic Knowledge Discovery??
   Aggregate micro-pathing on a world of photo metadata with no speed,
                      time, or distance restrictions




                              Richard Heimann © 2013

Thursday, February 21, 13
Personal Notes
    Richard Heimann
    Office: UMBC Common Faculty Area 3rd Floor
    Phone: 571-403-0119 (C)
    Office hours:
   Tues. 6:30-7:00 (Virtual);
    or by appointment (send e-mail)
   I promptly respond to emails. Phone calls are another
   matter.
    Email: rheimann@umbc.edu or
    heimann.richard@gmail.com


                                Richard Heimann © 2013

Thursday, February 21, 13
Thank you…

                   Data Tactics Corporation
              https://www.data-tactics-corp.com/
               http://datatactics.blogspot.com/
                     Twitter: @DataTactics

                              Rich Heimann
                            Twitter: @rheimann


                                 Richard Heimann © 2013

Thursday, February 21, 13

Weitere ähnliche Inhalte

Was ist angesagt?

Chap1 introduction to geographic information system (gis)
Chap1 introduction to geographic information system (gis)Chap1 introduction to geographic information system (gis)
Chap1 introduction to geographic information system (gis)Mweemba Hachita
 
Introduction to arc gis
Introduction to arc gisIntroduction to arc gis
Introduction to arc gisMohamed Hamed
 
Introduction to GIS
Introduction to GISIntroduction to GIS
Introduction to GISEhsan Hamzei
 
WEB GIS AND WEB MAP.pptx
WEB GIS AND WEB MAP.pptxWEB GIS AND WEB MAP.pptx
WEB GIS AND WEB MAP.pptxAsim Pt
 
Chapter one gis
Chapter one gisChapter one gis
Chapter one gisGokul Saud
 
Basics of remote sensing, pk mani
Basics of remote sensing, pk maniBasics of remote sensing, pk mani
Basics of remote sensing, pk maniP.K. Mani
 
Geoinformatics
GeoinformaticsGeoinformatics
Geoinformaticsgeovino
 
Getting started with GIS
Getting started with GISGetting started with GIS
Getting started with GISEsri India
 
Terminology and Basic Questions About GIS
Terminology and Basic Questions About GISTerminology and Basic Questions About GIS
Terminology and Basic Questions About GISMrinmoy Majumder
 
DATA in GIS and DATA Query
DATA in GIS and DATA QueryDATA in GIS and DATA Query
DATA in GIS and DATA QueryKU Leuven
 
Principles of GIS unit 1
Principles of GIS unit 1Principles of GIS unit 1
Principles of GIS unit 1SanjanaKhemka1
 
Steps for Principal Component Analysis (pca) using ERDAS software
Steps for Principal Component Analysis (pca) using ERDAS softwareSteps for Principal Component Analysis (pca) using ERDAS software
Steps for Principal Component Analysis (pca) using ERDAS softwareSwetha A
 
Components of Spatial Data Quality in GIS
Components of Spatial Data Quality in GISComponents of Spatial Data Quality in GIS
Components of Spatial Data Quality in GISKaium Chowdhury
 
Gis Geographical Information System Fundamentals
Gis Geographical Information System FundamentalsGis Geographical Information System Fundamentals
Gis Geographical Information System FundamentalsUroosa Samman
 

Was ist angesagt? (20)

Chap1 introduction to geographic information system (gis)
Chap1 introduction to geographic information system (gis)Chap1 introduction to geographic information system (gis)
Chap1 introduction to geographic information system (gis)
 
Introduction to GIS
Introduction to GISIntroduction to GIS
Introduction to GIS
 
Introduction to arc gis
Introduction to arc gisIntroduction to arc gis
Introduction to arc gis
 
Introduction to GIS
Introduction to GISIntroduction to GIS
Introduction to GIS
 
WEB GIS AND WEB MAP.pptx
WEB GIS AND WEB MAP.pptxWEB GIS AND WEB MAP.pptx
WEB GIS AND WEB MAP.pptx
 
Chapter one gis
Chapter one gisChapter one gis
Chapter one gis
 
Basics of remote sensing, pk mani
Basics of remote sensing, pk maniBasics of remote sensing, pk mani
Basics of remote sensing, pk mani
 
Geoinformatics
GeoinformaticsGeoinformatics
Geoinformatics
 
Introduction to gis
Introduction to gisIntroduction to gis
Introduction to gis
 
Getting started with GIS
Getting started with GISGetting started with GIS
Getting started with GIS
 
Presentation on gis and future trends
Presentation on gis and future trendsPresentation on gis and future trends
Presentation on gis and future trends
 
Terminology and Basic Questions About GIS
Terminology and Basic Questions About GISTerminology and Basic Questions About GIS
Terminology and Basic Questions About GIS
 
Remote Sensing
Remote Sensing Remote Sensing
Remote Sensing
 
Microwave remote sensing
Microwave remote sensingMicrowave remote sensing
Microwave remote sensing
 
DATA in GIS and DATA Query
DATA in GIS and DATA QueryDATA in GIS and DATA Query
DATA in GIS and DATA Query
 
Principles of GIS unit 1
Principles of GIS unit 1Principles of GIS unit 1
Principles of GIS unit 1
 
Steps for Principal Component Analysis (pca) using ERDAS software
Steps for Principal Component Analysis (pca) using ERDAS softwareSteps for Principal Component Analysis (pca) using ERDAS software
Steps for Principal Component Analysis (pca) using ERDAS software
 
Remote sensing
Remote sensing Remote sensing
Remote sensing
 
Components of Spatial Data Quality in GIS
Components of Spatial Data Quality in GISComponents of Spatial Data Quality in GIS
Components of Spatial Data Quality in GIS
 
Gis Geographical Information System Fundamentals
Gis Geographical Information System FundamentalsGis Geographical Information System Fundamentals
Gis Geographical Information System Fundamentals
 

Andere mochten auch

Esriuk_track3_esri spatial analysis presentation
Esriuk_track3_esri spatial analysis presentationEsriuk_track3_esri spatial analysis presentation
Esriuk_track3_esri spatial analysis presentationEsri UK
 
Spatial Analysis with R - the Good, the Bad, and the Pretty
Spatial Analysis with R - the Good, the Bad, and the PrettySpatial Analysis with R - the Good, the Bad, and the Pretty
Spatial Analysis with R - the Good, the Bad, and the PrettyNoam Ross
 
R programming language in spatial analysis
R programming language in spatial analysisR programming language in spatial analysis
R programming language in spatial analysisAbhiram Kanigolla
 
Spatial data analysis 2
Spatial data analysis 2Spatial data analysis 2
Spatial data analysis 2Johan Blomme
 
Spatial analysis and Analysis Tools ( GIS )
Spatial analysis and Analysis Tools ( GIS )Spatial analysis and Analysis Tools ( GIS )
Spatial analysis and Analysis Tools ( GIS )designQube
 
Spatial data analysis 1
Spatial data analysis 1Spatial data analysis 1
Spatial data analysis 1Johan Blomme
 
Spatial analysis and modeling
Spatial analysis and modelingSpatial analysis and modeling
Spatial analysis and modelingTolasa_F
 
Network analysis in gis
Network analysis in gisNetwork analysis in gis
Network analysis in gisstudent
 

Andere mochten auch (8)

Esriuk_track3_esri spatial analysis presentation
Esriuk_track3_esri spatial analysis presentationEsriuk_track3_esri spatial analysis presentation
Esriuk_track3_esri spatial analysis presentation
 
Spatial Analysis with R - the Good, the Bad, and the Pretty
Spatial Analysis with R - the Good, the Bad, and the PrettySpatial Analysis with R - the Good, the Bad, and the Pretty
Spatial Analysis with R - the Good, the Bad, and the Pretty
 
R programming language in spatial analysis
R programming language in spatial analysisR programming language in spatial analysis
R programming language in spatial analysis
 
Spatial data analysis 2
Spatial data analysis 2Spatial data analysis 2
Spatial data analysis 2
 
Spatial analysis and Analysis Tools ( GIS )
Spatial analysis and Analysis Tools ( GIS )Spatial analysis and Analysis Tools ( GIS )
Spatial analysis and Analysis Tools ( GIS )
 
Spatial data analysis 1
Spatial data analysis 1Spatial data analysis 1
Spatial data analysis 1
 
Spatial analysis and modeling
Spatial analysis and modelingSpatial analysis and modeling
Spatial analysis and modeling
 
Network analysis in gis
Network analysis in gisNetwork analysis in gis
Network analysis in gis
 

Ähnlich wie Spatial Analysis and Geomatics

GES673 SP2014 Intro Lecture
GES673 SP2014 Intro LectureGES673 SP2014 Intro Lecture
GES673 SP2014 Intro LectureRich Heimann
 
Human Terrain Analysis at George Mason University (DAY 1)
Human Terrain Analysis at George Mason University (DAY 1)Human Terrain Analysis at George Mason University (DAY 1)
Human Terrain Analysis at George Mason University (DAY 1)Rich Heimann
 
A quick overview of geospatial analysis
A quick overview of geospatial analysisA quick overview of geospatial analysis
A quick overview of geospatial analysisMd.Farhad Hossen
 
Introduction to Geographic Information System (GIS)
Introduction to Geographic Information System (GIS)Introduction to Geographic Information System (GIS)
Introduction to Geographic Information System (GIS)Shashank Singh
 
Introduction to GIS.ppt
Introduction to GIS.pptIntroduction to GIS.ppt
Introduction to GIS.pptHDaas1
 
Mooc Intro 1.pptx
Mooc Intro 1.pptxMooc Intro 1.pptx
Mooc Intro 1.pptxGeoBlogs
 
FULLYFINAL GIS CHOUBEYJI.pptx
FULLYFINAL GIS CHOUBEYJI.pptxFULLYFINAL GIS CHOUBEYJI.pptx
FULLYFINAL GIS CHOUBEYJI.pptxAkashBhagat34
 
Learning assignment on geographic information system
Learning assignment on geographic information systemLearning assignment on geographic information system
Learning assignment on geographic information systemMuhammad Tahir Mehmood
 
Fundamentals_of_GIS_Estoque.pdf
Fundamentals_of_GIS_Estoque.pdfFundamentals_of_GIS_Estoque.pdf
Fundamentals_of_GIS_Estoque.pdfmichael152973
 
Global information system ppt
Global information system pptGlobal information system ppt
Global information system pptGhayasHaiderSajid
 
Application of GIS in Flood Hazard Mapping - GIS I Fundamentals - CEI40 - AGA
Application of GIS in Flood Hazard Mapping - GIS I Fundamentals - CEI40 - AGAApplication of GIS in Flood Hazard Mapping - GIS I Fundamentals - CEI40 - AGA
Application of GIS in Flood Hazard Mapping - GIS I Fundamentals - CEI40 - AGAAhmed Gamal Abdel Gawad
 
IEEE SIGHT Bombay section webinar talk on GIS & Remote Sensing-Introduction t...
IEEE SIGHT Bombay section webinar talk on GIS & Remote Sensing-Introduction t...IEEE SIGHT Bombay section webinar talk on GIS & Remote Sensing-Introduction t...
IEEE SIGHT Bombay section webinar talk on GIS & Remote Sensing-Introduction t...AdityaAllamraju1
 
1505382049E-TextConceptsofGIS(includeerrorsinGIS.pdf
1505382049E-TextConceptsofGIS(includeerrorsinGIS.pdf1505382049E-TextConceptsofGIS(includeerrorsinGIS.pdf
1505382049E-TextConceptsofGIS(includeerrorsinGIS.pdfVisheshDalwal
 
Fundamentals of gis
Fundamentals of gisFundamentals of gis
Fundamentals of gisJessy Mol
 
Not the Geography You Remember
Not the Geography You RememberNot the Geography You Remember
Not the Geography You RememberBill Bass
 
Geographical Information System By Zewde Alemayehu Tilahun.pptx
Geographical Information System By Zewde Alemayehu Tilahun.pptxGeographical Information System By Zewde Alemayehu Tilahun.pptx
Geographical Information System By Zewde Alemayehu Tilahun.pptxzewde alemayehu
 

Ähnlich wie Spatial Analysis and Geomatics (20)

GES673 SP2014 Intro Lecture
GES673 SP2014 Intro LectureGES673 SP2014 Intro Lecture
GES673 SP2014 Intro Lecture
 
Human Terrain Analysis at George Mason University (DAY 1)
Human Terrain Analysis at George Mason University (DAY 1)Human Terrain Analysis at George Mason University (DAY 1)
Human Terrain Analysis at George Mason University (DAY 1)
 
A quick overview of geospatial analysis
A quick overview of geospatial analysisA quick overview of geospatial analysis
A quick overview of geospatial analysis
 
Introduction to Geographic Information System (GIS)
Introduction to Geographic Information System (GIS)Introduction to Geographic Information System (GIS)
Introduction to Geographic Information System (GIS)
 
Introduction to GIS.ppt
Introduction to GIS.pptIntroduction to GIS.ppt
Introduction to GIS.ppt
 
Mooc Intro 1.pptx
Mooc Intro 1.pptxMooc Intro 1.pptx
Mooc Intro 1.pptx
 
Gis
GisGis
Gis
 
FULLYFINAL GIS CHOUBEYJI.pptx
FULLYFINAL GIS CHOUBEYJI.pptxFULLYFINAL GIS CHOUBEYJI.pptx
FULLYFINAL GIS CHOUBEYJI.pptx
 
Learning assignment on geographic information system
Learning assignment on geographic information systemLearning assignment on geographic information system
Learning assignment on geographic information system
 
Gis
GisGis
Gis
 
Fundamentals_of_GIS_Estoque.pdf
Fundamentals_of_GIS_Estoque.pdfFundamentals_of_GIS_Estoque.pdf
Fundamentals_of_GIS_Estoque.pdf
 
Global information system ppt
Global information system pptGlobal information system ppt
Global information system ppt
 
Application of GIS in Flood Hazard Mapping - GIS I Fundamentals - CEI40 - AGA
Application of GIS in Flood Hazard Mapping - GIS I Fundamentals - CEI40 - AGAApplication of GIS in Flood Hazard Mapping - GIS I Fundamentals - CEI40 - AGA
Application of GIS in Flood Hazard Mapping - GIS I Fundamentals - CEI40 - AGA
 
IEEE SIGHT Bombay section webinar talk on GIS & Remote Sensing-Introduction t...
IEEE SIGHT Bombay section webinar talk on GIS & Remote Sensing-Introduction t...IEEE SIGHT Bombay section webinar talk on GIS & Remote Sensing-Introduction t...
IEEE SIGHT Bombay section webinar talk on GIS & Remote Sensing-Introduction t...
 
1505382049E-TextConceptsofGIS(includeerrorsinGIS.pdf
1505382049E-TextConceptsofGIS(includeerrorsinGIS.pdf1505382049E-TextConceptsofGIS(includeerrorsinGIS.pdf
1505382049E-TextConceptsofGIS(includeerrorsinGIS.pdf
 
GIS KD.pdf
GIS KD.pdfGIS KD.pdf
GIS KD.pdf
 
Iirs - Overview of GIS
Iirs - Overview of GISIirs - Overview of GIS
Iirs - Overview of GIS
 
Fundamentals of gis
Fundamentals of gisFundamentals of gis
Fundamentals of gis
 
Not the Geography You Remember
Not the Geography You RememberNot the Geography You Remember
Not the Geography You Remember
 
Geographical Information System By Zewde Alemayehu Tilahun.pptx
Geographical Information System By Zewde Alemayehu Tilahun.pptxGeographical Information System By Zewde Alemayehu Tilahun.pptx
Geographical Information System By Zewde Alemayehu Tilahun.pptx
 

Mehr von Rich Heimann

Guest Talk for Data Society's "INTRO TO DATA SCIENCE BOOT CAMP"
Guest Talk for Data Society's "INTRO TO DATA SCIENCE BOOT CAMP"Guest Talk for Data Society's "INTRO TO DATA SCIENCE BOOT CAMP"
Guest Talk for Data Society's "INTRO TO DATA SCIENCE BOOT CAMP"Rich Heimann
 
Big Data Analytics: Discovering Latent Structure in Twitter; A Case Study in ...
Big Data Analytics: Discovering Latent Structure in Twitter; A Case Study in ...Big Data Analytics: Discovering Latent Structure in Twitter; A Case Study in ...
Big Data Analytics: Discovering Latent Structure in Twitter; A Case Study in ...Rich Heimann
 
Human Terrain Analysis at George Mason University (DAY 1)
Human Terrain Analysis at George Mason University (DAY 1)Human Terrain Analysis at George Mason University (DAY 1)
Human Terrain Analysis at George Mason University (DAY 1)Rich Heimann
 
Human Terrain Analysis at George Mason University (DAY 1)
Human Terrain Analysis at George Mason University (DAY 1)Human Terrain Analysis at George Mason University (DAY 1)
Human Terrain Analysis at George Mason University (DAY 1)Rich Heimann
 
Why L-3 Data Tactics Data Science?
Why L-3 Data Tactics Data Science?Why L-3 Data Tactics Data Science?
Why L-3 Data Tactics Data Science?Rich Heimann
 
Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)Rich Heimann
 
Data Tactics Analytics Brown Bag (November 2013)
Data Tactics Analytics Brown Bag (November 2013)Data Tactics Analytics Brown Bag (November 2013)
Data Tactics Analytics Brown Bag (November 2013)Rich Heimann
 
A Blended Approach to Analytics at Data Tactics Corporation
A Blended Approach to Analytics at Data Tactics CorporationA Blended Approach to Analytics at Data Tactics Corporation
A Blended Approach to Analytics at Data Tactics CorporationRich Heimann
 
Data Tactics Analytics Brown Bag (Aug 22, 2013)
Data Tactics Analytics Brown Bag (Aug 22, 2013)Data Tactics Analytics Brown Bag (Aug 22, 2013)
Data Tactics Analytics Brown Bag (Aug 22, 2013)Rich Heimann
 
Big Social Data: The Spatial Turn in Big Data (Video available soon on YouTube)
Big Social Data: The Spatial Turn in Big Data (Video available soon on YouTube)Big Social Data: The Spatial Turn in Big Data (Video available soon on YouTube)
Big Social Data: The Spatial Turn in Big Data (Video available soon on YouTube)Rich Heimann
 
Spatial Analysis; The Primitives at UMBC
Spatial Analysis; The Primitives at UMBCSpatial Analysis; The Primitives at UMBC
Spatial Analysis; The Primitives at UMBCRich Heimann
 
Week 1 Lecture @ UMBC
Week 1 Lecture @ UMBCWeek 1 Lecture @ UMBC
Week 1 Lecture @ UMBCRich Heimann
 

Mehr von Rich Heimann (13)

Guest Talk for Data Society's "INTRO TO DATA SCIENCE BOOT CAMP"
Guest Talk for Data Society's "INTRO TO DATA SCIENCE BOOT CAMP"Guest Talk for Data Society's "INTRO TO DATA SCIENCE BOOT CAMP"
Guest Talk for Data Society's "INTRO TO DATA SCIENCE BOOT CAMP"
 
Big Data Analytics: Discovering Latent Structure in Twitter; A Case Study in ...
Big Data Analytics: Discovering Latent Structure in Twitter; A Case Study in ...Big Data Analytics: Discovering Latent Structure in Twitter; A Case Study in ...
Big Data Analytics: Discovering Latent Structure in Twitter; A Case Study in ...
 
Human Terrain Analysis at George Mason University (DAY 1)
Human Terrain Analysis at George Mason University (DAY 1)Human Terrain Analysis at George Mason University (DAY 1)
Human Terrain Analysis at George Mason University (DAY 1)
 
Human Terrain Analysis at George Mason University (DAY 1)
Human Terrain Analysis at George Mason University (DAY 1)Human Terrain Analysis at George Mason University (DAY 1)
Human Terrain Analysis at George Mason University (DAY 1)
 
Why L-3 Data Tactics Data Science?
Why L-3 Data Tactics Data Science?Why L-3 Data Tactics Data Science?
Why L-3 Data Tactics Data Science?
 
DS4G
DS4GDS4G
DS4G
 
Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)
 
Data Tactics Analytics Brown Bag (November 2013)
Data Tactics Analytics Brown Bag (November 2013)Data Tactics Analytics Brown Bag (November 2013)
Data Tactics Analytics Brown Bag (November 2013)
 
A Blended Approach to Analytics at Data Tactics Corporation
A Blended Approach to Analytics at Data Tactics CorporationA Blended Approach to Analytics at Data Tactics Corporation
A Blended Approach to Analytics at Data Tactics Corporation
 
Data Tactics Analytics Brown Bag (Aug 22, 2013)
Data Tactics Analytics Brown Bag (Aug 22, 2013)Data Tactics Analytics Brown Bag (Aug 22, 2013)
Data Tactics Analytics Brown Bag (Aug 22, 2013)
 
Big Social Data: The Spatial Turn in Big Data (Video available soon on YouTube)
Big Social Data: The Spatial Turn in Big Data (Video available soon on YouTube)Big Social Data: The Spatial Turn in Big Data (Video available soon on YouTube)
Big Social Data: The Spatial Turn in Big Data (Video available soon on YouTube)
 
Spatial Analysis; The Primitives at UMBC
Spatial Analysis; The Primitives at UMBCSpatial Analysis; The Primitives at UMBC
Spatial Analysis; The Primitives at UMBC
 
Week 1 Lecture @ UMBC
Week 1 Lecture @ UMBCWeek 1 Lecture @ UMBC
Week 1 Lecture @ UMBC
 

Kürzlich hochgeladen

POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 

Kürzlich hochgeladen (20)

POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 

Spatial Analysis and Geomatics

  • 1. Geoprocessing & Spatial Analysis GES673 at Shady Grove Richard Heimann Richard Heimann © 2013 Thursday, February 21, 13
  • 2. Course Description The increased access to spatial data and overall improved application of spatial analytical methods present certain challenges to social scientific research. This graduate course is designed to focus on substantive social science research topics while exposing rewards and potential risks involved in the application of geographic information systems (GIS), spatial analysis, and spatial statistics in their own research.  The course will also highlight connections between spatial concepts and data availability. Both traditional spatial science data will be used as well as new emerging social media data, which better reflect some of the more recently developments in Big Data - most notably the socially critical exploration of such data. Substantive foci will include readings and discussions of spatially explicit theory leaning toward acknowledgment of a social and spatial turn in Big Data and enhanced role and extension of spatial analysis to keep with such trends. Throughout the course, lectures and discussions will be complemented with lab sessions introducing spatial analysis methods and GIS and spatial analysis software. The lab sessions will include the use of among other software GeoDa and ArcGIS. These lab sessions will introduce many methodological and technical issues relevant to spatial analysis. Assignments for the courses include up to two writing assignments, up to four lab assignments, and a final project which will be presented as a short 15-minute presentation as well as submitted as a term paper. The writing assignments will include an annotated bibliography/brief literature review within a selected theme area of spatial thinking/perspectives/methods. The lab assignments will focus on building geospatial databases, basic spatial analysis, exploratory spatial data analysis, and spatial regression modeling. The courses will include other labs and assignments that will be completed for no grade; these are intended as mechanisms/opportunities for developing and enhancing familiarity with selected software, data resources, and analytic methods. Course Objectives: - Examine methods and literature of geographic information science, spatial analysis and geographic knowledge discovery. - Learn about solving problems and answering questions using GIS and quantitative methods. - Use GIS software to learn some of the analytical tools available - ArcGIS Desktop & GeoDa. - Gain experience working with traditional and nontraditional social science data (i.e. Flickr, Twitter). Richard Heimann © 2013 Thursday, February 21, 13
  • 3. Course Notes Text: 1. Geospatial Analysis, 3rd edition. By: Michael J. de Smith, Michael Goodchild, and Paul A.Longley. The text is available as an Adobe readable file for download (uses special secure PDF reader), a version for the Kindle, on-line via a website, and as a printed book. See http://www.spatialanalysisonline.com/ for further information. 2. Making Spatial Decisions Using GIS: A Workbook. 2nd edition. By: Kathryn Keranen and Robert Kolvoord. Should be available in the Shady Grove Bookstore or ESRI Press or Amazon: http://www.amazon.com/Making-Spatial-Decisions-Using- GIS/dp/1589482808 3. GeoDa User Guide 0.9.3. (UG) The documentation will be somewhat unsyncronized with the software but not so much so that you will be prevented from completing labs. https://geodacenter.asu.edu/software/documentation 4. Exploring Spatial Data with GeoDa: A Workbook (UGW) http://www.csiss.org/clearinghouse/GeoDa/geodaworkbook.pdf 5. Other readings will be required and further suggested. They will be noted in the syllabus and either provided or will be cited for your discovery. Optional Text: a. The GIS 20: Essential Skills - http://www.amazon.com/GIS-20-Essential-Skills/dp/1589482565 Evaluation Midterm exam (15%) (20 “T/F with explanation”) Based on lectures and readings (open book) Lab Assignment 50 points (25%) (5 x 10) Reading Labs 40 points (20%) (4 x 10) Paper (60 points) & Presentation (20) (40%) Richard Heimann © 2013 Thursday, February 21, 13
  • 4. What will we discuss…? Methods Theory -Visual Data Analysis -First Law of Geography -Spatial Analysis -Spatial Heterogeneity -ESDA -Spatially Explicit Theory -Spatial Analysis -Geographic Knowledge Discovery -Spatial Econometrics -Spatial Modeling Data Big Data, Small Data vs. Big Data MAUP, Ecological Fallacy, Atomistic Fallacy, etc. Richard Heimann © 2013 Thursday, February 21, 13
  • 5. Why GeoDa, Python, and R? Not a GIS, but… •Complements all major GIS packages. •Windows based, so familiar interface. •Relies on same programming/math as the R package spdep and extends into Python using PySAL. • Incorporates more sophisticated statistical routines into spatial analysis than a GIS (i.e. ArcGIS Desktop). •Developed by Dr. Luc Anselin, Arizona State U. •FREE! •Python is an OS interpreted, object-oriented, high-level programming language. • R is an OS strongly functional language and environment to statistically explore data sets and analyze datasets. Richard Heimann © 2013 Thursday, February 21, 13
  • 6. What do I mean when I say OS? Free and Open Source: you can think of it as “free” as in “free speech,” and “free” as in “free beer.”   Open GeoDa is a cross-platform, open source version. PySAL is the underlying open source library with extended functionality. Richard Heimann © 2013 Thursday, February 21, 13
  • 7. Introductions Name Background Experience w/ Spatial Analysis Expectations… Recently watched movie or book read… Richard Heimann © 2013 Thursday, February 21, 13
  • 8. Geoprocessing & Spatial Analysis (GES673) What will we talk about today? Just an introduction...but we will be gaining momentum. What is GIS? Spatial Analysis? Why is Spatial Analysis and what are the four levels? The Social Turn in Big Data and the neospatial analysis and mining for knowledge discovery. Richard Heimann © 2013 Thursday, February 21, 13
  • 9. What is GIS? This is NOT a GIS Class. Geographic Information Information is knowledge about “what is where when” Geographic/geospatial: synonymous. ...spatial subtly different. What is the ‘S’ in GIS? Systems: the technology. Science: the concepts and theory. Studies: the societal context. Richard Heimann © 2013 Thursday, February 21, 13
  • 10. Defining Geographic Information Systems (GIS) The common ground between information processing and the many fields using spatial analysis techniques. (Tomlinson, 1972) A powerful set of tools for collecting, storing, retrieving, transforming, and displaying spatial data from the real world. (Burroughs, 1986) A computerised database management system for the capture, storage, retrieval, analysis and display of spatial (locationally defined) data. (NCGIA, 1987) A decision support system involving the integration of spatially referenced data in a problem solving environment. (Cowen, 1988) Richard Heimann © 2013 Thursday, February 21, 13
  • 11. Geographic Information System: intuitive description A map with a database behind it; a virtual representation of the real world and its infrastructure. Richard Heimann © 2013 Thursday, February 21, 13
  • 12. GI Systems, Science and Studies Which will we do? Systems Advanced Seminar is GIS GES670 Professional Seminar in Geospatial Technologies GES659 *Geoprocessing and Spatial Analysis GES673 *Spatial Social Science GES679 Science *Geoprocessing and Spatial Analysis GES673 GIS Modeling Techniques GES773 Spatial Social Science GES679 *Spatial Statistics GES774 Advanced Visualization and Presentation Studies *Geoprocessing and Spatial Analysis GES673 GIS Modeling Techniques GES773 *Spatial Social Science GES679 *Combine hands-on technical training with an understanding of the underlying science, and an emphasis on multidisciplinary applications Richard Heimann © 2013 Thursday, February 21, 13
  • 13. Where Most UMBC Students Work and Live Richard Heimann © 2013 Thursday, February 21, 13
  • 14. The GIS Data Model Richard Heimann © 2013 Thursday, February 21, 13
  • 15. The GIS Data Model: Purpose Allows geographic features to be digitally represented and stored in a database so that they can be abstractly presented in map (analog) form, and can also be worked with and manipulated to address some problem. (see associated diagrams) Richard Heimann © 2013 Thursday, February 21, 13
  • 16. Richard Heimann © 2013 Thursday, February 21, 13
  • 17. A layer-cake of information GIS Data Model Richard Heimann © 2013 Thursday, February 21, 13
  • 18. Spatial and Attribute Data Spatial data (where) specifies location; stored in a shape file, geodatabase or similar geographic file. Attribute (descriptive) data (what, how much, when) specifies characteristics at that location, natural or human-created stored in a data base table. GIS systems traditionally maintain spatial and attribute data separately, then “join” them for display or analysis. Richard Heimann © 2013 Thursday, February 21, 13
  • 19. Spatial and Attribute Data ALABAMA AL Lack of Locational Invariance (Goodchild et al) ALASKA AK ARIZONA AZ ARKANSAS AR • fundamental property of spatial analysis CALIFORNIA CA COLORADO CO CONNECTICUT CT • results change when location changes DELAWARE DE DISTRICT OF COLUMBIA DC FLORIDA FL where matters GEORGIA HAWAII IDAHO GA HI ID ILLINOIS IL INDIANA IN IOWA IA KANSAS KS KENTUCKY KY LOUISIANA LA MAINE ME MARYLAND MD MASSACHUSETTS MA MICHIGAN MI MINNESOTA MN MISSISSIPPI MS MISSOURI MO MONTANA MT NEBRASKA NE NEVADA NV NEW HAMPSHIRE NH NEW JERSEY NJ NEW MEXICO NM NEW YORK NY NORTH CAROLINA NC NORTH DAKOTA ND OHIO OH OKLAHOMA OK OREGON OR PENNSYLVANIA PA RHODE ISLAND RI SOUTH CAROLINA SC SOUTH DAKOTA SD TENNESSEE TN TEXAS TX UTAH UT VERMONT VT VIRGINIA VA WASHINGTON WA WEST VIRGINIA WV WISCONSIN WI WYOMING WY Richard Heimann © 2013 Thursday, February 21, 13
  • 20. Representing Data with Raster and Vector Models Raster Model Area is covered by grid with (usually) equal-sized, square cells; Regular Lattices. Attributes are recorded by assigning each cell a single value based on the majority feature (attribute) in the cell, such as land use type. Image data is a special case of raster data in which the “attribute” is a reflectance value from the geomagnetic spectrum Cells in image data often called pixels (picture elements) Vector Model The fundamental concept of vector GIS is that all geographic features in the real work can be represented either as: Points or dots (nodes): Cities, human sensors, individual obs. Lines (arcs): movement, connectedness, networks Areas (polygons): Countries, States, Census Tracts, Cities, Irregular Lattices - Multivariate in nature. Richard Heimann © 2013 Thursday, February 21, 13
  • 21. Lattice Data; Yes or No? Richard Heimann © 2013 Thursday, February 21, 13
  • 22. Lattice Data; Yes or No? Irregular Lattice Regular Irregular Lattice Richard Heimann © 2013 Lattice Thursday, February 21, 13
  • 23. What is spatial analysis? From Data to Information ...beyond mapping. transformations, manipulations and application of analytical methods to spatial (geographic) data Lack of locational invariance (Goodchild et al) Fundamental property of spatial analysis. Analyses where the outcome changes when the locations of the objects under study change. Median center vs. Median, Standard Deviational Ellipses., Autocorrelation vs. Spatial Autocorrelation. Where matters In an absolute sense (coordinates) In a relative sense (spatial arrangement, distance) Richard Heimann © 2013 Thursday, February 21, 13
  • 24. Spatial analysis as a process Problem formulation Data gathering Exploratory analysis Hypothesis formulation Modeling and testing Consultation and review Reporting and implementation Richard Heimann © 2013 Thursday, February 21, 13
  • 25. Analytical methodologies Mitchell (2005) Draper et al (2005) Richard Heimann © 2013 Thursday, February 21, 13
  • 26. Analytical methodologies - PPDAC Mackay & Oldford (2002) Richard Heimann © 2013 Thursday, February 21, 13
  • 27. Components of Spatial Analysis Visualization Showing interesting patterns Exploratory Spatial Data Analysis (ESDA) Finding interesting patterns Spatial Modeling, Regression Explaining interesting patterns Richard Heimann © 2013 Thursday, February 21, 13
  • 28. THE PROBLEM … GEOGRAPHICAL LITERACY Despite having a highly education society, Americans are arguably the world’s most geographically ignorant people By comparison, children throughout much of the world are exposed to geographic training in both primary and secondary schools Most Americans learn what little geography they know in elementary or middle school. In the United States, the last time a student hears the word “geography” is usually in the third grade Discussion of geography at any higher level is hidden under the heading “social studies” Concern over geographical illiteracy led President Reagan to declare November 15-21, 1987 as the first Geography Awareness Week (a joint resolution of the One Hundredth Congress) Richard Heimann © 2013 Thursday, February 21, 13
  • 29. GEOGRAPHY TODAY The National Geographic Society released the Roper Public Affairs 2006 Geographic Literacy Study in May, 2006 510 interviews were conducted among a sample of 18- to 24-year old adults in the continental United States between December 17, 2006 and January 20, 2006) The sample has a margin or error of +/- 4.4 % at the 95% confidence level Survey results … Over 6 in ten (63%) of those surveyed could not locate Iraq on a map of the Middle East Nearly nine in ten (88%) could not identify Afghanistan on a map of Asia Seven in ten (70%) could not find North Korea on a map, and 63% did not know its border with South Korea is the most heavily fortified in the world Sizeable percentages did not know that Sudan and Rwanda are in located in Africa (54% and 40%, respectively) Richard Heimann © 2013 Thursday, February 21, 13
  • 30. GEOGRAPHY TODAY (CONTINUED) Three-quarters could not find Indonesia on a world map and were unaware that a majority of Indonesia’s population is Muslin, making it the largest Muslim country in the world. A third or more could not find Louisiana or Mississippi on a map of the United States. Only 18% could correctly answer a multiple-choice question about the most widely spoken native language in the world. (5 Part Questionnaire) Although half said map reading skills are “absolutely necessary” in today’s world, many Americans lack basic practical skills necessary for safety and employment in today’s world. One-third (34%) would go in the wrong direction in the event of an evacuation One third (32%) would miss a conference call scheduled with colleagues in another time zone. Recommended Link 2006 National Geographic – Roper Survey of Geographic Literacy Richard Heimann © 2013 http://www.nationalgeographic.com/roper2006/findings.html Thursday, February 21, 13
  • 31. Advanced Placement Human Geography Score Percent This college-level course introduces students to the systematic study of 5 11.6% patterns and processes that have shaped human understanding, use, 4 16.7% and alteration of Earth's surface. 3 21.9% Students employ spatial concepts and landscape analyses to analyze 2 16.6% human social organization and its environmental consequences. They 1 33.2% also learn about the methods and tools geographers use in their In the 2009 science and practice. administration, 50,730 students took the exam and the mean score Richard Heimann © 2013 was a 2.57.  Thursday, February 21, 13
  • 32. Human Geography Richard Heimann © 2013 Thursday, February 21, 13
  • 33. Human Geography http://www.benjaminbarber.com/bio.html Richard Heimann © 2013 Thursday, February 21, 13
  • 34. Human Geography Richard Heimann © 2013 Thursday, February 21, 13
  • 35. WHAT IS GEOGRAPHY? • Geography is the study of the earth’s surface as the space within which human population live • Geography combines characteristics of both the natural and social sciences and literally bridges the gap between the two - more on this later. • Geography is a generalized as opposed to a specialized field of study • Space is the unifying theme for geographers • Geography is the science of space and place • Geographers are interested in … • Where things are located on the earth’s surface • Why they are located where they are • How places differ from one another • How people interact with the environment • Geographers were among the first scientists to sound the alarm that human-induced changes to the environment are beginning to threaten the balance of life Richard Heimann © 2013 Thursday, February 21, 13
  • 36. What was wrong with Geography? Geography had a number of problems, including: 1. It was overly descriptive Geography followed a set format for the inventory of physical and cultural features 2. It was almost purely educational Regions don't really exist 3. It failed to explain geographic patterns Geography was descriptive and did not explain why patterns were the way they were Where attempts at explanation did exist, they favored historical approaches 4. The biggest problem of geography was the fact that it was unscientific …the Nomothetic & Idiographic debate in geography begins! Richard Heimann © 2013 Thursday, February 21, 13
  • 37. Introduction to Spatial Analysis Topics •Description versus Analysis •The concepts of Process, Pattern and Analysis •Issues and challenges in spatial data analysis •Measuring space Richard Heimann © 2013 Thursday, February 21, 13
  • 38. Process, Pattern and Analysis Processes operating in space produce patterns Spatial Analysis is aimed at: 1., 2. Identifying and describing the pattern 3., 4. Identifying and understanding the process Richard Heimann © 2013 Thursday, February 21, 13
  • 39. Complete Spatial Randomness Deviations from spatial randomness suggests underlying social processes. “Every observable effect has a physical cause” (Thales) Perhaps the most profound insight- Randomized Variable causality is a rejection of the Total TTL Count – – 500 meter cell 500 meter cell randomness. “Every observable effect has a physical cause” (Thales) Perhaps the most profound insight-causality is a rejection of the randomness. Richard Heimann © 2013 Thursday, February 21, 13
  • 40. Complete Spatial Randomness Randomized Variable Total TTL Count – – 500 meter cell 500 meter cell “Every observable effect has a physical cause” (Thales) Perhaps the most profound insight-causality is a rejection of the randomness. Richard Heimann © 2013 Thursday, February 21, 13
  • 41. Description vs. Analysis Description Most GIS systems are used by governments and private companies to describe the real world this helps the organization “do its job” For example, manage sewer and water networks manage land resources Most GIS systems are primarily designed for this purpose They are used to develop spatial databases to describe the real world and help manage it. Richard Heimann © 2013 Thursday, February 21, 13
  • 42. Description vs. Analysis Analysis Tries to understand the processes which cause or create the patterns in the real world Understanding processes: Helps the organization do its job better Make better decisions, for example Helps us understand the phenomena itself This is the role of science Richard Heimann © 2013 Thursday, February 21, 13
  • 43. Description vs. Analysis Is the locations of the software industry different from the telecommucations industry? Analysis Tries to understand the processes which cause or create the patterns in the real world Understanding processes: Helps the organization do its job better Make better decisions, for example Helps us understand the Here, we are using “centrographic statistics” to phenomena itself help answer this question This is the role of science Richard Heimann © 2013 Thursday, February 21, 13
  • 44. Dr. Snow maps cholera in Soho London (1854) Richard Heimann © 2013 Thursday, February 21, 13
  • 45. The first example of Spatial Analysis Richard Heimann © 2013 Thursday, February 21, 13
  • 46. The first example of Spatial Analysis • John Snow’s maps of cholera in 1850s London Richard Heimann © 2013 Thursday, February 21, 13
  • 47. The first example of Spatial Analysis • John Snow’s maps of cholera in 1850s London Richard Heimann © 2013 Thursday, February 21, 13
  • 48. The first example of Spatial Analysis • John Snow’s maps of cholera in 1850s London Richard Heimann © 2013 Thursday, February 21, 13
  • 49. The first example of Spatial Analysis • John Snow’s maps of cholera in 1850s London Richard Heimann © 2013 Thursday, February 21, 13
  • 50. The first example of Spatial Analysis • John Snow’s maps of cholera in 1850s London Richard Heimann © 2013 Thursday, February 21, 13
  • 51. The first example of Spatial Analysis • John Snow’s maps of cholera in 1850s London Richard Heimann © 2013 Thursday, February 21, 13
  • 52. The first example of Spatial Analysis • John Snow’s maps of cholera in 1850s London Richard Heimann © 2013 Thursday, February 21, 13
  • 53. The first example of Spatial Analysis • John Snow’s maps of cholera in 1850s London Was it ESDA or hypothesis testing? Richard Heimann © 2013 Thursday, February 21, 13
  • 54. The first example of Spatial Analysis • John Snow’s maps of cholera in 1850s London Was it ESDA or hypothesis testing? • Did he discover the association between water and cholera after drawing the map: ESDA Richard Heimann © 2013 Thursday, February 21, 13
  • 55. The first example of Spatial Analysis • John Snow’s maps of cholera in 1850s London Was it ESDA or hypothesis testing? • Did he discover the association between water and cholera after drawing the map: ESDA • Did he draw the map in order to prove the association: using a map for hypothesis testing Richard Heimann © 2013 Thursday, February 21, 13
  • 56. Spatial Analysis: successive levels of sophistication Four levels of Spatial Analysis: --Each is more advanced (more difficult!) Spatial data description (the primitives) Exploratory Spatial Data Analysis (ESDA) Spatial statistical analysis and hypothesis testing Spatial modeling and prediction We will look at all 4 levels in this lecture series Richard Heimann © 2013 Thursday, February 21, 13
  • 57. Spatial Analysis: successive levels of sophistication Four levels of Spatial Analysis: --Each is more advanced (more difficult!) Spatial data description (the primitives) Exploratory Spatial Data Analysis (ESDA) Spatial statistical analysis and hypothesis testing Spatial modeling and prediction We will look at all 4 levels in this lecture series Richard Heimann © 2013 Thursday, February 21, 13
  • 58. Spatial Analysis: successive levels of sophistication 1. Spatial data description (primitive): Focus is on describing the world, and representing it in a digital format --computer map --computer database Uses classic GIS capabilities --buffering, map layer overlay --spatial queries & measurement Richard Heimann © 2013 Thursday, February 21, 13
  • 59. Spatial Analysis: successive levels of sophistication 1. Spatial data description (primitive): Focus is on describing the world, and representing it in a digital format --computer map --computer database Uses classic GIS capabilities --buffering, map layer overlay --spatial queries & measurement Richard Heimann © 2013 Thursday, February 21, 13
  • 60. Spatial Analysis: successive levels of sophistication 2. Exploratory Spatial Data Analysis Searching for patterns and possible explanations GeoVisualization through calculation and display of Centrographic statistics and other spatially descriptive statistics Richard Heimann © 2013 Thursday, February 21, 13
  • 61. Spatial Analysis: successive levels of sophistication 2. Exploratory Spatial Data Analysis Centrographics - Moments of Data Map showing changes to the mean center of population for the United States, 1790–2010 (U.S. Census Bureau)[1] http://en.wikipedia.org/wiki/Moment_(mathematics) Richard Heimann © 2013 Thursday, February 21, 13
  • 62. Spatial Analysis: successive levels of sophistication Richard Heimann © 2013 Thursday, February 21, 13
  • 63. Spatial Analysis: successive levels of sophistication The Geography of the Nazi Vote: Context, Confession, and Class in the Reichstag Election of 1930 Author(s): John O'Loughlin, Colin Flint, Luc Anselin Source: Annals of the Association of American Geographers Richard Heimann © 2013 Thursday, February 21, 13
  • 64. Spatial Analysis: successive levels of sophistication 3. Spatial statistical analysis and hypothesis testing Are data “to be expected” or are they “unexpected” relative to some statistical model, usually of a random process (pure chance). 2.5% 2.5% -1.96 1.96 0 We can test if the spatial pattern for voting behavior in Germany in 1930 is in fact cluster or random. The Geography of the Nazi Vote: Context, Confession, and Class in the Reichstag Election of 1930 Author(s): John O'Loughlin, Colin Flint, Luc Anselin Source: Annals of the Association of American Geographers Richard Heimann © 2013 Thursday, February 21, 13
  • 65. Making things even harder... • Inward and outward asymptotics i.e. increasing spatial extent, increasing temporal lags, finer spatial resolution, finer temporal resolution. • Increased number of cross sections. • …visual correlations and visual detection of change over space and time do not exist. • Apophenia is real! • Spatial Analysis and Geographic Pattern Recognition will reduce patternicity (Sherman, 2008). Richard Heimann © 2013 Thursday, February 21, 13
  • 66. Spatially Random or Spatially Clustered? Richard Heimann © 2013 Thursday, February 21, 13
  • 67. Spatially Random or Spatially Clustered? Moran’s I: Moran’s I: 0.689 0.003 Richard Heimann © 2013 Thursday, February 21, 13
  • 68. Spatial Analysis: successive levels of sophistication 4. Spatial modeling: prediction Construct models (of processes) to predict spatial outcomes (patterns) Coefficient: % Poverty Coefficient: % FB Coefficient: % Elderly Coefficient: % Black Richard Heimann © 2013 Thursday, February 21, 13
  • 69. Spatial Analysis: successive levels of sophistication Richard Heimann © 2013 Thursday, February 21, 13
  • 70. Spatial Analysis: successive levels of sophistication Statistically Statistically significant significant global global variables that variables that exhibit strong exhibit little regional regional variation variation inform inform region local policy. wide policy. Richard Heimann © 2013 Thursday, February 21, 13
  • 71. Spatial Analysis: successive levels of sophistication Local R2 informs us where the model is performing well and where it is performing poorly. The poor results in the south may indicate that an important variable is missing from our model. Richard Heimann © 2013 Thursday, February 21, 13
  • 72. Issues/Challenges/Problems in Spatial Analysis Summarize these now. Talk in greater detail about them throughout this lecture series. Richard Heimann © 2013 Thursday, February 21, 13
  • 73. Critical Issues in Spatial Analysis Richard Heimann © 2013 Thursday, February 21, 13
  • 74. Critical Issues in Spatial Analysis • Spatial autocorrelation – Data from locations near to each other are usually more similar than data from locations far away from each other Richard Heimann © 2013 Thursday, February 21, 13
  • 75. Critical Issues in Spatial Analysis • Spatial autocorrelation – Data from locations near to each other are usually more similar than data from locations far away from each other • Modifiable areal unit problem (MAUP-zone ) – Results may depend on the specific geographic unit used in the study – Province or county; county or city Richard Heimann © 2013 Thursday, February 21, 13
  • 76. Critical Issues in Spatial Analysis • Spatial autocorrelation – Data from locations near to each other are usually more similar than data from locations far away from each other • Modifiable areal unit problem (MAUP-zone ) – Results may depend on the specific geographic unit used in the study – Province or county; county or city • Scale affects representation and results – Cities may be represented as points or polygons – Results depend on the scale at which the analysis is conducted: province or county – MAUP—scale effect Richard Heimann © 2013 Thursday, February 21, 13
  • 77. Critical Issues in Spatial Analysis • Spatial autocorrelation – Data from locations near to each other are usually more similar than data from locations far away from each other • Modifiable areal unit problem (MAUP-zone ) – Results may depend on the specific geographic unit used in the study – Province or county; county or city • Scale affects representation and results – Cities may be represented as points or polygons – Results depend on the scale at which the analysis is conducted: province or county – MAUP—scale effect • Ecological fallacy – Results obtained from aggregated data (e.g. provinces) cannot be assumed to apply to individual people – MAUP—individual effect Richard Heimann © 2013 Thursday, February 21, 13
  • 78. Critical Issues in Spatial Analysis • Spatial autocorrelation – Data from locations near to each other are usually more similar than data from locations far away from each other • Modifiable areal unit problem (MAUP-zone ) – Results may depend on the specific geographic unit used in the study – Province or county; county or city • Scale affects representation and results – Cities may be represented as points or polygons – Results depend on the scale at which the analysis is conducted: province or county – MAUP—scale effect • Ecological fallacy – Results obtained from aggregated data (e.g. provinces) cannot be assumed to apply to individual people – MAUP—individual effect • Non-uniformity of Space – Phenomena are not distributed evenly in space – Be careful how you interpret results! Richard Heimann © 2013 Thursday, February 21, 13
  • 79. Critical Issues in Spatial Analysis • Spatial autocorrelation – Data from locations near to each other are usually more similar than data from locations far away from each other • Modifiable areal unit problem (MAUP-zone ) – Results may depend on the specific geographic unit used in the study – Province or county; county or city • Scale affects representation and results – Cities may be represented as points or polygons – Results depend on the scale at which the analysis is conducted: province or county – MAUP—scale effect • Ecological fallacy – Results obtained from aggregated data (e.g. provinces) cannot be assumed to apply to individual people – MAUP—individual effect • Non-uniformity of Space – Phenomena are not distributed evenly in space – Be careful how you interpret results! • Edge issues – Edges of the map, beyond which there is no data, can significantly affect results Richard Heimann © 2013 Thursday, February 21, 13
  • 80. The common problems... http://www.amazon.com/GIS-20-Essential-Skills/dp/1589482565 Richard Heimann © 2013 Thursday, February 21, 13
  • 81. Measuring Space Richard Heimann © 2013 Thursday, February 21, 13
  • 82. Fundamental Spatial Concepts Distance The magnitude of spatial separation Euclidean (straight line) distance often only an approximation Adjacency or neighborhood Nominal or binary (0,1) equivalent of distance Levels of adjacency exist: 1st, 2nd, 3rd nearest neighbor, etc.. Interaction The strength of the relationship between entities An inverse function of distance Richard Heimann © 2013 Thursday, February 21, 13
  • 83. Review (Part 1) What is Spatial Analysis? What are the four levels of Spatial Analysis? What are the three measures? Richard Heimann © 2013 Thursday, February 21, 13
  • 84. Take a Break! Richard Heimann © 2013 Thursday, February 21, 13
  • 85. Nontraditional Spatial Analysis Traditional spatial analyses grew up in an era of sparse data and very weak computational power. Today, both of those circumstances are reversed and many of the old solutions are no longer suitable to answer todays questions. "Spatial Analysis and Data Mining", reflects this change and combines two things which, until recently, engaged quite different groups of researchers and practitioners. Together, they require particular techniques and a sophisticated understanding of the special problems associated with spatial data. This geographic data mining, or Geographic Knowledge Discovery (GKD), is not new, but is developing and changing rapidly as both more, and different, data becomes available, and people see new applications. The days of ‘Big Data’ require fresh thinking. The aim of geographic data mining (GKD) is to assist in the generation of hypotheses, which can be tested, about interesting or anomalous spatial patterns which may be discovered in very large databases. It is important that the patterns discovered should not be statistical or sampling artifacts, and should be nontrivial and useful. The intent is not to build a system that makes decisions or interpretations automatically, but supports humans in these tasks. Also GKD is not synonymous with statistical analyses, such tools have a role in the testing of hypotheses generated by GKD but not in GKD itself. Richard Heimann © 2013 Thursday, February 21, 13
  • 86. DATA is the new OIL… Richard Heimann © 2013 Thursday, February 21, 13
  • 87. Long Tail of Big Data Head: Big Data Long Tail: Intelligence Reporting, Science Data – Dark Data Head: Big Data – Large continuous datasets coincident over Time & Space. Ideal for multivariate analysis. Tail {power law distribution} is good for business but suboptimal for governance. Data in tail is often unmaintained beyond their initially designed use case and individually curated. As a result, the data is discontiguous from other research efforts and discontinuous over space and time. Dark data is suspected to exist or ought to exist but is difficult or impossible to find. The problem of dark data is real and prevalent in the tail. The long tail is an intractably large management problem. Richard Heimann © 2013 Thursday, February 21, 13
  • 88. Long Tail of NSF data… Power law 80% 20% Number of Grants 7,478 1,869 Dollar Amount $938,548,595 $1,199,088,125 Total Grants (NSF07) 9,347 (Count) $2,137,636,716 (Amount) Richard Heimann © 2013 Thursday, February 21, 13
  • 89. Long Tail of data science… Head Tail Homogenous Heterogeneous Centralized curation Individual curation Maintained Unmaintained Continuous over S & T Discontinuous over S & T Visibly accessible DARK Data High Velocity Slow or NO velocity High Volume Low Volume Easier Data Integration Harder Data Integration Unreasonable Effectiveness of Data Reasonable Effectiveness of Data Open Innovation – Integrated Research Closed Innovation – Vertical Research Richard Heimann © 2013 Thursday, February 21, 13
  • 90. The Open Innovation Model In the new model of open innovation, a company commercializes both its own ideas as well as innovations from other firms and seeks ways to bring its in- house ideas to market by deploying pathways outside its current businesses. Note that the boundary between the company and its surrounding environment is porous (represented by a dashed line), enabling innovations to move more easily between the two. Henry W. Chesbrough, Era of Open Innovation. SPRING 2003 MIT SLOAN MANAGEMENT REVIEW Richard Heimann © 2013 Thursday, February 21, 13
  • 91. “The Unreasonable Effectiveness of “The Unreasonable Mathematics in the Effectiveness of Data” Natural Sciences” Eugene Wigner (1960 Nobel Peter Norvig Director of Research Laureate) at Google Inc. Richard Heimann © 2013 Thursday, February 21, 13
  • 92. Big Data, Small Theory Spatial Simpson’s Paradox Global standards will always compete with local social phenomenon. Violence in the Violence in the north north Violence Violence in the south Violence in the south Global models average regionally variant Local models account for regional variation. phenomenon. Richard Heimann © 2013 Thursday, February 21, 13
  • 93. New Aged Experimentation George Box “”The only way to understand complex systems is to shock those systems and observe the way they react”” New motivation for experimentation especially in quasi-experimental methods. (...more later) Richard Heimann © 2013 Thursday, February 21, 13
  • 94. New Aged Experimentation Richard Heimann © 2013 Thursday, February 21, 13
  • 95. Nontraditional Datasets Twitter – Sampled ongoing collection of social media tweets with UserId and time. Some even have precise location data, but this is not the norm. Collection pulls roughly between 1-2 million tweets / day. Example Proxy Problems: Discovery of crowd-sourced phenomena (e.g., people posting to beware of a certain neighborhood) Discovery of correlated trends (e.g., finding that people posting about a certain topic in an area correlates to higher crime in that area) Tracking sentiment on certain topics and issues Tracking language usage in areas to determine abnormal language presence in an area Richard Heimann © 2013 Thursday, February 21, 13
  • 96. What is Geographic Knowledge Discovery?? • How can we infer movement patterns from vast amounts of what appears to be just point data collected in time and associated with an identifier ? • Technique is applicable to Twitter, FourSquare and MANY others. Volume plot of photos binned by area on log scale Paris as seen from Flickr over all time Richard Heimann © 2013 Thursday, February 21, 13
  • 97. What is Geographic Knowledge Discovery?? Aggregate micro-pathing on a world of photo metadata with no speed, time, or distance restrictions Richard Heimann © 2013 Thursday, February 21, 13
  • 98. Personal Notes Richard Heimann Office: UMBC Common Faculty Area 3rd Floor Phone: 571-403-0119 (C) Office hours: Tues. 6:30-7:00 (Virtual); or by appointment (send e-mail) I promptly respond to emails. Phone calls are another matter. Email: rheimann@umbc.edu or heimann.richard@gmail.com Richard Heimann © 2013 Thursday, February 21, 13
  • 99. Thank you… Data Tactics Corporation https://www.data-tactics-corp.com/ http://datatactics.blogspot.com/ Twitter: @DataTactics Rich Heimann Twitter: @rheimann Richard Heimann © 2013 Thursday, February 21, 13