Predicting growth of urban agglomerations through fractal analysis of geo spatial data

th

14 Esri India User Conference 2013

PREDICTING GROWTH OF URBAN AGGLOMERATIONS THROUGH FRACTAL ANALYSIS OF
GEO-SPATIAL DATA
Ashish Vanmali1, Saket Porwal2, Vikram Gadre3, Anshuman Gupta4, Laveesh Bhandari5
1,2,3

4,5

Department of Electrical Engineering, IIT Bombay, Mumbai– 400076, India
Indicus Analytics, Nehru House, 4 Bahadur Shah Zafar Marg, New Delhi – 110002, India

Abstract:

About the Author:

Location Analytics is one of the fastest emerging fields in
the broad area of Business Intelligence/Data Science. By
some industry estimates, almost 80% of all data has a
location dimension to it. Consequently, identification of
trends and patterns in spatially distributed information has
far reaching applications ranging from urban planning, to
logistics and supply chain management, location based
marketing, sales territory planning and retail store location.
In view of this, we present an approach based on Fractal
Analysis (FA) of highly granular geo-spatial data.
Specifically, we use proprietary data available at
approximately1 square km level for New Delhi, India
provided by Indicus Analytics (India’s leading economic
data analytics firm based in New Delhi). We compare and
contrast the patterns and insights generated using the FA
approach with other more traditional approaches such as
spatial to correlation and structural similarity indices.
Preliminary results indicate that there are indeed “selfsimilar” local patterns that are completely missed by spatial
correlation that are accurately captured by the more
sophisticated FA approach. These patterns provide deep
insights into the underlying socio-economic and
demographic processes and can be used to predict the
spatial distribution of these variables in the future. For
example, questions such as what are the pockets of
population growth in a city and how will businesses and
government respond to that growth can be answered using
the proposed approach.

Mr. Ashish Vanmali, Ph.D. (Pursuing)
Ashish V. Vanmali received his B.E. (EXTC) in 2001 from
University of Mumbai, India and M. Tech. (Electrical
Engg.) in 2008 from IIT Bombay, India. He is a Ph.D.
candidate with Department of Electrical Engineering, IIT
Bombay, India. He is an Assistant Professor with
Vidyavardhini's C.O.E. & Tech., University of Mumbai,
India. His research interests include image and video
processing, biometrics, and data fusion.
E mail ID: ashish@ee.iitb.ac.in
Contact No.: +91 9890120301
Mr. Saket Porwal, M.Tech.(Pursuing)
Saket Porwal is a final year M.Tech. student at IIT
Bombay. His research interests include time-frequency
aspects of signal processing and their application in data
analysis. For additional details
please refer to
in.linkedin.com/pub/saket-porwal/41/959/250/
E mail ID: saketporwal@ee.iitb.ac.in
Dr. Vikram Gadre
Vikram Gadre is Professor with Department of Electrical
Engineering, IIT Bombay, India. For details refer to
http://www.ee.iitb.ac.in/web/faculty/homepage/vmga
dre
Dr. Anshuman Gupta
Anshuman Gupta is Vice President with Indicus
Analytics, New Delhi. He has over 15 years of
experience in business analytics in the retail and CPG
domains. For additional information please refer to
in.linkedin.com/pub/anshuman-gupta-phd/9/b56/414/
Dr. Laveesh Bhandari
Laveesh Bhandari is Director with Indicus Analytics. He
completed his Ph.D. in Economics from Boston
University. He has worked at NCAER conducting studies
on Indian Industry and infrastructure, taught at IIT
Delhi, and now heads Indicus Analytics – India’s premier
economics research firm. He has authored and coauthored numerous publications on socio-economic
development, health, education, poverty, inequality,
etc. He writes frequently for newspapers such as
Business Standard, Economic Times, India Today, etc.
Email ID: laveesh@indicus.net

Page 1 of 7

th


Introduction:
All business operations occur within a context defined by their location [1]. Broadly, decision making requires triangulation
between (1) internal operational and usage data (2) data on competing or synergistic options and (3) a precise idea on the scale
and character of economic activity/ demography of the area. However, information for many locations across India (and the
world) is either missing, flawed or simply not comparable due to any number of reasons. Consequently, most decision-making
processes end up being based on imperfect/inadequate information that rely heavily of norms/thumb rules and/or gut instinct.
Thus, most decisions dependent upon location information are constrained and suboptimal. To address this challenge, this paper
describes a novel Fractal Analysis (FA) based approach for analyzing highly granular geo-spatial data for generating deep
location-based insights by leveraging a number of key functionalities of ESRI’s technology platform.
Previously, Lee De Cola [2] presented use of fractal analysis for classification of remotely sensed images. In [3] Pierre
Frankhauser presented fractals as a tool for urban data analysis and also remarked the need of supplementary measures for the
complete analysis. Keersmaecker et al. [4] presented comparison of fractal-based parameters calculated by different fractal
methods for characterizing intra-urban diversity. Myint [5] provided a comparative study of various approaches for texture
analysis and classification of remotely sensed data. A similar study of various spatial methods was carried out by Dale et al. [6] to
conclude that no single method can reveal all the important characteristics of spatial data and the results of different analyses
are not expected to be completely independent of each other. In this paper, we build on these and other research as described
next.

Data and Study Area:
Six quantities measured at different geographical locations of 1 sq. km area of Delhi city are used as data under consideration.
Each “grid cell” is identified by the latitude & longitude of its centroid. Overall, data was taken at 1602 different locations of
Delhi. Specifically, we use proprietary data provided by Indicus Analytics for the following 6 variables/indices: (i) Population,
(ii) Night-time light intensity, (iii) Points-of-Interest (POI), (iv) Road Length, (v) Index of telecom call intensity and (vi) Index of
property tax collections. Table I lists these variables and the acronyms used in this paper. These are chosen as a sample set from
the set of 5000+ socio-economic and demographic variables available with Indicus. Inverse distance weighted interpolation
method [7,8] is employed for the interpolation as it is the preferred choice for geographical data interpolation. In this method,
nearby points contribute more to the interpolation as opposed to the distant ones. Interpolated values are the weighted sum of
the known values and the weight is inversely proportional to the distance between the interpolated values and the known ones.
The simplest form of the Inverse distance weighting method is used in the present work known as Shepard method [7,8] with the
weight function
hi  p
wi 
...(1)
n
hj p



j 0

where p is an arbitrary positive real number called the power parameter ( p is taken 2 in the present work) and hi is Euclidean
distance given by

hi  ( x  xi )2  ( y  yi )2

...(2)

where ( x, y ) are the coordinates of the interpolation point and ( xi , yi ) are the coordinates of each dispersion point. The weight
Table I
Quantities Under Consideration
Quantity
Population
Night-time light intensity
Index of telecom call intensity
Index of number of Points-of-Interest
Index of road length
Index of property tax
Page 2 of 7

Acronyms
POP
RAD
CALLS
POI
ROADS
TAX

th


function decays from unity to zero as the distance to the dispersion point increase. The weight functions are normalized to
ensure that the weights sum to one. The interpolated point P ( x, y ) is then calculated as a weighted sum as

P( x, y) 



n
i0

wi P( xi , yi )

...(3)

No extrapolation was used for the points outside the boundary of Delhi and these points were taken to be zero.
All the data extraction and processing is carried out using ESRI ArcGIS technology. For example, the night-time lights raster
image is processed using the ‘Zonal Statistics’ and ‘Zonal statistics as table’ tools under the ‘Spatial Analyst’ toolset of ArcGIS.
The ‘Georeferencing’ tools provided in ArcGIS were used for all georeferencing issues. The ‘Extract values to points’ under the
‘Spatial Analyst’ toolset was used for extracting values of individual cells in a raster.

Analysis Methodology:
(a) Spatial Correlation Coefficient
Spatial correlation coefficient is a common parameter by which one can comment on the similarity between the two signals
globally. The correlation coefficient computed is given by



  (X
m

n

mn

 X )(Ymn  Y )

...(4)

 x y

where X and Y are the 2-D signals with means X and Y respectively.  x and  y are the standard deviation of X and Y
respectively given by

x 

  (X

y 

  (Y

m

n

mn

 X )2

...(5)

and
m

n

mn

 Y )2

...(6)

(b) Fractal Characteristic & Fractal Dimension
A quantity is called fractal if its fractal dimension is non-integer [9,10]. Fractal is a mathematical object that is both self-similar
and chaotic. Self-similar refers that the object looks same at different scales and chaotic refers that the object is complex too
[9,10]. More formally a continuous function f is said to be self-similar [9,10] if there exists a disjoint subsets S1 , S2 ,, Sk such
that the function f on each S k is an affine transformation of f i.e. there exists a scale li  1 ,a translation ri , a constant ci and a
weight wi such that

f (t )  ci  wi f (li (t  ri )) t  Si
...(7)
The fractal dimension of a point is zero, of a line segment is one, a square is two, and of a cube is three. In general, the fractal
dimension is not an integer, but a fractional dimensional. To determine the fractal characteristics of a function, its fractal
dimension can be calculated. There are several definitions of fractal dimension in the literature, the most famous measure is
known as the box counting dimension [10]. The expression for the fractal dimension is
log e N ()
F .D.   lim
...(8)
0
loge 
where N () is the minimum number of boxes of size  needed to entirely enclose the object.
In practice the box size can’t be zero, but one can go to the pixel level i.e. in practice the smallest size that can be imagine is
the pixel of the image itself. In the present work the data sets are the images of 64×64. Hence if the size of the 64×64 box is
taken unity, the size of the pixel would be 1/64. The data given is converted to the binary image by employing thresholding [11]
and then the box counting algorithm [10] is employed.
(c) The Hausdorff Metric
Fractals are defined over Hausdorff metric space [12]. There is a notion of the Hausdorff distance between the two fractal sets.
A metric is a function which measures distance on a space. The standard Euclidean distance between x and y in Rn is denoted
as d E ( x, y ) . The Hausdorff metric is defined below (See Fig. 1),
Page 3 of 7

th


Fig. 1 –Distance between a point x and
element (set) B

Fig. 2 –Distance between two sets A and B

If x  R n , the “distance” between x to B is
d ( x, B)  min d E ( x, b)

...(9)

bB

The “distance” from A to B is
d ( A, B)  max d E ( x, B)

...(10)

xA

It can be noted that d is not metric, since d is not symmetric i.e. d ( A, B )  d ( B, A) (See Fig. 2). The Hausdorff distance,
h( A, B) between A and B is then given by

h( A, B)  max d ( A, B), d ( B, A)

...(11)

(d) Structural Similarity Index
Zhou et al. [13] proposed an algorithm for image quality assessment. They developed a parameter called structural similarity
index by which one can make comment on the structural similarity between two images locally. Similarity measure between two
images has the form






SSIM ( x, y)   l ( x, y )  c( x, y)   s ( x, y) 

...(12)

where l ( x , y ) , c ( x, y ) and s ( x, y ) are luminance, contrast and structure comparison of images and   0 ,   0 ,   0 are
parameters used to adjust the relative importance of the three components. Given two local image patches x and y of the two
images, respectively, the luminance, contrast and structural similarities between them are evaluated as
2 x  y  C1
l ( x, y )  2
...(13)
 x   y 2  C1

c( x, y) 
l ( x, y ) 

2 x y  C2

...(14)

 x 2   y 2  C2
2 xy  C3

...(15)

 x y  C3

where C1 , C2 and C3 are small stabilizing constants and  x ,  x and  xy represent the mean, standard deviation and crosscorrelation evaluations over a local window, respectively.
The simplified expression with       1 and C3  C2 / 2 has the form

SSIM ( x, y) 

(2 x  y  C1 )(2 xy  C2 )

...(16)

2

(  x   y 2  C1 )( x 2   y 2  C2 )

The quality map so produced, exhibits the local isotropic properties of the image under comparison with the values ranging
between 0 (indicating dis-similarities) and 1 (indicating similarity). The overall quality measure is obtained by calculating mean
SSIM (MSSIM) index. In the present study we have used a 8×8 window with C1  C2  0.01 .
Page 4 of 7

th


(a) RAD-POP

(d) RAD-ROADS

(b) RAD-CALLS

(c) RAD-POI

(e) RAD-TAX

Fig. 3 –SSIM Maps when paired with RAD

(a) POI-POP

(d) POI-ROADS

(b) POI-RAD

(e) POI-TAX

Fig. 4 –SSIM Maps when paired with POI
Page 5 of 7

(c) POI-CALLS

th


Results and Discussion:
Spatial correlation coefficient, Hausdorff distance and mean SSIM values for different pairs of data are calculated and are
presented in Table II, III and IV respectively. All these results are normalized in the range of 0 to 1. For spatial correlation
coefficient and mean SSIM higher value indicate similarity whereas for Hausdorff distance lower value indicates similarity. In
each table, 3 most similar scores are marked with Red color whereas 3 most dis-similar scores are marked with Blue color. Table
V gives Fractal dimension of different quantities.
Spatial correlation coefficient and Hausdorff distance helps in global evaluation of the quantities. Table II and III indicates that
both these parameters produce quite identical results. In both the cases, quantities are most correlated when paired with RAD
and are least correlated when paired with POI. Quantities like POP, CALLS and ROADS also exhibit good amount of correlation
when paired with other quantities.
Structural similarity helps in a global as well as local evaluation of the quantities. Mean SSIM gives the global trend whereas
SSIM map gives the local trend of the quantities under consideration. The mean SSIM values in Table IV are found to be very
close to unity. This is an indication that all these quantities are highly structurally similar in a global sense. To analyze the local
trend one need to compare SSIM maps of different pairs. Fig. 3 and Fig. 4 show SSIM maps for different pairs when paired with
RAD and POI respectively. Fig. 3 exhibits that SSIM maps when paired with RAD produce a similar image pattern. This does not
hold true for SSIM maps when paired with POI as in Fig. 4. The patterns generated in this case are quite different for different
pairs. This indicates that RAD will have good correlation in local sense and POI will have less correlation in local sense with the
other quantities under consideration. Also a similarity trend is observed for quantities when paired with POP, CALLS, ROADS.
These results are again similar to the results for spatial correlation coefficient and Hausdorff distance.
Fractal dimension is an indication of self-similarity from a global to local sense and vice-versa. According to Table V, TAX
exhibits the highest self-similarity followed by RAD and ROADS where as POI exhibits the least self-similarity.
The mean and the standard deviation of the fractal dimension of the six quantities are 1.55 and 0.22 respectively. As a result,
the quantities with fractal dimension in the range 1.33 to 1.77 will be correlated as oppose to the one outside this range.
Accordingly, POP, RAD, CALLS and ROADS form a set of quantities which are correlated and POI and TAX form the set of outliers.
This is again similar to the earlier results.
There is another interpretation of fractal dimension i.e. fractal dimension tells about the space occupancy. If fractal dimension
of a 2-D function is 2, it means that the function is occupying whole the 2-D space. If the fractal dimension is less than 2, it infers
that it is occupying somewhat less space. For the comparison, the fractal dimension of the Delhi map generated from latitudesTable II
Spatial Correlation Coefficient

Table III
Hausdorff Distance

POP

RAD

CALLS

POI

ROADS

TAX

POP

RAD

CALLS

POI

ROADS

TAX

POP

1

0.8

0.75

0.58

0.74

0.58

POP

0

0.44

0.52

0.92

0.53

0.55

RAD

0.8

1

0.77

0.56

0.85

0.77

RAD

0.44

0

0.49

1

0.35

0.39

CALLS

0.75

0.77

1

0.66

0.75

0.6

CALLS

0.52

0.49

0

0.87

0.51

0.56

POI

0.58

0.56

0.66

1

0.62

0.39

POI

0.92

1

0.87

0

0.96

1

ROADS

0.74

0.85

0.75

0.62

1

0.7

ROADS

0.53

0.35

0.51

0.96

0

0.44

TAX

0.58

0.77

0.6

0.39

0.7

1

TAX

0.55

0.39

0.56

1

0.44

0

Table IV
Mean SSIM

Table V
Fractal Dimension

POP

RAD

CALLS

POI

ROADS

1

0.9925

0.9987

0.9976

0.9981

0.9911

RAD

0.9925

1

Quantity
POP
RAD
CALLS
POI
ROADS
TAX
Delhi Map

TAX

POP

0.9909

0.9863

0.9948

0.9953

CALLS

0.9987 0.9909

1

0.9987

0.9980

0.9905

POI

0.9976 0.9863

0.9987

1

0.9964

0.9869

ROADS

0.9981 0.9948

0.9980

0.9964

1

0.9939

TAX

0.9911 0.9953

0.9905

0.9869

0.9939

1

Page 6 of 7

F.D.
1.50
1.67
1.50
1.18
1.63
1.82
1.84

th


longitudes combinations of the measurements is also calculated and is found to be 1.84 and is given in the last row of Table V.
Fractal dimension of TAX is 1.82, i.e. we can say that tax is been collected from almost all Delhi space having fractal dimension
1.84. For POI, fractal dimension is 1.18, indicating that POI does not follow a uniform distribution among the space of Delhi and
are distributed unevenly in this space. This is the extra information that fractal dimension brings in the analysis which is not
captured by other parameters.

Conclusion:
In this work, we took the first steps of analyzing highly granular geo-spatial data using fractal analysis techniques. Since the
data is available at 1-sq. km level, we are able to generate much deeper insights compared to coarser data at the state or district
or even sub-district level. Another key feature of the work is the integration of data from multiple sources that helps to create a
more complete and dynamic picture of the evolving socio-economic processes. The geo-spatial analytical capabilities of ESRI’s
ArcGIS platform were critical in this work both from a data extraction/processing as well as visualization perspective. Currently
we are working on extending the analysis to predict how the given variables will evolve in the future over various timeframes.
This information would be critical for making key strategic and tactical level decisions for both businesses and government
entities.

Acknowledgment:
Authors would like to thank Indicus Analytics, New Delhi, India (www.indicus.net) for their very generous assistance for
providing the datasets for this work.

References:
1.
2.
3.
4.
5.

6.
7.
8.
9.
10.
11.
12.
13.

P. A. Longley, M. F. Goodchild, D. J. Maguire, and D. W. Rhind, Geographic Information Systems and Science, Wiley,
2005.
L. D. Cola, “Fractal analysis of a classified landsat scene,” Photogrammetric Engineering and Remote Sensing, vol. 55, no.
5, pp. 601–612, 1989.
P. Frankhauser, “The fractal approach. a new tool for the spatial analysis of urban agglomerations,” Population, vol. 10,
no. 1, pp. 205–240, 1998.
M. L. De Keersmaecker, P. Frankhauser, and I. Thomas, “Using fractal dimensions for characterizing intra-urban
diversity: The example of Brussels,” Geographical Analysis, vol. 35, no. 4, pp. 310–328, 2003.
S. W. Myint, “Fractal approaches in texture analysis and classification of remotely sensed data: Comparisons with spatial
autocorrelation techniques and simple descriptive statistics,” International Journal of Remote Sensing, vol. 24, no. 9, pp.
1925–1947, 2003.
M. R. T. Dale, P. Dixon, M.-J. Fortin, P. Legendre, D. E. Myers, and M. S. Rosenberg, “Conceptual and mathematical
relationships among methods for spatial analysis,” Ecography, vol. 25, no. 5, pp. 558–577, Oct 2002.
D. Shepard, “A two-dimensional interpolation function for irregularly spaced data,” in Proceedings of the 1968 23rd
ACM national conference, ser. ACM ’68, New York, USA, pp. 517–524, 1968.
M. A. Azpurua, and K. D. Ramos, “A comparison of spatial interpolation methods for estimation of average
electromagnetic field magnitude,” Progress In Electromagnetics Research M, vol. 14, pp. 135-145, 2010.
S. Mallat, A Wavelet Tour of Signal Processing: The Sparse Way, 3rd ed. Burlington, MA, USA: Academic Press, 2009, ch.
Wavelet Zoom, pp. 242–259.
R. Lopes and N. Betrouni, “Fractal and multifractal analysis: A review.” Medical Image Analysis, vol. 13, no. 4, pp. 634–
649, 2009.
N. Otsu, “A threshold selection method from gray-level histograms,” IEEE Transactions on Systems, Man and
Cybernetics, vol. 9, no. 1, pp. 62–66, 1979.
M. Barnsley, Fractals Everywhere, 2nd ed. Boston, MA, USA: Academic Press, 1993, ch. Metric Spaces, Equivalent
Spaces: Classification of Subsets, and the Space of Fractals, pp. 5–41.
Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image quality assessment: from error visibility to structural similarity,”
Image Processing, IEEE Transactions on, vol. 13, no. 4, pp. 600–612, 2004.
Page 7 of 7

Predicting growth of urban agglomerations through fractal analysis of geo spatial data

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (15)

Ähnlich wie Predicting growth of urban agglomerations through fractal analysis of geo spatial data

Ähnlich wie Predicting growth of urban agglomerations through fractal analysis of geo spatial data (20)

Mehr von Indicus Analytics Private Limited

Mehr von Indicus Analytics Private Limited (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Predicting growth of urban agglomerations through fractal analysis of geo spatial data