3. What is Meteorology and Oceanography?
◦ study of spatial and temporal variations of the atmospheric,
oceanographic and land parameters over long time periods
◦ helps in prediction of disasters which prevents loss of life and
property
What is data mining?
◦ process of extraction of
implicit,
previously unknown
and potentially useful information from huge amount of data
4. Technique Application
Anomaly
detection
Detection of Land cover change , outlier
values of precipitation
Association rule
mining
Finding association between
oceanographic parameters and cyclone
intensification
Pattern mining
Understanding of natural events. For
example: eddies sustain energy for weeks
or months and therefore can be
manifested as connected group of
gradually increasing or decreasing time
series
Classification Detection of water fraction per flood pixel
Regression Detection of forest cover per pixel
5. Swirls of ocean currents
Play significant role in transport
of water, heat, salt, and nutrients
Green
swirl is
ocean
eddy
Gradually decreasing segments of time series enclosed between red and
green lines are signatures of an eddy
6. This is challenging due to following reasons:
◦ Not concrete objects: Spatio-temporal phenomena are not
concrete objects but evolving patterns over space and time
whereas in traditional data mining, objects are concrete i.e. they
are either present or absent.
Transactions – item either present or absent (0 or 1)
Hurricanes – continuous gradual evolution, does not simply appear
and disappear
◦ Uncertainty: It occurs due to biases in measurement as some
values may be missing due to presence of cloud cover.
◦ Diversity: This is due to heterogeneity in space and time as data
may be available from different sources at different spatial and
temporal resolutions
◦ Variability: Values captured for same location at difference of
small intervals may vary due to local climatic variations
7. The data retrieved from the remote sensing satellites is in the form
of data products having different data formats.
The standard data format for most of the data products is HDF
format. Some other formats are NetCDF, KML etc.
One data product contains data related to one parameter.
The authenticated users can download the Indian satellite data from
mosdac.gov.in website of ISRO.
MOSDAC disseminates data for around 20 parameters. Some of
these are:
o Normalized Difference Vegetation Index (NDVI)
o Land surface temperature (LST)
o Aerosol Optical Depth (AOD)
o Cloud Liquid Water (CLW)
o Mean Sea Surface (MSS)
8. Container for storing a variety of scientific data
Composed of two primary types of objects :
◦ Groups :
grouping structure containing 1 or more HDF objects together
with supporting metadata
◦ Datasets :
Multidimensional array of data elements together with
supporting metadata
12. Anomaly detection
◦ Land cover change detection
◦ Outlier precipitation detection
◦ Outlier time interval detection
Detection of water fraction per flood pixel
Detection of forest cover per pixel
13. Aim:
◦ To find those locations which undergo significant and sudden change
during a particular time period.
◦ The time at which the change occurs is also determined.
Importance:
◦ Helps in mapping of damages following a natural disaster such as fire,
droughts, floods etc.
14. Land cover
change detection
methods
Bitemporal
methods
Red – focused
techniques
Time series data
mining techniques
15. Bitemporal
methods
Image
differencing
Image
ratioing
Principal
component
analysis
Change
vector
analysis
Land cover
change
detection
methods
Bitemporal
methods
Time series
data mining
techniques
16. Time series
data mining
techniques
Predictive
model based
Yearly Delta
Algorithm
Variability
Distribution
Algorithm
Vegetation
Independent
Yearly Delta
Algorithm
Segmentation
based
Top down
approach
Bottom up
approach
Recursive
merging
algorithm
Land cover
change
detection
methods
Bitemporal
methods
Time series
data mining
techniques
17. Bitemporal methods Time series data mining methods
Two time instants are compared Vegetation time series is analyzed at each
location and changes in the time series are
identified
Do not provide the information about the
time of change
Provides the information about the time of
change
Less computational complexity Computational complexity is high as large
time series has to be analyzed
Segmentation based approach Predictive model based approach
Time series is partitioned into homogenous
segments and boundaries between
segments may be change points
A model is constructed for the portion of the
time series and that is used to predict the
future time points.
The time series that are sufficiently different
are considered change points.
18.
19. Time series data mining methods
Segmentation based
◦ Recursive merging algorithm
Predictive model based
◦ Yearly Delta algorithm
◦ Variability distribution algorithm
◦ Vegetation independent variability distribution algorithm
20. Input : Monthly composited EVI (Enhanced Vegetation Index) dataset for
the state of California for years 2000-2006.
Output : Detection of land cover changes
◦ Forest fires
◦ Conversion to farming
◦ Construction or logging
21. Algorithm : The pixel time series is analyzed as follows:
1. Let {b1,b2,…., bn} correspond to list of annual EVI sum which is the sum of
vegetation index value of all the months.
2. Two consecutive segments with most similar annual EVI sums are merged
• Suppose b1 and b2 are most similar EVI sums, then at the end of this
step, list will be {(b1+b2)/2,b3,…., bn} having one less element
• Merge cost s1= dist {b1,b2}
3. Step 2 is applied recursively until list contains one element
4. List of merge costs will be s1,s2,.......,sn-1.
5. Change score for a location or pixel will be
max
change score
i
s
6. Pixels are ranked on basis of change score value and some top ranked
pixels are considered as changes.
1
1
1
1
min
n
i
i
n
i
s
23. Change score is calculated in such a way so as to take into account
the type of vegetation
◦ very small change can be considered as change point for stable
forests
◦ large change may not be change point for high variability regions
such as grasslands
Helps in reducing the detection of false positives
Limitations:
◦ Minimum cost of merging is considered as variability value due to
local climatic changes.
◦ But, the minimum cost may have occurred very rare and have
been captured by chance
24. Time series data mining methods
Segmentation based
◦ Recursive merging algorithm
Predictive model based
◦ Yearly Delta algorithm
◦ Variability distribution algorithm
◦ Vegetation independent variability distribution algorithm
25. Input: MODIS EVI data for California and Yukon
◦ Data for California is at 250m spatial resolution for years 2006-
2008.
◦ Data for Yukon is at 1km spatial resolution for years 2004-2008.
◦ Time series for each pixel is analyzed independently
Output:
◦ Land cover change locations (pixels)
◦ Time at which change occurred
Validation: High quality data for fires generated from independent
source is used for validation
26. Algorithm:
◦ Previous year is considered as a model
◦ Change score is assigned to each time step as difference between mean annual
EVI of current year and previous year
change score annual EVI
annual EVI current year current year previous year
◦ Maximum change score across all the time steps is considered the YD score for a
location
n-1
◦ Top ranked pixels according to YD score are called change points.
Limitation :
◦ Does not make use of information about natural variation in EVI.
◦ Only one top change of a time series is considered.
There is possibility that one time series may undergo multiple changes during a
given period
score max(change score)
i1
YD
28. Change occurs in year 2005 due to natural
variations
Although
difference in
annual EVI is
high but not
very high if
compared with
mean
variability
score
29. Time series data mining methods
Segmentation based
◦ Recursive merging algorithm
Predictive model based
◦ Yearly Delta algorithm
◦ Variability distribution algorithm
◦ Vegetation independent variability distribution algorithm
30. Algorithm:
Each annual segment in the first k years is considered a model and remaining k-1
values are considered as the observed values.
Mean Manhattan distance is computed for the k-1 years of model to give the
distribution of variability scores for that location.
Modified score value called VD score is used which is
where μ is the mean of distribution.
The mean is estimated using Maximum Likelihood Estimation method
Special features:
Makes use of information about natural variation in EVI.
Any year for which annual EVI deviates significantly from the mean annual EVI for k
years should be discarded
Limitation :
Some of the vegetation types such as open shrubs have large variations in spread of
annual variability
VD score YD score -
31. Change
point
As only one
vegetation type
i.e. forests is
considered,
therefore YD is
also performing
better
Constant YD score
Constant VD score
Scatter plot of mean variability against YD score for forest cover
(Courtesy: Mithal et al. [6])
32. Savannas consist of
trees, shrubs, grasses
etc.
The different
vegetation types has
different value of
threshold change
score to be
considered as actual
change.
Therefore, VD
performs better than
YD algorithm
Constant YD score
Constant VD score
Scatter plot of mean variability against YD score for savannas (Courtesy:
Mithal et al. [6])
33. As open shrub-lands
show different spread
of variability for
different locations
even though
vegetation type is
same, therefore both
YD and VD are
showing lot of false
positives
Constant YD score
Constant VD score
Scatter plot of mean variability against YD score for shrublands
(Courtesy: Mithal et al. [6])
34. Time series data mining methods
Segmentation based
◦ Recursive Merging Algorithm
Predictive model based
◦ Yearly Delta Algorithm
◦ Variability Distribution Algorithm
◦ Vegetation Independent Variability Distribution Algorithm
35. Algorithm:
Mean and standard deviation of variability score distribution are
estimated as maximum likelihood estimates of distribution
New score called VID score is used and calculated as follows:
VID score
Salient features:
YD score -
Takes into account the information about spread of variability score
distribution and therefore reduces false alarm rates
High VID score implies lower false positive rate and vice versa.
36. Curve for variability
score for pixel 2
Curve for variability change for pixel 1
score for pixel 1
Mean annual EVI
Variability score in
this area indicates
Variability score in
this area indicates
change for pixel 2
Both pixels correspond to shrub vegetation
type whose spread of variability score varies
from location to location and time to time.
37. Maximum likelihood estimation (MLE)
Every model is specified by the parameters.
MLE is a parameter estimation method which finds the parameter values of a
model that best fits the data.
As fluctuations in variability score for particular vegetation type are normally
distributed for a location, therefore parameters are calculated for normal
distribution
The mean and standard deviation are the parameters for the normal distribution.
Calculation of mean and standard deviation using MLE
◦ Let f(y|w) denotes probability density function (PDF) that specifies probability of observing data
vector y given the parameter w.
◦ If individual observations yi are independent of each other, then according to theory of
probability, the PDF for data y=(y1,.......,yn) given the vector w can be expressed as
multiplication of individual PDFs.
f(y=(y1,…..yn)|w) = f1(y1|w) f2(y2|w)…..fn(yn|w)
38. The PDF for one observation is
e
( )
xi
1
P x
( ) 2
2
2
2
The PDF for multiple independent observations is
1
x x e
( ,....., | , ) 2
2
Taking log on both sides
xi
f
n
( )
2
2
1
e
( )
xi
n
n
2
2
2
(2 ) 2
2
2
2
ln(2 ) ln( )
1
2
ln( )
f n n xi
39. In order for data to best fit the model, the value of the parameter
vector should maximize the PDF.
The partial differentiation of PDF with respect to each of component
parameter of vector should be zero
f x xi i
n
0
(ln( ))
2
f n xi xi
n
2
3
2
0
(ln( ))
40. Yearly Delta algorithm Variability distribution
method
Vegetation Independent
Yearly Delta Algorithm
Does not consider the type
of vegetation.
Same YDscore value may be
actual change for forests
but not for savannas or
shrublands.
Considers the type of
vegetation
Same VD score may be
actual change for regions
such as savannas (having
less variation in variability
value) but not for
shrublands (having high
variation in variability value)
Considers the type of
vegetation
VIDscore works for all the
vegetation types
Does not consider the
average change score
value(μ) and the degree of
variability in value(σ)
Considers only the average
change score value(μ)
Considers both the average
change score value(μ) and
the degree of variability in
value(σ)
YDscore= max i=1 to n(annual
EVI current year – annual EVI
previous year)
VD score = YDscore - μ VID score=(YDscore-μ)/ σ
41. Where TPn = true positives,
FPn = false positives,
M = total no of pixels considered
42. Green line -> YD score
Red line -> VD score
Black line -> VID score
VD and VID gives better
results than YD.
Reason:
Graph corresponds
to only forest region.
MODIS forest map
was used to detect
forest cover pixels
inaccurate and
includes some
shrubs and
agricultural land
labeled as forests.
43. Green line -> YD score
Red line -> VD score
Black line -> VID score
VID performs slightly
worse than VD
Reason-Initial few years
selected to model variability
may have some noise
Therefore, mean variability
for that location is modeled
as high and changes in later
years will go undetected
44. Green line -> YD score
Red line -> VD score
Black line -> VID score
Performance of
VID is best.
Reason-Shrubs
form dominant land
cover type for
California and they
show high variability
in spread of
variability score due
to higher sensitivity
to climatic variations
45. Green line -> YD score
Red line -> VD score
Black line -> VID score
Performance of YD is
exceptionally poor and that
of VID is exceptionally
good.
Reason-due to high
variability in spread of
variability score for different
locations with vegetation
type as shrubs
46. Anomaly detection
◦ Land cover change detection
◦ Outlier precipitation detection
◦ Outlier time interval detection
Detection of water fraction per flood pixel
Detection of forest cover per pixel
47. Input :
◦ South American Precipitation dataset in geoscience format known as NetCDF
Output:
Variable Value
Num Year Periods 10
Year Range 1995-2004
Grid Size 2.5º×2.5º
Num Latitudes 31
Num Longitudes 23
Total Grids 713
◦ The top k=5 outliers are found for every year
◦ Total of 155 outlier sequences were found over a period of 10 years
Running time of algorithm is 229s.
48. Aim:
◦ To find and track the position of outliers with time
Method description:
◦ Top k outliers are found for every year using Exact-Grid Top-k algorithm
◦ Outliers are tracked using the OutStretch algorithm
◦ The outlier sequences generated are analyzed
How to find the outlier (Exact-Grid Top-k algorithm)
◦ Concept of discrepancy is used
◦ Discrepancy value is assigned to each rectangular region using
Kulldorff’s scan statistic.
◦ Top-k outliers are selected for further processing as it is necessary in
order to track the outliers
49. How to calculate the discrepancy?
◦ Two parameters are required:
a measurement m (number of incidences of an event)
a baseline b (total population at risk)
◦ The measurement M and baseline B values for the whole dataset (U) are
calculated as
p m M ) (
p U
B b( p)
p
U
◦ The measurement M and baseline B values for the region (R) are calculated
as
m p
( )
M
p R
R m
b p
( )
B
p R
R b
◦ The discrepancy score of the shaded area is calculated by using the given
formula:
)
1
1
m
m
m b m
R
( , ) log (1 ) log(
m
b
b
R
R
R
R
R R R
d
◦ For the above figure, M=6, B=16, mR= 4/6, and bR = 4/16
50. Outstretch algorithm
◦ The region is stretched around each side of the outlier region of the
previous year
◦ Each of outlier in current year is examined to see whether it lies in the
region consisting of stretched region and outlier region of previous year
◦ If it is, then it will be added to child list of previous year outlier
RecurseNode algorithm:
◦ All the sequences starting at root node of trees and ending at leaf node
are fetched.
Outlier region of
previous year
Stretched
region
51. (1,1), (2,2) and (3,2)
corresponds to one
sequence followed
by outlier
Forest built by applying outstretch algorithm recursively
52. Anomaly detection
◦ Land cover change detection
◦ Outlier precipitation detection
◦ Outlier time interval detection
Detection of water fraction per flood pixel
Detection of forest cover per pixel
53. Input:
◦ Sea surface temperature (SST) data of Equatorial Pacific Ocean.
◦ The data consisted of measurements of sea surface temperature
for 44 sensors in Pacific Ocean
◦ Each sensor had a time series of 1440 data points.
Output:
◦ Time intervals where spatial neighborhood has shown abnormal
behavior.
54. Terms:
Spatial distance (sd) : Distance between 2 locations based on distance between
spatial coordinates
2 2
sd s pxsqx s pysqy
( ) ( )
Measurement distance (md) : Distance between 2 points based on difference
between features of 2 points.
Where p and q are 2 locations
m
Spatial neighborhood : Cluster of locations such that the spatial distance (sd)
and measurement distance (md) between every 2 locations is less than the
respective threshold values.
Sum of squared error (SSE) : Measure of degree of abnormality of the interval
Where valbn is each
temporal reading in base
interval and μ is the mean
of the temporal readings
md s pam sqam
1
2
( )
BN
bn
valbn
SSE dist
1
2
( ) int
55. Aim:
◦ To find time intervals where spatial neighborhoods are likely to show
abnormal behavior.
Algorithm:
◦ Time series is first divided into a set of base equal size temporal
intervals
◦ Spatial neighborhoods are found for every base interval
◦ Each of spatial node in every base interval is analyzed and binary
classified as 1 if showing abnormal behavior or 0 otherwise
◦ Count of spatial nodes having a binary error classification of 1 is
found for every base interval and this count is called vote count.
◦ A threshold mv is then applied and those intervals for which votes >
mv are binary classified as 1 and others as 0.
56. ◦ Consecutive base intervals which have same binary classification are
merged to form the larger intervals.
◦ Mean value for each edge is calculated for every interval.
◦ Spatial neighborhoods are calculated for each interval using the
mean value of edge.
57. Agglomerative temporal intervals for SST data
Location : 0ºN latitude and 110ºW longitude
Time period : 10 day period from 01/01/2004 to 01/10/2004
No of measurements: approx.1400
58. Neighborhood (a) represents cooler water
Neighborhood (b) represents warmer water
Neighborhood (c) and (d) represents moderate water
• Edge clustering is validated by satellite image of SST.
• Light regions represent cooler temperatures
•Dark regions represent warmer temperatures.
59. Neighborhood quality for each interval
SSE of
neighborhood (a)
shows interesting
pattern between
intervals 16 and 19
SSE goes from
high to low and then
back from low to high
60. Neighborhood (a)
has more spread
during 16th interval
as compared to
17th interval
61. Input :
◦ Land cover type
◦ 8-day composite surface reflectance for NIR band (CH1) and VIS
band (CH2)
◦ CH2-CH1
◦ CH2/CH1
◦ NDVI dataset
◦ Data before flooding in Mississippi basin is used as training dataset
◦ Data after flooding in Mississippi basin that occurred on June 17-19,
2008 is used for testing
Output:
◦ The best attribute (R) for classification i.e. CH2-CH1 is found.
◦ The threshold values of the best attribute (R) for pure water and pure
land are found.
Validation data:
◦ 30m spatial resolution Landsat TM imagery for validation purposes
62. Aim:
◦ To find the fraction of water in flood pixels which are usually water
mixed with land cover features for MODIS dataset which has
coarse resolution
Method description:
◦ Decision tree approach is used to find
the best parameter (predictor) in order to differentiate between
land and water.
the threshold values of the predictor R for pure water (Rwater)
and pure land (Rland)
◦ Water fraction per pixel can be found by comparing actual value
of predictor with its value for pure water or pure land
R WF * R (1WF )*
Rmix water land
((R R ) /(R R ))*100 land mix land water
WF
63. Experimental Results:
Some of the rules used for deciding threshold values are :
◦ (CH2-CH1) > 9.17 -> class Land
◦ (CH2-CH1) <= 2.91 -> class Water
Correlation between TM and MODIS water fractions is 0.97 with
bias of 4.47% and standard deviation of 4.4%.
65. Anomaly detection
◦ Land cover change detection
◦ Outlier precipitation detection
◦ Outlier time interval detection
Detection of water fraction per flood pixel
Detection of forest cover per pixel
66. Input :
◦ Land surface temperature 5-monthly composited MOD11C3
product
◦ NDVI and EVI from monthly composited MOD13C2 product
◦ Land cover type from MCD12C1 yearly product
Output:
◦ Fraction of forest cover per pixel
Validation:
◦ Forest cover information from PRODES data at 90 m resolution
available in GeoTiff format is used for validation purposes
67. Aim :
◦ To find the forest cover per pixel for MODIS dataset having coarse
resolution
◦ The data values for parameters like NDVI, EVI, land surface temperature
etc are available per pixel.
◦ Therefore, value is affected by vegetation cover of every point covered in
that pixel.
◦ Same parameter value may correspond to different fraction of forest
cover depending on vegetation type for whole area per pixel.
Algorithm:
◦ Modification of Leeuwen et al. approach.
◦ Leeuwen et al. approach gives the single logistic regression model for all
vegetation types.
◦ But, improved algorithm considers vegetation type and gives
independent logistic regression model for each vegetation type
68. Leeuwen et al. approach
Terms:
pit : Fraction of forest cover for pixel i in year t (generated from the
analysis of high-resolution LandSat TM) images
Xit : Vector of MODIS observations for pixel i in year t
β: Vector of model parameters (which are estimated from a set of
training data) for pixel i in year t
The vectors Xit and β each have three components:
◦ the first corresponding to a constant intercept term
◦ the second to a NDVI measurement,
◦ and the third to a LST measurement.
Model :
X
p
it
p
it
T
it
1
ln
69. Learning independent regression algorithms require segmentation
of observation space into multiple categories.
Segmentation is done by partitioning the feature space which is n-dimensional
space with one feature corresponding to one of axis.
Features are selected based on their ability in differentiating
between different vegetation types.
For example: Forests show high inter-annual NDVI and EVI mean
and low inter-annual LST mean but intra-annual variance of NDVI,
EVI and LST is low.
Therefore, mean(μ) and variance(σ2) are selected as features.
70. Forests show high
inter-annual mean
and low intra-annual
variance
Farms show high
intra-annual variance
due to crop cycles
Grasslands show
high intra-annual
variance and high
inter-annual mean
Water locations
show high intra-annual
variance and
low inter-annual
mean
Vegetation type distribution in feature space (μ, σ2) of NDVI
71. Analysis of partition
corresponding to forest
vegetation type
Scatter plot of residual of
baseline approach and residual
of vegetation specific approach
Residual of vegetation
specific approach has lower
magnitude than baseline model
Therefore, vegetation
approach better than baseline
model
72. Analysis of partition
corresponding to cropland
vegetation type
Residual of vegetation
specific model is lower in
magnitude as compared to
baseline model
74. Various research works related to anomaly detection and
detection of water fraction or forest cover per pixel have been
discussed.
Most of the research works are pixel-based and do not
consider the spatial neighborhood of a pixel.
Domain knowledge is also required along with data mining
techniques
Future works should work towards addressing these
limitations.
76. [1] Jonathan T. Overpeck, Gerald A.Meehl, Sandrine Bony, David R. Easterling, D. (2011) ,"Climate
data challenges in the 21st century," in Science, 2011.
[2] James H. Faghmous and Vipin Kumar," Spatio-temporal data mining for climate data : Advances,
Challenges and Opportunities," in Data Mining and Knowledge Discovery for Big Data, 2014.
[3] Donglian Sun, Yunyue Yu, Mitchell D. Goldberg ,"Deriving Water Fraction and Flood Maps From
MODIS Images Using Decision Tree Approach," in IEEE Journal of Selected Topics In Applied Earth
Observations And Remote Sensing , 2011.
[4] Shyam Boriah, Vipin Kumar, Michael Steinbach, Christopher Potter, Steven Klooster," Land
Cover Change Detection : A Case Study," in Knowledge Discovery in Databases Proceedings, 2008.
[5] Elizabeth Wu, Wei Liu, Sanjay Chawla ,"Spatio-Temporal Outlier Detection in Precipitation Data,"
in Knowledge Discovery in Databases Proceedings, 2008.
[6] Hong Yeon Cho, Ji Hee Oh, Kyeong Ok Kim, and Jae Seol Shim, "Outlier Detection and missing
data filling methods for coastal water temperature data," in Journal of Coastal Research, 2013.
77. [7] C.T.Dhanya and D.Nagesh Kumar, " Data mining for evolution of association rules for droughts
and floods in India using climate inputs," in Journal of Geophysical Research, 2009.
[8] Ruixin Yang, Jiang Tang, and Donglian Sun, " Association Rule Data Mining Applications for
Atlantic Tropical Cyclone Intensity Changes," in Journal of American Meteorological Society, 2011.
[9] James H.Faghmous, Yashu Chamber, Shyam Boriah, Stefan Liess, Vipin Kumar, "A novel and
scalable spatio-temporal technique for ocean eddy monitoring," in Association for Advancement of
Artificial Intelligence, 2012.
[10] Imran Maqsood, Muhammad Riaz Khan, and Ajith Abrahim, "An ensemble of neural networks for
weather forecasting," in Neural Computing & Applications , 2004.
[11] Agboola A.H., Gabriel A.J., Aliyu E.O., Alese B.K., "Development of a Fuzzy Logic Based
Rainfall Prediction Model," in International Journal of Engineering and Technology, 2013.
[12] Christopher G.Healey, "On the Use of Perceptual Cues and Data Mining for Effective
Visualization of Scientific Datasets," in Proceedings Graphics Interface, 1998.
78. [13] Wenwen Li, Chaowei Yang, Donglian Sun, "Mining geophysical Parameters through Decision Tree
Analysis to Determine Correlation with Tropical Cyclone Development," in Computers & Geosciences,
2008.
[14]Pinky Saikia Dutta, and Hitesh Tahbilder, "Prediction of Rainfall using Data mining Technique over
Assam," in Indian Journal of Computer Science and Engineering (IJCSE),2014.
[15]Anuj Karpatne, Mace Blank, Michael Lau, Shyam Boriah, Karsten Steinhaeuser, Michael Steinbach
and Vipin Kumar," Importance of Vegetation Type in Forest Cover," in Intelligent Data Understanding,
2012.
[16] James H.Faghmous, Mathew Le, Muhammed Uluyol, Vipin Kumar and Snigdhansu Chatterjee, "A
parameter-free spatio-temporal pattern mining model to catalog ocean dynamics," in IEEE 13th
International Conference on Data Mining, 2013.
[17] Rie Honda and Osamu Konishi, "Temporal rule discovery for Time-Series Satellite Images and
Integration with RDB," in Principles of Data Mining and Knowledge Discovery, Lecture Notes in
Computer Science ,2001.
79. [18] Pol R. Coppin and Marvin E. Bauer, "Change Detection in Forest Ecosystems with Remote Sensing
Digital Imagery ," in Remote Sensing Reviews, 1996.
[19] Varun Mithal, Ashish Garg , Ivan Brugere, Shyam Boriah, Vipin Kumar, Michael Steinbach, ristopher
Potter, Steven Klooster , "Incorporating Natural Variation Into Time Series-Based Land Cover Change
Identification," in Proceedings of the NASA Conference on Intelligent Data Understanding, 2011.
[20] Varun Mithal, Shyam Boriah, Ashish Garg, Michael Steinbach, Vipin Kumar, "Monitoring Global
Forest Cover Using Data Mining," in Journal of Association for Computing Machinery, Volume V, 2010.
[21] D. Agarwal, A. McGregor, J.M.Phillips, S.Venkatsubramanian, and Z.Zhu, "Spatial Scan Statistics:
Approximations and Performance Study," in Knowledge Discovery in Databases Proceedings, 2006.
[22] Michael P. McGuire, Vandana P. Janeja, Aryya Gangopadhyay, "Spatiotemporal Neighborhood
Discovery for Sensor Data" in Knowledge Discovery in Databases Proceedings, 2008.
[23]Thijs T. van Leeuwen, Andrew J. Frank, Yufang Jin, Padhraic Smyth, Michael L. Goulden, Guido R.
van der Werf and James T. Randerson, "Optimal use of land surface temperature data to detect changes
in tropical forest cover," in Journal of Geophysical Research: Biogeosciences, 2011.
Hinweis der Redaktion
Example for concrete objects
Uncertainty due to presence of clouds or others
Explanation for the HDF file
Basic idea about these
Residual – difference between actual value of x and predicted value of x.