SlideShare ist ein Scribd-Unternehmen logo
1 von 80
Amod Aggarwal 
(13535005) 
Guided by: 
Dr. Padam Kumar 
Dr. Dhaval Patel
 Introduction 
 Literature Survey 
 Conclusion 
 References
 What is Meteorology and Oceanography? 
◦ study of spatial and temporal variations of the atmospheric, 
oceanographic and land parameters over long time periods 
◦ helps in prediction of disasters which prevents loss of life and 
property 
 What is data mining? 
◦ process of extraction of 
 implicit, 
 previously unknown 
 and potentially useful information from huge amount of data
Technique Application 
Anomaly 
detection 
Detection of Land cover change , outlier 
values of precipitation 
Association rule 
mining 
Finding association between 
oceanographic parameters and cyclone 
intensification 
Pattern mining 
Understanding of natural events. For 
example: eddies sustain energy for weeks 
or months and therefore can be 
manifested as connected group of 
gradually increasing or decreasing time 
series 
Classification Detection of water fraction per flood pixel 
Regression Detection of forest cover per pixel
Swirls of ocean currents 
Play significant role in transport 
of water, heat, salt, and nutrients 
Green 
swirl is 
ocean 
eddy 
Gradually decreasing segments of time series enclosed between red and 
green lines are signatures of an eddy
 This is challenging due to following reasons: 
◦ Not concrete objects: Spatio-temporal phenomena are not 
concrete objects but evolving patterns over space and time 
whereas in traditional data mining, objects are concrete i.e. they 
are either present or absent. 
 Transactions – item either present or absent (0 or 1) 
 Hurricanes – continuous gradual evolution, does not simply appear 
and disappear 
◦ Uncertainty: It occurs due to biases in measurement as some 
values may be missing due to presence of cloud cover. 
◦ Diversity: This is due to heterogeneity in space and time as data 
may be available from different sources at different spatial and 
temporal resolutions 
◦ Variability: Values captured for same location at difference of 
small intervals may vary due to local climatic variations
 The data retrieved from the remote sensing satellites is in the form 
of data products having different data formats. 
 The standard data format for most of the data products is HDF 
format. Some other formats are NetCDF, KML etc. 
 One data product contains data related to one parameter. 
 The authenticated users can download the Indian satellite data from 
mosdac.gov.in website of ISRO. 
 MOSDAC disseminates data for around 20 parameters. Some of 
these are: 
o Normalized Difference Vegetation Index (NDVI) 
o Land surface temperature (LST) 
o Aerosol Optical Depth (AOD) 
o Cloud Liquid Water (CLW) 
o Mean Sea Surface (MSS)
 Container for storing a variety of scientific data 
 Composed of two primary types of objects : 
◦ Groups : 
 grouping structure containing 1 or more HDF objects together 
with supporting metadata 
◦ Datasets : 
 Multidimensional array of data elements together with 
supporting metadata
 Introduction 
 Literature Survey 
 Conclusion 
 References
 Anomaly detection 
◦ Land cover change detection 
◦ Outlier precipitation detection 
◦ Outlier time interval detection 
 Detection of water fraction per flood pixel 
 Detection of forest cover per pixel
Aim: 
◦ To find those locations which undergo significant and sudden change 
during a particular time period. 
◦ The time at which the change occurs is also determined. 
Importance: 
◦ Helps in mapping of damages following a natural disaster such as fire, 
droughts, floods etc.
Land cover 
change detection 
methods 
Bitemporal 
methods 
Red – focused 
techniques 
Time series data 
mining techniques
Bitemporal 
methods 
Image 
differencing 
Image 
ratioing 
Principal 
component 
analysis 
Change 
vector 
analysis 
Land cover 
change 
detection 
methods 
Bitemporal 
methods 
Time series 
data mining 
techniques
Time series 
data mining 
techniques 
Predictive 
model based 
Yearly Delta 
Algorithm 
Variability 
Distribution 
Algorithm 
Vegetation 
Independent 
Yearly Delta 
Algorithm 
Segmentation 
based 
Top down 
approach 
Bottom up 
approach 
Recursive 
merging 
algorithm 
Land cover 
change 
detection 
methods 
Bitemporal 
methods 
Time series 
data mining 
techniques
Bitemporal methods Time series data mining methods 
Two time instants are compared Vegetation time series is analyzed at each 
location and changes in the time series are 
identified 
Do not provide the information about the 
time of change 
Provides the information about the time of 
change 
Less computational complexity Computational complexity is high as large 
time series has to be analyzed 
Segmentation based approach Predictive model based approach 
Time series is partitioned into homogenous 
segments and boundaries between 
segments may be change points 
A model is constructed for the portion of the 
time series and that is used to predict the 
future time points. 
The time series that are sufficiently different 
are considered change points.
Time series data mining methods 
 Segmentation based 
◦ Recursive merging algorithm 
 Predictive model based 
◦ Yearly Delta algorithm 
◦ Variability distribution algorithm 
◦ Vegetation independent variability distribution algorithm
 Input : Monthly composited EVI (Enhanced Vegetation Index) dataset for 
the state of California for years 2000-2006. 
 Output : Detection of land cover changes 
◦ Forest fires 
◦ Conversion to farming 
◦ Construction or logging
Algorithm : The pixel time series is analyzed as follows: 
1. Let {b1,b2,…., bn} correspond to list of annual EVI sum which is the sum of 
vegetation index value of all the months. 
2. Two consecutive segments with most similar annual EVI sums are merged 
• Suppose b1 and b2 are most similar EVI sums, then at the end of this 
step, list will be {(b1+b2)/2,b3,…., bn} having one less element 
• Merge cost s1= dist {b1,b2} 
3. Step 2 is applied recursively until list contains one element 
4. List of merge costs will be s1,s2,.......,sn-1. 
5. Change score for a location or pixel will be 
max 
change score  
i 
s 
6. Pixels are ranked on basis of change score value and some top ranked 
pixels are considered as changes. 
1 
1 
1 
1 
min 
 
 
  n 
i 
i 
n 
i 
s
b1 b2 b3 
Time series for one pixel
 Change score is calculated in such a way so as to take into account 
the type of vegetation 
◦ very small change can be considered as change point for stable 
forests 
◦ large change may not be change point for high variability regions 
such as grasslands 
 Helps in reducing the detection of false positives 
 Limitations: 
◦ Minimum cost of merging is considered as variability value due to 
local climatic changes. 
◦ But, the minimum cost may have occurred very rare and have 
been captured by chance
Time series data mining methods 
 Segmentation based 
◦ Recursive merging algorithm 
 Predictive model based 
◦ Yearly Delta algorithm 
◦ Variability distribution algorithm 
◦ Vegetation independent variability distribution algorithm
 Input: MODIS EVI data for California and Yukon 
◦ Data for California is at 250m spatial resolution for years 2006- 
2008. 
◦ Data for Yukon is at 1km spatial resolution for years 2004-2008. 
◦ Time series for each pixel is analyzed independently 
 Output: 
◦ Land cover change locations (pixels) 
◦ Time at which change occurred 
 Validation: High quality data for fires generated from independent 
source is used for validation
 Algorithm: 
◦ Previous year is considered as a model 
◦ Change score is assigned to each time step as difference between mean annual 
EVI of current year and previous year 
change score  annual EVI  
annual EVI current year current year previous year 
◦ Maximum change score across all the time steps is considered the YD score for a 
location 
n-1 
◦ Top ranked pixels according to YD score are called change points. 
 Limitation : 
◦ Does not make use of information about natural variation in EVI. 
◦ Only one top change of a time series is considered. 
 There is possibility that one time series may undergo multiple changes during a 
given period 
score max(change score) 
i1 
YD 
Actual change occurs in year 2008 
Difference in 
annual EVI is 
high
Change occurs in year 2005 due to natural 
variations 
Although 
difference in 
annual EVI is 
high but not 
very high if 
compared with 
mean 
variability 
score
Time series data mining methods 
 Segmentation based 
◦ Recursive merging algorithm 
 Predictive model based 
◦ Yearly Delta algorithm 
◦ Variability distribution algorithm 
◦ Vegetation independent variability distribution algorithm
Algorithm: 
 Each annual segment in the first k years is considered a model and remaining k-1 
values are considered as the observed values. 
 Mean Manhattan distance is computed for the k-1 years of model to give the 
distribution of variability scores for that location. 
 Modified score value called VD score is used which is 
where μ is the mean of distribution. 
 The mean is estimated using Maximum Likelihood Estimation method 
Special features: 
 Makes use of information about natural variation in EVI. 
 Any year for which annual EVI deviates significantly from the mean annual EVI for k 
years should be discarded 
Limitation : 
 Some of the vegetation types such as open shrubs have large variations in spread of 
annual variability 
VD score YD score - 
Change 
point 
As only one 
vegetation type 
i.e. forests is 
considered, 
therefore YD is 
also performing 
better 
Constant YD score 
Constant VD score 
Scatter plot of mean variability against YD score for forest cover 
(Courtesy: Mithal et al. [6])
Savannas consist of 
trees, shrubs, grasses 
etc. 
The different 
vegetation types has 
different value of 
threshold change 
score to be 
considered as actual 
change. 
Therefore, VD 
performs better than 
YD algorithm 
Constant YD score 
Constant VD score 
Scatter plot of mean variability against YD score for savannas (Courtesy: 
Mithal et al. [6])
As open shrub-lands 
show different spread 
of variability for 
different locations 
even though 
vegetation type is 
same, therefore both 
YD and VD are 
showing lot of false 
positives 
Constant YD score 
Constant VD score 
Scatter plot of mean variability against YD score for shrublands 
(Courtesy: Mithal et al. [6])
Time series data mining methods 
 Segmentation based 
◦ Recursive Merging Algorithm 
 Predictive model based 
◦ Yearly Delta Algorithm 
◦ Variability Distribution Algorithm 
◦ Vegetation Independent Variability Distribution Algorithm
Algorithm: 
 Mean and standard deviation of variability score distribution are 
estimated as maximum likelihood estimates of distribution 
 New score called VID score is used and calculated as follows: 
VID score 
Salient features: 
YD score - 
 
 Takes into account the information about spread of variability score 
distribution and therefore reduces false alarm rates 
 High VID score implies lower false positive rate and vice versa.
Curve for variability 
score for pixel 2 
Curve for variability change for pixel 1 
score for pixel 1 
Mean annual EVI 
Variability score in 
this area indicates 
Variability score in 
this area indicates 
change for pixel 2 
Both pixels correspond to shrub vegetation 
type whose spread of variability score varies 
from location to location and time to time.
Maximum likelihood estimation (MLE) 
 Every model is specified by the parameters. 
 MLE is a parameter estimation method which finds the parameter values of a 
model that best fits the data. 
 As fluctuations in variability score for particular vegetation type are normally 
distributed for a location, therefore parameters are calculated for normal 
distribution 
 The mean and standard deviation are the parameters for the normal distribution. 
 Calculation of mean and standard deviation using MLE 
◦ Let f(y|w) denotes probability density function (PDF) that specifies probability of observing data 
vector y given the parameter w. 
◦ If individual observations yi are independent of each other, then according to theory of 
probability, the PDF for data y=(y1,.......,yn) given the vector w can be expressed as 
multiplication of individual PDFs. 
f(y=(y1,…..yn)|w) = f1(y1|w) f2(y2|w)…..fn(yn|w)
 The PDF for one observation is 
e 
( ) 
xi 
 
1 
 
 
P x  
( ) 2 
 
2 
2 
2 
 
 
 The PDF for multiple independent observations is 
1 
x x e 
( ,....., | , ) 2 
  2 
 Taking log on both sides 
xi 
f 
 
n  
 
( ) 
2 
2 
1 
 
 
 
 
 
e 
( ) 
xi 
n 
n 
 
 
 
2 
2 
2 
(2 ) 2   
 
 
 
 
  
 
 
 
2 
2 
2 
ln(2 ) ln( ) 
1 
2 
ln( ) 
  
     f n n xi
 In order for data to best fit the model, the value of the parameter 
vector should maximize the PDF. 
 The partial differentiation of PDF with respect to each of component 
parameter of vector should be zero 
  
f x xi i      
n 
 
 
 
 
 
 
  
0 
(ln( )) 
2 
    
f n  xi  xi 
  
n 
      
 
 
 
 
  
2 
3 
2 
0 
(ln( ))
Yearly Delta algorithm Variability distribution 
method 
Vegetation Independent 
Yearly Delta Algorithm 
Does not consider the type 
of vegetation. 
Same YDscore value may be 
actual change for forests 
but not for savannas or 
shrublands. 
Considers the type of 
vegetation 
Same VD score may be 
actual change for regions 
such as savannas (having 
less variation in variability 
value) but not for 
shrublands (having high 
variation in variability value) 
Considers the type of 
vegetation 
VIDscore works for all the 
vegetation types 
Does not consider the 
average change score 
value(μ) and the degree of 
variability in value(σ) 
Considers only the average 
change score value(μ) 
Considers both the average 
change score value(μ) and 
the degree of variability in 
value(σ) 
YDscore= max i=1 to n(annual 
EVI current year – annual EVI 
previous year) 
VD score = YDscore - μ VID score=(YDscore-μ)/ σ
Where TPn = true positives, 
FPn = false positives, 
M = total no of pixels considered
Green line -> YD score 
Red line -> VD score 
Black line -> VID score 
VD and VID gives better 
results than YD. 
Reason: 
Graph corresponds 
to only forest region. 
 MODIS forest map 
was used to detect 
forest cover pixels 
 inaccurate and 
includes some 
shrubs and 
agricultural land 
labeled as forests.
Green line -> YD score 
Red line -> VD score 
Black line -> VID score 
VID performs slightly 
worse than VD 
Reason-Initial few years 
selected to model variability 
may have some noise 
Therefore, mean variability 
for that location is modeled 
as high and changes in later 
years will go undetected
Green line -> YD score 
Red line -> VD score 
Black line -> VID score 
Performance of 
VID is best. 
Reason-Shrubs 
form dominant land 
cover type for 
California and they 
show high variability 
in spread of 
variability score due 
to higher sensitivity 
to climatic variations
Green line -> YD score 
Red line -> VD score 
Black line -> VID score 
Performance of YD is 
exceptionally poor and that 
of VID is exceptionally 
good. 
Reason-due to high 
variability in spread of 
variability score for different 
locations with vegetation 
type as shrubs
 Anomaly detection 
◦ Land cover change detection 
◦ Outlier precipitation detection 
◦ Outlier time interval detection 
 Detection of water fraction per flood pixel 
 Detection of forest cover per pixel
 Input : 
◦ South American Precipitation dataset in geoscience format known as NetCDF 
 Output: 
Variable Value 
Num Year Periods 10 
Year Range 1995-2004 
Grid Size 2.5º×2.5º 
Num Latitudes 31 
Num Longitudes 23 
Total Grids 713 
◦ The top k=5 outliers are found for every year 
◦ Total of 155 outlier sequences were found over a period of 10 years 
 Running time of algorithm is 229s.
 Aim: 
◦ To find and track the position of outliers with time 
 Method description: 
◦ Top k outliers are found for every year using Exact-Grid Top-k algorithm 
◦ Outliers are tracked using the OutStretch algorithm 
◦ The outlier sequences generated are analyzed 
 How to find the outlier (Exact-Grid Top-k algorithm) 
◦ Concept of discrepancy is used 
◦ Discrepancy value is assigned to each rectangular region using 
Kulldorff’s scan statistic. 
◦ Top-k outliers are selected for further processing as it is necessary in 
order to track the outliers
 How to calculate the discrepancy? 
◦ Two parameters are required: 
 a measurement m (number of incidences of an event) 
 a baseline b (total population at risk) 
◦ The measurement M and baseline B values for the whole dataset (U) are 
calculated as 
p m M ) (  
 
p U 
 
 
B b( p) 
p  
U 
 
◦ The measurement M and baseline B values for the region (R) are calculated 
as 
m p 
( ) 
M 
 
p R 
R m 
  
b p 
( ) 
B 
 
p R 
R b 
  
◦ The discrepancy score of the shaded area is calculated by using the given 
formula: 
) 
1 
 
1 
m 
m 
m b m 
R 
 
 
 
 
 
 
( , ) log (1 ) log( 
m 
b 
b 
R 
R 
R 
R 
 
R R R 
d 
 
  
 
 
◦ For the above figure, M=6, B=16, mR= 4/6, and bR = 4/16
 Outstretch algorithm 
◦ The region is stretched around each side of the outlier region of the 
previous year 
◦ Each of outlier in current year is examined to see whether it lies in the 
region consisting of stretched region and outlier region of previous year 
◦ If it is, then it will be added to child list of previous year outlier 
 RecurseNode algorithm: 
◦ All the sequences starting at root node of trees and ending at leaf node 
are fetched. 
Outlier region of 
previous year 
Stretched 
region
(1,1), (2,2) and (3,2) 
corresponds to one 
sequence followed 
by outlier 
Forest built by applying outstretch algorithm recursively
 Anomaly detection 
◦ Land cover change detection 
◦ Outlier precipitation detection 
◦ Outlier time interval detection 
 Detection of water fraction per flood pixel 
 Detection of forest cover per pixel
 Input: 
◦ Sea surface temperature (SST) data of Equatorial Pacific Ocean. 
◦ The data consisted of measurements of sea surface temperature 
for 44 sensors in Pacific Ocean 
◦ Each sensor had a time series of 1440 data points. 
 Output: 
◦ Time intervals where spatial neighborhood has shown abnormal 
behavior.
Terms: 
 Spatial distance (sd) : Distance between 2 locations based on distance between 
spatial coordinates 
2 2 
sd  s pxsqx  s pysqy 
( ) ( ) 
 Measurement distance (md) : Distance between 2 points based on difference 
between features of 2 points. 
Where p and q are 2 locations 
m 
 Spatial neighborhood : Cluster of locations such that the spatial distance (sd) 
and measurement distance (md) between every 2 locations is less than the 
respective threshold values. 
 Sum of squared error (SSE) : Measure of degree of abnormality of the interval 
Where valbn is each 
temporal reading in base 
interval and μ is the mean 
of the temporal readings 
   
md s pam sqam 
1 
2 
( ) 
  
 
 
BN 
bn 
valbn 
SSE dist 
1 
2 
( ) int 
 Aim: 
◦ To find time intervals where spatial neighborhoods are likely to show 
abnormal behavior. 
 Algorithm: 
◦ Time series is first divided into a set of base equal size temporal 
intervals 
◦ Spatial neighborhoods are found for every base interval 
◦ Each of spatial node in every base interval is analyzed and binary 
classified as 1 if showing abnormal behavior or 0 otherwise 
◦ Count of spatial nodes having a binary error classification of 1 is 
found for every base interval and this count is called vote count. 
◦ A threshold mv is then applied and those intervals for which votes > 
mv are binary classified as 1 and others as 0.
◦ Consecutive base intervals which have same binary classification are 
merged to form the larger intervals. 
◦ Mean value for each edge is calculated for every interval. 
◦ Spatial neighborhoods are calculated for each interval using the 
mean value of edge.
Agglomerative temporal intervals for SST data 
Location : 0ºN latitude and 110ºW longitude 
Time period : 10 day period from 01/01/2004 to 01/10/2004 
No of measurements: approx.1400
Neighborhood (a) represents cooler water 
Neighborhood (b) represents warmer water 
Neighborhood (c) and (d) represents moderate water 
• Edge clustering is validated by satellite image of SST. 
• Light regions represent cooler temperatures 
•Dark regions represent warmer temperatures.
Neighborhood quality for each interval 
SSE of 
neighborhood (a) 
shows interesting 
pattern between 
intervals 16 and 19 
SSE goes from 
high to low and then 
back from low to high
Neighborhood (a) 
has more spread 
during 16th interval 
as compared to 
17th interval
 Input : 
◦ Land cover type 
◦ 8-day composite surface reflectance for NIR band (CH1) and VIS 
band (CH2) 
◦ CH2-CH1 
◦ CH2/CH1 
◦ NDVI dataset 
◦ Data before flooding in Mississippi basin is used as training dataset 
◦ Data after flooding in Mississippi basin that occurred on June 17-19, 
2008 is used for testing 
 Output: 
◦ The best attribute (R) for classification i.e. CH2-CH1 is found. 
◦ The threshold values of the best attribute (R) for pure water and pure 
land are found. 
 Validation data: 
◦ 30m spatial resolution Landsat TM imagery for validation purposes
 Aim: 
◦ To find the fraction of water in flood pixels which are usually water 
mixed with land cover features for MODIS dataset which has 
coarse resolution 
 Method description: 
◦ Decision tree approach is used to find 
 the best parameter (predictor) in order to differentiate between 
land and water. 
 the threshold values of the predictor R for pure water (Rwater) 
and pure land (Rland) 
◦ Water fraction per pixel can be found by comparing actual value 
of predictor with its value for pure water or pure land 
R WF * R  (1WF )* 
Rmix water land 
((R R ) /(R R ))*100 land mix land water 
WF   
Experimental Results: 
 Some of the rules used for deciding threshold values are : 
◦ (CH2-CH1) > 9.17 -> class Land 
◦ (CH2-CH1) <= 2.91 -> class Water 
 Correlation between TM and MODIS water fractions is 0.97 with 
bias of 4.47% and standard deviation of 4.4%.
Decision tree created 
using C4.5 algorithm
 Anomaly detection 
◦ Land cover change detection 
◦ Outlier precipitation detection 
◦ Outlier time interval detection 
 Detection of water fraction per flood pixel 
 Detection of forest cover per pixel
 Input : 
◦ Land surface temperature 5-monthly composited MOD11C3 
product 
◦ NDVI and EVI from monthly composited MOD13C2 product 
◦ Land cover type from MCD12C1 yearly product 
 Output: 
◦ Fraction of forest cover per pixel 
 Validation: 
◦ Forest cover information from PRODES data at 90 m resolution 
available in GeoTiff format is used for validation purposes
 Aim : 
◦ To find the forest cover per pixel for MODIS dataset having coarse 
resolution 
◦ The data values for parameters like NDVI, EVI, land surface temperature 
etc are available per pixel. 
◦ Therefore, value is affected by vegetation cover of every point covered in 
that pixel. 
◦ Same parameter value may correspond to different fraction of forest 
cover depending on vegetation type for whole area per pixel. 
 Algorithm: 
◦ Modification of Leeuwen et al. approach. 
◦ Leeuwen et al. approach gives the single logistic regression model for all 
vegetation types. 
◦ But, improved algorithm considers vegetation type and gives 
independent logistic regression model for each vegetation type
Leeuwen et al. approach 
Terms: 
 pit : Fraction of forest cover for pixel i in year t (generated from the 
analysis of high-resolution LandSat TM) images 
 Xit : Vector of MODIS observations for pixel i in year t 
 β: Vector of model parameters (which are estimated from a set of 
training data) for pixel i in year t 
 The vectors Xit and β each have three components: 
◦ the first corresponding to a constant intercept term 
◦ the second to a NDVI measurement, 
◦ and the third to a LST measurement. 
Model : 
X 
p 
 
 
it  
p 
it 
T 
it 
 
 
 
 
 
 
1 
ln
 Learning independent regression algorithms require segmentation 
of observation space into multiple categories. 
 Segmentation is done by partitioning the feature space which is n-dimensional 
space with one feature corresponding to one of axis. 
 Features are selected based on their ability in differentiating 
between different vegetation types. 
 For example: Forests show high inter-annual NDVI and EVI mean 
and low inter-annual LST mean but intra-annual variance of NDVI, 
EVI and LST is low. 
 Therefore, mean(μ) and variance(σ2) are selected as features.
Forests show high 
inter-annual mean 
and low intra-annual 
variance 
Farms show high 
intra-annual variance 
due to crop cycles 
Grasslands show 
high intra-annual 
variance and high 
inter-annual mean 
Water locations 
show high intra-annual 
variance and 
low inter-annual 
mean 
Vegetation type distribution in feature space (μ, σ2) of NDVI
Analysis of partition 
corresponding to forest 
vegetation type 
Scatter plot of residual of 
baseline approach and residual 
of vegetation specific approach 
Residual of vegetation 
specific approach has lower 
magnitude than baseline model 
Therefore, vegetation 
approach better than baseline 
model
Analysis of partition 
corresponding to cropland 
vegetation type 
Residual of vegetation 
specific model is lower in 
magnitude as compared to 
baseline model
 Introduction 
 Literature Review 
 Conclusion 
 References
 Various research works related to anomaly detection and 
detection of water fraction or forest cover per pixel have been 
discussed. 
 Most of the research works are pixel-based and do not 
consider the spatial neighborhood of a pixel. 
 Domain knowledge is also required along with data mining 
techniques 
 Future works should work towards addressing these 
limitations.
 Introduction 
 Literature Review 
 Conclusion 
 References
 [1] Jonathan T. Overpeck, Gerald A.Meehl, Sandrine Bony, David R. Easterling, D. (2011) ,"Climate 
data challenges in the 21st century," in Science, 2011. 
 [2] James H. Faghmous and Vipin Kumar," Spatio-temporal data mining for climate data : Advances, 
Challenges and Opportunities," in Data Mining and Knowledge Discovery for Big Data, 2014. 
 [3] Donglian Sun, Yunyue Yu, Mitchell D. Goldberg ,"Deriving Water Fraction and Flood Maps From 
MODIS Images Using Decision Tree Approach," in IEEE Journal of Selected Topics In Applied Earth 
Observations And Remote Sensing , 2011. 
 [4] Shyam Boriah, Vipin Kumar, Michael Steinbach, Christopher Potter, Steven Klooster," Land 
Cover Change Detection : A Case Study," in Knowledge Discovery in Databases Proceedings, 2008. 
 [5] Elizabeth Wu, Wei Liu, Sanjay Chawla ,"Spatio-Temporal Outlier Detection in Precipitation Data," 
in Knowledge Discovery in Databases Proceedings, 2008. 
 [6] Hong Yeon Cho, Ji Hee Oh, Kyeong Ok Kim, and Jae Seol Shim, "Outlier Detection and missing 
data filling methods for coastal water temperature data," in Journal of Coastal Research, 2013.
 [7] C.T.Dhanya and D.Nagesh Kumar, " Data mining for evolution of association rules for droughts 
and floods in India using climate inputs," in Journal of Geophysical Research, 2009. 
 [8] Ruixin Yang, Jiang Tang, and Donglian Sun, " Association Rule Data Mining Applications for 
Atlantic Tropical Cyclone Intensity Changes," in Journal of American Meteorological Society, 2011. 
 [9] James H.Faghmous, Yashu Chamber, Shyam Boriah, Stefan Liess, Vipin Kumar, "A novel and 
scalable spatio-temporal technique for ocean eddy monitoring," in Association for Advancement of 
Artificial Intelligence, 2012. 
 [10] Imran Maqsood, Muhammad Riaz Khan, and Ajith Abrahim, "An ensemble of neural networks for 
weather forecasting," in Neural Computing & Applications , 2004. 
 [11] Agboola A.H., Gabriel A.J., Aliyu E.O., Alese B.K., "Development of a Fuzzy Logic Based 
Rainfall Prediction Model," in International Journal of Engineering and Technology, 2013. 
 [12] Christopher G.Healey, "On the Use of Perceptual Cues and Data Mining for Effective 
Visualization of Scientific Datasets," in Proceedings Graphics Interface, 1998.
 [13] Wenwen Li, Chaowei Yang, Donglian Sun, "Mining geophysical Parameters through Decision Tree 
Analysis to Determine Correlation with Tropical Cyclone Development," in Computers & Geosciences, 
2008. 
 [14]Pinky Saikia Dutta, and Hitesh Tahbilder, "Prediction of Rainfall using Data mining Technique over 
Assam," in Indian Journal of Computer Science and Engineering (IJCSE),2014. 
 [15]Anuj Karpatne, Mace Blank, Michael Lau, Shyam Boriah, Karsten Steinhaeuser, Michael Steinbach 
and Vipin Kumar," Importance of Vegetation Type in Forest Cover," in Intelligent Data Understanding, 
2012. 
 [16] James H.Faghmous, Mathew Le, Muhammed Uluyol, Vipin Kumar and Snigdhansu Chatterjee, "A 
parameter-free spatio-temporal pattern mining model to catalog ocean dynamics," in IEEE 13th 
International Conference on Data Mining, 2013. 
 [17] Rie Honda and Osamu Konishi, "Temporal rule discovery for Time-Series Satellite Images and 
Integration with RDB," in Principles of Data Mining and Knowledge Discovery, Lecture Notes in 
Computer Science ,2001.
 [18] Pol R. Coppin and Marvin E. Bauer, "Change Detection in Forest Ecosystems with Remote Sensing 
Digital Imagery ," in Remote Sensing Reviews, 1996. 
 [19] Varun Mithal, Ashish Garg , Ivan Brugere, Shyam Boriah, Vipin Kumar, Michael Steinbach, ristopher 
Potter, Steven Klooster , "Incorporating Natural Variation Into Time Series-Based Land Cover Change 
Identification," in Proceedings of the NASA Conference on Intelligent Data Understanding, 2011. 
 [20] Varun Mithal, Shyam Boriah, Ashish Garg, Michael Steinbach, Vipin Kumar, "Monitoring Global 
Forest Cover Using Data Mining," in Journal of Association for Computing Machinery, Volume V, 2010. 
 [21] D. Agarwal, A. McGregor, J.M.Phillips, S.Venkatsubramanian, and Z.Zhu, "Spatial Scan Statistics: 
Approximations and Performance Study," in Knowledge Discovery in Databases Proceedings, 2006. 
 [22] Michael P. McGuire, Vandana P. Janeja, Aryya Gangopadhyay, "Spatiotemporal Neighborhood 
Discovery for Sensor Data" in Knowledge Discovery in Databases Proceedings, 2008. 
 [23]Thijs T. van Leeuwen, Andrew J. Frank, Yufang Jin, Padhraic Smyth, Michael L. Goulden, Guido R. 
van der Werf and James T. Randerson, "Optimal use of land surface temperature data to detect changes 
in tropical forest cover," in Journal of Geophysical Research: Biogeosciences, 2011.
Seminar final1

Weitere ähnliche Inhalte

Was ist angesagt?

Measuring water from Sky: Basin-wide ET monitoring and application
Measuring water from Sky: Basin-wide ET monitoring and applicationMeasuring water from Sky: Basin-wide ET monitoring and application
Measuring water from Sky: Basin-wide ET monitoring and applicationIwl Pcu
 
Masters Thesis Defense Presentation
Masters Thesis Defense PresentationMasters Thesis Defense Presentation
Masters Thesis Defense Presentationnancyanne
 
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...India UK Water Centre (IUKWC)
 
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...India UK Water Centre (IUKWC)
 
Development and Applications of Fire Danger Rating Systems in Southeast Asia
Development and Applications of Fire Danger Rating Systems in Southeast AsiaDevelopment and Applications of Fire Danger Rating Systems in Southeast Asia
Development and Applications of Fire Danger Rating Systems in Southeast AsiaCIFOR-ICRAF
 
Bergström hans
Bergström hansBergström hans
Bergström hansWinterwind
 
Climate downscaling
Climate downscalingClimate downscaling
Climate downscalingIC3Climate
 
Master's course defense presentation in Water Resource Management and GIS
Master's course defense presentation in Water Resource Management and GIS  Master's course defense presentation in Water Resource Management and GIS
Master's course defense presentation in Water Resource Management and GIS Tooryalay Ayoubi
 
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...India UK Water Centre (IUKWC)
 
Extinction of Millimeter wave on Two Dimensional Slices of Foam-Covered Sea-s...
Extinction of Millimeter wave on Two Dimensional Slices of Foam-Covered Sea-s...Extinction of Millimeter wave on Two Dimensional Slices of Foam-Covered Sea-s...
Extinction of Millimeter wave on Two Dimensional Slices of Foam-Covered Sea-s...IJSRED
 
Remote Sensing of Urban Heat Islands
Remote Sensing of Urban Heat IslandsRemote Sensing of Urban Heat Islands
Remote Sensing of Urban Heat IslandsChristopher Martin
 
USDA/GIS Update - Data, Products & Services
USDA/GIS Update - Data, Products & ServicesUSDA/GIS Update - Data, Products & Services
USDA/GIS Update - Data, Products & ServicesDRIscience
 
Kush Defense
Kush DefenseKush Defense
Kush Defensekbhusal
 

Was ist angesagt? (18)

Measuring water from Sky: Basin-wide ET monitoring and application
Measuring water from Sky: Basin-wide ET monitoring and applicationMeasuring water from Sky: Basin-wide ET monitoring and application
Measuring water from Sky: Basin-wide ET monitoring and application
 
Masters Thesis Defense Presentation
Masters Thesis Defense PresentationMasters Thesis Defense Presentation
Masters Thesis Defense Presentation
 
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
 
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
 
snow runoff model
snow runoff modelsnow runoff model
snow runoff model
 
Development and Applications of Fire Danger Rating Systems in Southeast Asia
Development and Applications of Fire Danger Rating Systems in Southeast AsiaDevelopment and Applications of Fire Danger Rating Systems in Southeast Asia
Development and Applications of Fire Danger Rating Systems in Southeast Asia
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Bergström hans
Bergström hansBergström hans
Bergström hans
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Climate downscaling
Climate downscalingClimate downscaling
Climate downscaling
 
Master's course defense presentation in Water Resource Management and GIS
Master's course defense presentation in Water Resource Management and GIS  Master's course defense presentation in Water Resource Management and GIS
Master's course defense presentation in Water Resource Management and GIS
 
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Extinction of Millimeter wave on Two Dimensional Slices of Foam-Covered Sea-s...
Extinction of Millimeter wave on Two Dimensional Slices of Foam-Covered Sea-s...Extinction of Millimeter wave on Two Dimensional Slices of Foam-Covered Sea-s...
Extinction of Millimeter wave on Two Dimensional Slices of Foam-Covered Sea-s...
 
Remote Sensing of Urban Heat Islands
Remote Sensing of Urban Heat IslandsRemote Sensing of Urban Heat Islands
Remote Sensing of Urban Heat Islands
 
USDA/GIS Update - Data, Products & Services
USDA/GIS Update - Data, Products & ServicesUSDA/GIS Update - Data, Products & Services
USDA/GIS Update - Data, Products & Services
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Kush Defense
Kush DefenseKush Defense
Kush Defense
 

Andere mochten auch

Bedrijfspresentatie SB&A
Bedrijfspresentatie SB&ABedrijfspresentatie SB&A
Bedrijfspresentatie SB&AMDMijnheer
 
Bedrijfspresentatie SB&A
Bedrijfspresentatie SB&ABedrijfspresentatie SB&A
Bedrijfspresentatie SB&AMDMijnheer
 
One minute-leadership
One minute-leadershipOne minute-leadership
One minute-leadershipEstrous Swain
 
司書セミナー@大阪府立中央図書館(プレゼン用・確認済み)
司書セミナー@大阪府立中央図書館(プレゼン用・確認済み)司書セミナー@大阪府立中央図書館(プレゼン用・確認済み)
司書セミナー@大阪府立中央図書館(プレゼン用・確認済み)YoshiYuki Kanematsu
 
Boston Terrier Essentials
Boston Terrier EssentialsBoston Terrier Essentials
Boston Terrier Essentialsbostonterriers
 
Misé en scene elements
Misé en scene elementsMisé en scene elements
Misé en scene elementshughes82
 
Collective Identity Assessment 1
Collective Identity Assessment 1Collective Identity Assessment 1
Collective Identity Assessment 1hughes82
 
Representation of youth theories
Representation of youth theoriesRepresentation of youth theories
Representation of youth theorieshughes82
 

Andere mochten auch (16)

Research game
Research gameResearch game
Research game
 
Ibm mq
Ibm mqIbm mq
Ibm mq
 
Bedrijfspresentatie SB&A
Bedrijfspresentatie SB&ABedrijfspresentatie SB&A
Bedrijfspresentatie SB&A
 
Bedrijfspresentatie SB&A
Bedrijfspresentatie SB&ABedrijfspresentatie SB&A
Bedrijfspresentatie SB&A
 
One minute-leadership
One minute-leadershipOne minute-leadership
One minute-leadership
 
Texasss d
Texasss dTexasss d
Texasss d
 
Georgia wileymay2013
Georgia wileymay2013Georgia wileymay2013
Georgia wileymay2013
 
司書セミナー@大阪府立中央図書館(プレゼン用・確認済み)
司書セミナー@大阪府立中央図書館(プレゼン用・確認済み)司書セミナー@大阪府立中央図書館(プレゼン用・確認済み)
司書セミナー@大阪府立中央図書館(プレゼン用・確認済み)
 
Heart of darkness
Heart of darknessHeart of darkness
Heart of darkness
 
eLibrary USA
eLibrary USA eLibrary USA
eLibrary USA
 
Boston Terrier Essentials
Boston Terrier EssentialsBoston Terrier Essentials
Boston Terrier Essentials
 
Misé en scene elements
Misé en scene elementsMisé en scene elements
Misé en scene elements
 
North dakota
North dakotaNorth dakota
North dakota
 
Collective Identity Assessment 1
Collective Identity Assessment 1Collective Identity Assessment 1
Collective Identity Assessment 1
 
Representation of youth theories
Representation of youth theoriesRepresentation of youth theories
Representation of youth theories
 
Flowchart and algorithm
Flowchart and algorithmFlowchart and algorithm
Flowchart and algorithm
 

Ähnlich wie Seminar final1

Drought monitoring, Precipitation statistics, and water balance with freely a...
Drought monitoring, Precipitation statistics, and water balance with freely a...Drought monitoring, Precipitation statistics, and water balance with freely a...
Drought monitoring, Precipitation statistics, and water balance with freely a...AngelosAlamanos
 
Andy Jarvis PARASID Near Real Time Monitoring Of Habitat Change Using A Neur...
Andy  Jarvis PARASID Near Real Time Monitoring Of Habitat Change Using A Neur...Andy  Jarvis PARASID Near Real Time Monitoring Of Habitat Change Using A Neur...
Andy Jarvis PARASID Near Real Time Monitoring Of Habitat Change Using A Neur...CIAT
 
The Role of Semantics in Harmonizing YOPP Observation and Model Data
The Role of Semantics in Harmonizing YOPP Observation and Model DataThe Role of Semantics in Harmonizing YOPP Observation and Model Data
The Role of Semantics in Harmonizing YOPP Observation and Model DataSiri Jodha Singh Khalsa
 
Andy Jarvis - Parasid Near Real Time Monitoring Of Habitat Change Using A Neu...
Andy Jarvis - Parasid Near Real Time Monitoring Of Habitat Change Using A Neu...Andy Jarvis - Parasid Near Real Time Monitoring Of Habitat Change Using A Neu...
Andy Jarvis - Parasid Near Real Time Monitoring Of Habitat Change Using A Neu...CIAT
 
Thackway_MAY_presentation
Thackway_MAY_presentationThackway_MAY_presentation
Thackway_MAY_presentationTERN Australia
 
Urban Landuse/ Landcover change analysis using Remote Sensing and GIS
Urban Landuse/ Landcover change analysis using Remote Sensing and GISUrban Landuse/ Landcover change analysis using Remote Sensing and GIS
Urban Landuse/ Landcover change analysis using Remote Sensing and GISHarshvardhan Vashistha
 
A framework to assess wetlands' potential as Nature-based Solutions
A framework to assess wetlands' potential as Nature-based SolutionsA framework to assess wetlands' potential as Nature-based Solutions
A framework to assess wetlands' potential as Nature-based SolutionsAngelosAlamanos
 
18 06 entso-e ieee pan european system adequacy
18 06 entso-e ieee pan european system adequacy18 06 entso-e ieee pan european system adequacy
18 06 entso-e ieee pan european system adequacyLaurent Schmitt
 
Andy Jarvis Parasid Near Real Time Monitoring Of Habitat Change Using A Neura...
Andy Jarvis Parasid Near Real Time Monitoring Of Habitat Change Using A Neura...Andy Jarvis Parasid Near Real Time Monitoring Of Habitat Change Using A Neura...
Andy Jarvis Parasid Near Real Time Monitoring Of Habitat Change Using A Neura...CIAT
 
Session 1.1.3. Climate Projections
Session 1.1.3. Climate Projections Session 1.1.3. Climate Projections
Session 1.1.3. Climate Projections NAP Events
 
6.1.1 Methodologies for climate rational for adaptation- CC Projections
6.1.1 Methodologies for climate rational for adaptation- CC Projections6.1.1 Methodologies for climate rational for adaptation- CC Projections
6.1.1 Methodologies for climate rational for adaptation- CC ProjectionsNAP Events
 
EcoTas13 BradEvans e-MAST
EcoTas13 BradEvans e-MASTEcoTas13 BradEvans e-MAST
EcoTas13 BradEvans e-MASTTERN Australia
 
Storm Prediction data analysis using R/SAS
Storm Prediction data analysis using R/SASStorm Prediction data analysis using R/SAS
Storm Prediction data analysis using R/SASGautam Sawant
 
HYDROLOGY NOTES......................pptx
HYDROLOGY  NOTES......................pptxHYDROLOGY  NOTES......................pptx
HYDROLOGY NOTES......................pptxFormulaMw
 
Extending the modern record back in time using proxy data
Extending the modern record back in time using proxy dataExtending the modern record back in time using proxy data
Extending the modern record back in time using proxy dataTim Osborn
 
RemoteSensingProjectPaper
RemoteSensingProjectPaperRemoteSensingProjectPaper
RemoteSensingProjectPaperJames Sherwood
 
Assessing ecosystem services over large areas
Assessing ecosystem services over large areasAssessing ecosystem services over large areas
Assessing ecosystem services over large areasAlessandro Gimona
 

Ähnlich wie Seminar final1 (20)

Drought monitoring, Precipitation statistics, and water balance with freely a...
Drought monitoring, Precipitation statistics, and water balance with freely a...Drought monitoring, Precipitation statistics, and water balance with freely a...
Drought monitoring, Precipitation statistics, and water balance with freely a...
 
Andy Jarvis PARASID Near Real Time Monitoring Of Habitat Change Using A Neur...
Andy  Jarvis PARASID Near Real Time Monitoring Of Habitat Change Using A Neur...Andy  Jarvis PARASID Near Real Time Monitoring Of Habitat Change Using A Neur...
Andy Jarvis PARASID Near Real Time Monitoring Of Habitat Change Using A Neur...
 
The Role of Semantics in Harmonizing YOPP Observation and Model Data
The Role of Semantics in Harmonizing YOPP Observation and Model DataThe Role of Semantics in Harmonizing YOPP Observation and Model Data
The Role of Semantics in Harmonizing YOPP Observation and Model Data
 
Andy Jarvis - Parasid Near Real Time Monitoring Of Habitat Change Using A Neu...
Andy Jarvis - Parasid Near Real Time Monitoring Of Habitat Change Using A Neu...Andy Jarvis - Parasid Near Real Time Monitoring Of Habitat Change Using A Neu...
Andy Jarvis - Parasid Near Real Time Monitoring Of Habitat Change Using A Neu...
 
Thackway_MAY_presentation
Thackway_MAY_presentationThackway_MAY_presentation
Thackway_MAY_presentation
 
Urban Landuse/ Landcover change analysis using Remote Sensing and GIS
Urban Landuse/ Landcover change analysis using Remote Sensing and GISUrban Landuse/ Landcover change analysis using Remote Sensing and GIS
Urban Landuse/ Landcover change analysis using Remote Sensing and GIS
 
A framework to assess wetlands' potential as Nature-based Solutions
A framework to assess wetlands' potential as Nature-based SolutionsA framework to assess wetlands' potential as Nature-based Solutions
A framework to assess wetlands' potential as Nature-based Solutions
 
18 06 entso-e ieee pan european system adequacy
18 06 entso-e ieee pan european system adequacy18 06 entso-e ieee pan european system adequacy
18 06 entso-e ieee pan european system adequacy
 
Andy Jarvis Parasid Near Real Time Monitoring Of Habitat Change Using A Neura...
Andy Jarvis Parasid Near Real Time Monitoring Of Habitat Change Using A Neura...Andy Jarvis Parasid Near Real Time Monitoring Of Habitat Change Using A Neura...
Andy Jarvis Parasid Near Real Time Monitoring Of Habitat Change Using A Neura...
 
Session 1.1.3. Climate Projections
Session 1.1.3. Climate Projections Session 1.1.3. Climate Projections
Session 1.1.3. Climate Projections
 
6.1.1 Methodologies for climate rational for adaptation- CC Projections
6.1.1 Methodologies for climate rational for adaptation- CC Projections6.1.1 Methodologies for climate rational for adaptation- CC Projections
6.1.1 Methodologies for climate rational for adaptation- CC Projections
 
Buni zum Glacier
Buni zum GlacierBuni zum Glacier
Buni zum Glacier
 
Aero
AeroAero
Aero
 
EcoTas13 BradEvans e-MAST
EcoTas13 BradEvans e-MASTEcoTas13 BradEvans e-MAST
EcoTas13 BradEvans e-MAST
 
Storm Prediction data analysis using R/SAS
Storm Prediction data analysis using R/SASStorm Prediction data analysis using R/SAS
Storm Prediction data analysis using R/SAS
 
HYDROLOGY NOTES......................pptx
HYDROLOGY  NOTES......................pptxHYDROLOGY  NOTES......................pptx
HYDROLOGY NOTES......................pptx
 
Extending the modern record back in time using proxy data
Extending the modern record back in time using proxy dataExtending the modern record back in time using proxy data
Extending the modern record back in time using proxy data
 
RemoteSensingProjectPaper
RemoteSensingProjectPaperRemoteSensingProjectPaper
RemoteSensingProjectPaper
 
a3-4.park.pdf
a3-4.park.pdfa3-4.park.pdf
a3-4.park.pdf
 
Assessing ecosystem services over large areas
Assessing ecosystem services over large areasAssessing ecosystem services over large areas
Assessing ecosystem services over large areas
 

Kürzlich hochgeladen

Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)simmis5
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduitsrknatarajan
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college projectTonystark477637
 

Kürzlich hochgeladen (20)

Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 

Seminar final1

  • 1. Amod Aggarwal (13535005) Guided by: Dr. Padam Kumar Dr. Dhaval Patel
  • 2.  Introduction  Literature Survey  Conclusion  References
  • 3.  What is Meteorology and Oceanography? ◦ study of spatial and temporal variations of the atmospheric, oceanographic and land parameters over long time periods ◦ helps in prediction of disasters which prevents loss of life and property  What is data mining? ◦ process of extraction of  implicit,  previously unknown  and potentially useful information from huge amount of data
  • 4. Technique Application Anomaly detection Detection of Land cover change , outlier values of precipitation Association rule mining Finding association between oceanographic parameters and cyclone intensification Pattern mining Understanding of natural events. For example: eddies sustain energy for weeks or months and therefore can be manifested as connected group of gradually increasing or decreasing time series Classification Detection of water fraction per flood pixel Regression Detection of forest cover per pixel
  • 5. Swirls of ocean currents Play significant role in transport of water, heat, salt, and nutrients Green swirl is ocean eddy Gradually decreasing segments of time series enclosed between red and green lines are signatures of an eddy
  • 6.  This is challenging due to following reasons: ◦ Not concrete objects: Spatio-temporal phenomena are not concrete objects but evolving patterns over space and time whereas in traditional data mining, objects are concrete i.e. they are either present or absent.  Transactions – item either present or absent (0 or 1)  Hurricanes – continuous gradual evolution, does not simply appear and disappear ◦ Uncertainty: It occurs due to biases in measurement as some values may be missing due to presence of cloud cover. ◦ Diversity: This is due to heterogeneity in space and time as data may be available from different sources at different spatial and temporal resolutions ◦ Variability: Values captured for same location at difference of small intervals may vary due to local climatic variations
  • 7.  The data retrieved from the remote sensing satellites is in the form of data products having different data formats.  The standard data format for most of the data products is HDF format. Some other formats are NetCDF, KML etc.  One data product contains data related to one parameter.  The authenticated users can download the Indian satellite data from mosdac.gov.in website of ISRO.  MOSDAC disseminates data for around 20 parameters. Some of these are: o Normalized Difference Vegetation Index (NDVI) o Land surface temperature (LST) o Aerosol Optical Depth (AOD) o Cloud Liquid Water (CLW) o Mean Sea Surface (MSS)
  • 8.  Container for storing a variety of scientific data  Composed of two primary types of objects : ◦ Groups :  grouping structure containing 1 or more HDF objects together with supporting metadata ◦ Datasets :  Multidimensional array of data elements together with supporting metadata
  • 9.
  • 10.
  • 11.  Introduction  Literature Survey  Conclusion  References
  • 12.  Anomaly detection ◦ Land cover change detection ◦ Outlier precipitation detection ◦ Outlier time interval detection  Detection of water fraction per flood pixel  Detection of forest cover per pixel
  • 13. Aim: ◦ To find those locations which undergo significant and sudden change during a particular time period. ◦ The time at which the change occurs is also determined. Importance: ◦ Helps in mapping of damages following a natural disaster such as fire, droughts, floods etc.
  • 14. Land cover change detection methods Bitemporal methods Red – focused techniques Time series data mining techniques
  • 15. Bitemporal methods Image differencing Image ratioing Principal component analysis Change vector analysis Land cover change detection methods Bitemporal methods Time series data mining techniques
  • 16. Time series data mining techniques Predictive model based Yearly Delta Algorithm Variability Distribution Algorithm Vegetation Independent Yearly Delta Algorithm Segmentation based Top down approach Bottom up approach Recursive merging algorithm Land cover change detection methods Bitemporal methods Time series data mining techniques
  • 17. Bitemporal methods Time series data mining methods Two time instants are compared Vegetation time series is analyzed at each location and changes in the time series are identified Do not provide the information about the time of change Provides the information about the time of change Less computational complexity Computational complexity is high as large time series has to be analyzed Segmentation based approach Predictive model based approach Time series is partitioned into homogenous segments and boundaries between segments may be change points A model is constructed for the portion of the time series and that is used to predict the future time points. The time series that are sufficiently different are considered change points.
  • 18.
  • 19. Time series data mining methods  Segmentation based ◦ Recursive merging algorithm  Predictive model based ◦ Yearly Delta algorithm ◦ Variability distribution algorithm ◦ Vegetation independent variability distribution algorithm
  • 20.  Input : Monthly composited EVI (Enhanced Vegetation Index) dataset for the state of California for years 2000-2006.  Output : Detection of land cover changes ◦ Forest fires ◦ Conversion to farming ◦ Construction or logging
  • 21. Algorithm : The pixel time series is analyzed as follows: 1. Let {b1,b2,…., bn} correspond to list of annual EVI sum which is the sum of vegetation index value of all the months. 2. Two consecutive segments with most similar annual EVI sums are merged • Suppose b1 and b2 are most similar EVI sums, then at the end of this step, list will be {(b1+b2)/2,b3,…., bn} having one less element • Merge cost s1= dist {b1,b2} 3. Step 2 is applied recursively until list contains one element 4. List of merge costs will be s1,s2,.......,sn-1. 5. Change score for a location or pixel will be max change score  i s 6. Pixels are ranked on basis of change score value and some top ranked pixels are considered as changes. 1 1 1 1 min     n i i n i s
  • 22. b1 b2 b3 Time series for one pixel
  • 23.  Change score is calculated in such a way so as to take into account the type of vegetation ◦ very small change can be considered as change point for stable forests ◦ large change may not be change point for high variability regions such as grasslands  Helps in reducing the detection of false positives  Limitations: ◦ Minimum cost of merging is considered as variability value due to local climatic changes. ◦ But, the minimum cost may have occurred very rare and have been captured by chance
  • 24. Time series data mining methods  Segmentation based ◦ Recursive merging algorithm  Predictive model based ◦ Yearly Delta algorithm ◦ Variability distribution algorithm ◦ Vegetation independent variability distribution algorithm
  • 25.  Input: MODIS EVI data for California and Yukon ◦ Data for California is at 250m spatial resolution for years 2006- 2008. ◦ Data for Yukon is at 1km spatial resolution for years 2004-2008. ◦ Time series for each pixel is analyzed independently  Output: ◦ Land cover change locations (pixels) ◦ Time at which change occurred  Validation: High quality data for fires generated from independent source is used for validation
  • 26.  Algorithm: ◦ Previous year is considered as a model ◦ Change score is assigned to each time step as difference between mean annual EVI of current year and previous year change score  annual EVI  annual EVI current year current year previous year ◦ Maximum change score across all the time steps is considered the YD score for a location n-1 ◦ Top ranked pixels according to YD score are called change points.  Limitation : ◦ Does not make use of information about natural variation in EVI. ◦ Only one top change of a time series is considered.  There is possibility that one time series may undergo multiple changes during a given period score max(change score) i1 YD 
  • 27. Actual change occurs in year 2008 Difference in annual EVI is high
  • 28. Change occurs in year 2005 due to natural variations Although difference in annual EVI is high but not very high if compared with mean variability score
  • 29. Time series data mining methods  Segmentation based ◦ Recursive merging algorithm  Predictive model based ◦ Yearly Delta algorithm ◦ Variability distribution algorithm ◦ Vegetation independent variability distribution algorithm
  • 30. Algorithm:  Each annual segment in the first k years is considered a model and remaining k-1 values are considered as the observed values.  Mean Manhattan distance is computed for the k-1 years of model to give the distribution of variability scores for that location.  Modified score value called VD score is used which is where μ is the mean of distribution.  The mean is estimated using Maximum Likelihood Estimation method Special features:  Makes use of information about natural variation in EVI.  Any year for which annual EVI deviates significantly from the mean annual EVI for k years should be discarded Limitation :  Some of the vegetation types such as open shrubs have large variations in spread of annual variability VD score YD score - 
  • 31. Change point As only one vegetation type i.e. forests is considered, therefore YD is also performing better Constant YD score Constant VD score Scatter plot of mean variability against YD score for forest cover (Courtesy: Mithal et al. [6])
  • 32. Savannas consist of trees, shrubs, grasses etc. The different vegetation types has different value of threshold change score to be considered as actual change. Therefore, VD performs better than YD algorithm Constant YD score Constant VD score Scatter plot of mean variability against YD score for savannas (Courtesy: Mithal et al. [6])
  • 33. As open shrub-lands show different spread of variability for different locations even though vegetation type is same, therefore both YD and VD are showing lot of false positives Constant YD score Constant VD score Scatter plot of mean variability against YD score for shrublands (Courtesy: Mithal et al. [6])
  • 34. Time series data mining methods  Segmentation based ◦ Recursive Merging Algorithm  Predictive model based ◦ Yearly Delta Algorithm ◦ Variability Distribution Algorithm ◦ Vegetation Independent Variability Distribution Algorithm
  • 35. Algorithm:  Mean and standard deviation of variability score distribution are estimated as maximum likelihood estimates of distribution  New score called VID score is used and calculated as follows: VID score Salient features: YD score -   Takes into account the information about spread of variability score distribution and therefore reduces false alarm rates  High VID score implies lower false positive rate and vice versa.
  • 36. Curve for variability score for pixel 2 Curve for variability change for pixel 1 score for pixel 1 Mean annual EVI Variability score in this area indicates Variability score in this area indicates change for pixel 2 Both pixels correspond to shrub vegetation type whose spread of variability score varies from location to location and time to time.
  • 37. Maximum likelihood estimation (MLE)  Every model is specified by the parameters.  MLE is a parameter estimation method which finds the parameter values of a model that best fits the data.  As fluctuations in variability score for particular vegetation type are normally distributed for a location, therefore parameters are calculated for normal distribution  The mean and standard deviation are the parameters for the normal distribution.  Calculation of mean and standard deviation using MLE ◦ Let f(y|w) denotes probability density function (PDF) that specifies probability of observing data vector y given the parameter w. ◦ If individual observations yi are independent of each other, then according to theory of probability, the PDF for data y=(y1,.......,yn) given the vector w can be expressed as multiplication of individual PDFs. f(y=(y1,…..yn)|w) = f1(y1|w) f2(y2|w)…..fn(yn|w)
  • 38.  The PDF for one observation is e ( ) xi  1   P x  ( ) 2  2 2 2    The PDF for multiple independent observations is 1 x x e ( ,....., | , ) 2   2  Taking log on both sides xi f  n   ( ) 2 2 1      e ( ) xi n n    2 2 2 (2 ) 2            2 2 2 ln(2 ) ln( ) 1 2 ln( )        f n n xi
  • 39.  In order for data to best fit the model, the value of the parameter vector should maximize the PDF.  The partial differentiation of PDF with respect to each of component parameter of vector should be zero   f x xi i      n         0 (ln( )) 2     f n  xi  xi   n             2 3 2 0 (ln( ))
  • 40. Yearly Delta algorithm Variability distribution method Vegetation Independent Yearly Delta Algorithm Does not consider the type of vegetation. Same YDscore value may be actual change for forests but not for savannas or shrublands. Considers the type of vegetation Same VD score may be actual change for regions such as savannas (having less variation in variability value) but not for shrublands (having high variation in variability value) Considers the type of vegetation VIDscore works for all the vegetation types Does not consider the average change score value(μ) and the degree of variability in value(σ) Considers only the average change score value(μ) Considers both the average change score value(μ) and the degree of variability in value(σ) YDscore= max i=1 to n(annual EVI current year – annual EVI previous year) VD score = YDscore - μ VID score=(YDscore-μ)/ σ
  • 41. Where TPn = true positives, FPn = false positives, M = total no of pixels considered
  • 42. Green line -> YD score Red line -> VD score Black line -> VID score VD and VID gives better results than YD. Reason: Graph corresponds to only forest region.  MODIS forest map was used to detect forest cover pixels  inaccurate and includes some shrubs and agricultural land labeled as forests.
  • 43. Green line -> YD score Red line -> VD score Black line -> VID score VID performs slightly worse than VD Reason-Initial few years selected to model variability may have some noise Therefore, mean variability for that location is modeled as high and changes in later years will go undetected
  • 44. Green line -> YD score Red line -> VD score Black line -> VID score Performance of VID is best. Reason-Shrubs form dominant land cover type for California and they show high variability in spread of variability score due to higher sensitivity to climatic variations
  • 45. Green line -> YD score Red line -> VD score Black line -> VID score Performance of YD is exceptionally poor and that of VID is exceptionally good. Reason-due to high variability in spread of variability score for different locations with vegetation type as shrubs
  • 46.  Anomaly detection ◦ Land cover change detection ◦ Outlier precipitation detection ◦ Outlier time interval detection  Detection of water fraction per flood pixel  Detection of forest cover per pixel
  • 47.  Input : ◦ South American Precipitation dataset in geoscience format known as NetCDF  Output: Variable Value Num Year Periods 10 Year Range 1995-2004 Grid Size 2.5º×2.5º Num Latitudes 31 Num Longitudes 23 Total Grids 713 ◦ The top k=5 outliers are found for every year ◦ Total of 155 outlier sequences were found over a period of 10 years  Running time of algorithm is 229s.
  • 48.  Aim: ◦ To find and track the position of outliers with time  Method description: ◦ Top k outliers are found for every year using Exact-Grid Top-k algorithm ◦ Outliers are tracked using the OutStretch algorithm ◦ The outlier sequences generated are analyzed  How to find the outlier (Exact-Grid Top-k algorithm) ◦ Concept of discrepancy is used ◦ Discrepancy value is assigned to each rectangular region using Kulldorff’s scan statistic. ◦ Top-k outliers are selected for further processing as it is necessary in order to track the outliers
  • 49.  How to calculate the discrepancy? ◦ Two parameters are required:  a measurement m (number of incidences of an event)  a baseline b (total population at risk) ◦ The measurement M and baseline B values for the whole dataset (U) are calculated as p m M ) (   p U   B b( p) p  U  ◦ The measurement M and baseline B values for the region (R) are calculated as m p ( ) M  p R R m   b p ( ) B  p R R b   ◦ The discrepancy score of the shaded area is calculated by using the given formula: ) 1  1 m m m b m R       ( , ) log (1 ) log( m b b R R R R  R R R d      ◦ For the above figure, M=6, B=16, mR= 4/6, and bR = 4/16
  • 50.  Outstretch algorithm ◦ The region is stretched around each side of the outlier region of the previous year ◦ Each of outlier in current year is examined to see whether it lies in the region consisting of stretched region and outlier region of previous year ◦ If it is, then it will be added to child list of previous year outlier  RecurseNode algorithm: ◦ All the sequences starting at root node of trees and ending at leaf node are fetched. Outlier region of previous year Stretched region
  • 51. (1,1), (2,2) and (3,2) corresponds to one sequence followed by outlier Forest built by applying outstretch algorithm recursively
  • 52.  Anomaly detection ◦ Land cover change detection ◦ Outlier precipitation detection ◦ Outlier time interval detection  Detection of water fraction per flood pixel  Detection of forest cover per pixel
  • 53.  Input: ◦ Sea surface temperature (SST) data of Equatorial Pacific Ocean. ◦ The data consisted of measurements of sea surface temperature for 44 sensors in Pacific Ocean ◦ Each sensor had a time series of 1440 data points.  Output: ◦ Time intervals where spatial neighborhood has shown abnormal behavior.
  • 54. Terms:  Spatial distance (sd) : Distance between 2 locations based on distance between spatial coordinates 2 2 sd  s pxsqx  s pysqy ( ) ( )  Measurement distance (md) : Distance between 2 points based on difference between features of 2 points. Where p and q are 2 locations m  Spatial neighborhood : Cluster of locations such that the spatial distance (sd) and measurement distance (md) between every 2 locations is less than the respective threshold values.  Sum of squared error (SSE) : Measure of degree of abnormality of the interval Where valbn is each temporal reading in base interval and μ is the mean of the temporal readings    md s pam sqam 1 2 ( )     BN bn valbn SSE dist 1 2 ( ) int 
  • 55.  Aim: ◦ To find time intervals where spatial neighborhoods are likely to show abnormal behavior.  Algorithm: ◦ Time series is first divided into a set of base equal size temporal intervals ◦ Spatial neighborhoods are found for every base interval ◦ Each of spatial node in every base interval is analyzed and binary classified as 1 if showing abnormal behavior or 0 otherwise ◦ Count of spatial nodes having a binary error classification of 1 is found for every base interval and this count is called vote count. ◦ A threshold mv is then applied and those intervals for which votes > mv are binary classified as 1 and others as 0.
  • 56. ◦ Consecutive base intervals which have same binary classification are merged to form the larger intervals. ◦ Mean value for each edge is calculated for every interval. ◦ Spatial neighborhoods are calculated for each interval using the mean value of edge.
  • 57. Agglomerative temporal intervals for SST data Location : 0ºN latitude and 110ºW longitude Time period : 10 day period from 01/01/2004 to 01/10/2004 No of measurements: approx.1400
  • 58. Neighborhood (a) represents cooler water Neighborhood (b) represents warmer water Neighborhood (c) and (d) represents moderate water • Edge clustering is validated by satellite image of SST. • Light regions represent cooler temperatures •Dark regions represent warmer temperatures.
  • 59. Neighborhood quality for each interval SSE of neighborhood (a) shows interesting pattern between intervals 16 and 19 SSE goes from high to low and then back from low to high
  • 60. Neighborhood (a) has more spread during 16th interval as compared to 17th interval
  • 61.  Input : ◦ Land cover type ◦ 8-day composite surface reflectance for NIR band (CH1) and VIS band (CH2) ◦ CH2-CH1 ◦ CH2/CH1 ◦ NDVI dataset ◦ Data before flooding in Mississippi basin is used as training dataset ◦ Data after flooding in Mississippi basin that occurred on June 17-19, 2008 is used for testing  Output: ◦ The best attribute (R) for classification i.e. CH2-CH1 is found. ◦ The threshold values of the best attribute (R) for pure water and pure land are found.  Validation data: ◦ 30m spatial resolution Landsat TM imagery for validation purposes
  • 62.  Aim: ◦ To find the fraction of water in flood pixels which are usually water mixed with land cover features for MODIS dataset which has coarse resolution  Method description: ◦ Decision tree approach is used to find  the best parameter (predictor) in order to differentiate between land and water.  the threshold values of the predictor R for pure water (Rwater) and pure land (Rland) ◦ Water fraction per pixel can be found by comparing actual value of predictor with its value for pure water or pure land R WF * R  (1WF )* Rmix water land ((R R ) /(R R ))*100 land mix land water WF   
  • 63. Experimental Results:  Some of the rules used for deciding threshold values are : ◦ (CH2-CH1) > 9.17 -> class Land ◦ (CH2-CH1) <= 2.91 -> class Water  Correlation between TM and MODIS water fractions is 0.97 with bias of 4.47% and standard deviation of 4.4%.
  • 64. Decision tree created using C4.5 algorithm
  • 65.  Anomaly detection ◦ Land cover change detection ◦ Outlier precipitation detection ◦ Outlier time interval detection  Detection of water fraction per flood pixel  Detection of forest cover per pixel
  • 66.  Input : ◦ Land surface temperature 5-monthly composited MOD11C3 product ◦ NDVI and EVI from monthly composited MOD13C2 product ◦ Land cover type from MCD12C1 yearly product  Output: ◦ Fraction of forest cover per pixel  Validation: ◦ Forest cover information from PRODES data at 90 m resolution available in GeoTiff format is used for validation purposes
  • 67.  Aim : ◦ To find the forest cover per pixel for MODIS dataset having coarse resolution ◦ The data values for parameters like NDVI, EVI, land surface temperature etc are available per pixel. ◦ Therefore, value is affected by vegetation cover of every point covered in that pixel. ◦ Same parameter value may correspond to different fraction of forest cover depending on vegetation type for whole area per pixel.  Algorithm: ◦ Modification of Leeuwen et al. approach. ◦ Leeuwen et al. approach gives the single logistic regression model for all vegetation types. ◦ But, improved algorithm considers vegetation type and gives independent logistic regression model for each vegetation type
  • 68. Leeuwen et al. approach Terms:  pit : Fraction of forest cover for pixel i in year t (generated from the analysis of high-resolution LandSat TM) images  Xit : Vector of MODIS observations for pixel i in year t  β: Vector of model parameters (which are estimated from a set of training data) for pixel i in year t  The vectors Xit and β each have three components: ◦ the first corresponding to a constant intercept term ◦ the second to a NDVI measurement, ◦ and the third to a LST measurement. Model : X p   it  p it T it       1 ln
  • 69.  Learning independent regression algorithms require segmentation of observation space into multiple categories.  Segmentation is done by partitioning the feature space which is n-dimensional space with one feature corresponding to one of axis.  Features are selected based on their ability in differentiating between different vegetation types.  For example: Forests show high inter-annual NDVI and EVI mean and low inter-annual LST mean but intra-annual variance of NDVI, EVI and LST is low.  Therefore, mean(μ) and variance(σ2) are selected as features.
  • 70. Forests show high inter-annual mean and low intra-annual variance Farms show high intra-annual variance due to crop cycles Grasslands show high intra-annual variance and high inter-annual mean Water locations show high intra-annual variance and low inter-annual mean Vegetation type distribution in feature space (μ, σ2) of NDVI
  • 71. Analysis of partition corresponding to forest vegetation type Scatter plot of residual of baseline approach and residual of vegetation specific approach Residual of vegetation specific approach has lower magnitude than baseline model Therefore, vegetation approach better than baseline model
  • 72. Analysis of partition corresponding to cropland vegetation type Residual of vegetation specific model is lower in magnitude as compared to baseline model
  • 73.  Introduction  Literature Review  Conclusion  References
  • 74.  Various research works related to anomaly detection and detection of water fraction or forest cover per pixel have been discussed.  Most of the research works are pixel-based and do not consider the spatial neighborhood of a pixel.  Domain knowledge is also required along with data mining techniques  Future works should work towards addressing these limitations.
  • 75.  Introduction  Literature Review  Conclusion  References
  • 76.  [1] Jonathan T. Overpeck, Gerald A.Meehl, Sandrine Bony, David R. Easterling, D. (2011) ,"Climate data challenges in the 21st century," in Science, 2011.  [2] James H. Faghmous and Vipin Kumar," Spatio-temporal data mining for climate data : Advances, Challenges and Opportunities," in Data Mining and Knowledge Discovery for Big Data, 2014.  [3] Donglian Sun, Yunyue Yu, Mitchell D. Goldberg ,"Deriving Water Fraction and Flood Maps From MODIS Images Using Decision Tree Approach," in IEEE Journal of Selected Topics In Applied Earth Observations And Remote Sensing , 2011.  [4] Shyam Boriah, Vipin Kumar, Michael Steinbach, Christopher Potter, Steven Klooster," Land Cover Change Detection : A Case Study," in Knowledge Discovery in Databases Proceedings, 2008.  [5] Elizabeth Wu, Wei Liu, Sanjay Chawla ,"Spatio-Temporal Outlier Detection in Precipitation Data," in Knowledge Discovery in Databases Proceedings, 2008.  [6] Hong Yeon Cho, Ji Hee Oh, Kyeong Ok Kim, and Jae Seol Shim, "Outlier Detection and missing data filling methods for coastal water temperature data," in Journal of Coastal Research, 2013.
  • 77.  [7] C.T.Dhanya and D.Nagesh Kumar, " Data mining for evolution of association rules for droughts and floods in India using climate inputs," in Journal of Geophysical Research, 2009.  [8] Ruixin Yang, Jiang Tang, and Donglian Sun, " Association Rule Data Mining Applications for Atlantic Tropical Cyclone Intensity Changes," in Journal of American Meteorological Society, 2011.  [9] James H.Faghmous, Yashu Chamber, Shyam Boriah, Stefan Liess, Vipin Kumar, "A novel and scalable spatio-temporal technique for ocean eddy monitoring," in Association for Advancement of Artificial Intelligence, 2012.  [10] Imran Maqsood, Muhammad Riaz Khan, and Ajith Abrahim, "An ensemble of neural networks for weather forecasting," in Neural Computing & Applications , 2004.  [11] Agboola A.H., Gabriel A.J., Aliyu E.O., Alese B.K., "Development of a Fuzzy Logic Based Rainfall Prediction Model," in International Journal of Engineering and Technology, 2013.  [12] Christopher G.Healey, "On the Use of Perceptual Cues and Data Mining for Effective Visualization of Scientific Datasets," in Proceedings Graphics Interface, 1998.
  • 78.  [13] Wenwen Li, Chaowei Yang, Donglian Sun, "Mining geophysical Parameters through Decision Tree Analysis to Determine Correlation with Tropical Cyclone Development," in Computers & Geosciences, 2008.  [14]Pinky Saikia Dutta, and Hitesh Tahbilder, "Prediction of Rainfall using Data mining Technique over Assam," in Indian Journal of Computer Science and Engineering (IJCSE),2014.  [15]Anuj Karpatne, Mace Blank, Michael Lau, Shyam Boriah, Karsten Steinhaeuser, Michael Steinbach and Vipin Kumar," Importance of Vegetation Type in Forest Cover," in Intelligent Data Understanding, 2012.  [16] James H.Faghmous, Mathew Le, Muhammed Uluyol, Vipin Kumar and Snigdhansu Chatterjee, "A parameter-free spatio-temporal pattern mining model to catalog ocean dynamics," in IEEE 13th International Conference on Data Mining, 2013.  [17] Rie Honda and Osamu Konishi, "Temporal rule discovery for Time-Series Satellite Images and Integration with RDB," in Principles of Data Mining and Knowledge Discovery, Lecture Notes in Computer Science ,2001.
  • 79.  [18] Pol R. Coppin and Marvin E. Bauer, "Change Detection in Forest Ecosystems with Remote Sensing Digital Imagery ," in Remote Sensing Reviews, 1996.  [19] Varun Mithal, Ashish Garg , Ivan Brugere, Shyam Boriah, Vipin Kumar, Michael Steinbach, ristopher Potter, Steven Klooster , "Incorporating Natural Variation Into Time Series-Based Land Cover Change Identification," in Proceedings of the NASA Conference on Intelligent Data Understanding, 2011.  [20] Varun Mithal, Shyam Boriah, Ashish Garg, Michael Steinbach, Vipin Kumar, "Monitoring Global Forest Cover Using Data Mining," in Journal of Association for Computing Machinery, Volume V, 2010.  [21] D. Agarwal, A. McGregor, J.M.Phillips, S.Venkatsubramanian, and Z.Zhu, "Spatial Scan Statistics: Approximations and Performance Study," in Knowledge Discovery in Databases Proceedings, 2006.  [22] Michael P. McGuire, Vandana P. Janeja, Aryya Gangopadhyay, "Spatiotemporal Neighborhood Discovery for Sensor Data" in Knowledge Discovery in Databases Proceedings, 2008.  [23]Thijs T. van Leeuwen, Andrew J. Frank, Yufang Jin, Padhraic Smyth, Michael L. Goulden, Guido R. van der Werf and James T. Randerson, "Optimal use of land surface temperature data to detect changes in tropical forest cover," in Journal of Geophysical Research: Biogeosciences, 2011.

Hinweis der Redaktion

  1. Example for concrete objects Uncertainty due to presence of clouds or others
  2. Explanation for the HDF file
  3. Basic idea about these
  4. Residual – difference between actual value of x and predicted value of x.