2. It Is A Bit Too Easy …
• Very easy access to lots of
occurrence data
• Very easy access to rich
geospatial data
• Easy-to-use modeling tools
• Lots of literature setting out
the examples
3. Ecological Niche Modeling
1. Accumulate Input Data
2. Integrate Occurrence and Environmental Data
3. Model Calibration
4. Model Evaluation
5. Summary and Interpretation
4. Accumulate Input Data
Collate primary
biodiversity data
documenting
occurrences
Process environmental
layers to be maximally
relevant to distributional
ecology of species in
question
Collate GIS database
of relevant data
layers
Assess spatial precision
of occurrence data;
adjust inclusion of data
accordingly
Data subsetting for
model evaluation
Occurrence and
environmental data
Assess spatial
autocorrelation
5. Occurrence Data in Niche Modeling
• Goal is to represent the full diversity of
situations under which a particular species
maintains populations
• Spatial biases (i.e., non-random or non-
uniform distribution within G) is not damning
• Biases within E are catastrophic, and will
translate directly into biases in any niche
estimate
• More is usually better, but not always…
8. Georeferencing should …
• Represent the place at which the species was
found
• Represent the certainty and uncertainty with
which that place is characterized
• Summarize the methods used to establish that
place
• Preserve all of the original information for
possible reinterpretation
12. Data Cleaning
• Attempt to detect meaningfully erroneous records,
so that they can be treated with caution in analysis
• Use internal consistency to detect initial problems
– Species names consistent?
– Terrestrial species on land, marine species in the ocean?
– Latlong matches country, state, district, etc.
• Use external consistency to go deeper
– Occurrence data match known distribution spatially?
– Occurrence data match known distribution
environmentally?
• If precision data are available, filter to retain only
records that are precise enough for the study
• Iterative process with important consequences
14. Data Subsetting
• Must respond to the question at hand … why
are you doing the study?
• Ideally completely independent data streams
• Failing that, can be
– Macrospatial
– Microspatial (but see spatial autocorrelation)
– Random
• Will return to this point later…
15. Generalities: Environmental Data
• Raster format: i.e., information exists across
entire region of interest
• Relevant information as regards the
distributional potential of the species of
interest
• More dimensions = better (generally), BUT
– collinearity is bad
– too many dimensions is bad
16. Major Sources
• Climate data – long time span, but low
temporal resolution
• Remote-sensing data – high temporal
resolution, diverse products, short time span
• Topographic data – high temporal resolution,
uncertain connection to species’ distributional
ecology
• Soils data – uneven global coverage,
categorical data
• Others
18. Two Major Implications
• Non-independence in model evaluation
– Available data are often split into data sets for
calibration and evaluation
– Data points that are not independent of one another
may end up in different data sets, thereby
compromising the robustness of the test
• Inflation of sample sizes
– Because individual data points may be non-
independent of one another, sample sizes may appear
larger than they actually are
– This inflation may create opportunity for Type 1 errors
in model evaluation and model comparisons
21. Integrate Occurrence and Environmental Data
Assess BAM scenario for
species in question; avoid
M-limited situations
Saupe et al. 2012. Variation in niche and distribution model performance: The need
for a priori assessment of key causal factors. Ecological Modelling, 237–238, 11-22.
Estimate M and S
as area of analysis
in study
Barve et al. 2011. The crucial role of the
accessible area in ecological niche modeling
and species distribution modeling. Ecological
Modelling, 222, 1810-1819.
Reduce dimensionality (PCA
or correlation analysis)
Occurrence and
environmental data
Occurrence and
environmental data
ready for analysis
27. BAM Conclusions
• Some situations are not amenable to fitting
ecological niche models that will have
predictive power
• Models tend much more to good fitting of the
potential distribution, rather than the actual
distribution
• Must ponder carefully the BAM configuration
in a particular study situation to avoid
configurations that will not yield usable
models
34. M
• When the species has no history in an area:
– Use a radius related to dispersal distances
• When history is short (i.e., environment
constant):
– Use a radius representing compounding of
dispersal distances
• When history is long (i.e., environmental
change is a factor)
– Seek ways of assessing areas that the species’
distribution through time has covered…
37. Model Calibration
Estimate ecological niche
(various algorithms)
Model calibration,
adjusting parameters to
maximize quality
Model
thresholding
Peterson et al. 2007. Transferability and model
evaluation in ecological niche modeling: A
comparison of GARP and Maxent. Ecography,
30, 550-560.
Occurrence and
environmental data
ready for analysis
“No Silver Bullet” paper to appear
Warren, D. L. and S. N. Seifert. 2011. Ecological niche
modeling in Maxent: The importance of model complexity
and the performance of model selection criteria. Ecological
Applications 21:335-342.
Preliminary
models
47. No Silver Bullets in ENM
• Single algorithms may perform ‘best’ on average
• The best algorithm in any given situation,
however, may be other than the ‘best’
• NSB thinking suggests that we should not use a
single approach
• Use a suite of approaches (e.g., as implemented
in OM, BIOMOD, BIOENSEMBLES, etc.), challenge
to predict, choose best for that situation
• Maxent is good, but it is not the only algorithm …
50. Thresholding
• Use an approach that prioritizes omission
error over commission error, in view of the
greater reliability of presence data
• Minimum training presence thresholding
seeks the highest suitability value that
includes 100% of the calibration data
• Suggest (strongly) using a parallel approach
that seeks that highest suitability value that
includes (100-E)% of the calibration data
52. Model Evaluation
Project niche model
to geographic space
Model
evaluation
Peterson et al. 2008. Rethinking receiver operating characteristic
analysis applications in ecological niche modelling. Ecological
Modelling, 213, 63-72.
Preliminary
models
Reset data subsets based
on evaluation results
Corroborated models
ready for projection to
geographic
times/regions of
interest
54. If predicted suitable area covers
15% of the testing area, then
15% of evaluation points are
expected to fall in the predicted
suitable area by chance.
• p = proportion of area
predicted suitable
• s = number of successes
• n = number of evaluation
points
Cumulative binomial distribution calculates the probability of
obtaining s successes out of n trials in a situation in which p
proportion of the testing area is predicted present. If this probability
is below 0.05, we interpret the situation as indicating that the
model’s predictions are significantly better than random.
Threshold-
dependent
Approach
59. Significance vs Performance
• Predictions that are significantly better than
random is important, and is a sine qua non for
model interpretation
• BUT, it is also important to assure that the
model performs sufficiently well for the
intended uses of the output
• Performance measures include omission rate,
correct classification rate, etc.
60. Summary and Interpretation
Evaluation of model
transfer results
Transfer to other
situations (time
and space)
Assess extrapolation
(MESS and MOP)
Owens, H. L., L. P. Campbell, L. Dornak, E. E. Saupe, N.
Barve, J. Soberón, K. Ingenloff, A. Lira-Noriega, C. M.
Hensz, C. E. Myers, and A. T. Peterson. 2013.
Constraints on interpretation of ecological niche
models by limited environmental ranges on calibration
areas. Ecological Modelling 263:10-18.
Refine estimate of
current distribution
via land use, etc.
Compare present and
“other” to assess
effects of change
Models calibrated and
evaluated, and transferred
to present and “other”
situations
66. MESS and MOP
• Both have the intention of detecting extrapolative
situations
• MESS is implemented within Maxent
• MESS compares the area in question to the
centroid of the calibration cloud
• MOP compares the area in question to the
nearest part of the calibration cloud
• Agree on ‘out of range’ conditions
• MOP better characterizes similarities between
calibration and transfer regions, and thus is more
optimistic as regards in-range extrapolation
68. Ecological Niche Modeling
1. Accumulate Input Data
2. Integrate Occurrence and Environmental Data
3. Model Calibration
4. Model Evaluation
5. Summary and Interpretation