Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Brazilian Cerrado geolinked data and qualitative models
1. Brazilian Cerrado ontology network
and qualitative models: a case study
application of geolinked data
approach to Ecology
Adriano Souza
University of Brasilia, Brazil
2013
2. This presentation
• Introduction
• Background and state of the art
• Research question
• Hypothesis
• The proposal
• Work in progress
• What is next…
2 of 41
3. Ecoinformatics
A field of research and development focused on the interface
between ecology, computer science, and information technology.
“Ecologists have recognized the need for integrated data
systems to support cross-disciplinary collaboration to
understand the basic ecological principles that govern the
biosphere.”
Green et al. (2005)
3 of 41
4. AI Technologies can be useful to theoretical development in Ecology:
• To organize knowledge bases compatible with computers,
including qualitative and quantitative knowledge;
• To perform fast assessment of assumptions, hypotheses and
other ideas in a theoretical context;
• To determine the consequences and the logical consistency of
long and complex paths of ecological reasoning.
Maybe the most immediate impact of AI Technologies will be on
the way ecologists organize, develop and implement models.
Rykiel (1989)
4 of 41
5. Why models are necessary?
• To build and use models contribute to…
• Understand the structure of systems;
• Predict the behavior of systems;
• Control variables in order to obtain specific results.
• They are used in:
• Scientific research;
• Decision making and management;
• Education and training.
5 of 41
6. Ecological models
To build ecological models is a complex task because...
1. Ecological models are heterogeneous, including both qualitative
and quantitative knowledge;
2.It is hard to collect data and perform experiments;
3.The available data is incomplete, inaccurate, uncertain and
many times expressed in qualitative terms;
4.The theoretical foundations and the laws (or first principles) are
still under development.
• New approaches to ecological modeling are required!
Qualitative
Reasoning
6 of 41
7. Qualitative Reasoning (QR)
It is an area of artificial intelligence that creates
representations for continuous aspects of the world to
support reasoning with little information
The use of QR models can contribute to clarify many aspects
and to improve the understanding of causal reasoning chains
involving environmental factors and changes in populations
and communities.
Salles & Bredeweg (2006)
7 of 41
8. Qualitative Ecological Models
Are promising because
• Allow to build and run simulations with incomplete knowledge;
• Allow to create a rich vocabulary about a variety of systems;
• Explicit representation of causality which gives support to
explanation of systems from its structure;
• Contribute to improve comprehension about complex systems
and fosters the decision making process.
Advantages over numerical models
• Inaccurate prediction, but CORRECT
• Easy exploitation of alternatives
• Automatic interpretation
8 of 41
13. What all savannas have in common?
“Savanna occurs over a vast range of conditions that have little in
common except for their inability to support rapid tree growth.”
(Hoffman et al., 2012)
Source: Challenges and opportunities in remote sensing of global savannas. Colorado State University.
Available in: http://www.nrel.colostate.edu/projects/srs/
13 of 41
14. 2 main reasons why do I care about Cerrado
1. Ecological Theory
• Stability
Biodiversity
• Equilibrium x Non-equilibrium
2. Conservation Biology
Hotspot
14 of 41
15. Unstable
Disturbance-driven savannas
Disturbances such as fire, grazing and
browsing are required to maintaing
both trees and grasses in the system
Disturbances such as fire and
herbivore, although capable to modify
tree to grass ratios, are not necessary
for coexistence
Stable
Climatically determined savannas
Sankaran et al. (2005)
15 of 41
16. About 50% of Cerrado is Deforested
Total Area: 2.047.146,35 Km2
Cerrado vegetation type (%) IBGE (2004)
Savanna
61%
Forest
32%
Cerrado cover type (%) IBGE (2004)
Natural
51%
Grassland
7%
Deforested
48%
Water
1%
16 of 41
17. Cerrado + Poor acid, aluminum toxicity soils + State-of-the-Art Technology + Farmers =
Remaining areas
Deforested area accumulated until 2008
17 of 41
18. Amount of soybean production (ton) in 2005 per municipality
18 of 41
20. Research question
How to use the GeoLinked data approach to
integrate datasets along with qualitative reasoning
ecological models, in order to improve the
understanding of ecological mechanisms and
facilitate access and management of environmental
information?
20 of 41
21. Exploring hypothesis
The use of qualitative conceptual simulation
models, associated to data sources made available
by geolinked data semantic techniques, can
improve the interpretative and predictive capacity
over the data available about the dynamics of
Cerrado vegetation.
21 of 41
23. Study A
Species: Ouratea hexasperma
Density: 73
Fire frequency: Fire Protected
+
geographical information
meteorological data
agriculture
owl:sameAs
owl:sameAs
GeoLinked data to retrieve appropriate models
Simulation
owl:sameAs
Study B
Species: Ouratea hexasperma
Density: 83
Fire frequency: Low
+
geographical information
meteorological data
agriculture
23 of 41
24. Methodological aspects: the Life Cycle Model
Specification
Exploitation
Modeling
Publication
RDF generation
Links
Generation
Iterative incremental life cycle
24 of 41
25. Specification
• Ontology Requirements Specification Document
Ontology Requirements Specification Document
1 Purpose
The purpose of the Cerrado ontology network is to represent the scientific knowledge about the ecology and dynamics of
Cerrado plants, it should express how the structure of the plant communities in Cerrado is and how they change over time.
2 Scope
Because of the complexity and extend of the domain, the scope of the ontologies will focus to cover the following subdomains:
plant community dynamics and fire.
3 Implementation Language
The Ontologies will be developed using the Web Ontology Language OWL, once it is part of the W3C recommendation for the
Semantic Web.
4 Intended End-Users
1.
Researchers and scientists seeking to understand the functioning of savannah plant community.
2.
Environment managers of conservation units and those responsible for making the public policies for environment and
biodiversity conservation.
3.
Ecological and environmental information and data about Brazilian Cerrado users.
5 Intended Uses
1.
To store data and provide information about diversity, composition and dynamics of Cerrado wood plants.
2.
To propose a standard and management practice of the data available about the Cerrado vegetation.
3.
To propose a service in which the user can search for species, its location and assess the changes in populations over time.
6 Ontology Requirements
a. Non-Functional Requirements
NFR1: The ontology network must give support to a multilingual scenario for Portuguese and English.
b. Functional Requirements: Groups of Competency Questions
For the functional requirements it was used the competency questions technique (Gruninger and Fox, 1994) recommended by
NeOn methodology.
25 of 41
26. Specification
• Competency Questions
# Competency Questions for CCOn
1 What is a biome?
2 What is a savanna?
3 What characterizes a savanna?
4 What are the determining factors of savannas?
5 What is Cerrado?
6 What characterizes the Cerrado?
7 What is a population?
8 What is population growth?
9 Which processes determine the size of a population?
10 What is mortality?
11 What is natality?
12 What is an ecological community?
13 What are the types of ecological communities?
14 What is biodiversity?
15 What is the species richness of a community?
16 What factors determine the species richness of a community?
17 What is a plant community?
18 What are the types of plant communities?
19 What are wood plants?
20 What are herbaceous plants?
What are the main measurements of biological diversity of a community
21
or ecosystem?
# Competency Questions for Fire Ontology
65 Where does occur the wildfires in Cerrado?
66 How often Cerrado vegetation burns?
67 In what period of the year does wildfires occur in Cerrado?
68 What is wildfire?
69 Where are located the places with similar temperature range?
Where are located the places with maximum temperature in a
70
certain time?
71 What are the types of wildfires?
72 What is the severity of each burn event in Cerrado?
73 What is the temperature of a location in a given time?
74 What is the relative humidity of a location in a given time?
75 What causes a burn event?
76 What are the effects of a burn event?
26 of 41
28. Reused x New Terms
Table 2. Reused Classes
Number of terms
Resource
CCOn
Fire Ontology
OBOE
0
2
ENVo
3
1
CWR
24
0
Table 3. Reused Properties
Property
Origen
Reused in
exactMatch
SKOS
CCOn
mappingRelation
SKOS
CCOn
Table 4. New terms
Ontology
N of classes N of properties N of individuals
CCOn
58
21
137
Fire Ontology
49
17
9
Total
107
38
146
28 of 41
33. Use case example
Dataset A
Study area
Fragment
First inventory
Last inventory
Domain specific ontologies
OBOE
Gerais de Balsas Colonization
Project, Maranhão, Brazil
1
2
1995
1995
2002
2002
Fire
Mean Annual
Mortality rate
Mean Annual
Recruitment rate
Biennial
Is a
Community dynamics
Measurement
Is characterized by
Biennial
2.73
Of-Characteristic
Recruitment rate
4.88
3.25
5.86
Cerrado sensu stricto
has-measurement
Is part of
Observation
ofEntity
Dataset B
Location
Tree
Jatobá Biological
Reserve, Bahia,
Brazil
2004
Observation
ofEntity
Fire
1991
Last inventory
Mortality rate
Recruitment
rate
Wood plant
affects
First inventory
Fire
Is-a
hasCharacteristic
has-measurement
Protected
1.93
3.72
Fire Characteristic
Measurement
Of-Characteristic
Is-a
Fire frequency
33 of 41
35. Ontology evaluation: Pitfalls
100
Table 3. Ontology Pitfalls found in Fire Ontology
70
Critical
Important
60
Minor
Usability-profiling
Structural
50
Functional
90
80
Version
P02
P04
P05
P07
P08
P10
P11
P13
P22
P29
P38
Total
70
0.3
2
11
0
0
54
ONT
7
7
ONT
0
-
81
60
0.4
0
1
1
0
49
ONT
0
4
ONT
2
-
57
50
0.5
0
1
0
0
53
0
0
2
ONT
0
-
56
40
0.6
0
0
0
0
51
0
0
0
ONT
0
-
51
30
0.7
0
0
0
1
24
0
5
0
0
0
ONT
30
20
0.8
0
0
0
1
24
0
0
0
0
0
ONT
25
10
40
30
20
10
0
0
0.3
Table 4. Ontology Pitfalls found in Ccon Ontology
0.4
0.5
0.6
0.7
0.8
100
Version
P04
P05
P08
P11
P13
P22
P31
P35
P38
Total
0.8.1
2
0
83
8
8
ONT
-
-
-
101
0.8.2
1
1
83
6
8
ONT
-
-
-
0
0
88
2
0
ONT
-
-
-
90
0.9.1
0
0
24
2
0
ONT
1
1
ONT
28
0.9.2
0
0
24
0
0
ONT
1
1
ONT
26
80
80
60
60
40
40
20
20
0
0
99
0.9.0
100
0.8.1
0.8.2
0.9.0
0.9.1
0.9.2
35 of 41
36. Expert Evaluation
• 2 Questionnaires were elaborated (G forms)
• Quality
• Completeness
• Correctness
• Likert scale and Open Questions
Support
material
Evaluation
On line
documentation
Bioportal
visualization
Competency
questions
Fire Ontology
Questionnaire
Expert Answers
Questionnairies
CCOn
Questionnaire
36 of 41
37. What is next
RDF generation
geometry2RDF
Google Refine
Links
Generation
owl:sameAs
Publication
Virtuoso, Pubby
Exploitation
Map4RDF
SILK
37 of 41
38. Data Sources for RDF generation
Data sources for vegetation dynamics on scientific literature:
• Souza, A. (2010). Estrutura e Dinâmica da Vegetação Lenhosa de Cerrado sensu stricto no período de 19 anos, na Reserva Ecológica do
IBGE , Distrito Federal , Brasil. 68p. Dissertação de mestrado. Departamento de Ecologia. Universidade de Brasília.
• Roitman, I.; Felfili, J.M.; Rezende, A.V. (2008). Tree dynamics of a fire-protected cerrado sensu stricto surrounded by forest plantations,
over a 13-years period (1991-2004) in Bahia, Brazil. Plant Ecology. 197: 255-267.
• Moreira, A.G. (1992). Fire protection and vegetation dynamics in the Brazilian Cerrado. Ph.D. thesis, Harvard University, Cambridge, MA,
U.S.A.
• Aquino, F. D. G., Walter, B. M. T., & Ribeiro, J. F. (2007). Woody community dynamics in two fragments of “cerrado” stricto sensu over a
seven-year period (1995-2002), MA, Brazil. Revista Brasileira de Botânica, 30(1), 113–121.
• Libano, A. M., & Felfili, J. M. (2006). Mudanças temporais na composição florística e na diversidade de um cerrado sensu stricto do Brasil
Central em um período de 18 anos (1985-2003). Acta Botanica Brasilica, 20(4), 927–936.
Meteorological
Data Source
Fire
Occurrence
and risk
INMET
INPE
Maps and
environmental
data
IBGE
CSR
IBAMA
MMA
LAPIG
38 of 41
39. Outlook
• This work presents a plan for a pilot study to be a test.
• It involves linked geographical, meteorological, ecological and
environmental open data provided by Brazilian government
agencies.
• A methodology based in a Geolinked data approach is adopted
to create a case study aiming investigate the application of
linked data principles to ecology.
39 of 41
40. Outlook
• A relevant question to be investigated in this preliminary pilot
study is how to integrate qualitative reasoning models along
with maps and other data, be able to reason with the data and
make inferences and finally to show the results.
• The topics addressed in this work have potential to boost both
applications of geolinked data technologies to new areas, and
to open new perspectives for research involving ecological data
management, integration and use.
40 of 41