The document proposes a global linked and open data infrastructure for agricultural development called the Big Data Aggregator (BDE) platform. It discusses existing infrastructural components like vocabularies, authority data, APIs, and tools that different actors have built. The BDE platform would be aware of and interlink with these existing components in a distributed ecosystem. It would provide a generic, reusable infrastructure with domain-specific instances to evaluate requirements in different societal domains through pilot use cases. The goal is for the new big data infrastructure to be interlinked with current infrastructural elements.
1. A global linked and open data
infrastructure for agricultural
development
Valeria Pesce
Global Forum on Agricultural Research
Food and Agriculture Organization of
the United nations
2. BDE proposed infrastructure
• ICT infrastructure
• Computing infrastructure
• One re-depolyable generic infrastructure, n
“Domain-specific Big Data Integrator Instances”
• These tools may all be thought of as generic tools
and they will be available as options in the
generic plat-form. However, individual domains
are likely to need more specialised tools and
datasets
3. The ag-data context
• Different actors have built several infrastructural
components over the years under the umbrellas of
different initiatives and projects
• Mostly vocabularies, authority data, online tools,
APIs; mostly for non-big data
• Little work on computational services
IGAD group
4. “Positioning” this infrastructure
within the ag-data context
• Open data for agricultural development is big
and non-big data
• Beyond all the additional features (the 3 Vs) of
being “big”, big data still also have the same
features as other non-big data
• A dedicated big data infrastructure should be
aware of and interlink with other existing
infrastructural elements in a distributed and
inter-linked ecosystem
19. BDE in context
• An infrastructure that minimises the disruption to current
workflows
• An adaptable, easy to deploy and use solution, which will
allow the interest-ed user groups and stakeholders to
extend their Big Data solutions or introduce Big Data
technology to their business processes
Big Data Aggregator platform
20. APIs
Registries
Cloud / SaaS tools
Shared
URIs
Infrastructural components
Grid jobs
Grid workflows
AGROVOC
NALT
GACS
Gene Ontology
Soil Terms
Local KOSs
Controlled lists
etc.
Description
vocabularies
KOSs
Darwin Core
INSPIRE
DCAT
etc.
Interoperable
vocabularies
Datasets
Vocabularies
APIs
Re-usable
software
Vocabulary tools
agINFRA
CM / DM tools
VocBench
Authority
Persons
Institutions
Projects
etc.
Authority
Processing APIs
agINFRA
BioCatalogue
APIs
Vocabulary APIs
Agrovoc WS
Climate Tagger
BDE platform
22. BDE approach
• One re-depolyable generic infrastructure, n “Domain-
specific Big Data Integrator Instances”
• Provide a comprehensive test-beds for the evaluation
of the BDE Aggregator Platform according to the
requirements of the respective societal domain;
• Carefully select pilot use cases, across different
domains, so that they are adequate test beds but self-
sustainable systems beyond the action’s end;
• These tools may all be thought of as generic tools and
they will be available as options in the generic plat-
form. However, individual domains are likely to need
more specialised tools and datasets
23. Conclusion
• We would like the new big data infrastructure
to be interlinked with existing infrastructural
components