In the age of Big Data, filtering mechanisms have to professionalized to increase accessibility to data. This presentation, held at Knowledge Management Academy in Vienna, shows how technologies derived from the Semantic Web can help to establish more efficient means to manage data and information.
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Data Strategies: Metadata, Open Data, Linked Data
1. Data Strategies
Metadata, Open Data & Linked Data
Andreas Blumauer
CEO, Semantic Web Company
www.semantic-web.at
www.poolparty.biz
2. About Semantic Web Company
Company was founded 2001 in Vienna, Austria
>20 experts in linked data technologies
Product: PoolParty Suite (launched 2009)
Serving global 500 companies & large NGOs
EU- & US-based consulting services
3. Some customers we serve
• Pearson
• Daimler
• Wolters Kluwer
• Ministry of Finance (AUT)
• GBPN
• Credit Suisse • Council of EU
• Education Services (AUS)
• World Bank • Roche
• Wood Mackenzie • REEEP
4. Agenda
Intro
Data management – the current situation
Potential & Benefits of Linked Open Data (LOD) –
what is metadata, open data, linked data,
what is linked open data?
Use Cases
Global Buildings Performance Network (GBPN) & BPIE
World Bank Thesauri
EIP on Water: Marketplace
Renewable Energy & Energy Efficiency Partnership (REEEP)
Q&A
13. Data management in the environmental sector –
The current situation
Example: Buildings performance
“2012 saw the launch of an impressive
number of online portals sharing data and
analysis on energy efficiency in buildings”
(Ingeborg Nolte, Senior Communication
Manager at BPIE)
However: how can the value be leveraged
of so many (open) data sets which are
actually isolated from each other?
Will Excel be the ultimate solution?
14. What‟s wrong with Open Data?
<daycare id=„Seven Dwarfs“
address=„...“>
. . .
</ daycare >
<kindergarten>
<name>Seven Dwarfs</name>
<child_care name=„Seven Dwarfs“>
<address>
<location>...</location>
<street>...</street>
<description>...</description>
<zip>...</zip>
</address>
</kindergarten>
<text>...</text>
</child_care>
Syntactic heterogenity – different trees
Semantic heterogenity – different tags and attributes (e.g.
kindergarten, child_care, daycare)
20. What is linked data,
what is linked open data?
The Free Universal
Construction Kit
connects
Lego®, Duplo®, Fischertechnik®,
Gears! Gears!
Gears!®, K‟Nex®, Krinkles®, Bristle
Blocks®, Lincoln
Logs®, Tinkertoys®, Zome®, Zom
eTool® and Zoob®
with a low cost 3D printed
adapter set
CC by Golan Levin (US), Shawn
Sims (US)
21. LOD as a giant knowledge base
Which policies in the area of renewable energy have helped to initiate
projects and programmes in the agricultural sector which finally have
improved substantially the nutritional situation in a certain country?
22. Application example #1:
Energy Market Intelligence
Scenario #1:
I am an energy market
researcher at the International
Energy Agency (IEA).
I inform policy makers about the
situation in specific renewable
energy areas to develop
targeted energy support
programs.
For my research I need indicators
about utilisation levels of all
alternative forms of energy with
regards to geographical and
political categories.
http://integrator.poolparty.biz/report_renewable/
23. How does it work?
Articles about Renewable Energy
72,018 documents
From ~300 web sources
Reegle Thesaurus: ~3,000 concepts
Traverse hierarchies below main categories
(wind, solar, etc.) and classify documents
Geonames
Annotate documents with regards to their geographical
entities
DBpedia
Lookup several Yago classes to all extracted geographical
entities to assert additional categories, e.g.: EUcountries, French-speaking countries etc.
24. How does it work?
Semantic
Search
Geospatial
Search
PoolParty Semantic Integrator
….
Data
Visualisation
25. Application example #2: Health Care
Scenario #2:
I am an information officer at
the Global Health
Observatory of the World
Health Organisation.
I inform policy makers about
the global situation in
specific disease areas to
direct support to the
required health support
programs.
For my research I need data
about disease prevalence in
relation to socio-economic
factors.
http://integrator.poolparty.biz/report_medicine/
26. How does it work?
PubMed Articles
Cardiovascular Diseases: 39,911 documents
Neoplasms: 69,937 documents
Nervous System Diseases: 48,128 documents
MeSH: 26,700 concepts / 346,600 triples
Traverse hierarchies below disease main categories and classify
documents
Geonames
Annotate documents with regards to their geographical entities
DBpedia
Lookup HDI (The Human Development Index (HDI) is a composite
statistic of life expectancy, education, and income indices used to
rank countries into four tiers of human development)
27. How does it work?
Semantic
Search
Geospatial
Search
PoolParty Semantic Integrator
Data
Visualisation
28. Data management in the environmental sector –
The current situation
Example: Energy data
“It’s necessary to split the responsibility for
different data sets between different data
providers.”
(Florian Bauer)
However: how can this „splitting‟ be co-
ordinated and hwo can additional positive
network effects be stimulated?
29. 5 stars of data standards
• Publish Open Data in RDF reusing vocabularies which
can be understood and combined by apps in
unforeseen ways (e.g. visualization widgets)
link your
data
use URIs to
denote things
use non-proprietary formats
(e.g., CSV instead of Excel)
make it available as structured data
(e.g., Excel instead of image scan of a table)
make your stuff available on the Web (whatever format)
under an open license
30. Licensing is key for open data
Kind of license
Source: http://www.licensius.com/blog/lodlicenses
Num.
%
Not specified
132
39%
Public Domain
69
21%
Attribution
66
20%
Share alike
35
10%
Closed
16
5%
With
restrictions
5
2%
Other
3
1%
32. Global Buildings Performance Network (GBPN)
The Global Buildings Performance Network (GBPN) is a globally organised
and regionally focused network whose mission is to advance best practice
policies that can significantly reduce energy consumption and associated
CO2 emissions from buildings.
33. Goals
Launch of the GBPN global Knowledge Platform for the
Energy Performance of Buildings (www.gbpn.org)
Share Knowledge
Build Awareness & showcase best practise
Stimulate collective research
Stimulate collective analysis from experts worldwide
Promote better decision-making
Help the building sector effectively reduce its impact on climate change
Linked Open Data successfully services these objectives!
35. The GBPN Knowledge Plattform
LOD based GBPN Terminology http://bit.ly/YSbD9S
GBPN News Aggregator Tool: http://bit.ly/13JLJqk
GBPN Policy Comparative Tool: http://bit.ly/X9Vihm
The GBPN Knowledge Platform is a Linked Open Data
project that aims to open and connect with the
best resources, data and information on buildings energy
performance policies worldwide.
Report Database: http://www.gbpn.org/reports
The Laboratory: http://www.gbpn.org/laboratory
GBPN web blog: http://bit.ly/X9VSeW
Live-Demo
40. Standardisation and consistency is key
Based on our experience in establishing knowledge broker
portals we know:
There is a strong need to increase consistency when tagging
climate and energy resources
We need to ensure the consistency of message being
delivered to the public to avoid confusion using terms in
different ways
This needs standardization of the used categories and tags
44. Impact
reegle.info users per year (not including datasets reused on other sites)
3,000,000
2,500,000
2,000,000
1,500,000
1,000,000
500,000
0
2008
2009
2010
2011
2012