1. TOWARDS A PAN-EUROPEAN
E-PROCUREMENT PLATFORM TO AGGREGATE,
PUBLISH AND SEARCH PUBLIC
PROCUREMENT NOTICES POWERED BY
LINKED OPEN DATA:
THE MOLDEAS APPROACH
Dr. Jose María Alvarez Rodríguez
Research Fellow, SEERC
Thessaloniki, 22-02-2012
2. Background and Glossary
e-Procurement • CPV
• A public procurements initiated, negotiated and/or – Common Procurement Vocabulary
concluded using electronic means, i.e. using electronic • LOD
equipment for the processing and storage of data, in
– Linking Open Data or Linked Open Data
particular through the Internet.
• NUTS
Public procurement
– Nomenclature of Territorial Units for
• A procedure initiated by a contracting authority with a
Statistic
view of acquiring goods, services or public works for
the fulfillment of its tasks. • OWL
Public Procurement Notice, notice, public contract, etc. – Ontology Web Language
• Being strict are not the same. • PSI
• There are distinct definitions depending on the stage: – Public Sector Information
PriorNotice, AwardNotice, etc. • RDF
• For the sake of a better understanding we will use – Resource Description Framework
these terms to refer the same thing “an • SME
announcement” of a new public procurement process – Small and Medium-sized Enterprise
(first stage and notice).
• TED
– Tenders Electronic Daily
Source: http://ec.europa.eu/internal_market/consultations/docs/2010/e-procurement/siemens-study_en.pdf
22/02/2013 Thessaloniki, Greece 2
3. The problem…
I have a family business that
produces beds and other
bedroom furniture…
but I do not have clients
Let’s search… due to the crisis…
I think we could sell our
and we could also try to sell
products in other countries…
beds to public administrations…
22/02/2013 Thessaloniki, Greece 3
5. Some help…an expert in e-Procurement
We are a Spanish SME that sells
an alert service about public
procurement opportunities…
We need the type of
We will deliver to you a
contract…
daily report…
And other variables: value, …the region…
duration, etc.
22/02/2013 Thessaloniki, Greece 5
6. The interview…
“I can provide different types of “beds” and bedroom furniture”
“Ok! Let’s see some CPV codes…”
• 33192100-3 - Beds for medical use
• 39143116-2 – Cots
• 39143310-2 - Coffee tables
• …
I have amily
“Do you have any target region?”
“Well, Thessaloniki, Greece…but maybe other countries”
“Ok! Let’s see some NUTS codes…”
• GR-Greece
• GR1: Voreia Ellada
• GR12 Kentriki Makedonia
• GR122 Thessaloniki Prefecture, etc.
I have amily
22/02/2013 Thessaloniki, Greece 6
7. …
Is it a familiar business? Isn’t it?
“yes, we are 10 people…”
“Great!...a SME…
“Do you have any thinking about
the duration of the contract or
the value?
“I suppose we could assume
contracts about 60000€ of
one year duration…”
“Ok! I am going to collect all these features
and I will report you the new opportunities…”
“Great! I hope to get some
business opportunity…”
“For sure! Don’t hesitate about it!
22/02/2013 Thessaloniki, Greece 7
8. Let’s start…
• We need public procurement opportunities that
fulfill these requirements:
Feature Value
Type of “object” (CPV Codes): 33192100-3, 39143116-2,
39143310-2
Location (NUTS Codes) GR, GR1, GR12, GR122 and
other European countries
Type of company SME
Duration 1 year
Value 60,000 €
… …
22/02/2013 Thessaloniki, Greece 8
9. Building the alert…
• We have to retrieve information from different
– Data sources or providers
• Official Bulletins, Official web pages, Newspaper, etc.
– Formats
• PNG, JPEG, PDF, MSOffice, OpenOffice, CSV, RSS, etc.
– Languages
• 23 official languages in Europe
– Models, services and APIs
• XML-Schema, SQL, REST, WSDL/SOAP, etc.
22/02/2013 Thessaloniki, Greece 9
10. Could you understand this notice?
22/02/2013
http://bit.ly/Yw0Rpm Thessaloniki, Greece 10
11. …and this one?
22/02/2013
http://bit.ly/WTIRYA Thessaloniki, Greece 11
12. Yes, you can speak/read/write Spanish…
…and the location, where is “Asturias”?
…and the format?
you have software to read PDFs, PNGs, etc. files
…and what is the meaning of “2012”?
“2012” it is clearly a year
…and what is the meaning of “3.371.282,99 €”?
It is clearly a value (~three million of Euros) using “.” as decimal separator
22/02/2013 Thessaloniki, Greece 12
13. Yes, but…we seek for an alert service…
• The information and data should be…
– Automatically processed
• Machine-processable format
– Validated against a common data model
– Available for querying via a formal language such as SQL
– Usable to build added-value services
– …
• Someone could say: ”Ok! But I can search by myself
in the web and manually check the features”
– Yes, why not? You can perfectly check an average of 16K
notices per day in the European Union
22/02/2013 Thessaloniki, Greece 13
14. and Why?
• e-Procurement is a strategic sector
– 17% of the GDP
• Action Plans 2004 and 2020
• Projects
– E-Certis, Fiscalis 2013, E-Prior, PEPPOL, STORK, etc.
• Other actions
– TED, RAMON metadata server, CPV, NUTS, etc.
• Legal framework (to be transposed in each European country)
• Boost participation (specially SMEs)
– First action could be to alert about new public procurement notices
22/02/2013 Thessaloniki, Greece 14
15. …But
• …a tangled realm of data and information…
– Formats, models, APIs, providers, classifications,
locations, etc.
It is not easy to reuse this valuable
public sector information (PSI)
We should make this information/data
available to be machine-processable…
22/02/2013 Thessaloniki, Greece 15
16. Open Data
Semantic Web
Linked Data
22/02/2013 Thessaloniki, Greece 16
17. 8 principles-Open Data
1. Data Must Be Complete.
2. . . . Primary.
3. . . . Timely.
4. . . . Accessible.
5. . . . Machine processable.
6. Access Must Be Non-Discriminatory.
7. Data Formats Must Be Non-Proprietary.
8. Data Must Be License-free.
22/02/2013 Thessaloniki, Greece 17
18. Public Procurement Data
is a clear example of Open Data
…and due to its relevance for the
economic sector we should ensure
all the principles of this initiative.
22/02/2013 Thessaloniki, Greece 18
19. Semantic web
Common & shared data model
Graph (subject, object, predicate) foaf:name
RDF with different serialization #me “Jose”
formats
Implicit multilinguism support
Knowledge-representation foaf:family:name
Ontologies
OWL (Ontology Web Language)
Logic formalism: DL, F-Logic, etc.
Reasoning foaf:knows “Alvarez”
Knowledge-management
Expert systems
Standards foaf:name
Query languages #diego “Diego”
Vocabularies
Datasets
…
22/02/2013 Thessaloniki, Greece 19
23. “There is not so much gain,
it is just another way to represent
information…”
“Yes, but it is machine-processable
(properties have semantics)
and we can do better!”
• Re-using well-know vocabularies, properties, etc.
• Making use of data properties
• Labeling all resources
• …
22/02/2013 Thessaloniki, Greece 23
25. “It seems better but…”
“Can we also represent the data
in public procurement notices?”
“Yes, of course, we can follow
the same approach!”
22/02/2013 Thessaloniki, Greece 25
26. “Firstly we are going to
introduce the concept of
Linked Data…”
22/02/2013 Thessaloniki, Greece 26
27. Linked Data
Principles 5* Model
1. Use URIs to name things
2. When someone looks up a
URI, provide useful
information, using the
standards (RDF*, SPARQL).
3. Include links to other
URIs.
4. Use HTTP URIs.
http://www.youtube.com/watch?v=ga1aSJXCFe0 (Tim Berners-Lee and “The bag of crisps”)
22/02/2013 Thessaloniki, Greece 27
28. “These principles can be achieved by
applying RDF to represent data and
we can reach 5*!”
“Yes, but you should make links to existing
datasets”
“Where Can I find them?”
“In the LOD Cloud there are some RDF
datasets and they are also open!”
22/02/2013 Thessaloniki, Greece 28
29. Linked Open Data
Cloud
203 datasets ( 25 Billions of rdf triples) and
395 millions of links (Sept. 2010).
Domains: Media, Geographic, Government (42,09 %),
Publications, Cross-domain, Life sciences, etc. (Ago. 2011).
393 datasets (Jun. 2012).
22/02/2013 Thessaloniki, Greece 29
http://richard.cyganiak.de/2007/10/lod/
30. “Let’s link our data to existing
datasets…”
CPV 2008
(URI: http://purl.org/weso/pscs/cpv/2008/resource/{id})
Example: http://purl.org/weso/pscs/cpv/2008/resource/33192100
NUTS
(URI: http://nuts.psi.enakting.org/id/{id})
Example: http://nuts.psi.enakting.org/id/GR
We are going to use prefixes to ease the reading of URIs…
22/02/2013 Thessaloniki, Greece 30
32. “Great! We can reuse the information and
data…but…
Can we enrich that data?”
“Yes, you can create a “proxy” resource
with new data and link to the existing one”
“For instance, we are going to add lat/long
to the NUTS code GR”
22/02/2013 Thessaloniki, Greece 32
34. “We can easily extend the RDF model to
represent information keeping the
semantics”
“Yes, exactly.”
“Wait, wait, wait…we have a data model,
an implicit semantics and a query
language…so this is like a traditional
database”
22/02/2013 Thessaloniki, Greece 34
35. “Yes, there are common similarities…”
Table Graph
E/R model RDF/OWL semantics
SQL SPARQL
And…this is the
Web of Data!
22/02/2013 Thessaloniki, Greece 35
38. “Can I execute SPARQL queries?”
“Yes, you could ask….”
“Give me gymnasts, born in Thessaloniki
that have won an Olympic gold medal,
including their name, date of birth and
some comment about them”
22/02/2013 Thessaloniki, Greece 38
45. What we did…
Define the processes to produce, publish, consume and validate the Linked Data
generated from public procurement notices
Design an ontology for representing domain knowledge
Entities and relationships
..
Apply the aforementioned points to public procurement data:
1M of Public Procurement Notices
9 Product Scheme Classifications (PSCs) from UN, EU, etc.
50K companies/people
+200 Countries
Validate the generated Linked Data and make a comparison with existing
approaches
A survey of 196 criteria
Consume and exploit the generated Linked Data creating a matchmaking service
using different methods
Syntactic search, concept query expansion and a recommending engine
22/02/2013 Thessaloniki, Greece 45
46. If we talk the same language (RDF)
we can easily fulfill the requirements of our
“bed manufacturer”.
We would report the possibility of tendering in
“Asturias” (ES12)
to provide
“Beds” (CPV-33192100)
and other furniture
(CPV-39143116 & CPV-39143310-2).
22/02/2013 Thessaloniki, Greece 46
47. Results
http://purl.org/weso/moldeas/ (it is now being updated)
22/02/2013 Thessaloniki, Greece 47
49. Example of
a simple SPARQL query
SELECT DISTINCT * WHERE{
?ppn rdf:type ppn-def:ppn.
?ppn ppn-def:nutsCode ?nutsCode.
FILTER(?nutsCode = <http://nuts.psi.enakting.org/id/ES12> OR
?nutsCode = <http://nuts.psi.enakting.org/id/GR>) .
?ppn cpv-def:codeIn2008 ?cpvCode.
FILTER(?cpvCode = cpv:33192100 OR
?cpvCode = cpv:39143116 OR
?cpvCode = cpv:39143310) .
?ppn dc:date ?date .
}
http://purl.org/weso/moldeas/
http://bit.ly/XQ0uUV
(see Demo queries)
*This is the old version of MOLDEAS. New procurement notices, etc. are coming soon…
22/02/2013 Thessaloniki, Greece #49
50. What we got…
New way for representing the valuable information of public
procurement notices applying semantic technologies
New datasets that are now part of the LOD Cloud
Dissemination and networking
Expertise, know-how generation and new research lines
…maybe
A step forward to a new way of publishing public data, more
specifically procurement data
Enabling cross-border business opportunities
22/02/2013 Thessaloniki, Greece 50
51. Main Conclusion
We can represent information and
data in public procurement notices
using semantic technologies
(vocabularies, datasets, etc.)
Overcoming most of the problems
in public procurement notices
22/02/2013 Thessaloniki, Greece 51