SlideShare ist ein Scribd-Unternehmen logo
1 von 22
Downloaden Sie, um offline zu lesen
IBM Research – Ireland

Linked	
  Data	
  and	
  Search	
  

Vanessa	
  Lopez	
  
Smarter	
  Ci*es	
  Technology	
  Centre	
  
	
  
IBM	
  Research	
  Ireland	
  
© 2012 IBM Corporation
IBM Research – Ireland

Background:	
  Why	
  Linked	
  Data	
  
Provides	
  explicit	
  seman9cs	
  
Extensible	
  
Interoperability-­‐focused:	
  to	
  enable	
  automa9c	
  discovery	
  and	
  inges9on	
  
Large	
  exis9ng	
  corpora	
  
Fundamentally	
  incremental	
  (like	
  the	
  Web)	
  
W3C	
  standard	
  representa9on	
  and	
  common	
  format	
  
Government	
  push	
  (e.g.	
  data.gov,	
  data.gov.uk,	
  Linked	
  Government	
  Data)	
  

© 2012 IBM Corporation
IBM Research – Ireland

Yes,	
  yes..	
  Richer	
  structured	
  queries	
  
but	
  ..	
  

	
  	
  

..	
  Limited	
  usability	
  for	
  both	
  data	
  
publishers	
  and	
  consumers	
  	
  

© 2012 IBM Corporation
IBM Research – Ireland

How can	
  we	
  help	
  users	
  in	
  querying	
  	
  
and	
  exploring	
  the	
  Seman9c	
  Web	
  content?	
  
	
  

© 2012 IBM Corporation
IBM Research – Ireland

State	
  of	
  the	
  art	
  
•  Seman9c	
  search	
  over	
  messy,	
  heterogeneous	
  
data	
  and	
  mash-­‐ups	
  
•  Exploratory	
  and	
  Faceted	
  systems	
  
•  Query	
  Builders	
  and	
  rela9onship	
  finders	
  	
  
•  Ques9on	
  Answer	
  over	
  Linked	
  Data	
  sources	
  
•  Google	
  knowledge	
  graph	
  
	
  
hVp://technologies.kmi.open.ac.uk/poweraqua	
  

© 2012 IBM Corporation
IBM Research – Ireland

State	
  of	
  the	
  art	
  

© 2012 IBM Corporation
IBM Research – Ireland

Linked	
  Data	
  and	
  Search	
  -­‐	
  Problem	
  domain:	
  

What	
  makes	
  City	
  Data	
  	
  
so	
  special?	
  
How	
  can	
  we	
  make	
  it	
  more	
  
accessible?	
  	
  

© 2012 IBM Corporation
IBM Research – Ireland

Seman9c	
  processing	
  of	
  urban	
  data	
  
–	
  why	
  is	
  different?	
  

•  How	
  can	
  we	
  go	
  from	
  raw	
  data	
  to	
  insight	
  into	
  
the	
  opera9on	
  of	
  a	
  city	
  with	
  minimal	
  effort?	
  
Return-­‐on-­‐Investment	
  

(because	
  data	
  integra9on	
  is	
  expensive)	
  

Fit-­‐for-­‐all	
  

(ci9zen	
  engagement)	
  
© 2012 IBM Corporation
IBM Research – Ireland

Challenges:	
  Big	
  city	
  data	
  
Volume	
  

Velocity	
  

•  Lots	
  of	
  relevant	
  
informa*on	
  
•  Not	
  linked	
  to	
  
authorita*ve	
  sources	
  

•  Streams	
  
•  Frequent	
  updates	
  

Variety	
  

Veracity	
  

•  Different	
  models	
  and	
  file	
  
formats	
  
•  Open	
  domain	
  -­‐	
  Unknown	
  
schema	
  

•  Diverse	
  sources	
  
•  Difficult	
  to	
  do	
  assess	
  
quality	
  

© 2012 IBM Corporation
IBM Research – Ireland

Business	
  case:	
  open	
  data	
  as	
  a	
  means	
  to	
  an	
  end	
  

© 2012 IBM Corporation
IBM Research – Ireland

Business	
  case	
  	
  
•  Why	
  are	
  ambulances	
  late?	
  
Sources	
  of	
  informa*on	
  
•  100’s	
  of	
  datasets	
  from	
  four	
  municipal	
  authori9es	
  in	
  Dublin	
  
•  Most	
  sta9c,	
  some	
  dynamic	
  
•  Social	
  Media:	
  twiVer,	
  LiveDrive,	
  even_ul,	
  eventBright,	
  …	
  
•  Linked	
  Data:	
  DBpedia,	
  ..	
  
•  Vocabularies:	
  IPSV,	
  FOAF,	
  VOID,	
  PROV,	
  DCAT,	
  WSG	
  
Domain	
  of	
  informa*on	
  
•  Loca9ons	
  of	
  Health	
  Services	
  
•  Ambulance	
  call	
  outs	
  and	
  response	
  9mes	
  
•  Tweets	
  about	
  traffic	
  conges9on	
  
•  Geo-­‐located	
  tweets	
  about	
  people	
  movement	
  
•  Road	
  network	
  
•  Event	
  Web	
  Services	
  
•  …	
  
© 2012 IBM Corporation
IBM Research – Ireland

Issues	
  

•  Linked	
  Data	
  to	
  enrich	
  data	
  and	
  give	
  contextual	
  
insight	
  for	
  publishers	
  and	
  consumers:	
  
–  Publish	
  (vocabularies,	
  annota9on)	
  
–  Discovery	
  and	
  Search	
  (metadata	
  /	
  cataloguing,	
  
full-­‐text	
  indexing,	
  seman9c	
  en99es)	
  
–  Link	
  (schema	
  alignment,	
  linked	
  data,	
  social	
  media)	
  
–  Extract	
  interes9ng	
  views	
  
–  Reason	
  (diagnose	
  traffic	
  problems)	
  
Ubiquitous	
  aspects:	
  Provenance,	
  Governance,	
  Performance,	
  Security,	
  Privacy	
  

	
  
© 2012 IBM Corporation
IBM Research – Ireland

Approach–	
  Data	
  model	
  

Documents	
  +	
  
Metadata	
  

Structure	
  

Tabular	
  Graph	
  
C1	
  a	
  Cell	
  
C1	
  inRow	
  r1	
  
C1	
  value	
  “name”	
  
	
  …	
  

En**es	
  

En9ty	
  Graph	
  
e1	
  a	
  En9ty	
  
e1	
  inRow	
  r1	
  
e1	
  inCol	
  c2	
  
	
  …	
  

Links	
  

Views	
  

Annota9on	
  Graph	
   Mapping	
  Graph	
  
e1	
  a	
  En9ty	
  
e1	
  a	
  En9ty	
  
e1	
  rdfs:label	
  “name”	
   e1	
  sameAs	
  e2	
  
e1	
  addr	
  “X	
  st”	
  
…	
  
e1	
  lat	
  :53.23”	
  	
  	
  …	
  

Pay-­‐as-­‐you-­‐go,	
  Gain-­‐as-­‐you-­‐go	
  

• 
• 
• 
• 

Structured	
  metadata	
  -­‐>	
  Queries	
  over	
  the	
  metadata	
  
Files	
  into	
  a	
  standard	
  representa9on	
  -­‐>	
  Queries	
  over	
  the	
  data.	
  
Par9ally	
  integrate	
  schemata	
  -­‐>	
  Queries	
  across	
  datasets.	
  
Integrate	
  globally	
  -­‐>	
  Queries	
  across	
  Web	
  data	
  
© 2012 IBM Corporation

Insight	
  
IBM Research – Ireland

Discovery:	
  Publishing	
  and	
  Cataloguing	
  
•  METADATA	
  
–  Many	
  data	
  publishers	
  and	
  disconnected	
  datasets	
  
–  Link	
  metadata	
  using	
  domain	
  vocabularies:	
  IPSV	
  
–  Convert	
  to	
  simple	
  RDF	
  format	
  

	
  
Vocabulary	
  matching	
  
IPSV	
  

© 2012 IBM Corporation
IBM Research – Ireland

© 2012 IBM Corporation
IBM Research – Ireland

Search	
  and	
  linking	
  

•  Full	
  text	
  indexing	
  for	
  search	
  over	
  metadata	
  and	
  content	
  
•  En9ty	
  linking	
  and	
  naviga9on	
  (keywords,	
  categories,	
  
publishing	
  agencies,	
  regions,..)	
  
•  Open	
  metadata	
  and	
  vocabularies	
  (VOID,	
  PROV,	
  etc)	
  for	
  
data	
  discovery	
  and	
  linking	
  
•  Mining	
  descrip9ons	
  (Dbpedia	
  spotlight)	
  
	
  
Open	
  metadata	
  
Full	
  text	
  indexing	
  
En9ty	
  linking	
  
Mining	
  descrip9ons	
  
© 2012 IBM Corporation
IBM Research – Ireland

Faceted	
  search:	
  “beaches	
  in	
  Fingal”	
  

© 2012 IBM Corporation
IBM Research – Ireland

© 2012 IBM Corporation
IBM Research – Ireland

Content	
  integra9on	
  

•  Incrementally	
  lij	
  data	
  content	
  (beyond	
  search	
  to	
  
querying	
  across	
  datasets	
  content)	
  
–  Extract	
  en99es	
  represented	
  in	
  RDF	
  (PAYGO)	
  
–  Label	
  extrac9on	
  and	
  annota9on	
  
–  Link	
  when	
  we	
  have	
  higher	
  confidence	
  (lat,	
  long)	
  
–  Geo-­‐coding	
  and	
  taxonomy	
  of	
  tweets	
  (traffic)	
  
Geocoding	
  
Label	
  extrac9on	
  
Minimal	
  Entry	
  cost	
  
Provenance-­‐based	
  dataset	
  ranking	
  
© 2012 IBM Corporation
IBM Research – Ireland

Views	
  

•  Beyond	
  search	
  to	
  guiding	
  the	
  user	
  to	
  create	
  
meaningful	
  views:	
  
–  Guide	
  the	
  users	
  to	
  annotate	
  data,	
  recommend	
  
related	
  datasets	
  and	
  create	
  dataviews	
  on	
  the	
  fly	
  
–  Ranking	
  and	
  context-­‐based	
  recommenda9ons	
  
–  Allow	
  seman9c	
  based	
  analysis	
  on	
  mul9ple	
  views	
  
	
  
Hidden	
  informa9on	
  discovery	
  
Cross	
  domain	
  queries	
  
Mul9ple	
  endpoints	
  
Mul9ple	
  interpreta9ons	
  

© 2012 IBM Corporation
IBM Research – Ireland

Demo	
  

•  Currently:	
  Web	
  services	
  and	
  technology	
  
demonstrator	
  
•  Next:	
  Open	
  RDF-­‐based	
  data	
  management	
  deployed	
  
in	
  Dublin	
  City	
  (read/write).	
  Deployment	
  of	
  traffic	
  
diagnoser.	
  
•  SPUD:	
  Seman*c	
  Processing	
  of	
  Urban	
  Data	
  
(2nd	
  prize	
  at	
  the	
  Seman*c	
  Web	
  Challenge	
  –	
  ISWC)	
  

•  Live	
  demo:	
  www.dublinked.ie/sandbox/Seman9cWebChall	
  
	
  

Spyros	
  Kotoulas,	
  Vanessa	
  Lopez,	
  Raymond	
  Lloyd,	
  Marco	
  Luca	
  Sbodio,	
  Freddy	
  
Lecue,	
  Mar;n	
  Stephenson,	
  Elizabeth	
  Daly,	
  Veli	
  Bicer,	
  Aris	
  Gkoulalas-­‐Divanis,	
  
Giusy	
  Di	
  Lorenzo,	
  Anika	
  Schumann,	
  Denis	
  PaFerson,	
  and	
  Pol	
  Mac	
  Aonghusa	
  	
  
© 2012 IBM Corporation
IBM Research – Ireland

Thank	
  you!	
  

	
   Reference	
  Publica9on:	
  
•  QuerioCity:	
  A	
  Linked	
  Data	
  PlaZorm	
  for	
  Urban	
  Informa*on	
  Management	
  
V.	
  Lopez,	
  S.	
  Kotoulas,	
  M.	
  L.	
  Sbodio,	
  M.	
  Stephenson,	
  A.	
  Gkoulalas-­‐Divanis,	
  
P.	
  Mac	
  Aonghusa.	
  In	
  Use	
  track	
  at	
  the	
  11th	
  Interna;onal	
  Seman;c	
  Web	
  
Conference	
  (ISWC).	
  

City	
  Fabric	
  Team:	
  

© 2012 IBM Corporation

Weitere ähnliche Inhalte

Ähnlich wie Vanessa lopez linked data and search

Dublinked tech workshop_15_dec2011
Dublinked tech workshop_15_dec2011Dublinked tech workshop_15_dec2011
Dublinked tech workshop_15_dec2011
Dublinked .
 
ESWC SS 2012 - Wednesday Keynote Spyros Kotoulas : Managing the Information o...
ESWC SS 2012 - Wednesday Keynote Spyros Kotoulas : Managing the Information o...ESWC SS 2012 - Wednesday Keynote Spyros Kotoulas : Managing the Information o...
ESWC SS 2012 - Wednesday Keynote Spyros Kotoulas : Managing the Information o...
eswcsummerschool
 
Lee Feigenbaum Presentation
Lee Feigenbaum PresentationLee Feigenbaum Presentation
Lee Feigenbaum Presentation
Mediabistro
 
Show104 buried treasure
Show104 buried treasureShow104 buried treasure
Show104 buried treasure
Mark Myers
 
SHOW104 - Buried treasure: Finding the Hidden Gold in Lotus Notes Data
SHOW104 - Buried treasure: Finding the Hidden Gold in Lotus Notes DataSHOW104 - Buried treasure: Finding the Hidden Gold in Lotus Notes Data
SHOW104 - Buried treasure: Finding the Hidden Gold in Lotus Notes Data
panagenda
 

Ähnlich wie Vanessa lopez linked data and search (20)

Dublinked tech workshop_15_dec2011
Dublinked tech workshop_15_dec2011Dublinked tech workshop_15_dec2011
Dublinked tech workshop_15_dec2011
 
JavaOne2013 Leveraging Linked Data and OSLC
JavaOne2013 Leveraging Linked Data and OSLCJavaOne2013 Leveraging Linked Data and OSLC
JavaOne2013 Leveraging Linked Data and OSLC
 
The Web of data and web data commons
The Web of data and web data commonsThe Web of data and web data commons
The Web of data and web data commons
 
WebGUI And The Semantic Web
WebGUI And The Semantic WebWebGUI And The Semantic Web
WebGUI And The Semantic Web
 
WWW2014 Overview of W3C Linked Data Platform 20140410
WWW2014 Overview of W3C Linked Data Platform 20140410WWW2014 Overview of W3C Linked Data Platform 20140410
WWW2014 Overview of W3C Linked Data Platform 20140410
 
Session 43 :: Accessing data using a common interface: OGSA-DAI as an example
Session 43 :: Accessing data using a common interface: OGSA-DAI as an exampleSession 43 :: Accessing data using a common interface: OGSA-DAI as an example
Session 43 :: Accessing data using a common interface: OGSA-DAI as an example
 
ESWC SS 2012 - Wednesday Keynote Spyros Kotoulas : Managing the Information o...
ESWC SS 2012 - Wednesday Keynote Spyros Kotoulas : Managing the Information o...ESWC SS 2012 - Wednesday Keynote Spyros Kotoulas : Managing the Information o...
ESWC SS 2012 - Wednesday Keynote Spyros Kotoulas : Managing the Information o...
 
Linked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareLinked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the Software
 
Admiral Group
Admiral GroupAdmiral Group
Admiral Group
 
Lee Feigenbaum Presentation
Lee Feigenbaum PresentationLee Feigenbaum Presentation
Lee Feigenbaum Presentation
 
Building Satori: Web Data Extraction On Hadoop
Building Satori: Web Data Extraction On HadoopBuilding Satori: Web Data Extraction On Hadoop
Building Satori: Web Data Extraction On Hadoop
 
Mongo DB: Operational Big Data Database
Mongo DB: Operational Big Data DatabaseMongo DB: Operational Big Data Database
Mongo DB: Operational Big Data Database
 
Show104 buried treasure
Show104 buried treasureShow104 buried treasure
Show104 buried treasure
 
SHOW104 - Buried treasure: Finding the Hidden Gold in Lotus Notes Data
SHOW104 - Buried treasure: Finding the Hidden Gold in Lotus Notes DataSHOW104 - Buried treasure: Finding the Hidden Gold in Lotus Notes Data
SHOW104 - Buried treasure: Finding the Hidden Gold in Lotus Notes Data
 
How Government Agencies are Using MongoDB to Build Data as a Service Solutions
How Government Agencies are Using MongoDB to Build Data as a Service SolutionsHow Government Agencies are Using MongoDB to Build Data as a Service Solutions
How Government Agencies are Using MongoDB to Build Data as a Service Solutions
 
Data Collection and Integration, Linked Data Management
Data Collection and Integration, Linked Data ManagementData Collection and Integration, Linked Data Management
Data Collection and Integration, Linked Data Management
 
Myth Busters IV: I Access My Data Through APIs–Data Virtualization Can't Do This
Myth Busters IV: I Access My Data Through APIs–Data Virtualization Can't Do ThisMyth Busters IV: I Access My Data Through APIs–Data Virtualization Can't Do This
Myth Busters IV: I Access My Data Through APIs–Data Virtualization Can't Do This
 
Mastering the variety dimension of Big Data with semantic technologies: high ...
Mastering the variety dimension of Big Data with semantic technologies: high ...Mastering the variety dimension of Big Data with semantic technologies: high ...
Mastering the variety dimension of Big Data with semantic technologies: high ...
 
MongoDB & Hadoop - Understanding Your Big Data
MongoDB & Hadoop - Understanding Your Big DataMongoDB & Hadoop - Understanding Your Big Data
MongoDB & Hadoop - Understanding Your Big Data
 
Data-as-a-Service: DataGraft
Data-as-a-Service: DataGraftData-as-a-Service: DataGraft
Data-as-a-Service: DataGraft
 

Mehr von Dublinked .

Mehr von Dublinked . (20)

Route to PA Project Meeting Dublinked Presentation 03.12.2015
Route to PA Project Meeting Dublinked Presentation 03.12.2015Route to PA Project Meeting Dublinked Presentation 03.12.2015
Route to PA Project Meeting Dublinked Presentation 03.12.2015
 
Boost you Open Data with Co-Creation
Boost you Open Data with Co-Creation Boost you Open Data with Co-Creation
Boost you Open Data with Co-Creation
 
Housing Intelligence for Dublin
Housing Intelligence for DublinHousing Intelligence for Dublin
Housing Intelligence for Dublin
 
Organicity - Co-creating Future Cities
Organicity - Co-creating Future CitiesOrganicity - Co-creating Future Cities
Organicity - Co-creating Future Cities
 
The Local Asset Mapping Project (LAMP)
The Local Asset Mapping Project (LAMP)The Local Asset Mapping Project (LAMP)
The Local Asset Mapping Project (LAMP)
 
The 1911 Census
The 1911 Census The 1911 Census
The 1911 Census
 
Future Skills Needs for Data and Analytics
Future Skills Needs for Data and Analytics Future Skills Needs for Data and Analytics
Future Skills Needs for Data and Analytics
 
Girls Hack Ireland
Girls Hack IrelandGirls Hack Ireland
Girls Hack Ireland
 
Dublinked - Celebrating Over Three Years of Open Data for the Dublin Region
Dublinked - Celebrating Over Three Years of Open Data for the Dublin RegionDublinked - Celebrating Over Three Years of Open Data for the Dublin Region
Dublinked - Celebrating Over Three Years of Open Data for the Dublin Region
 
The CSO Open Data Experience
The CSO Open Data ExperienceThe CSO Open Data Experience
The CSO Open Data Experience
 
Data, Infrastructure and Public Policy
Data, Infrastructure and Public PolicyData, Infrastructure and Public Policy
Data, Infrastructure and Public Policy
 
Startup Ireland and the Startup Gathering 2015
Startup Ireland and the Startup Gathering 2015Startup Ireland and the Startup Gathering 2015
Startup Ireland and the Startup Gathering 2015
 
Catalysing research and enterprise collaboration in the data ecosystem
Catalysing research and enterprise collaboration in the data ecosystemCatalysing research and enterprise collaboration in the data ecosystem
Catalysing research and enterprise collaboration in the data ecosystem
 
Open Data StartUp Stories in Ireland
Open Data StartUp Stories in IrelandOpen Data StartUp Stories in Ireland
Open Data StartUp Stories in Ireland
 
Roscommon County Council Open Data Portal
Roscommon County Council Open Data Portal Roscommon County Council Open Data Portal
Roscommon County Council Open Data Portal
 
Developing technology solutions for communities
Developing technology solutions for communitiesDeveloping technology solutions for communities
Developing technology solutions for communities
 
Open Data Ireland: Developing a national open data strategy
Open Data Ireland: Developing a national open data strategyOpen Data Ireland: Developing a national open data strategy
Open Data Ireland: Developing a national open data strategy
 
Open Knowledge Ireland
Open Knowledge IrelandOpen Knowledge Ireland
Open Knowledge Ireland
 
Data Driven Tranportation Analytics
Data Driven Tranportation Analytics Data Driven Tranportation Analytics
Data Driven Tranportation Analytics
 
The Irish Times Data Blog
The Irish Times Data BlogThe Irish Times Data Blog
The Irish Times Data Blog
 

Kürzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Kürzlich hochgeladen (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 

Vanessa lopez linked data and search

  • 1. IBM Research – Ireland Linked  Data  and  Search   Vanessa  Lopez   Smarter  Ci*es  Technology  Centre     IBM  Research  Ireland   © 2012 IBM Corporation
  • 2. IBM Research – Ireland Background:  Why  Linked  Data   Provides  explicit  seman9cs   Extensible   Interoperability-­‐focused:  to  enable  automa9c  discovery  and  inges9on   Large  exis9ng  corpora   Fundamentally  incremental  (like  the  Web)   W3C  standard  representa9on  and  common  format   Government  push  (e.g.  data.gov,  data.gov.uk,  Linked  Government  Data)   © 2012 IBM Corporation
  • 3. IBM Research – Ireland Yes,  yes..  Richer  structured  queries   but  ..       ..  Limited  usability  for  both  data   publishers  and  consumers     © 2012 IBM Corporation
  • 4. IBM Research – Ireland How can  we  help  users  in  querying     and  exploring  the  Seman9c  Web  content?     © 2012 IBM Corporation
  • 5. IBM Research – Ireland State  of  the  art   •  Seman9c  search  over  messy,  heterogeneous   data  and  mash-­‐ups   •  Exploratory  and  Faceted  systems   •  Query  Builders  and  rela9onship  finders     •  Ques9on  Answer  over  Linked  Data  sources   •  Google  knowledge  graph     hVp://technologies.kmi.open.ac.uk/poweraqua   © 2012 IBM Corporation
  • 6. IBM Research – Ireland State  of  the  art   © 2012 IBM Corporation
  • 7. IBM Research – Ireland Linked  Data  and  Search  -­‐  Problem  domain:   What  makes  City  Data     so  special?   How  can  we  make  it  more   accessible?     © 2012 IBM Corporation
  • 8. IBM Research – Ireland Seman9c  processing  of  urban  data   –  why  is  different?   •  How  can  we  go  from  raw  data  to  insight  into   the  opera9on  of  a  city  with  minimal  effort?   Return-­‐on-­‐Investment   (because  data  integra9on  is  expensive)   Fit-­‐for-­‐all   (ci9zen  engagement)   © 2012 IBM Corporation
  • 9. IBM Research – Ireland Challenges:  Big  city  data   Volume   Velocity   •  Lots  of  relevant   informa*on   •  Not  linked  to   authorita*ve  sources   •  Streams   •  Frequent  updates   Variety   Veracity   •  Different  models  and  file   formats   •  Open  domain  -­‐  Unknown   schema   •  Diverse  sources   •  Difficult  to  do  assess   quality   © 2012 IBM Corporation
  • 10. IBM Research – Ireland Business  case:  open  data  as  a  means  to  an  end   © 2012 IBM Corporation
  • 11. IBM Research – Ireland Business  case     •  Why  are  ambulances  late?   Sources  of  informa*on   •  100’s  of  datasets  from  four  municipal  authori9es  in  Dublin   •  Most  sta9c,  some  dynamic   •  Social  Media:  twiVer,  LiveDrive,  even_ul,  eventBright,  …   •  Linked  Data:  DBpedia,  ..   •  Vocabularies:  IPSV,  FOAF,  VOID,  PROV,  DCAT,  WSG   Domain  of  informa*on   •  Loca9ons  of  Health  Services   •  Ambulance  call  outs  and  response  9mes   •  Tweets  about  traffic  conges9on   •  Geo-­‐located  tweets  about  people  movement   •  Road  network   •  Event  Web  Services   •  …   © 2012 IBM Corporation
  • 12. IBM Research – Ireland Issues   •  Linked  Data  to  enrich  data  and  give  contextual   insight  for  publishers  and  consumers:   –  Publish  (vocabularies,  annota9on)   –  Discovery  and  Search  (metadata  /  cataloguing,   full-­‐text  indexing,  seman9c  en99es)   –  Link  (schema  alignment,  linked  data,  social  media)   –  Extract  interes9ng  views   –  Reason  (diagnose  traffic  problems)   Ubiquitous  aspects:  Provenance,  Governance,  Performance,  Security,  Privacy     © 2012 IBM Corporation
  • 13. IBM Research – Ireland Approach–  Data  model   Documents  +   Metadata   Structure   Tabular  Graph   C1  a  Cell   C1  inRow  r1   C1  value  “name”    …   En**es   En9ty  Graph   e1  a  En9ty   e1  inRow  r1   e1  inCol  c2    …   Links   Views   Annota9on  Graph   Mapping  Graph   e1  a  En9ty   e1  a  En9ty   e1  rdfs:label  “name”   e1  sameAs  e2   e1  addr  “X  st”   …   e1  lat  :53.23”      …   Pay-­‐as-­‐you-­‐go,  Gain-­‐as-­‐you-­‐go   •  •  •  •  Structured  metadata  -­‐>  Queries  over  the  metadata   Files  into  a  standard  representa9on  -­‐>  Queries  over  the  data.   Par9ally  integrate  schemata  -­‐>  Queries  across  datasets.   Integrate  globally  -­‐>  Queries  across  Web  data   © 2012 IBM Corporation Insight  
  • 14. IBM Research – Ireland Discovery:  Publishing  and  Cataloguing   •  METADATA   –  Many  data  publishers  and  disconnected  datasets   –  Link  metadata  using  domain  vocabularies:  IPSV   –  Convert  to  simple  RDF  format     Vocabulary  matching   IPSV   © 2012 IBM Corporation
  • 15. IBM Research – Ireland © 2012 IBM Corporation
  • 16. IBM Research – Ireland Search  and  linking   •  Full  text  indexing  for  search  over  metadata  and  content   •  En9ty  linking  and  naviga9on  (keywords,  categories,   publishing  agencies,  regions,..)   •  Open  metadata  and  vocabularies  (VOID,  PROV,  etc)  for   data  discovery  and  linking   •  Mining  descrip9ons  (Dbpedia  spotlight)     Open  metadata   Full  text  indexing   En9ty  linking   Mining  descrip9ons   © 2012 IBM Corporation
  • 17. IBM Research – Ireland Faceted  search:  “beaches  in  Fingal”   © 2012 IBM Corporation
  • 18. IBM Research – Ireland © 2012 IBM Corporation
  • 19. IBM Research – Ireland Content  integra9on   •  Incrementally  lij  data  content  (beyond  search  to   querying  across  datasets  content)   –  Extract  en99es  represented  in  RDF  (PAYGO)   –  Label  extrac9on  and  annota9on   –  Link  when  we  have  higher  confidence  (lat,  long)   –  Geo-­‐coding  and  taxonomy  of  tweets  (traffic)   Geocoding   Label  extrac9on   Minimal  Entry  cost   Provenance-­‐based  dataset  ranking   © 2012 IBM Corporation
  • 20. IBM Research – Ireland Views   •  Beyond  search  to  guiding  the  user  to  create   meaningful  views:   –  Guide  the  users  to  annotate  data,  recommend   related  datasets  and  create  dataviews  on  the  fly   –  Ranking  and  context-­‐based  recommenda9ons   –  Allow  seman9c  based  analysis  on  mul9ple  views     Hidden  informa9on  discovery   Cross  domain  queries   Mul9ple  endpoints   Mul9ple  interpreta9ons   © 2012 IBM Corporation
  • 21. IBM Research – Ireland Demo   •  Currently:  Web  services  and  technology   demonstrator   •  Next:  Open  RDF-­‐based  data  management  deployed   in  Dublin  City  (read/write).  Deployment  of  traffic   diagnoser.   •  SPUD:  Seman*c  Processing  of  Urban  Data   (2nd  prize  at  the  Seman*c  Web  Challenge  –  ISWC)   •  Live  demo:  www.dublinked.ie/sandbox/Seman9cWebChall     Spyros  Kotoulas,  Vanessa  Lopez,  Raymond  Lloyd,  Marco  Luca  Sbodio,  Freddy   Lecue,  Mar;n  Stephenson,  Elizabeth  Daly,  Veli  Bicer,  Aris  Gkoulalas-­‐Divanis,   Giusy  Di  Lorenzo,  Anika  Schumann,  Denis  PaFerson,  and  Pol  Mac  Aonghusa     © 2012 IBM Corporation
  • 22. IBM Research – Ireland Thank  you!     Reference  Publica9on:   •  QuerioCity:  A  Linked  Data  PlaZorm  for  Urban  Informa*on  Management   V.  Lopez,  S.  Kotoulas,  M.  L.  Sbodio,  M.  Stephenson,  A.  Gkoulalas-­‐Divanis,   P.  Mac  Aonghusa.  In  Use  track  at  the  11th  Interna;onal  Seman;c  Web   Conference  (ISWC).   City  Fabric  Team:   © 2012 IBM Corporation