SlideShare ist ein Scribd-Unternehmen logo
1 von 30
Downloaden Sie, um offline zu lesen
Freebase
A socially managed semantic database



Jamie Taylor
SemTech 2010 Data Camp
Freebase has Many Types of Things
12 Million Topics
A Multiplicity Strong Identifiers

            http://rdf.freebase.com/ns/en.berlin_wall




            http://www.ellerdale.com/topics/view/0080-6ba0




            http://www.bbc.co.uk/music/artists/7f347782-eb14-40c3-98e2-17b6e1bfe56c

                   http://musicbrainz.org/artist/7f347782-eb14-40c3-98e2-17b6e1bfe56c

http://rdf.freebase.com/ns/authority.musicbrainz.7f347782-eb14-40c3-98e2-17b6e1bfe56c
Relations
contains
                          400 Million
           contained-by

                                  event               label
                                          albums

                            member-of
                                          member-of

           nationality

                           education
                                          education

                          contained-by
What’s in Freebase?
http://www.bestbuy.com/site/She+Wolf…

              http://www.daylife.com/topic/Shakira

                         http://twitter.com/shakira

                  http://www.facebook.com/shakira

                  http://www.myspace.com/shakira

                  http://www.last.fm/music/Shakira

http://www.netflix.com/RoleDisplay/Shakira/20046629

          http://www.guardian.co.uk/music/shakira
99% pure

All data undergoes rigorous QA before load
Major focus is reconciliation
Use sampling to assure 99% accuracy
Data that does not meet 99% accuracy is not loaded
What's been built on Freebase?
Up to 100,000 Queries a Day




 Quarterly dumps of graph
    http://download.freebase.com
Users contribute data




Users extend the data model
The Freebase Commons
                      Top-level domains
                      ·American football       ·Internet
                      ·Anime/Manga             ·Language
                      ·Architecture            ·Law
                      ·Astronomy               ·Library
                      ·Automotive              ·Location
                      ·Aviation                ·Martial Arts
                      ·Awards                  ·Measurement Unit
                      ·Baseball                ·Media Common
                      ·Basketball              ·Medicine
                      ·Bicycles                ·Metaweb Types
                      ·Biology                 ·Meteorology
                      ·Boats                   ·Military
                      ·Broadcast               ·Music
                      ·Business                ·Olympics
                      ·Celebrities             ·Opera
                      ·Chemistry               ·Organization
                      ·Comics                  ·People
                      ·Common                  ·Geography
                      ·Computers               ·Projects
                      ·Conferences             ·Protected Places
                      ·Cricket                 ·Publishing
                      ·Data World              ·Radio
                      ·Digicams                ·Rail
                      ·Education               ·Religion
                      ·Engineering             ·Royalty
                      ·Event                   ·Soccer
                      ·Clothing and Textiles   ·Spaceflight
                      ·Fictional Universes     ·Sports
                      ·Film                    ·Symbols
                      ·Food & Drink            ·Tennis
                      ·Freebase                ·Theater
                      ·Games                   ·Time
                      ·Geology                 ·Transportation




schema = vocabulary
                      ·Government              ·Travel
                      ·Hobbies and Interests   ·TV
                      ·Ice Hockey              ·Video Games
                      ·Influence               ·Visual Art
The Scope of Schema
   10,448 Properties
      describing
     4,936 Types*
     organized into
     641 Domains
     (77 Commons)
            *types with 10 or more instances
Strength through Exemplars
                                                   Type Instances


            100,000,000


             10,000,000



                                                              >10 instances,
              1,000,000


               100,000
                                                              4936 types
Instances




                10,000


                  1,000
                                                              1424 Commons
                   100


                    10


                     1
                          0   1000   2000   3000   4000   5000    6000   7000   8000   9000   10000 11000
                                                                 Rank
Metaweb Query Language
      [{
           "name" : null,
           "type" : "/film/film"
      }]




               MQL
[{
     "name" : null,
     "type" : "/film/film",
     "directed_by":{"id":"/en/george_lucas"},
     "starring":[{
            "actor":{"id":"/en/harrison_ford"}
         }]
}]




                      MQL
[{
      "name" : null,
      "type" : "/film/film",
      "directed_by":{"id":"/en/george_lucas"},
      "starring": [{
          "actor": {
             "name": null,
             "film": [{
                 "film": {"id": "/en/the_great_escape"}
             }]
          }
     }]
}]


                     Donald Pleasence
                        THX 1138
Freebase Suggest
Reconciliation
        {
             "/type/object/name":"Blade Runner",
             "/type/object/type":"/film/film",
             "/film/film/starring/actor":["Harrison Ford", "Rutger Hauer"],
             "/film/film/director":"Ridley Scott",
             "/film/film/release_date_s":"1981"
         }
[{
     "id":"/guid/9202a8c04000641f8000000000009e89",
     "name":["Blade Runner", "Bladerunner"],
     "score":1.4320519,
     "match":true,
     "type":["/common/topic", "/film/film","/media_common/adapted_work", "/award/award_winning_work",
     ]},
 {
     "id":"/guid/9202a8c04000641f80000000002643d0",
     "name":["Blade"],
     "score":0.48852453,
     "match":false,
     "type":["/common/topic", "/film/film", "/award/award_winning_work", "/award/award_nominated_work",
     ]}

               http://data.labs.freebase.com/recon/
Topic Blocks
Topic API
         Shortcut to building Topic displays
         Two forms:
             basic (names, types, description)
             standard (basic + keys, properties)




http://www.freebase.com/experimental/topic/standard?id=/en/ncis
Geo Search API



Semantic              Spatial              Semantic




      http://www.freebase.com/docs/geosearch
Gridworks
Acre Development Environment
Getting Started++
•   Freebase Documentation Hub
    •   http://www.freebase.com/docs
•   Developer Mailing List
    •   http://lists.freebase.com/mailman/listinfo/freebase-discuss
    •   http://freebase.markmail.org
•   Real Time help on IRC
    •   Freenode #freebase
•   Freebase Happenings
    •   http://blog.freebase.com
•   About the Graph Store
    •   Google: "ACM SIGMOD schema last tuple store"

Weitere ähnliche Inhalte

Ähnlich wie Freebase - Semantic Technologies 2010 Code Camp

Freebase API @ HackTO 2
Freebase API @ HackTO 2Freebase API @ HackTO 2
Freebase API @ HackTO 2narphorium
 
Text Analytic Summit 2010
Text Analytic Summit 2010Text Analytic Summit 2010
Text Analytic Summit 2010Jamie Taylor
 
Real-time Semantic Web with Twitter Annotations
Real-time Semantic Web with Twitter AnnotationsReal-time Semantic Web with Twitter Annotations
Real-time Semantic Web with Twitter AnnotationsJoshua Shinavier
 
ServerSide Javascript on Freebase - SF JavaScript meetup #9
ServerSide Javascript on Freebase - SF JavaScript meetup #9ServerSide Javascript on Freebase - SF JavaScript meetup #9
ServerSide Javascript on Freebase - SF JavaScript meetup #9Will Moffat
 
YQL:: Select * from Internet
YQL:: Select * from InternetYQL:: Select * from Internet
YQL:: Select * from Internetdrgath
 
The NoTube BeanCounter: Aggregating User Data for Television Programme Recomm...
The NoTube BeanCounter: Aggregating User Data for Television Programme Recomm...The NoTube BeanCounter: Aggregating User Data for Television Programme Recomm...
The NoTube BeanCounter: Aggregating User Data for Television Programme Recomm...MODUL Technology GmbH
 
YQL: Select * from Internet
YQL: Select * from InternetYQL: Select * from Internet
YQL: Select * from Internetdrgath
 
Ruby Kaigi July 2009 Tokyo (Japanese)
Ruby Kaigi July 2009 Tokyo (Japanese)Ruby Kaigi July 2009 Tokyo (Japanese)
Ruby Kaigi July 2009 Tokyo (Japanese)Adhearsion Foundation
 
yourHistory - entity linking for a personalized timeline of historic events
yourHistory - entity linking for a personalized timeline of historic eventsyourHistory - entity linking for a personalized timeline of historic events
yourHistory - entity linking for a personalized timeline of historic eventsDavid Graus
 
Iccv2009 recognition and learning object categories p3 c00 - summary and da...
Iccv2009 recognition and learning object categories   p3 c00 - summary and da...Iccv2009 recognition and learning object categories   p3 c00 - summary and da...
Iccv2009 recognition and learning object categories p3 c00 - summary and da...zukun
 
R, Data Wrangling & Kaggle Data Science Competitions
R, Data Wrangling & Kaggle Data Science CompetitionsR, Data Wrangling & Kaggle Data Science Competitions
R, Data Wrangling & Kaggle Data Science CompetitionsKrishna Sankar
 
How Brands Can Survive & Thrive Online - Digital Evolution
How Brands Can Survive & Thrive Online - Digital EvolutionHow Brands Can Survive & Thrive Online - Digital Evolution
How Brands Can Survive & Thrive Online - Digital EvolutionAndrea Vascellari
 
Sounddogsppt
SounddogspptSounddogsppt
Sounddogspptpoopshkin
 
A Training & Simulation Perspective on Maritime Information & Automation
A Training & Simulation Perspective on Maritime Information & AutomationA Training & Simulation Perspective on Maritime Information & Automation
A Training & Simulation Perspective on Maritime Information & AutomationAndy Fawkes
 
Looking at Content Recommendations through a Search Lens - Extended Version
Looking at Content Recommendations through a Search Lens - Extended VersionLooking at Content Recommendations through a Search Lens - Extended Version
Looking at Content Recommendations through a Search Lens - Extended VersionSonya Liberman
 
Evaluating Methods to Rediscover Missing Web Pages from the Web Infrastructure
Evaluating Methods to Rediscover Missing Web Pages from the Web InfrastructureEvaluating Methods to Rediscover Missing Web Pages from the Web Infrastructure
Evaluating Methods to Rediscover Missing Web Pages from the Web InfrastructureMartin Klein
 
COMP 4010 - Lecture 7: Introduction to Augmented Reality
COMP 4010 - Lecture 7: Introduction to Augmented RealityCOMP 4010 - Lecture 7: Introduction to Augmented Reality
COMP 4010 - Lecture 7: Introduction to Augmented RealityMark Billinghurst
 

Ähnlich wie Freebase - Semantic Technologies 2010 Code Camp (19)

Freebase API @ HackTO 2
Freebase API @ HackTO 2Freebase API @ HackTO 2
Freebase API @ HackTO 2
 
Text Analytic Summit 2010
Text Analytic Summit 2010Text Analytic Summit 2010
Text Analytic Summit 2010
 
Real-time Semantic Web with Twitter Annotations
Real-time Semantic Web with Twitter AnnotationsReal-time Semantic Web with Twitter Annotations
Real-time Semantic Web with Twitter Annotations
 
ServerSide Javascript on Freebase - SF JavaScript meetup #9
ServerSide Javascript on Freebase - SF JavaScript meetup #9ServerSide Javascript on Freebase - SF JavaScript meetup #9
ServerSide Javascript on Freebase - SF JavaScript meetup #9
 
YQL:: Select * from Internet
YQL:: Select * from InternetYQL:: Select * from Internet
YQL:: Select * from Internet
 
The NoTube BeanCounter: Aggregating User Data for Television Programme Recomm...
The NoTube BeanCounter: Aggregating User Data for Television Programme Recomm...The NoTube BeanCounter: Aggregating User Data for Television Programme Recomm...
The NoTube BeanCounter: Aggregating User Data for Television Programme Recomm...
 
ChContext
ChContextChContext
ChContext
 
YQL: Select * from Internet
YQL: Select * from InternetYQL: Select * from Internet
YQL: Select * from Internet
 
Ruby Kaigi July 2009 Tokyo (Japanese)
Ruby Kaigi July 2009 Tokyo (Japanese)Ruby Kaigi July 2009 Tokyo (Japanese)
Ruby Kaigi July 2009 Tokyo (Japanese)
 
yourHistory - entity linking for a personalized timeline of historic events
yourHistory - entity linking for a personalized timeline of historic eventsyourHistory - entity linking for a personalized timeline of historic events
yourHistory - entity linking for a personalized timeline of historic events
 
Iccv2009 recognition and learning object categories p3 c00 - summary and da...
Iccv2009 recognition and learning object categories   p3 c00 - summary and da...Iccv2009 recognition and learning object categories   p3 c00 - summary and da...
Iccv2009 recognition and learning object categories p3 c00 - summary and da...
 
SC in SL
SC in SLSC in SL
SC in SL
 
R, Data Wrangling & Kaggle Data Science Competitions
R, Data Wrangling & Kaggle Data Science CompetitionsR, Data Wrangling & Kaggle Data Science Competitions
R, Data Wrangling & Kaggle Data Science Competitions
 
How Brands Can Survive & Thrive Online - Digital Evolution
How Brands Can Survive & Thrive Online - Digital EvolutionHow Brands Can Survive & Thrive Online - Digital Evolution
How Brands Can Survive & Thrive Online - Digital Evolution
 
Sounddogsppt
SounddogspptSounddogsppt
Sounddogsppt
 
A Training & Simulation Perspective on Maritime Information & Automation
A Training & Simulation Perspective on Maritime Information & AutomationA Training & Simulation Perspective on Maritime Information & Automation
A Training & Simulation Perspective on Maritime Information & Automation
 
Looking at Content Recommendations through a Search Lens - Extended Version
Looking at Content Recommendations through a Search Lens - Extended VersionLooking at Content Recommendations through a Search Lens - Extended Version
Looking at Content Recommendations through a Search Lens - Extended Version
 
Evaluating Methods to Rediscover Missing Web Pages from the Web Infrastructure
Evaluating Methods to Rediscover Missing Web Pages from the Web InfrastructureEvaluating Methods to Rediscover Missing Web Pages from the Web Infrastructure
Evaluating Methods to Rediscover Missing Web Pages from the Web Infrastructure
 
COMP 4010 - Lecture 7: Introduction to Augmented Reality
COMP 4010 - Lecture 7: Introduction to Augmented RealityCOMP 4010 - Lecture 7: Introduction to Augmented Reality
COMP 4010 - Lecture 7: Introduction to Augmented Reality
 

Mehr von Jamie Taylor

The next phase of Web2.0: Data
The next phase of Web2.0: DataThe next phase of Web2.0: Data
The next phase of Web2.0: DataJamie Taylor
 
Public private-cloud
Public private-cloudPublic private-cloud
Public private-cloudJamie Taylor
 
Using Semantics to Enhance Content
Using Semantics to Enhance ContentUsing Semantics to Enhance Content
Using Semantics to Enhance ContentJamie Taylor
 
Freebase Workshop, December 2009
Freebase Workshop, December 2009Freebase Workshop, December 2009
Freebase Workshop, December 2009Jamie Taylor
 
Using Semantics to Enhance Content Publishing
Using Semantics to Enhance Content PublishingUsing Semantics to Enhance Content Publishing
Using Semantics to Enhance Content PublishingJamie Taylor
 
ISWC 2009 Consuming LOD
ISWC 2009 Consuming LODISWC 2009 Consuming LOD
ISWC 2009 Consuming LODJamie Taylor
 
Drupal and the Semantic Web
Drupal and the Semantic WebDrupal and the Semantic Web
Drupal and the Semantic WebJamie Taylor
 

Mehr von Jamie Taylor (7)

The next phase of Web2.0: Data
The next phase of Web2.0: DataThe next phase of Web2.0: Data
The next phase of Web2.0: Data
 
Public private-cloud
Public private-cloudPublic private-cloud
Public private-cloud
 
Using Semantics to Enhance Content
Using Semantics to Enhance ContentUsing Semantics to Enhance Content
Using Semantics to Enhance Content
 
Freebase Workshop, December 2009
Freebase Workshop, December 2009Freebase Workshop, December 2009
Freebase Workshop, December 2009
 
Using Semantics to Enhance Content Publishing
Using Semantics to Enhance Content PublishingUsing Semantics to Enhance Content Publishing
Using Semantics to Enhance Content Publishing
 
ISWC 2009 Consuming LOD
ISWC 2009 Consuming LODISWC 2009 Consuming LOD
ISWC 2009 Consuming LOD
 
Drupal and the Semantic Web
Drupal and the Semantic WebDrupal and the Semantic Web
Drupal and the Semantic Web
 

Kürzlich hochgeladen

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 

Kürzlich hochgeladen (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

Freebase - Semantic Technologies 2010 Code Camp

  • 1. Freebase A socially managed semantic database Jamie Taylor SemTech 2010 Data Camp
  • 2.
  • 3. Freebase has Many Types of Things
  • 5.
  • 6. A Multiplicity Strong Identifiers http://rdf.freebase.com/ns/en.berlin_wall http://www.ellerdale.com/topics/view/0080-6ba0 http://www.bbc.co.uk/music/artists/7f347782-eb14-40c3-98e2-17b6e1bfe56c http://musicbrainz.org/artist/7f347782-eb14-40c3-98e2-17b6e1bfe56c http://rdf.freebase.com/ns/authority.musicbrainz.7f347782-eb14-40c3-98e2-17b6e1bfe56c
  • 7. Relations contains 400 Million contained-by event label albums member-of member-of nationality education education contained-by
  • 9.
  • 10. http://www.bestbuy.com/site/She+Wolf… http://www.daylife.com/topic/Shakira http://twitter.com/shakira http://www.facebook.com/shakira http://www.myspace.com/shakira http://www.last.fm/music/Shakira http://www.netflix.com/RoleDisplay/Shakira/20046629 http://www.guardian.co.uk/music/shakira
  • 11. 99% pure All data undergoes rigorous QA before load Major focus is reconciliation Use sampling to assure 99% accuracy Data that does not meet 99% accuracy is not loaded
  • 12. What's been built on Freebase?
  • 13. Up to 100,000 Queries a Day Quarterly dumps of graph http://download.freebase.com
  • 14.
  • 15.
  • 16. Users contribute data Users extend the data model
  • 17. The Freebase Commons Top-level domains ·American football ·Internet ·Anime/Manga ·Language ·Architecture ·Law ·Astronomy ·Library ·Automotive ·Location ·Aviation ·Martial Arts ·Awards ·Measurement Unit ·Baseball ·Media Common ·Basketball ·Medicine ·Bicycles ·Metaweb Types ·Biology ·Meteorology ·Boats ·Military ·Broadcast ·Music ·Business ·Olympics ·Celebrities ·Opera ·Chemistry ·Organization ·Comics ·People ·Common ·Geography ·Computers ·Projects ·Conferences ·Protected Places ·Cricket ·Publishing ·Data World ·Radio ·Digicams ·Rail ·Education ·Religion ·Engineering ·Royalty ·Event ·Soccer ·Clothing and Textiles ·Spaceflight ·Fictional Universes ·Sports ·Film ·Symbols ·Food & Drink ·Tennis ·Freebase ·Theater ·Games ·Time ·Geology ·Transportation schema = vocabulary ·Government ·Travel ·Hobbies and Interests ·TV ·Ice Hockey ·Video Games ·Influence ·Visual Art
  • 18. The Scope of Schema 10,448 Properties describing 4,936 Types* organized into 641 Domains (77 Commons) *types with 10 or more instances
  • 19. Strength through Exemplars Type Instances 100,000,000 10,000,000 >10 instances, 1,000,000 100,000 4936 types Instances 10,000 1,000 1424 Commons 100 10 1 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 11000 Rank
  • 20. Metaweb Query Language [{ "name" : null, "type" : "/film/film" }] MQL
  • 21. [{ "name" : null, "type" : "/film/film", "directed_by":{"id":"/en/george_lucas"}, "starring":[{ "actor":{"id":"/en/harrison_ford"} }] }] MQL
  • 22. [{ "name" : null, "type" : "/film/film", "directed_by":{"id":"/en/george_lucas"}, "starring": [{ "actor": { "name": null, "film": [{ "film": {"id": "/en/the_great_escape"} }] } }] }] Donald Pleasence THX 1138
  • 24. Reconciliation { "/type/object/name":"Blade Runner", "/type/object/type":"/film/film", "/film/film/starring/actor":["Harrison Ford", "Rutger Hauer"], "/film/film/director":"Ridley Scott", "/film/film/release_date_s":"1981" } [{ "id":"/guid/9202a8c04000641f8000000000009e89", "name":["Blade Runner", "Bladerunner"], "score":1.4320519, "match":true, "type":["/common/topic", "/film/film","/media_common/adapted_work", "/award/award_winning_work", ]}, { "id":"/guid/9202a8c04000641f80000000002643d0", "name":["Blade"], "score":0.48852453, "match":false, "type":["/common/topic", "/film/film", "/award/award_winning_work", "/award/award_nominated_work", ]} http://data.labs.freebase.com/recon/
  • 26. Topic API Shortcut to building Topic displays Two forms: basic (names, types, description) standard (basic + keys, properties) http://www.freebase.com/experimental/topic/standard?id=/en/ncis
  • 27. Geo Search API Semantic Spatial Semantic http://www.freebase.com/docs/geosearch
  • 30. Getting Started++ • Freebase Documentation Hub • http://www.freebase.com/docs • Developer Mailing List • http://lists.freebase.com/mailman/listinfo/freebase-discuss • http://freebase.markmail.org • Real Time help on IRC • Freenode #freebase • Freebase Happenings • http://blog.freebase.com • About the Graph Store • Google: "ACM SIGMOD schema last tuple store"