SlideShare ist ein Scribd-Unternehmen logo
1 von 19
Downloaden Sie, um offline zu lesen
Calais
Thomson Reuters Calais Initiative
Overview
• Going to discuss five basic topics


  – What is Calais?
  – Why we’re doing it & what our goals are
  – How it works / What’s under the hood?
  – A few examples
  – Where it’s headed
Calais…


• Calais extracts smart metadata from unstructured
  text and links that metadata to the Linked Data
  cloud.
Calais progress to date
• Launched in late January, 2008
• 9,500 developers have joined
  OpenCalais.com
• 1-3 million content ‘transactions’ per day
• Delivered four major update releases
• Free (as in free) for commercial or non-
  commercial use
5
                                                3                    Which provides
                                   Metadata
                                                                     information and
             1                    returned to
                                                                       other Linked
                                    the user
Unstructur                                                             Data pointers
                                   with keys
 ed Text




                                                                 4
                                                       Keys
                                                     provide
                                                    access to
                                                    the Calais
                              2                       Linked
                   Calais
                                                    Data cloud
                                                                                                        6
                  extracts
                  entities,                                                        To a range of open
                                                                                   and partner Linked
                 facts and
                                                                                      data assets,
                   events
                                                                                        including
                                                                                   Thomson Reuters
Quick Demo
 You can find the Calais Viewer demonstration tool here:
 http://viewer.opencalais.com (Note that the Calais Viewer is not the
 Calais service. It is merely a demonstration of how the service works.)
  – Copy and paste the text of a business news article from AP, Dow Jones
    or Reuters.com into the viewer, and press submit. The article is sent to
    the Calais engine which tags the content and returns it, marked-up.
  – The tags appear on the left hand rail, and you can click on the plus (+)
    sign to see the tags expand.
  – Since we are now on Calais 4.0, you can also use the viewer to see the
    Linked Data assets related to the tags Calais returns.
       • Click on a company name on the left hand rail to find a Calais summary page
         featuring a basic description for that company, as well as a number of links.
       • Follow those links to see the other data entries on that company that are
         available for public use in the Linked Data Cloud.
  – For example, here is the Calais summary page for IBM:
    http://d.opencalais.com/er/company/ralg-tr1r/9e3f6c34-aa6b-3a3b-b221-
    a07aa7933633.html
  – And here is the summary page for IBM in DBPedia (the Wikipedia
    translated into computer language): http://dbpedia.org/page/IBM
Why & What


 1. Derive semantic metadata from textual assets
 2. Use that semantic metadata to create entry points into
    the linked data ecosystem
 3. Provide a simple mechanism for the sharing of semantic
    metadata about textual content assets
 4. And just why are you doing this…
1: Semantics from Text: The Text Problem

                      • People consume text
                      • Most of it isn’t semantically enabled
                      • Most of it won’t be semantically
                        enabled
                      • This isn’t about standards –
                        microfromats vs RDFa vs.
                        whatever.


                      • Why: Latency, cost and short shelf-
                        life
1: Semantics from Text: The Text Problem
                                                                         • Target areas
                                                                           where:
Years
                                                                           – The economics
                                                         Great
                                                         Novels
                                                                             don’t support
                                               Scient.
        Shelf Life



                                                                             metadata
                                                Pubs
                                                                             creation
                                      Legacy
                                                                           – The value of
                                       News
                                                                             metadata is
                               New
                               Gen
                                                                             potentially high
                               News
                                                                           – The value of
Seconds               Tweets
                                                                             aggregated
                                                                             metadata is
                                      Latency
                        ds




                                                                             potentially
                     on




                                                                    rs
                                                                             extremely high
                                                                     a
                   c




                                                                  Ye
                Se
2: Getting from Text to the Linked Data Ecosystem
The Linked Data Cloud
3: Semantic Metadata Transport Layer
                        • I’m a content producer.
                          We’ve loaded the car
                          with rich semantic
                          metadata


                           – I’m sharing it within my
                             four walls
                           – How do I transport it to
                             my consumers?
                           – RSS / Atom, XML,
                             Proprietary data feeds,
                             Content API’s
4: Why We’re Doing It


• Two simple answers:


  – Hyper-evolution of capabilities – better, faster, stronger


  – The walled garden content world
How it Works – Under the Hood of Calais
How it Works – Under the Hood of Calais

                                                               Document
                                                                 Level
                                                               Metadata
                                                  Metadata
                               Reference         Management
                              Data Assets
                                                              Entity Level
                                                              Linked Data
                                                                 and …
                 Stat Tools
                              Disambig.
                ClearForest
   Calais Web
                                            RD
                               Engine
                NLP Engine
    Service
                                            F

                Rule   Lexi
                Base   cons
                                                   Output
                                                 Formatting
Where From Here?
• We’ve seen examples of first generation uses.
• Where does this go in the future?
• Beyond the document
  – Social Resume analysis
  – Museum Content Coalitions
  – Knowledge Management Applications
  – Investigative Journalism*
Investigative Journalism



FOIA       Calais Web   Company:Contract
Contract    Service     Company:Affiliation
Document
s

                                              Big Fuzzy Graph

News       Calais Web   Company:Person
            Service     FamilyRelation
What’s in the Pipeline?
• 2009 (this is a fuzzy list)


  – Person disambiguation @ domain level?
  – Other disambiguation
  – Continued expansion of URI’s (entities & events)
  – Calais as hub
  – Exposure of the IDE?
  – User managed lexicons
  – Languages
  – Opt-in SPARQL Endpoint?
• www.opencalais.com

  – Gallery – code and applications examples
  – Forums
  – Documentation


• Twitter @opencalais, Facebook Group

Weitere ähnliche Inhalte

Ähnlich wie Open Calais For SF And LA Meetups

OpenCalais At The San Diego Software Industry Council
OpenCalais At The San Diego Software Industry CouncilOpenCalais At The San Diego Software Industry Council
OpenCalais At The San Diego Software Industry CouncilKrista Thomas
 
Web 3 0 Krista Thomas 1 26 10
Web 3 0 Krista Thomas 1 26 10Web 3 0 Krista Thomas 1 26 10
Web 3 0 Krista Thomas 1 26 10Krista Thomas
 
Cloud Computing through FCAPS Managed Services in a Virtualized Data Center
Cloud Computing through FCAPS Managed Services in a Virtualized Data CenterCloud Computing through FCAPS Managed Services in a Virtualized Data Center
Cloud Computing through FCAPS Managed Services in a Virtualized Data Centervsarathy
 
Computing for Human Experience and Wellness
Computing for Human Experience and WellnessComputing for Human Experience and Wellness
Computing for Human Experience and WellnessAmit Sheth
 
Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10keirdo1
 
Cutting Big Data Down to Size with AMD and Dell
Cutting Big Data Down to Size with AMD and DellCutting Big Data Down to Size with AMD and Dell
Cutting Big Data Down to Size with AMD and DellAMD
 
제1회 Korea Community Day 발표자료 Bigdata
제1회 Korea Community Day 발표자료 Bigdata 제1회 Korea Community Day 발표자료 Bigdata
제1회 Korea Community Day 발표자료 Bigdata Gruter
 
Django and Neo4j - Domain modeling that kicks ass
Django and Neo4j - Domain modeling that kicks assDjango and Neo4j - Domain modeling that kicks ass
Django and Neo4j - Domain modeling that kicks assTobias Lindaaker
 
NOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4jNOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4jTobias Lindaaker
 
Mark logic Corporate Overview
Mark logic Corporate OverviewMark logic Corporate Overview
Mark logic Corporate OverviewTony Agresta
 
Metadata: Increasing Value in Digital Content Competition Flyer
Metadata: Increasing Value in Digital Content Competition FlyerMetadata: Increasing Value in Digital Content Competition Flyer
Metadata: Increasing Value in Digital Content Competition FlyerChinwag
 
Flash session -streaming--ses1243-lon
Flash session -streaming--ses1243-lonFlash session -streaming--ses1243-lon
Flash session -streaming--ses1243-lonJeffrey T. Pollock
 
[Day 3] Building Sustainable Communities
[Day 3] Building Sustainable Communities[Day 3] Building Sustainable Communities
[Day 3] Building Sustainable Communitiescsi2009
 
Soeren okfn greece meetup
Soeren okfn greece meetupSoeren okfn greece meetup
Soeren okfn greece meetupOKFN-GR
 
Massive Data Analytics and the Cloud
Massive Data Analytics and the CloudMassive Data Analytics and the Cloud
Massive Data Analytics and the CloudBooz Allen Hamilton
 
100615 htap network_brussels
100615 htap network_brussels100615 htap network_brussels
100615 htap network_brusselsRudolf Husar
 
NOSQL Overview Lightning Talk (Scalability Geekcruise 2009)
NOSQL Overview Lightning Talk (Scalability Geekcruise 2009)NOSQL Overview Lightning Talk (Scalability Geekcruise 2009)
NOSQL Overview Lightning Talk (Scalability Geekcruise 2009)Emil Eifrem
 
Conférence Open Data par où commencer ? "How to achieve interoperability?" E....
Conférence Open Data par où commencer ? "How to achieve interoperability?" E....Conférence Open Data par où commencer ? "How to achieve interoperability?" E....
Conférence Open Data par où commencer ? "How to achieve interoperability?" E....Aline Custodio
 

Ähnlich wie Open Calais For SF And LA Meetups (20)

OpenCalais At The San Diego Software Industry Council
OpenCalais At The San Diego Software Industry CouncilOpenCalais At The San Diego Software Industry Council
OpenCalais At The San Diego Software Industry Council
 
Web 3 0 Krista Thomas 1 26 10
Web 3 0 Krista Thomas 1 26 10Web 3 0 Krista Thomas 1 26 10
Web 3 0 Krista Thomas 1 26 10
 
Cloud Computing through FCAPS Managed Services in a Virtualized Data Center
Cloud Computing through FCAPS Managed Services in a Virtualized Data CenterCloud Computing through FCAPS Managed Services in a Virtualized Data Center
Cloud Computing through FCAPS Managed Services in a Virtualized Data Center
 
Computing for Human Experience and Wellness
Computing for Human Experience and WellnessComputing for Human Experience and Wellness
Computing for Human Experience and Wellness
 
Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10
 
Cutting Big Data Down to Size with AMD and Dell
Cutting Big Data Down to Size with AMD and DellCutting Big Data Down to Size with AMD and Dell
Cutting Big Data Down to Size with AMD and Dell
 
제1회 Korea Community Day 발표자료 Bigdata
제1회 Korea Community Day 발표자료 Bigdata 제1회 Korea Community Day 발표자료 Bigdata
제1회 Korea Community Day 발표자료 Bigdata
 
Django and Neo4j - Domain modeling that kicks ass
Django and Neo4j - Domain modeling that kicks assDjango and Neo4j - Domain modeling that kicks ass
Django and Neo4j - Domain modeling that kicks ass
 
NOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4jNOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4j
 
Mark logic Corporate Overview
Mark logic Corporate OverviewMark logic Corporate Overview
Mark logic Corporate Overview
 
InfoWorld
InfoWorldInfoWorld
InfoWorld
 
Metadata: Increasing Value in Digital Content Competition Flyer
Metadata: Increasing Value in Digital Content Competition FlyerMetadata: Increasing Value in Digital Content Competition Flyer
Metadata: Increasing Value in Digital Content Competition Flyer
 
Flash session -streaming--ses1243-lon
Flash session -streaming--ses1243-lonFlash session -streaming--ses1243-lon
Flash session -streaming--ses1243-lon
 
[Day 3] Building Sustainable Communities
[Day 3] Building Sustainable Communities[Day 3] Building Sustainable Communities
[Day 3] Building Sustainable Communities
 
Soeren okfn greece meetup
Soeren okfn greece meetupSoeren okfn greece meetup
Soeren okfn greece meetup
 
Massive Data Analytics and the Cloud
Massive Data Analytics and the CloudMassive Data Analytics and the Cloud
Massive Data Analytics and the Cloud
 
100615 htap network_brussels
100615 htap network_brussels100615 htap network_brussels
100615 htap network_brussels
 
NOSQL Overview Lightning Talk (Scalability Geekcruise 2009)
NOSQL Overview Lightning Talk (Scalability Geekcruise 2009)NOSQL Overview Lightning Talk (Scalability Geekcruise 2009)
NOSQL Overview Lightning Talk (Scalability Geekcruise 2009)
 
Conférence Open Data par où commencer ? "How to achieve interoperability?" E....
Conférence Open Data par où commencer ? "How to achieve interoperability?" E....Conférence Open Data par où commencer ? "How to achieve interoperability?" E....
Conférence Open Data par où commencer ? "How to achieve interoperability?" E....
 
Transformanz
TransformanzTransformanz
Transformanz
 

Mehr von Krista Thomas

The OpenCalais Workshop at WeMedia 2010
The OpenCalais Workshop at WeMedia 2010The OpenCalais Workshop at WeMedia 2010
The OpenCalais Workshop at WeMedia 2010Krista Thomas
 
Open Calais Workshop at WeMedia 2010
Open Calais Workshop at WeMedia 2010Open Calais Workshop at WeMedia 2010
Open Calais Workshop at WeMedia 2010Krista Thomas
 
Web 3 0 Krista Thomas 1 26 10
Web 3 0 Krista Thomas 1 26 10Web 3 0 Krista Thomas 1 26 10
Web 3 0 Krista Thomas 1 26 10Krista Thomas
 
Simple OpenCalais Whitepaper
Simple OpenCalais WhitepaperSimple OpenCalais Whitepaper
Simple OpenCalais WhitepaperKrista Thomas
 
OpenCalais @ UC Berkeley Media Technology Summit 9/29/09
OpenCalais @ UC Berkeley Media Technology Summit 9/29/09OpenCalais @ UC Berkeley Media Technology Summit 9/29/09
OpenCalais @ UC Berkeley Media Technology Summit 9/29/09Krista Thomas
 
Open Calais @ Transparent Text
Open Calais @ Transparent TextOpen Calais @ Transparent Text
Open Calais @ Transparent TextKrista Thomas
 
Tague Semtech Keynote 2009
Tague Semtech Keynote 2009Tague Semtech Keynote 2009
Tague Semtech Keynote 2009Krista Thomas
 
Phase2 OpenPublish Presentation SF SemWeb Meetup, April 28, 2009
Phase2 OpenPublish Presentation SF SemWeb Meetup, April 28, 2009Phase2 OpenPublish Presentation SF SemWeb Meetup, April 28, 2009
Phase2 OpenPublish Presentation SF SemWeb Meetup, April 28, 2009Krista Thomas
 
Intro To The Calais Web Service @ OpenCalais.com
Intro To The Calais Web Service @ OpenCalais.comIntro To The Calais Web Service @ OpenCalais.com
Intro To The Calais Web Service @ OpenCalais.comKrista Thomas
 
Open Calais Release 4.0
Open Calais Release 4.0Open Calais Release 4.0
Open Calais Release 4.0Krista Thomas
 
Calais @ the SD Forum
Calais @ the SD ForumCalais @ the SD Forum
Calais @ the SD ForumKrista Thomas
 
Calais @ the Palo Alto Semantic Web Meetup
Calais @ the Palo Alto Semantic Web MeetupCalais @ the Palo Alto Semantic Web Meetup
Calais @ the Palo Alto Semantic Web MeetupKrista Thomas
 
Final Calais For ONA
Final Calais For ONAFinal Calais For ONA
Final Calais For ONAKrista Thomas
 

Mehr von Krista Thomas (14)

Ad.ly Introduction
Ad.ly IntroductionAd.ly Introduction
Ad.ly Introduction
 
The OpenCalais Workshop at WeMedia 2010
The OpenCalais Workshop at WeMedia 2010The OpenCalais Workshop at WeMedia 2010
The OpenCalais Workshop at WeMedia 2010
 
Open Calais Workshop at WeMedia 2010
Open Calais Workshop at WeMedia 2010Open Calais Workshop at WeMedia 2010
Open Calais Workshop at WeMedia 2010
 
Web 3 0 Krista Thomas 1 26 10
Web 3 0 Krista Thomas 1 26 10Web 3 0 Krista Thomas 1 26 10
Web 3 0 Krista Thomas 1 26 10
 
Simple OpenCalais Whitepaper
Simple OpenCalais WhitepaperSimple OpenCalais Whitepaper
Simple OpenCalais Whitepaper
 
OpenCalais @ UC Berkeley Media Technology Summit 9/29/09
OpenCalais @ UC Berkeley Media Technology Summit 9/29/09OpenCalais @ UC Berkeley Media Technology Summit 9/29/09
OpenCalais @ UC Berkeley Media Technology Summit 9/29/09
 
Open Calais @ Transparent Text
Open Calais @ Transparent TextOpen Calais @ Transparent Text
Open Calais @ Transparent Text
 
Tague Semtech Keynote 2009
Tague Semtech Keynote 2009Tague Semtech Keynote 2009
Tague Semtech Keynote 2009
 
Phase2 OpenPublish Presentation SF SemWeb Meetup, April 28, 2009
Phase2 OpenPublish Presentation SF SemWeb Meetup, April 28, 2009Phase2 OpenPublish Presentation SF SemWeb Meetup, April 28, 2009
Phase2 OpenPublish Presentation SF SemWeb Meetup, April 28, 2009
 
Intro To The Calais Web Service @ OpenCalais.com
Intro To The Calais Web Service @ OpenCalais.comIntro To The Calais Web Service @ OpenCalais.com
Intro To The Calais Web Service @ OpenCalais.com
 
Open Calais Release 4.0
Open Calais Release 4.0Open Calais Release 4.0
Open Calais Release 4.0
 
Calais @ the SD Forum
Calais @ the SD ForumCalais @ the SD Forum
Calais @ the SD Forum
 
Calais @ the Palo Alto Semantic Web Meetup
Calais @ the Palo Alto Semantic Web MeetupCalais @ the Palo Alto Semantic Web Meetup
Calais @ the Palo Alto Semantic Web Meetup
 
Final Calais For ONA
Final Calais For ONAFinal Calais For ONA
Final Calais For ONA
 

Kürzlich hochgeladen

Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 

Kürzlich hochgeladen (20)

Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 

Open Calais For SF And LA Meetups

  • 2. Overview • Going to discuss five basic topics – What is Calais? – Why we’re doing it & what our goals are – How it works / What’s under the hood? – A few examples – Where it’s headed
  • 3. Calais… • Calais extracts smart metadata from unstructured text and links that metadata to the Linked Data cloud.
  • 4. Calais progress to date • Launched in late January, 2008 • 9,500 developers have joined OpenCalais.com • 1-3 million content ‘transactions’ per day • Delivered four major update releases • Free (as in free) for commercial or non- commercial use
  • 5. 5 3 Which provides Metadata information and 1 returned to other Linked the user Unstructur Data pointers with keys ed Text 4 Keys provide access to the Calais 2 Linked Calais Data cloud 6 extracts entities, To a range of open and partner Linked facts and data assets, events including Thomson Reuters
  • 6. Quick Demo You can find the Calais Viewer demonstration tool here: http://viewer.opencalais.com (Note that the Calais Viewer is not the Calais service. It is merely a demonstration of how the service works.) – Copy and paste the text of a business news article from AP, Dow Jones or Reuters.com into the viewer, and press submit. The article is sent to the Calais engine which tags the content and returns it, marked-up. – The tags appear on the left hand rail, and you can click on the plus (+) sign to see the tags expand. – Since we are now on Calais 4.0, you can also use the viewer to see the Linked Data assets related to the tags Calais returns. • Click on a company name on the left hand rail to find a Calais summary page featuring a basic description for that company, as well as a number of links. • Follow those links to see the other data entries on that company that are available for public use in the Linked Data Cloud. – For example, here is the Calais summary page for IBM: http://d.opencalais.com/er/company/ralg-tr1r/9e3f6c34-aa6b-3a3b-b221- a07aa7933633.html – And here is the summary page for IBM in DBPedia (the Wikipedia translated into computer language): http://dbpedia.org/page/IBM
  • 7. Why & What 1. Derive semantic metadata from textual assets 2. Use that semantic metadata to create entry points into the linked data ecosystem 3. Provide a simple mechanism for the sharing of semantic metadata about textual content assets 4. And just why are you doing this…
  • 8. 1: Semantics from Text: The Text Problem • People consume text • Most of it isn’t semantically enabled • Most of it won’t be semantically enabled • This isn’t about standards – microfromats vs RDFa vs. whatever. • Why: Latency, cost and short shelf- life
  • 9. 1: Semantics from Text: The Text Problem • Target areas where: Years – The economics Great Novels don’t support Scient. Shelf Life metadata Pubs creation Legacy – The value of News metadata is New Gen potentially high News – The value of Seconds Tweets aggregated metadata is Latency ds potentially on rs extremely high a c Ye Se
  • 10. 2: Getting from Text to the Linked Data Ecosystem
  • 12. 3: Semantic Metadata Transport Layer • I’m a content producer. We’ve loaded the car with rich semantic metadata – I’m sharing it within my four walls – How do I transport it to my consumers? – RSS / Atom, XML, Proprietary data feeds, Content API’s
  • 13. 4: Why We’re Doing It • Two simple answers: – Hyper-evolution of capabilities – better, faster, stronger – The walled garden content world
  • 14. How it Works – Under the Hood of Calais
  • 15. How it Works – Under the Hood of Calais Document Level Metadata Metadata Reference Management Data Assets Entity Level Linked Data and … Stat Tools Disambig. ClearForest Calais Web RD Engine NLP Engine Service F Rule Lexi Base cons Output Formatting
  • 16. Where From Here? • We’ve seen examples of first generation uses. • Where does this go in the future? • Beyond the document – Social Resume analysis – Museum Content Coalitions – Knowledge Management Applications – Investigative Journalism*
  • 17. Investigative Journalism FOIA Calais Web Company:Contract Contract Service Company:Affiliation Document s Big Fuzzy Graph News Calais Web Company:Person Service FamilyRelation
  • 18. What’s in the Pipeline? • 2009 (this is a fuzzy list) – Person disambiguation @ domain level? – Other disambiguation – Continued expansion of URI’s (entities & events) – Calais as hub – Exposure of the IDE? – User managed lexicons – Languages – Opt-in SPARQL Endpoint?
  • 19. • www.opencalais.com – Gallery – code and applications examples – Forums – Documentation • Twitter @opencalais, Facebook Group