SlideShare ist ein Scribd-Unternehmen logo
MJ no more:
Using Wikipedia Concurrent Edit Spikes
With Social Network Plausibility Checks
For Breaking News Detection
Thomas Steiner (tomac@google.com, @tomayac)
Seth van Hooland (svhoolan@ulb.ac.be, @sethvanhooland)
Ed Summers (edsu@loc.gov, @edsu)
News more and more don't break on the newswire
First Story Detection on Realtime Social Networks
Typically based on Twitter because of their Streaming API [Twitter2012].
Try to detect spikes in time, locality, text (oftentimes restricted domain, e.
g., earthquake prediction).
A typical representative for this kind of approach is, e.g., [Petrović2010].
High recall
Low precision
[Twitter2012] https://dev.twitter.com/docs/streaming-apis/streams/public
[Petrović2010] Saša Petrović, Miles Osborne, and Victor Lavrenko. 2010. Streaming first story detection with
application to Twitter. In Human Language Technologies: The 2010 Annual Conference of the North American
Chapter of the Association for Computational Linguistics (HLT '10). Association for Computational Linguistics,
Stroudsburg, PA, USA, 181–189.
Curation based on Wikipedia
Wikipedia page view logs are publicly available [Wikipedia2012]. Updated
on an hourly basis.
Osbourne et al. have successfully shown that there is a relation between
Wikipedia page views and news events [Osbourne2012].
Improves the approach of [Petrović2010] by using Wikipedia logs.
Key findings:
Wikipedia lags about 2h behind the news.
Newly created pages add noise.
[Wikipedia2012] http://dumps.wikimedia.org/other/pagecounts-raw/
[Osbourne2012] M. Osborne, S. Petrovic, R. McCreadie, C. Macdonald, I. Ounis. 2012. Bieber no more: First Story
Detection using Twitter and Wikipedia. In SIGIR 2012 Workshop on Time-aware Information Access (#TAIA2012),
Portland, Oregon, USA
Key idea: inverse the process
Use Wikipedia live IRC stream of recent changes [WikipediaIRC2012],
then do a sanity check on social networks.
[WikipediaIRC2012] http://meta.wikimedia.org/wiki/IRC/Channels#Raw_feeds
Introducing Wikipedia Live Monitor
Hooks into the Wikipedia recent changes IRC channels for all Wikipedia
locales.
Channel names follow the pattern
#language.project, e.g., #de.wikipedia
When an article gets edited, retrieve all language versions and treat them
as a cluster.
E.g., en:Albert_Einstein is in the same cluster as de:
Albert_Einstein.
1) ≥ 5 Occurrences
An article cluster must have at least n edits before it is considered a
breaking news candidate.
2) ≤60 Seconds Between Edits
An article cluster may have at max n seconds in between edits in order to
be regarded a breaking news candidate.
3) ≥2 Concurrent Editors
An article cluster must be edited by at least n concurrent editors before it
is considered a breaking news candidate.
4) ≤240 Seconds Since Last Edit
An article cluster is thrown out of the monitoring loop if its last edit is
longer ago than n seconds.
Breaking News Conditions
Koninginnedag (http://twitpic.com/cn1vgf/full)
Evaluation—Does it work at all?
Champions League Semi Final BVB vs. RMD with Lewandowski (http:
//twitpic.com/clo0s0)
Evaluation—Does it work at all?
Boston Bombings (https://twitter.
com/jason_koebler/statuses/323892465545388033,
http://www.usnews.com/news/articles/2013/04/15/is-wikipedia-better-for-
breaking-news-than-twitter)
Evaluation—Does it work at all?
Lag time for global events: <5 min
Resignation of Pope Benedict XVI (http://en.wikipedia.
org/wiki/Resignation_of_Pope_Benedict_XVI)
Three first edit times (UTC) after news broke on Feb 11, 2013
● English Wikipedia article: 10:58, 10:59, 11:02
● French Wikipedia article: 11:00, 11:00, 11:01
Implies that by looking at only two language versions (the actual number
of monitored versions is 42) of the Pope article, the system would have
reported the news at 11:01
Twitter account of Reuters announced the news at 10:59
Vatican Radio’s announcement was made at 10:57:47
Evaluation—How well does it work?
Work with realtime page view logs in addition to page edit logs
(API format currently being defined by Wikimedia)
News categorization and classification
E.g., Category Living-Persons removed from person implies (sad)
news
Improve false-positive rate, make connection with social networks and
actual article edits stronger
Auto notification system upon breaking news candidates
Pre-announcement: follow @WikiLiveMon
Future Work
Play with the system at
http://wikipedia-irc.herokuapp.com/
Read the paper at
http://arxiv.org/abs/1303.4702
Ask questions here or via
tomac@google.com & @tomayac
Demo and thank you

Weitere ähnliche Inhalte

Andere mochten auch

Shooting in Canada
Shooting in CanadaShooting in Canada
Shooting in CanadaNews Feather
 
Original felizmeno há luar
Original felizmeno há luarOriginal felizmeno há luar
Original felizmeno há luarKaryn XP
 
Internetový obchod 2007
Internetový obchod 2007Internetový obchod 2007
Internetový obchod 2007Filip Vatter
 
Tessel is a microcontroller that runs JavaScript.
Tessel is a microcontroller that runs JavaScript.Tessel is a microcontroller that runs JavaScript.
Tessel is a microcontroller that runs JavaScript.Ladislav Prskavec
 
Planilla retencion-iva
Planilla retencion-ivaPlanilla retencion-iva
Planilla retencion-ivaeve316
 
Sutherland media can_build awareness and growth! 2014
Sutherland media can_build awareness and growth! 2014Sutherland media can_build awareness and growth! 2014
Sutherland media can_build awareness and growth! 2014Janet Sutherland
 

Andere mochten auch (13)

Kombis bab16 kel9_akt2
Kombis bab16 kel9_akt2Kombis bab16 kel9_akt2
Kombis bab16 kel9_akt2
 
Shooting in Canada
Shooting in CanadaShooting in Canada
Shooting in Canada
 
Original felizmeno há luar
Original felizmeno há luarOriginal felizmeno há luar
Original felizmeno há luar
 
Internetový obchod 2007
Internetový obchod 2007Internetový obchod 2007
Internetový obchod 2007
 
Tessel is a microcontroller that runs JavaScript.
Tessel is a microcontroller that runs JavaScript.Tessel is a microcontroller that runs JavaScript.
Tessel is a microcontroller that runs JavaScript.
 
Planilla retencion-iva
Planilla retencion-ivaPlanilla retencion-iva
Planilla retencion-iva
 
Kwn bab3 kel9_akt2
Kwn bab3 kel9_akt2Kwn bab3 kel9_akt2
Kwn bab3 kel9_akt2
 
Ekop bab9 kel4_akt2.ppt
Ekop bab9 kel4_akt2.pptEkop bab9 kel4_akt2.ppt
Ekop bab9 kel4_akt2.ppt
 
Kombis bab1 kel9_akt2
Kombis bab1 kel9_akt2Kombis bab1 kel9_akt2
Kombis bab1 kel9_akt2
 
Ekop bab12 kel4_akt2.ppt
Ekop bab12 kel4_akt2.pptEkop bab12 kel4_akt2.ppt
Ekop bab12 kel4_akt2.ppt
 
Sutherland media can_build awareness and growth! 2014
Sutherland media can_build awareness and growth! 2014Sutherland media can_build awareness and growth! 2014
Sutherland media can_build awareness and growth! 2014
 
Newbldg2004 2014
Newbldg2004 2014Newbldg2004 2014
Newbldg2004 2014
 
Kombis bab7 kel9_akt2
Kombis bab7 kel9_akt2Kombis bab7 kel9_akt2
Kombis bab7 kel9_akt2
 

Ähnlich wie Using Wikipedia Concurrent Edit Spikes With Social Network Plausibility Checks For Breaking News Detection

Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...Artificial Intelligence Institute at UofSC
 
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...Leon Derczynski
 
Kim Hammar Msc Thesis Defense - 2018
Kim Hammar Msc Thesis Defense - 2018Kim Hammar Msc Thesis Defense - 2018
Kim Hammar Msc Thesis Defense - 2018Kim Hammar
 
Microsoft Io TechCamp Frankfurt am Main 2015
Microsoft Io TechCamp Frankfurt am Main 2015Microsoft Io TechCamp Frankfurt am Main 2015
Microsoft Io TechCamp Frankfurt am Main 2015Damir Dobric
 
Semantic Twitter Analyzing Tweets For Real Time Event Notification
Semantic Twitter Analyzing Tweets For Real Time Event NotificationSemantic Twitter Analyzing Tweets For Real Time Event Notification
Semantic Twitter Analyzing Tweets For Real Time Event Notificationokazaki117
 
Computational Verification Challenges in Social Media
Computational Verification Challenges in Social MediaComputational Verification Challenges in Social Media
Computational Verification Challenges in Social MediaSymeon Papadopoulos
 
Strategic perspectives 3
Strategic perspectives 3Strategic perspectives 3
Strategic perspectives 3archiejones4
 
Tracking discourse on social media
Tracking discourse on social mediaTracking discourse on social media
Tracking discourse on social mediaAlexander Nwala
 
The Russian News Topic Modelling Based on Citation Detections
The Russian News Topic Modelling Based on Citation Detections The Russian News Topic Modelling Based on Citation Detections
The Russian News Topic Modelling Based on Citation Detections Institute of Contemporary Sciences
 
Rob Procter
Rob ProcterRob Procter
Rob ProcterNSMNSS
 
Twitter Intelligent Sensor Agent
Twitter Intelligent Sensor AgentTwitter Intelligent Sensor Agent
Twitter Intelligent Sensor AgentIoannis Katakis
 
The Future of Communication
The Future of CommunicationThe Future of Communication
The Future of CommunicationenseGO
 
Information Management Trends 2009
Information Management Trends 2009Information Management Trends 2009
Information Management Trends 2009Christopher Eagle
 
Die Zukunft der Kommunikation
Die Zukunft der KommunikationDie Zukunft der Kommunikation
Die Zukunft der KommunikationenseGO
 
Iaetsd real time event detection and alert system using sensors
Iaetsd real time event detection and alert system using sensorsIaetsd real time event detection and alert system using sensors
Iaetsd real time event detection and alert system using sensorsIaetsd Iaetsd
 
A preliminary approach to knowledge integrity risk assessment in Wikipedia p...
A preliminary approach to knowledge integrity  risk assessment in Wikipedia p...A preliminary approach to knowledge integrity  risk assessment in Wikipedia p...
A preliminary approach to knowledge integrity risk assessment in Wikipedia p...Pablo Aragón
 
(Linked Data Development and Exploitation track) "Generating the Semantic Sna...
(Linked Data Development and Exploitation track) "Generating the Semantic Sna...(Linked Data Development and Exploitation track) "Generating the Semantic Sna...
(Linked Data Development and Exploitation track) "Generating the Semantic Sna...icwe2015
 

Ähnlich wie Using Wikipedia Concurrent Edit Spikes With Social Network Plausibility Checks For Breaking News Detection (20)

Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
 
Wiki Analytics Workshop
Wiki Analytics WorkshopWiki Analytics Workshop
Wiki Analytics Workshop
 
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
 
Kim Hammar Msc Thesis Defense - 2018
Kim Hammar Msc Thesis Defense - 2018Kim Hammar Msc Thesis Defense - 2018
Kim Hammar Msc Thesis Defense - 2018
 
Microsoft Io TechCamp Frankfurt am Main 2015
Microsoft Io TechCamp Frankfurt am Main 2015Microsoft Io TechCamp Frankfurt am Main 2015
Microsoft Io TechCamp Frankfurt am Main 2015
 
Semantic Twitter Analyzing Tweets For Real Time Event Notification
Semantic Twitter Analyzing Tweets For Real Time Event NotificationSemantic Twitter Analyzing Tweets For Real Time Event Notification
Semantic Twitter Analyzing Tweets For Real Time Event Notification
 
Computational Verification Challenges in Social Media
Computational Verification Challenges in Social MediaComputational Verification Challenges in Social Media
Computational Verification Challenges in Social Media
 
Strategic perspectives 3
Strategic perspectives 3Strategic perspectives 3
Strategic perspectives 3
 
Tracking discourse on social media
Tracking discourse on social mediaTracking discourse on social media
Tracking discourse on social media
 
The Russian News Topic Modelling Based on Citation Detections
The Russian News Topic Modelling Based on Citation Detections The Russian News Topic Modelling Based on Citation Detections
The Russian News Topic Modelling Based on Citation Detections
 
Rob Procter
Rob ProcterRob Procter
Rob Procter
 
Twitter Intelligent Sensor Agent
Twitter Intelligent Sensor AgentTwitter Intelligent Sensor Agent
Twitter Intelligent Sensor Agent
 
The Future of Communication
The Future of CommunicationThe Future of Communication
The Future of Communication
 
Information Management Trends 2009
Information Management Trends 2009Information Management Trends 2009
Information Management Trends 2009
 
Trend Analysis
Trend AnalysisTrend Analysis
Trend Analysis
 
Die Zukunft der Kommunikation
Die Zukunft der KommunikationDie Zukunft der Kommunikation
Die Zukunft der Kommunikation
 
Iaetsd real time event detection and alert system using sensors
Iaetsd real time event detection and alert system using sensorsIaetsd real time event detection and alert system using sensors
Iaetsd real time event detection and alert system using sensors
 
Broker Bots: Analyzing automated activity during High Impact Events on Twitter
Broker Bots: Analyzing automated activity during High Impact Events on TwitterBroker Bots: Analyzing automated activity during High Impact Events on Twitter
Broker Bots: Analyzing automated activity during High Impact Events on Twitter
 
A preliminary approach to knowledge integrity risk assessment in Wikipedia p...
A preliminary approach to knowledge integrity  risk assessment in Wikipedia p...A preliminary approach to knowledge integrity  risk assessment in Wikipedia p...
A preliminary approach to knowledge integrity risk assessment in Wikipedia p...
 
(Linked Data Development and Exploitation track) "Generating the Semantic Sna...
(Linked Data Development and Exploitation track) "Generating the Semantic Sna...(Linked Data Development and Exploitation track) "Generating the Semantic Sna...
(Linked Data Development and Exploitation track) "Generating the Semantic Sna...
 

Mehr von Gabriela Agustini

Como a cultura maker vai mudar o modo de produção global
Como a cultura maker vai mudar o modo de produção globalComo a cultura maker vai mudar o modo de produção global
Como a cultura maker vai mudar o modo de produção globalGabriela Agustini
 
Cidadãos como protagonistas das transformações sociais
Cidadãos como protagonistas das transformações sociaisCidadãos como protagonistas das transformações sociais
Cidadãos como protagonistas das transformações sociaisGabriela Agustini
 
Movimento Maker e Educação
Movimento Maker e EducaçãoMovimento Maker e Educação
Movimento Maker e EducaçãoGabriela Agustini
 
Diversidade cultural gilberto gil
Diversidade cultural gilberto gilDiversidade cultural gilberto gil
Diversidade cultural gilberto gilGabriela Agustini
 
Social Entrepreneurship - International School of Law and Technology
Social Entrepreneurship - International School of Law and TechnologySocial Entrepreneurship - International School of Law and Technology
Social Entrepreneurship - International School of Law and TechnologyGabriela Agustini
 
A tecnologia pode salvar a gente? | A gente pode salvar a tecnologia?
A tecnologia pode salvar a gente? | A gente pode salvar a tecnologia?A tecnologia pode salvar a gente? | A gente pode salvar a tecnologia?
A tecnologia pode salvar a gente? | A gente pode salvar a tecnologia?Gabriela Agustini
 
Makersfor Global Good Report
Makersfor Global Good ReportMakersfor Global Good Report
Makersfor Global Good ReportGabriela Agustini
 
Apresentação olabi institucional interna - abril 17
Apresentação olabi institucional interna - abril 17Apresentação olabi institucional interna - abril 17
Apresentação olabi institucional interna - abril 17Gabriela Agustini
 
Pretalab- apresentação institucional
Pretalab- apresentação institucionalPretalab- apresentação institucional
Pretalab- apresentação institucionalGabriela Agustini
 
Cultura e tecnologia - aula2
Cultura e tecnologia - aula2Cultura e tecnologia - aula2
Cultura e tecnologia - aula2Gabriela Agustini
 
Cultura e tecnologia - aula1
Cultura e tecnologia - aula1Cultura e tecnologia - aula1
Cultura e tecnologia - aula1Gabriela Agustini
 
Global Innovation Gathering featured in Make Magazine Germany
Global Innovation Gathering featured in Make Magazine GermanyGlobal Innovation Gathering featured in Make Magazine Germany
Global Innovation Gathering featured in Make Magazine GermanyGabriela Agustini
 
Inovação de baixo para cima e o poder dos cidadãos
Inovação de baixo para cima e o poder dos cidadãos Inovação de baixo para cima e o poder dos cidadãos
Inovação de baixo para cima e o poder dos cidadãos Gabriela Agustini
 
Makerspaces e hubs de inovação
Makerspaces e hubs de inovaçãoMakerspaces e hubs de inovação
Makerspaces e hubs de inovaçãoGabriela Agustini
 

Mehr von Gabriela Agustini (20)

Como a cultura maker vai mudar o modo de produção global
Como a cultura maker vai mudar o modo de produção globalComo a cultura maker vai mudar o modo de produção global
Como a cultura maker vai mudar o modo de produção global
 
Cidadãos como protagonistas das transformações sociais
Cidadãos como protagonistas das transformações sociaisCidadãos como protagonistas das transformações sociais
Cidadãos como protagonistas das transformações sociais
 
Inovação digital
Inovação digital Inovação digital
Inovação digital
 
Movimento Maker e Educação
Movimento Maker e EducaçãoMovimento Maker e Educação
Movimento Maker e Educação
 
Cultura digital - Aula 4
Cultura digital - Aula 4Cultura digital - Aula 4
Cultura digital - Aula 4
 
Cultura Digital- aula 3
Cultura Digital- aula 3Cultura Digital- aula 3
Cultura Digital- aula 3
 
Cultura Digital- aula 2
Cultura Digital- aula 2Cultura Digital- aula 2
Cultura Digital- aula 2
 
Diversidade cultural gilberto gil
Diversidade cultural gilberto gilDiversidade cultural gilberto gil
Diversidade cultural gilberto gil
 
Social Entrepreneurship - International School of Law and Technology
Social Entrepreneurship - International School of Law and TechnologySocial Entrepreneurship - International School of Law and Technology
Social Entrepreneurship - International School of Law and Technology
 
A tecnologia pode salvar a gente? | A gente pode salvar a tecnologia?
A tecnologia pode salvar a gente? | A gente pode salvar a tecnologia?A tecnologia pode salvar a gente? | A gente pode salvar a tecnologia?
A tecnologia pode salvar a gente? | A gente pode salvar a tecnologia?
 
Makersfor Global Good Report
Makersfor Global Good ReportMakersfor Global Good Report
Makersfor Global Good Report
 
Apresentação olabi institucional interna - abril 17
Apresentação olabi institucional interna - abril 17Apresentação olabi institucional interna - abril 17
Apresentação olabi institucional interna - abril 17
 
7 Forum Nacional de Museus
7 Forum Nacional de Museus7 Forum Nacional de Museus
7 Forum Nacional de Museus
 
Apresentacao metashop
Apresentacao metashopApresentacao metashop
Apresentacao metashop
 
Pretalab- apresentação institucional
Pretalab- apresentação institucionalPretalab- apresentação institucional
Pretalab- apresentação institucional
 
Cultura e tecnologia - aula2
Cultura e tecnologia - aula2Cultura e tecnologia - aula2
Cultura e tecnologia - aula2
 
Cultura e tecnologia - aula1
Cultura e tecnologia - aula1Cultura e tecnologia - aula1
Cultura e tecnologia - aula1
 
Global Innovation Gathering featured in Make Magazine Germany
Global Innovation Gathering featured in Make Magazine GermanyGlobal Innovation Gathering featured in Make Magazine Germany
Global Innovation Gathering featured in Make Magazine Germany
 
Inovação de baixo para cima e o poder dos cidadãos
Inovação de baixo para cima e o poder dos cidadãos Inovação de baixo para cima e o poder dos cidadãos
Inovação de baixo para cima e o poder dos cidadãos
 
Makerspaces e hubs de inovação
Makerspaces e hubs de inovaçãoMakerspaces e hubs de inovação
Makerspaces e hubs de inovação
 

Kürzlich hochgeladen

SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...CzechDreamin
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...CzechDreamin
 
Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024TopCSSGallery
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxAbida Shariff
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...FIDO Alliance
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FIDO Alliance
 
Introduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationIntroduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationZilliz
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoTAnalytics
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessUXDXConf
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfFIDO Alliance
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfFIDO Alliance
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsStefano
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Julian Hyde
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityScyllaDB
 
UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1DianaGray10
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCzechDreamin
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfFIDO Alliance
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomCzechDreamin
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutesconfluent
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty SecureFemke de Vroome
 

Kürzlich hochgeladen (20)

SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 
Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
 
Introduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationIntroduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG Evaluation
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. Startups
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty Secure
 

Using Wikipedia Concurrent Edit Spikes With Social Network Plausibility Checks For Breaking News Detection

  • 1. MJ no more: Using Wikipedia Concurrent Edit Spikes With Social Network Plausibility Checks For Breaking News Detection Thomas Steiner (tomac@google.com, @tomayac) Seth van Hooland (svhoolan@ulb.ac.be, @sethvanhooland) Ed Summers (edsu@loc.gov, @edsu)
  • 2. News more and more don't break on the newswire
  • 3. First Story Detection on Realtime Social Networks Typically based on Twitter because of their Streaming API [Twitter2012]. Try to detect spikes in time, locality, text (oftentimes restricted domain, e. g., earthquake prediction). A typical representative for this kind of approach is, e.g., [Petrović2010]. High recall Low precision [Twitter2012] https://dev.twitter.com/docs/streaming-apis/streams/public [Petrović2010] Saša Petrović, Miles Osborne, and Victor Lavrenko. 2010. Streaming first story detection with application to Twitter. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics (HLT '10). Association for Computational Linguistics, Stroudsburg, PA, USA, 181–189.
  • 4. Curation based on Wikipedia Wikipedia page view logs are publicly available [Wikipedia2012]. Updated on an hourly basis. Osbourne et al. have successfully shown that there is a relation between Wikipedia page views and news events [Osbourne2012]. Improves the approach of [Petrović2010] by using Wikipedia logs. Key findings: Wikipedia lags about 2h behind the news. Newly created pages add noise. [Wikipedia2012] http://dumps.wikimedia.org/other/pagecounts-raw/ [Osbourne2012] M. Osborne, S. Petrovic, R. McCreadie, C. Macdonald, I. Ounis. 2012. Bieber no more: First Story Detection using Twitter and Wikipedia. In SIGIR 2012 Workshop on Time-aware Information Access (#TAIA2012), Portland, Oregon, USA
  • 5. Key idea: inverse the process Use Wikipedia live IRC stream of recent changes [WikipediaIRC2012], then do a sanity check on social networks. [WikipediaIRC2012] http://meta.wikimedia.org/wiki/IRC/Channels#Raw_feeds
  • 6. Introducing Wikipedia Live Monitor Hooks into the Wikipedia recent changes IRC channels for all Wikipedia locales. Channel names follow the pattern #language.project, e.g., #de.wikipedia When an article gets edited, retrieve all language versions and treat them as a cluster. E.g., en:Albert_Einstein is in the same cluster as de: Albert_Einstein.
  • 7. 1) ≥ 5 Occurrences An article cluster must have at least n edits before it is considered a breaking news candidate. 2) ≤60 Seconds Between Edits An article cluster may have at max n seconds in between edits in order to be regarded a breaking news candidate. 3) ≥2 Concurrent Editors An article cluster must be edited by at least n concurrent editors before it is considered a breaking news candidate. 4) ≤240 Seconds Since Last Edit An article cluster is thrown out of the monitoring loop if its last edit is longer ago than n seconds. Breaking News Conditions
  • 9. Champions League Semi Final BVB vs. RMD with Lewandowski (http: //twitpic.com/clo0s0) Evaluation—Does it work at all?
  • 11. Lag time for global events: <5 min Resignation of Pope Benedict XVI (http://en.wikipedia. org/wiki/Resignation_of_Pope_Benedict_XVI) Three first edit times (UTC) after news broke on Feb 11, 2013 ● English Wikipedia article: 10:58, 10:59, 11:02 ● French Wikipedia article: 11:00, 11:00, 11:01 Implies that by looking at only two language versions (the actual number of monitored versions is 42) of the Pope article, the system would have reported the news at 11:01 Twitter account of Reuters announced the news at 10:59 Vatican Radio’s announcement was made at 10:57:47 Evaluation—How well does it work?
  • 12. Work with realtime page view logs in addition to page edit logs (API format currently being defined by Wikimedia) News categorization and classification E.g., Category Living-Persons removed from person implies (sad) news Improve false-positive rate, make connection with social networks and actual article edits stronger Auto notification system upon breaking news candidates Pre-announcement: follow @WikiLiveMon Future Work
  • 13. Play with the system at http://wikipedia-irc.herokuapp.com/ Read the paper at http://arxiv.org/abs/1303.4702 Ask questions here or via tomac@google.com & @tomayac Demo and thank you