SlideShare ist ein Scribd-Unternehmen logo
1 von 21
Toward Formal Reasoning with Epistemic Policies About Information Quality in the Twittersphere Brian Ulicny  VIStology, Inc.  bulicny@vistology.com Mieczyslaw Kokar  Northeastern University and VIStology, Inc. kokar@coe.neu.edu VIStology, Inc - Fusion 2011 1
Arab Spring Uprisings 2011 2 VIStology, Inc - Fusion 2011
Situation Awareness (?):Al Jazeera’s Twitter Monitor 3 VIStology, Inc - Fusion 2011
Situation Awareness:Attention Spikes from Twitter 4 VIStology, Inc - Fusion 2011
Situation Awareness: Flu Trends from Social Media Detecting influenza outbreaks by analyzing Twitter messages AronCulotta arXiv:1007.4748v1 [cs.IR] 27 Jul 2010 5 VIStology, Inc - Fusion 2011
Twitter as Open Source Intel 6 VIStology, Inc - Fusion 2011
7 VIStology, Inc - Fusion 2011 Confidence = <Reliability, Credibility>
Problem Statement How can we assess not only the volume of tweets per time period And the frequency of terms they contain But the reliability, credibility & confidence of the information they convey In a potentially adversarial situation? 8 VIStology, Inc - Fusion 2011
Naïve STANAG 2022 for Twitter Reliability = F: Cannot Be Judged All “sources not used in the past” Credibility =  1: Confirmed by Other Sources More than two string identical tweets? Or Credibility = 3, Possibly True  Because Sources not Independent Because Path between all sources in Twitter graph  9 VIStology, Inc - Fusion 2011
Need Tractable Way to Calculate: Twitter Source Reliability Twitter Content Credibility Twitter Source Independence Where  Entire Twitter graph contains 105 Million Users As of April, 2010 55 Million Tweets per Day 3 Billion Requests per day to Twitter API 10 VIStology, Inc - Fusion 2011
The Argument from Google There are too many Twitter sources to evaluate their reliability directly. However, Google has shown that there is great value in using eigenvector centrality (PageRank) as a proxy for reliability. Therefore, we assume that a PageRank-like metric correlates with Reliability because (1) We assume that people do not pass along information they believe to be unreliable (2) Eigenvector centrality/retweet influence, unlike simple indegree centrality, is difficult to fake. 11 VIStology, Inc - Fusion 2011
Not Every Twitter User is Real CENTCOM Operation Earnest Voice 12 VIStology, Inc - Fusion 2011
TunkRank as Reliability Influence(X) = Expected number of people who will read a tweet that X tweets, including all retweets of that tweet. For simplicity, we assume that, if a person reads the same message twice (because of retweets), both readings count. If X is a member of Followers(Y), then there is a 1/||Following(X)|| probability that X will read a tweet posted by Y, where Following(X) is the set of people that X follows. If X reads a tweet from Y, there’s a constant probability p that X will retweet it. D. Tunkelang.  2009.  A Twitter Analog to PageRank.  http://thenoisychannel.com/2009/01/13/a-twitter-analog-to-pagerank/ 13 VIStology, Inc - Fusion 2011
TunkRank as Reliability  TunkRankvs Indegree Centrality (log scale) Mapping TunkRank to STANAG 2022 Reliability  14 VIStology, Inc - Fusion 2011
Unreliability Indicators If X retweets a message, e.g: RT @Whitehouse Zombie uprising in Scranton And there is no corresponding original tweet Then X is E: Unreliable. If X tweets a message with the same URL (shortened or dereferenced)  But different content More than twice Then X is D: Not Usually Reliable. (On the other hand: Verification: Reliability ) 15 VIStology, Inc - Fusion 2011
Source Independence There is a path connecting (nearly) every user in the Twitter graph. This does not mean that there is no source independence in Twitter. We count any sources as independent if they originate the message, and  The shortest path between them is ≄ 4. In T.H. dataset, 4/20 tweets cite same NY Times URL via 3 shortened URLs.   So, not independent. Other news sources: 2 cite Guardian, 1 BBC, 1 Der Spiegel, 1 WaPo, 1 Times of London No explicit Retweets No Implicit Retweets => 16 originating sources Compute distance between remaining sources 16 VIStology, Inc - Fusion 2011
Sameness of Content String identical tweets are not independent.  Implicit retweets @BWJones: Tim Hetherington, photographer and 'Restrepo' co-director, killed in Misrata, Libya http://nyti.ms/dIm29T4/20/2011  6:16:25 PM @Frieze_magazine: Tim Hetherington, photographer and 'Restrepo' co-director, killed in Misrata, Libya http://nyti.ms/dIm29T4/20/2011  7:01:30 PM Custom Regexes to handle dead/alive Tweet =~ (<subject> .* (dead|died|killed|notalive|RIP) )  && Tweet !~ (<subject> .* (not (dead|died|killed)) => Dead Tim Hetherington, Restrepo director has been killed in Misurata OR: Tweet =~ (<subject>.*(alive|(not (killed|dead|died)) && Tweet !~ (<subject> .* (not alive|RIP) => Alive E.g. C H still alive.  (true positive) Wish T H were still alive (false positive) Misses: C H in serious condition ( |= alive) >2x P vs not-P: Confirmed P; not-P: Improbable; > 1.5x P vs not-P: Probably True P, Doubtful not-P;  ~same P, not-P: Possibly-true P, Possibly-true not-P 435 Tweets report C H dead; vs 7 C H alive: Confirmed: C H Dead; Improbable: C H not Dead. 17 VIStology, Inc - Fusion 2011
Recap: Algorithm Identify set of Tweets by Search API on name Classify into Dead/Alive content Calculate TunkRank on Users Discount false retweeters Calculate Source Independence Group same media URLs; retweets, implicit retweets Calculate distance between sources for joint network two hops out for each source. @NYTImesPhoto: An attack in Misurata, Libya today killed the photographer Tim Hetherington. 4/20/2011  7:11:15 PM TunkRank: 99th percentile; > 5 independent sources assert T H died; 0 alive <A:Completely Reliable, 1:Confirmed by Other Sources> @Cmovila: Sad news Tim Hetherington died in Misrata now when covering the front line. 4/20/2011  4:39:57 PM TunkRank: 0th Percentile; > 5 Independent sources assert T H died; 0 alive <E: Unreliable;  1:Confirmed by Other Sources> T H Alive: 5: Improbable> 18 VIStology, Inc - Fusion 2011
Notional Architecture VIStology, Inc - Fusion 2011 19 Twitter  Search API Tweet to RDF Conversion Message Classifier Twitter API BaseVISor Inference Engine TunkRank API Distance Calculator Tweets Augmented with STANAG 2022 Assessments
Conclusions Treating all Tweets as equally legitimate OK in non-adversarial, high volume situations. As OSINT, Tweets need to be evaluated according to the STANAG 2022 rubric We have outlined tractable ways to calculate reliability (TunkRank), credibility (sameness of content) and source (in)dependence. By converting Tweets to RDF, we can reason about them formally with a formal reasoner (BaseVISor) Future work: Do large scale demonstration showing efficacy in distinguishing low-confidence death rumors from high-confidence death notices on Twitter 20 VIStology, Inc - Fusion 2011
Questions? 21 VIStology, Inc - Fusion 2011

Weitere Àhnliche Inhalte

Andere mochten auch

Networking
NetworkingNetworking
Networkingnik.manjit
 
Mobile payments Today
Mobile payments TodayMobile payments Today
Mobile payments TodayMika Li
 
VIStology: Mining the Malaysian Sopo Blogosphere
VIStology: Mining the Malaysian Sopo BlogosphereVIStology: Mining the Malaysian Sopo Blogosphere
VIStology: Mining the Malaysian Sopo BlogosphereBrian Ulicny
 
Recovery_isea_2009
Recovery_isea_2009Recovery_isea_2009
Recovery_isea_2009lyndon3000
 
Jyrie Lossenko Sverd 091007
Jyrie Lossenko Sverd 091007Jyrie Lossenko Sverd 091007
Jyrie Lossenko Sverd 091007ulfsan
 
Getting to plan_b_a_revi
Getting to plan_b_a_reviGetting to plan_b_a_revi
Getting to plan_b_a_reviJames Taylor
 

Andere mochten auch (6)

Networking
NetworkingNetworking
Networking
 
Mobile payments Today
Mobile payments TodayMobile payments Today
Mobile payments Today
 
VIStology: Mining the Malaysian Sopo Blogosphere
VIStology: Mining the Malaysian Sopo BlogosphereVIStology: Mining the Malaysian Sopo Blogosphere
VIStology: Mining the Malaysian Sopo Blogosphere
 
Recovery_isea_2009
Recovery_isea_2009Recovery_isea_2009
Recovery_isea_2009
 
Jyrie Lossenko Sverd 091007
Jyrie Lossenko Sverd 091007Jyrie Lossenko Sverd 091007
Jyrie Lossenko Sverd 091007
 
Getting to plan_b_a_revi
Getting to plan_b_a_reviGetting to plan_b_a_revi
Getting to plan_b_a_revi
 

Ähnlich wie Toward Formal Reasoning with Epistemic Policies about Information Quality in the Twittersphere

Feat. Gerbaudo Class (Data and General Election in the UK)
Feat. Gerbaudo Class (Data and General Election in the UK)Feat. Gerbaudo Class (Data and General Election in the UK)
Feat. Gerbaudo Class (Data and General Election in the UK)fabiomalini
 
Infodemic in the Ukrainian segment of Telegram: the whos, hows and whys
Infodemic in the Ukrainian segment of Telegram: the whos, hows and whysInfodemic in the Ukrainian segment of Telegram: the whos, hows and whys
Infodemic in the Ukrainian segment of Telegram: the whos, hows and whysssuser86094a
 
Twitter - Media Agenda Setting
Twitter - Media Agenda SettingTwitter - Media Agenda Setting
Twitter - Media Agenda SettingAyushi Mona
 
Twitter Sentiment and Network Analysis
Twitter Sentiment and Network AnalysisTwitter Sentiment and Network Analysis
Twitter Sentiment and Network AnalysisXudong Brandon Liang
 
lable at ScienceDirectComputers in Human Behavior 83 (2018.docx
lable at ScienceDirectComputers in Human Behavior 83 (2018.docxlable at ScienceDirectComputers in Human Behavior 83 (2018.docx
lable at ScienceDirectComputers in Human Behavior 83 (2018.docxcroysierkathey
 
Document(2)
Document(2)Document(2)
Document(2)Sutha Guru
 
Misinformation vs Fact-Checks: The Ongoing Battle
Misinformation vs Fact-Checks: The Ongoing BattleMisinformation vs Fact-Checks: The Ongoing Battle
Misinformation vs Fact-Checks: The Ongoing BattleThe Open University
 
Are Twitter Users Equal in Predicting Elections
Are Twitter Users Equal in Predicting ElectionsAre Twitter Users Equal in Predicting Elections
Are Twitter Users Equal in Predicting ElectionsLu Chen
 
Broker Bots: Analyzing automated activity during High Impact Events on Twitter
Broker Bots: Analyzing automated activity during High Impact Events on TwitterBroker Bots: Analyzing automated activity during High Impact Events on Twitter
Broker Bots: Analyzing automated activity during High Impact Events on TwitterCybersecurity Education and Research Centre
 
Television Ratings Got Social
Television Ratings Got SocialTelevision Ratings Got Social
Television Ratings Got SocialBurak Polat
 
Fredrick Ishengoma - Online Social Networks and Terrorism 2.0 in Developing C...
Fredrick Ishengoma - Online Social Networks and Terrorism 2.0 in Developing C...Fredrick Ishengoma - Online Social Networks and Terrorism 2.0 in Developing C...
Fredrick Ishengoma - Online Social Networks and Terrorism 2.0 in Developing C...Fredrick Ishengoma
 
IRJET - Political Orientation Prediction using Social Media Activity
IRJET -  	  Political Orientation Prediction using Social Media ActivityIRJET -  	  Political Orientation Prediction using Social Media Activity
IRJET - Political Orientation Prediction using Social Media ActivityIRJET Journal
 
1 Crore Projects | ieee 2016 Projects | 2016 ieee Projects in chennai
1 Crore Projects | ieee 2016 Projects | 2016 ieee Projects in chennai1 Crore Projects | ieee 2016 Projects | 2016 ieee Projects in chennai
1 Crore Projects | ieee 2016 Projects | 2016 ieee Projects in chennai1crore projects
 
Are Twitter Users Equal in Predicting Elections? Insights from Republican Pri...
Are Twitter Users Equal in Predicting Elections? Insights from Republican Pri...Are Twitter Users Equal in Predicting Elections? Insights from Republican Pri...
Are Twitter Users Equal in Predicting Elections? Insights from Republican Pri...Artificial Intelligence Institute at UofSC
 
Osint data mining
Osint data miningOsint data mining
Osint data miningDaniel John
 
Computational Verification Challenges in Social Media
Computational Verification Challenges in Social MediaComputational Verification Challenges in Social Media
Computational Verification Challenges in Social MediaSymeon Papadopoulos
 
Using Twitter And Social Networking As A Tool
Using Twitter And Social Networking As A ToolUsing Twitter And Social Networking As A Tool
Using Twitter And Social Networking As A Toolmfaulkner
 
Detection and resolution of rumours in social media
Detection and resolution of rumours in social mediaDetection and resolution of rumours in social media
Detection and resolution of rumours in social mediaObedullahFahad
 
Twitterology - The Science of Twitter
Twitterology - The Science of TwitterTwitterology - The Science of Twitter
Twitterology - The Science of TwitterBruno Gonçalves
 

Ähnlich wie Toward Formal Reasoning with Epistemic Policies about Information Quality in the Twittersphere (20)

Twitter r t under crisis
Twitter r t under crisisTwitter r t under crisis
Twitter r t under crisis
 
Feat. Gerbaudo Class (Data and General Election in the UK)
Feat. Gerbaudo Class (Data and General Election in the UK)Feat. Gerbaudo Class (Data and General Election in the UK)
Feat. Gerbaudo Class (Data and General Election in the UK)
 
Infodemic in the Ukrainian segment of Telegram: the whos, hows and whys
Infodemic in the Ukrainian segment of Telegram: the whos, hows and whysInfodemic in the Ukrainian segment of Telegram: the whos, hows and whys
Infodemic in the Ukrainian segment of Telegram: the whos, hows and whys
 
Twitter - Media Agenda Setting
Twitter - Media Agenda SettingTwitter - Media Agenda Setting
Twitter - Media Agenda Setting
 
Twitter Sentiment and Network Analysis
Twitter Sentiment and Network AnalysisTwitter Sentiment and Network Analysis
Twitter Sentiment and Network Analysis
 
lable at ScienceDirectComputers in Human Behavior 83 (2018.docx
lable at ScienceDirectComputers in Human Behavior 83 (2018.docxlable at ScienceDirectComputers in Human Behavior 83 (2018.docx
lable at ScienceDirectComputers in Human Behavior 83 (2018.docx
 
Document(2)
Document(2)Document(2)
Document(2)
 
Misinformation vs Fact-Checks: The Ongoing Battle
Misinformation vs Fact-Checks: The Ongoing BattleMisinformation vs Fact-Checks: The Ongoing Battle
Misinformation vs Fact-Checks: The Ongoing Battle
 
Are Twitter Users Equal in Predicting Elections
Are Twitter Users Equal in Predicting ElectionsAre Twitter Users Equal in Predicting Elections
Are Twitter Users Equal in Predicting Elections
 
Broker Bots: Analyzing automated activity during High Impact Events on Twitter
Broker Bots: Analyzing automated activity during High Impact Events on TwitterBroker Bots: Analyzing automated activity during High Impact Events on Twitter
Broker Bots: Analyzing automated activity during High Impact Events on Twitter
 
Television Ratings Got Social
Television Ratings Got SocialTelevision Ratings Got Social
Television Ratings Got Social
 
Fredrick Ishengoma - Online Social Networks and Terrorism 2.0 in Developing C...
Fredrick Ishengoma - Online Social Networks and Terrorism 2.0 in Developing C...Fredrick Ishengoma - Online Social Networks and Terrorism 2.0 in Developing C...
Fredrick Ishengoma - Online Social Networks and Terrorism 2.0 in Developing C...
 
IRJET - Political Orientation Prediction using Social Media Activity
IRJET -  	  Political Orientation Prediction using Social Media ActivityIRJET -  	  Political Orientation Prediction using Social Media Activity
IRJET - Political Orientation Prediction using Social Media Activity
 
1 Crore Projects | ieee 2016 Projects | 2016 ieee Projects in chennai
1 Crore Projects | ieee 2016 Projects | 2016 ieee Projects in chennai1 Crore Projects | ieee 2016 Projects | 2016 ieee Projects in chennai
1 Crore Projects | ieee 2016 Projects | 2016 ieee Projects in chennai
 
Are Twitter Users Equal in Predicting Elections? Insights from Republican Pri...
Are Twitter Users Equal in Predicting Elections? Insights from Republican Pri...Are Twitter Users Equal in Predicting Elections? Insights from Republican Pri...
Are Twitter Users Equal in Predicting Elections? Insights from Republican Pri...
 
Osint data mining
Osint data miningOsint data mining
Osint data mining
 
Computational Verification Challenges in Social Media
Computational Verification Challenges in Social MediaComputational Verification Challenges in Social Media
Computational Verification Challenges in Social Media
 
Using Twitter And Social Networking As A Tool
Using Twitter And Social Networking As A ToolUsing Twitter And Social Networking As A Tool
Using Twitter And Social Networking As A Tool
 
Detection and resolution of rumours in social media
Detection and resolution of rumours in social mediaDetection and resolution of rumours in social media
Detection and resolution of rumours in social media
 
Twitterology - The Science of Twitter
Twitterology - The Science of TwitterTwitterology - The Science of Twitter
Twitterology - The Science of Twitter
 

KĂŒrzlich hochgeladen

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...gurkirankumar98700
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel AraĂșjo
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 

KĂŒrzlich hochgeladen (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

Toward Formal Reasoning with Epistemic Policies about Information Quality in the Twittersphere

  • 1. Toward Formal Reasoning with Epistemic Policies About Information Quality in the Twittersphere Brian Ulicny VIStology, Inc. bulicny@vistology.com Mieczyslaw Kokar Northeastern University and VIStology, Inc. kokar@coe.neu.edu VIStology, Inc - Fusion 2011 1
  • 2. Arab Spring Uprisings 2011 2 VIStology, Inc - Fusion 2011
  • 3. Situation Awareness (?):Al Jazeera’s Twitter Monitor 3 VIStology, Inc - Fusion 2011
  • 4. Situation Awareness:Attention Spikes from Twitter 4 VIStology, Inc - Fusion 2011
  • 5. Situation Awareness: Flu Trends from Social Media Detecting influenza outbreaks by analyzing Twitter messages AronCulotta arXiv:1007.4748v1 [cs.IR] 27 Jul 2010 5 VIStology, Inc - Fusion 2011
  • 6. Twitter as Open Source Intel 6 VIStology, Inc - Fusion 2011
  • 7. 7 VIStology, Inc - Fusion 2011 Confidence = <Reliability, Credibility>
  • 8. Problem Statement How can we assess not only the volume of tweets per time period And the frequency of terms they contain But the reliability, credibility & confidence of the information they convey In a potentially adversarial situation? 8 VIStology, Inc - Fusion 2011
  • 9. NaĂŻve STANAG 2022 for Twitter Reliability = F: Cannot Be Judged All “sources not used in the past” Credibility = 1: Confirmed by Other Sources More than two string identical tweets? Or Credibility = 3, Possibly True Because Sources not Independent Because Path between all sources in Twitter graph 9 VIStology, Inc - Fusion 2011
  • 10. Need Tractable Way to Calculate: Twitter Source Reliability Twitter Content Credibility Twitter Source Independence Where Entire Twitter graph contains 105 Million Users As of April, 2010 55 Million Tweets per Day 3 Billion Requests per day to Twitter API 10 VIStology, Inc - Fusion 2011
  • 11. The Argument from Google There are too many Twitter sources to evaluate their reliability directly. However, Google has shown that there is great value in using eigenvector centrality (PageRank) as a proxy for reliability. Therefore, we assume that a PageRank-like metric correlates with Reliability because (1) We assume that people do not pass along information they believe to be unreliable (2) Eigenvector centrality/retweet influence, unlike simple indegree centrality, is difficult to fake. 11 VIStology, Inc - Fusion 2011
  • 12. Not Every Twitter User is Real CENTCOM Operation Earnest Voice 12 VIStology, Inc - Fusion 2011
  • 13. TunkRank as Reliability Influence(X) = Expected number of people who will read a tweet that X tweets, including all retweets of that tweet. For simplicity, we assume that, if a person reads the same message twice (because of retweets), both readings count. If X is a member of Followers(Y), then there is a 1/||Following(X)|| probability that X will read a tweet posted by Y, where Following(X) is the set of people that X follows. If X reads a tweet from Y, there’s a constant probability p that X will retweet it. D. Tunkelang. 2009. A Twitter Analog to PageRank. http://thenoisychannel.com/2009/01/13/a-twitter-analog-to-pagerank/ 13 VIStology, Inc - Fusion 2011
  • 14. TunkRank as Reliability TunkRankvs Indegree Centrality (log scale) Mapping TunkRank to STANAG 2022 Reliability 14 VIStology, Inc - Fusion 2011
  • 15. Unreliability Indicators If X retweets a message, e.g: RT @Whitehouse Zombie uprising in Scranton And there is no corresponding original tweet Then X is E: Unreliable. If X tweets a message with the same URL (shortened or dereferenced) But different content More than twice Then X is D: Not Usually Reliable. (On the other hand: Verification: Reliability ) 15 VIStology, Inc - Fusion 2011
  • 16. Source Independence There is a path connecting (nearly) every user in the Twitter graph. This does not mean that there is no source independence in Twitter. We count any sources as independent if they originate the message, and The shortest path between them is ≄ 4. In T.H. dataset, 4/20 tweets cite same NY Times URL via 3 shortened URLs. So, not independent. Other news sources: 2 cite Guardian, 1 BBC, 1 Der Spiegel, 1 WaPo, 1 Times of London No explicit Retweets No Implicit Retweets => 16 originating sources Compute distance between remaining sources 16 VIStology, Inc - Fusion 2011
  • 17. Sameness of Content String identical tweets are not independent. Implicit retweets @BWJones: Tim Hetherington, photographer and 'Restrepo' co-director, killed in Misrata, Libya http://nyti.ms/dIm29T4/20/2011 6:16:25 PM @Frieze_magazine: Tim Hetherington, photographer and 'Restrepo' co-director, killed in Misrata, Libya http://nyti.ms/dIm29T4/20/2011 7:01:30 PM Custom Regexes to handle dead/alive Tweet =~ (<subject> .* (dead|died|killed|notalive|RIP) ) && Tweet !~ (<subject> .* (not (dead|died|killed)) => Dead Tim Hetherington, Restrepo director has been killed in Misurata OR: Tweet =~ (<subject>.*(alive|(not (killed|dead|died)) && Tweet !~ (<subject> .* (not alive|RIP) => Alive E.g. C H still alive. (true positive) Wish T H were still alive (false positive) Misses: C H in serious condition ( |= alive) >2x P vs not-P: Confirmed P; not-P: Improbable; > 1.5x P vs not-P: Probably True P, Doubtful not-P; ~same P, not-P: Possibly-true P, Possibly-true not-P 435 Tweets report C H dead; vs 7 C H alive: Confirmed: C H Dead; Improbable: C H not Dead. 17 VIStology, Inc - Fusion 2011
  • 18. Recap: Algorithm Identify set of Tweets by Search API on name Classify into Dead/Alive content Calculate TunkRank on Users Discount false retweeters Calculate Source Independence Group same media URLs; retweets, implicit retweets Calculate distance between sources for joint network two hops out for each source. @NYTImesPhoto: An attack in Misurata, Libya today killed the photographer Tim Hetherington. 4/20/2011 7:11:15 PM TunkRank: 99th percentile; > 5 independent sources assert T H died; 0 alive <A:Completely Reliable, 1:Confirmed by Other Sources> @Cmovila: Sad news Tim Hetherington died in Misrata now when covering the front line. 4/20/2011 4:39:57 PM TunkRank: 0th Percentile; > 5 Independent sources assert T H died; 0 alive <E: Unreliable; 1:Confirmed by Other Sources> T H Alive: 5: Improbable> 18 VIStology, Inc - Fusion 2011
  • 19. Notional Architecture VIStology, Inc - Fusion 2011 19 Twitter Search API Tweet to RDF Conversion Message Classifier Twitter API BaseVISor Inference Engine TunkRank API Distance Calculator Tweets Augmented with STANAG 2022 Assessments
  • 20. Conclusions Treating all Tweets as equally legitimate OK in non-adversarial, high volume situations. As OSINT, Tweets need to be evaluated according to the STANAG 2022 rubric We have outlined tractable ways to calculate reliability (TunkRank), credibility (sameness of content) and source (in)dependence. By converting Tweets to RDF, we can reason about them formally with a formal reasoner (BaseVISor) Future work: Do large scale demonstration showing efficacy in distinguishing low-confidence death rumors from high-confidence death notices on Twitter 20 VIStology, Inc - Fusion 2011
  • 21. Questions? 21 VIStology, Inc - Fusion 2011