SlideShare ist ein Scribd-Unternehmen logo
1 von 55
Downloaden Sie, um offline zu lesen
Searching Twitter:
Separating the Tweet
from the Chaff
Jonathan Hurlock & Max L. Wilson
You sure can!
llow
                                                                I fo ?
                                                             do ests
                                                       H ow ter
                                                           y In
                                                         m




http://www.flickr.com/photos/stevegarfield/5397972626/
http://www.flickr.com/photos/apelad/3684843147/
Yet more Data
Meta Data, Profile Data, Linked Data
Any of it Useful?
Who cares how much data there is!

“I think the challenge not only for twitter, but for
the technology industry at large. Is building
more relevant filters, in real time. Like being
able to surface valuable information
immediately. No matter who it is, whoʼs
listening or whoʼs broadcasting, is a really
really hard problem, and it makes twitter alot
more meaningful[... ]Weʼve gotten really really
good at being able to put content in, into media
[...] getting it out in a relevant, valueable way,
in real time is still very difficult.”

- Jack Dorsey (Creator of Twitter)
Why Twitter?
Where is the value?




                               $       ₧

                           ƒ

                                   !
                           ₥
                                ₧

                           ₤



                                        ₣
                       ¢     ₠!



                                   ! ₣£
                        ₡ ₱£




                                      !
                        ₧ ₡ ¢ ₤



                                    ₠

                                   ₱!
                      ₥       ₣            $ ƒ
Lets go back...




http://www.flickr.com/photos/milesdeelite/5309712846/
Lets go back...
                  Great Scott!




http://www.flickr.com/photos/milesdeelite/5309712846/
Asking Friends
Hey, what are you doing?




                           you   &   me
Social Search
What is everyone else doing?




                               you   &   me
friend                    friend       friend




Social Search
What is everyone else doing?




     friend                    you      &    me
bob             &        lisa



        Existing Knowledge
        No need to reinvent the wheel




                                                                    you            &         me


Meredith Ringel Morris, Jaime Teevan, and Katrina Panovich. 2010. What do people ask their social networks, and why?: a
survey study of status message & behavior. In Proceedings of the 28th international conference on Human factors in
computing systems (CHI '10). ACM, New York, NY, USA, 1739-1748.
lisa

        Existing Knowledge                                                 bob         &        me
        No need to reinvent the wheel


                                                                                    you




Meredith Ringel Morris, Jaime Teevan, and Katrina Panovich. 2010. What do people ask their social networks, and why?: a
survey study of status message & behavior. In Proceedings of the 28th international conference on Human factors in
computing systems (CHI '10). ACM, New York, NY, USA, 1739-1748.
Lets go back to the network
Remember...




                     you   &   me
friend           friend    friend




and if we take a step back...
Please mind the gap




     friend            you       me
We start to see interesting things...
Which have value!
Location, experiences, temporal data




                                                                             Yardi, Sarita and Boyd, Danah. ICWSM 2010.
http://www.flickr.com/photos/24423474@N08/4999891492/
http://www.flickr.com/photos/mdid/4560003881/
                                                                             Tweeting from the Town Square: Measuring Geographic
http://www.flickr.com/photos/seanhobson/3256437306/
                                                                             Local Networks
http://www.flickr.com/photos/gcaw/5445225362/
http://en.wikipedia.org/wiki/File:Plane_crash_into_Hudson_River_(crop).jpg
Location, experiences, temporal data
                   Political upheaval, emergency events .. so what are you tweeting now?




                                                                             Yardi, Sarita and Boyd, Danah. ICWSM 2010.
http://www.flickr.com/photos/24423474@N08/4999891492/
http://www.flickr.com/photos/mdid/4560003881/
                                                                             Tweeting from the Town Square: Measuring Geographic
http://www.flickr.com/photos/seanhobson/3256437306/
                                                                             Local Networks
http://www.flickr.com/photos/gcaw/5445225362/
http://en.wikipedia.org/wiki/File:Plane_crash_into_Hudson_River_(crop).jpg
Twitter Search
How do you find useful information?
Displaying Results
Realtime
Displaying Results


                                                          RT
                  Time, ReTweets, Location, Popularity?




http://www.flickr.com/photos/publicenergy/394124407/
Displaying Results


                                                          RT
                  Time, ReTweets, Location, Popularity?




http://www.flickr.com/photos/publicenergy/394124407/
Displaying Results
Making sense of the data.
Displaying Results
           Making sense of the data.




Michael S. Bernstein, Bongwon Suh, Lichan Hong, Jilin Chen, Sanjay Kairam, Ed H. Chi. Eddi: Interactive Topic-based Browsing of Social Status Streams.
In Proc. of ACM User Interface Software and Technology (UIST) conference, Oct. 2010. New York, NY.
Displaying Results
           Making sense of the data.




Diakopoulos, N.; Naaman, M.; Kivran-Swaine, F.; , "Diamonds in the rough: Social media visual analytics for journalistic inquiry,"
Visual Analytics Science and Technology (VAST), 2010 IEEE Symposium on , vol., no., pp.115-122, 25-26 Oct. 2010
Interestingness
                 Not necessarily useful!




  Naveed, Nasir and Gottron, Thomas and Kunegis, Jérôme and Alhadi, Arifah Che (2011) Bad News Travel Fast: A Content-based Analysis of
               Interestingness on Twitter. pp. 1-7. In: Proceedings of the ACM WebSci'11, June 14-17 2011, Koblenz, Germany.
http://www.flickr.com/photos/wwarby/2460655511/
How we are different?
What makes us unique?
Finding Usefulness!
                   What constitutes a useful Tweet?




                                                    fuln ess
                                                            use




http://www.flickr.com/photos/edduddiee/4346349664/
The Method
How did we go about this?
Teevan, J., Ramage, D., & Morris, M. R. (2011). #TwitterSearch: a comparison of microblog search and web search. WSDM
  '11: Proceedings of the fourth ACM international conference on Web search and data mining (pp. 35-44). New York, NY, USA:
  ACM.




                  Information Seeking
                  3 Information Seeking Tasks




http://www.bbc.co.uk/proms/2010/share/badgewidget.shtml

http://www.flickr.com/photos/ivyfield/4731067396/

http://www.flickr.com/photos/anniemole/241655156/
20 Participants
They were really nice people!
Search Interface
A simple, easy to understand interface
It’s useful because...


Think aloud + Interviews
To help us provide more insight
                                    I didn’t because...
∑
Analysis
Lots and lots of it!
                       K
Grounded Theory
         Inductive Coding = Lots of Post-its!




Glaser, B. G., & Strauss, A. L. (2009).
The Discovery of Grounded Theory: strategies for qualitative research.
Piscataway, New Jersey, USA: Transaction Publishers.
Kappa Analysis
       Cohen... Fleiss....




Landis, R. J., & Koch, G. G. (1977). The Measurement of Observer Agreement for Categorical Data. Biometrics , 33 (1),
159-174.
Extended Kappa Analysis
         Multi Coded Kappa


                              0.73 (Substantial Agreement)
                                   Between Evaluators
                                           &
                              0.62 (Substantial Agreement)
                           with Independent Untrained Coder
Harris, J. K., & Burke, R. C. (2005). Do you see what I see? An application of inter-coder reliability in qualitative analysis.
American Public Health Association 133rd Annual Meeting & Exposition. Washington, DC, USA: American Public Health
Association.
What did we find?
Useful & Not-Useful
Useful
In Tweet Content
       Experience            Someone reporting a personal experience, but not necessarily suggestion / direction.

          Direct         Someone making a direct recommendation, but not necessarily relaying a personal experience.
    Recommendation
    Social Knowledge           Containing information that is spreading socially, or becoming general knowledge.

   Specific Information                  Where facts are listed directly in tweets e.g. prices, times etc.

Reflection on Tweet
     Entertaining                                      The reader finds them amusing.

  Shared Sentiment                             The reader agrees with the author of the tweet.

Relevant
        Time                                                  The time is current

       Location                                      The location is relevant to the query.
Useful (cont.)
Trust
    Trusted Author                           The twitter account has a reputation / following

    Trusted Avatar                                The visual appearance cultivates trust.

      Trusted Link                             A link to a trustworthy recognisable domain.

Links
    Actionable Link         The user can perform a transaction by using the link (heavily dependent on trust)

      Media Link                                  The link is to rich multimedia content.

      Useful Link     The link provides valuable information content, e.g. authoritative information, educated reviews

Meta Tweet
   ReTweeted Lots                             Its information that others have passed on lots

     Conversation                       Its part of a series of tweets, and they all need to be useful
Not Useful
Tweet Content
    No Information                      Absence of anything, event, factual points
     Introspective            Personal content and personal thoughts for no social benefit
       Off Topic                 Result not related to the query give / TF-IDF irrelevant
     Too Technical     The content requires specific domain knowledge the resader doesn’t possess
  Poorly Constructed     Tweets that may have grammatical / spelling errors, or malformed URLs.

Bad Tweets
         SPAM                             Irrelevant or inappropriate messages
   Wrong Language               Messages sent in a foreign language of that to the reader
       Dead Link                         A URL which does not work i.e. a 404

Not Relevant
         Time                                     Out of date content
       Location                                Wrong geographic location
Not Useful (cont.)
Trust
  Un-truested Author                      An author the reader feels at un-eased by or suspicious of.
    Un-trusted Link                                   A link the reader feels is suspicious

Subjective
                        A tweet that is perspective centric, meaning the author is providing their view or projecting an
 Perspective Oriented                        attitude on a subject matter or to a subject / reader.
 Disagree with Tweet                     A conflict of aggreement between the reader and the author
      Not Funny              A tweet that is aimed to be humorous, which the reader does not feel is humorous.

Meta Tweet
        QnA              Part of a conversation, reader desires the whole convo. not just the question or the answer.
      Repeated                                       Content the reader has seen before.
Insights
                 Interesting finds




http://www.flickr.com/photos/foxmulderven/3063598624
The Possible Impact
Where could we see the impact of this work?
Search System
A work in progress
Conclusions
So just remember.
Thank you for Listening
Jonathan Hurlock
   @jonhurlock

Max L. Wilson
   @gingdottwit


                  Like the talk? Then please tweet it, by quickly visiting:
                          http://moourl.com/LikedTheTalk

Weitere ähnliche Inhalte

Ähnlich wie Searching Twitter: Separating the Tweet from the Chaff

People Like You Like Presentations Like This
People Like You Like Presentations Like ThisPeople Like You Like Presentations Like This
People Like You Like Presentations Like ThisDavid Millard
 
Emergent MEDIA, NEXT GEN THINKING
Emergent MEDIA, NEXT GEN THINKINGEmergent MEDIA, NEXT GEN THINKING
Emergent MEDIA, NEXT GEN THINKINGAnn DeMarle
 
Some ABCs of Forecasting - James Woudhuysen
Some ABCs of Forecasting - James WoudhuysenSome ABCs of Forecasting - James Woudhuysen
Some ABCs of Forecasting - James Woudhuysenuxbri
 
Spotting (Draft)
Spotting (Draft)Spotting (Draft)
Spotting (Draft)Keith Lyons
 
Citizen Science overview for ASU HSD598 graduate course, "Citizen Science"
Citizen Science overview for ASU HSD598 graduate course, "Citizen Science"Citizen Science overview for ASU HSD598 graduate course, "Citizen Science"
Citizen Science overview for ASU HSD598 graduate course, "Citizen Science"Darlene Cavalier
 
Fahrenheit 451 Essay Topics
Fahrenheit 451 Essay TopicsFahrenheit 451 Essay Topics
Fahrenheit 451 Essay TopicsJenny Hardcastle
 
Network leadership in an uncertain world
Network leadership in an uncertain worldNetwork leadership in an uncertain world
Network leadership in an uncertain worldRobin Teigland
 
Data as a Creative Material
Data as a Creative MaterialData as a Creative Material
Data as a Creative MaterialAudree Lapierre
 
Enabling access to participation
Enabling access to participationEnabling access to participation
Enabling access to participationSteve Vosloo
 
Gov 2.0 for Environmental Protection Agency and Executive Women in Government
Gov 2.0 for Environmental Protection Agency and Executive Women in GovernmentGov 2.0 for Environmental Protection Agency and Executive Women in Government
Gov 2.0 for Environmental Protection Agency and Executive Women in GovernmentAndrew Krzmarzick
 
Trenton (NJ) Small Business Innovation Project - October 2016
Trenton (NJ) Small Business Innovation Project - October 2016Trenton (NJ) Small Business Innovation Project - October 2016
Trenton (NJ) Small Business Innovation Project - October 2016Scott Hutcheson, Ph.D.
 

Ähnlich wie Searching Twitter: Separating the Tweet from the Chaff (20)

People Like You Like Presentations Like This
People Like You Like Presentations Like ThisPeople Like You Like Presentations Like This
People Like You Like Presentations Like This
 
Emergent MEDIA, NEXT GEN THINKING
Emergent MEDIA, NEXT GEN THINKINGEmergent MEDIA, NEXT GEN THINKING
Emergent MEDIA, NEXT GEN THINKING
 
Some ABCs of Forecasting - James Woudhuysen
Some ABCs of Forecasting - James WoudhuysenSome ABCs of Forecasting - James Woudhuysen
Some ABCs of Forecasting - James Woudhuysen
 
Spotting (Draft)
Spotting (Draft)Spotting (Draft)
Spotting (Draft)
 
Digital Tattoo Workshop for VPL
Digital Tattoo Workshop for VPLDigital Tattoo Workshop for VPL
Digital Tattoo Workshop for VPL
 
Lecture
LectureLecture
Lecture
 
Why Work & Teach With Second Life
Why Work & Teach With Second LifeWhy Work & Teach With Second Life
Why Work & Teach With Second Life
 
Cmlibraries ratto
Cmlibraries rattoCmlibraries ratto
Cmlibraries ratto
 
Dagan "'Alexa, get me the articles': user experience and voice interfaces in ...
Dagan "'Alexa, get me the articles': user experience and voice interfaces in ...Dagan "'Alexa, get me the articles': user experience and voice interfaces in ...
Dagan "'Alexa, get me the articles': user experience and voice interfaces in ...
 
Citizen Science overview for ASU HSD598 graduate course, "Citizen Science"
Citizen Science overview for ASU HSD598 graduate course, "Citizen Science"Citizen Science overview for ASU HSD598 graduate course, "Citizen Science"
Citizen Science overview for ASU HSD598 graduate course, "Citizen Science"
 
Ap Biology Essay Topics
Ap Biology Essay TopicsAp Biology Essay Topics
Ap Biology Essay Topics
 
Digital Tattoo: for MOSAIC
Digital Tattoo: for MOSAICDigital Tattoo: for MOSAIC
Digital Tattoo: for MOSAIC
 
Fahrenheit 451 Essay Topics
Fahrenheit 451 Essay TopicsFahrenheit 451 Essay Topics
Fahrenheit 451 Essay Topics
 
Why Second Life?
Why Second Life?Why Second Life?
Why Second Life?
 
Network leadership in an uncertain world
Network leadership in an uncertain worldNetwork leadership in an uncertain world
Network leadership in an uncertain world
 
Data as a Creative Material
Data as a Creative MaterialData as a Creative Material
Data as a Creative Material
 
Chicago School of Data Book
Chicago School of Data BookChicago School of Data Book
Chicago School of Data Book
 
Enabling access to participation
Enabling access to participationEnabling access to participation
Enabling access to participation
 
Gov 2.0 for Environmental Protection Agency and Executive Women in Government
Gov 2.0 for Environmental Protection Agency and Executive Women in GovernmentGov 2.0 for Environmental Protection Agency and Executive Women in Government
Gov 2.0 for Environmental Protection Agency and Executive Women in Government
 
Trenton (NJ) Small Business Innovation Project - October 2016
Trenton (NJ) Small Business Innovation Project - October 2016Trenton (NJ) Small Business Innovation Project - October 2016
Trenton (NJ) Small Business Innovation Project - October 2016
 

Kürzlich hochgeladen

Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 

Kürzlich hochgeladen (20)

Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 

Searching Twitter: Separating the Tweet from the Chaff

  • 1. Searching Twitter: Separating the Tweet from the Chaff Jonathan Hurlock & Max L. Wilson
  • 3.
  • 4.
  • 5.
  • 6.
  • 7. llow I fo ? do ests H ow ter y In m http://www.flickr.com/photos/stevegarfield/5397972626/
  • 9.
  • 10. Yet more Data Meta Data, Profile Data, Linked Data
  • 11. Any of it Useful? Who cares how much data there is! “I think the challenge not only for twitter, but for the technology industry at large. Is building more relevant filters, in real time. Like being able to surface valuable information immediately. No matter who it is, whoʼs listening or whoʼs broadcasting, is a really really hard problem, and it makes twitter alot more meaningful[... ]Weʼve gotten really really good at being able to put content in, into media [...] getting it out in a relevant, valueable way, in real time is still very difficult.” - Jack Dorsey (Creator of Twitter)
  • 12. Why Twitter? Where is the value? $ ₧ ƒ ! ₥ ₧ ₤ ₣ ¢ ₠! ! ₣£ ₡ ₱£ ! ₧ ₡ ¢ ₤ ₠ ₱! ₥ ₣ $ ƒ
  • 14. Lets go back... Great Scott! http://www.flickr.com/photos/milesdeelite/5309712846/
  • 15. Asking Friends Hey, what are you doing? you & me
  • 16. Social Search What is everyone else doing? you & me
  • 17. friend friend friend Social Search What is everyone else doing? friend you & me
  • 18. bob & lisa Existing Knowledge No need to reinvent the wheel you & me Meredith Ringel Morris, Jaime Teevan, and Katrina Panovich. 2010. What do people ask their social networks, and why?: a survey study of status message & behavior. In Proceedings of the 28th international conference on Human factors in computing systems (CHI '10). ACM, New York, NY, USA, 1739-1748.
  • 19. lisa Existing Knowledge bob & me No need to reinvent the wheel you Meredith Ringel Morris, Jaime Teevan, and Katrina Panovich. 2010. What do people ask their social networks, and why?: a survey study of status message & behavior. In Proceedings of the 28th international conference on Human factors in computing systems (CHI '10). ACM, New York, NY, USA, 1739-1748.
  • 20. Lets go back to the network Remember... you & me
  • 21. friend friend friend and if we take a step back... Please mind the gap friend you me
  • 22. We start to see interesting things...
  • 24. Location, experiences, temporal data Yardi, Sarita and Boyd, Danah. ICWSM 2010. http://www.flickr.com/photos/24423474@N08/4999891492/ http://www.flickr.com/photos/mdid/4560003881/ Tweeting from the Town Square: Measuring Geographic http://www.flickr.com/photos/seanhobson/3256437306/ Local Networks http://www.flickr.com/photos/gcaw/5445225362/ http://en.wikipedia.org/wiki/File:Plane_crash_into_Hudson_River_(crop).jpg
  • 25. Location, experiences, temporal data Political upheaval, emergency events .. so what are you tweeting now? Yardi, Sarita and Boyd, Danah. ICWSM 2010. http://www.flickr.com/photos/24423474@N08/4999891492/ http://www.flickr.com/photos/mdid/4560003881/ Tweeting from the Town Square: Measuring Geographic http://www.flickr.com/photos/seanhobson/3256437306/ Local Networks http://www.flickr.com/photos/gcaw/5445225362/ http://en.wikipedia.org/wiki/File:Plane_crash_into_Hudson_River_(crop).jpg
  • 26. Twitter Search How do you find useful information?
  • 28. Displaying Results RT Time, ReTweets, Location, Popularity? http://www.flickr.com/photos/publicenergy/394124407/
  • 29. Displaying Results RT Time, ReTweets, Location, Popularity? http://www.flickr.com/photos/publicenergy/394124407/
  • 31. Displaying Results Making sense of the data. Michael S. Bernstein, Bongwon Suh, Lichan Hong, Jilin Chen, Sanjay Kairam, Ed H. Chi. Eddi: Interactive Topic-based Browsing of Social Status Streams. In Proc. of ACM User Interface Software and Technology (UIST) conference, Oct. 2010. New York, NY.
  • 32. Displaying Results Making sense of the data. Diakopoulos, N.; Naaman, M.; Kivran-Swaine, F.; , "Diamonds in the rough: Social media visual analytics for journalistic inquiry," Visual Analytics Science and Technology (VAST), 2010 IEEE Symposium on , vol., no., pp.115-122, 25-26 Oct. 2010
  • 33. Interestingness Not necessarily useful! Naveed, Nasir and Gottron, Thomas and Kunegis, Jérôme and Alhadi, Arifah Che (2011) Bad News Travel Fast: A Content-based Analysis of Interestingness on Twitter. pp. 1-7. In: Proceedings of the ACM WebSci'11, June 14-17 2011, Koblenz, Germany. http://www.flickr.com/photos/wwarby/2460655511/
  • 34. How we are different? What makes us unique?
  • 35. Finding Usefulness! What constitutes a useful Tweet? fuln ess use http://www.flickr.com/photos/edduddiee/4346349664/
  • 36. The Method How did we go about this?
  • 37. Teevan, J., Ramage, D., & Morris, M. R. (2011). #TwitterSearch: a comparison of microblog search and web search. WSDM '11: Proceedings of the fourth ACM international conference on Web search and data mining (pp. 35-44). New York, NY, USA: ACM. Information Seeking 3 Information Seeking Tasks http://www.bbc.co.uk/proms/2010/share/badgewidget.shtml http://www.flickr.com/photos/ivyfield/4731067396/ http://www.flickr.com/photos/anniemole/241655156/
  • 38. 20 Participants They were really nice people!
  • 39. Search Interface A simple, easy to understand interface
  • 40.
  • 41. It’s useful because... Think aloud + Interviews To help us provide more insight I didn’t because...
  • 43. Grounded Theory Inductive Coding = Lots of Post-its! Glaser, B. G., & Strauss, A. L. (2009). The Discovery of Grounded Theory: strategies for qualitative research. Piscataway, New Jersey, USA: Transaction Publishers.
  • 44. Kappa Analysis Cohen... Fleiss.... Landis, R. J., & Koch, G. G. (1977). The Measurement of Observer Agreement for Categorical Data. Biometrics , 33 (1), 159-174.
  • 45. Extended Kappa Analysis Multi Coded Kappa 0.73 (Substantial Agreement) Between Evaluators & 0.62 (Substantial Agreement) with Independent Untrained Coder Harris, J. K., & Burke, R. C. (2005). Do you see what I see? An application of inter-coder reliability in qualitative analysis. American Public Health Association 133rd Annual Meeting & Exposition. Washington, DC, USA: American Public Health Association.
  • 46. What did we find? Useful & Not-Useful
  • 47. Useful In Tweet Content Experience Someone reporting a personal experience, but not necessarily suggestion / direction. Direct Someone making a direct recommendation, but not necessarily relaying a personal experience. Recommendation Social Knowledge Containing information that is spreading socially, or becoming general knowledge. Specific Information Where facts are listed directly in tweets e.g. prices, times etc. Reflection on Tweet Entertaining The reader finds them amusing. Shared Sentiment The reader agrees with the author of the tweet. Relevant Time The time is current Location The location is relevant to the query.
  • 48. Useful (cont.) Trust Trusted Author The twitter account has a reputation / following Trusted Avatar The visual appearance cultivates trust. Trusted Link A link to a trustworthy recognisable domain. Links Actionable Link The user can perform a transaction by using the link (heavily dependent on trust) Media Link The link is to rich multimedia content. Useful Link The link provides valuable information content, e.g. authoritative information, educated reviews Meta Tweet ReTweeted Lots Its information that others have passed on lots Conversation Its part of a series of tweets, and they all need to be useful
  • 49. Not Useful Tweet Content No Information Absence of anything, event, factual points Introspective Personal content and personal thoughts for no social benefit Off Topic Result not related to the query give / TF-IDF irrelevant Too Technical The content requires specific domain knowledge the resader doesn’t possess Poorly Constructed Tweets that may have grammatical / spelling errors, or malformed URLs. Bad Tweets SPAM Irrelevant or inappropriate messages Wrong Language Messages sent in a foreign language of that to the reader Dead Link A URL which does not work i.e. a 404 Not Relevant Time Out of date content Location Wrong geographic location
  • 50. Not Useful (cont.) Trust Un-truested Author An author the reader feels at un-eased by or suspicious of. Un-trusted Link A link the reader feels is suspicious Subjective A tweet that is perspective centric, meaning the author is providing their view or projecting an Perspective Oriented attitude on a subject matter or to a subject / reader. Disagree with Tweet A conflict of aggreement between the reader and the author Not Funny A tweet that is aimed to be humorous, which the reader does not feel is humorous. Meta Tweet QnA Part of a conversation, reader desires the whole convo. not just the question or the answer. Repeated Content the reader has seen before.
  • 51. Insights Interesting finds http://www.flickr.com/photos/foxmulderven/3063598624
  • 52. The Possible Impact Where could we see the impact of this work?
  • 53. Search System A work in progress
  • 55. Thank you for Listening Jonathan Hurlock @jonhurlock Max L. Wilson @gingdottwit Like the talk? Then please tweet it, by quickly visiting: http://moourl.com/LikedTheTalk