SlideShare ist ein Scribd-Unternehmen logo
1 von 22
Effective Use of the
Twitter Search API
Eric Jensen
Twitter Search

Submit your questions via
http://bit.ly/chirpsearch
or hashtag #chirpsearch
Agenda
•   Mission of the Twitter Search API

•   History

•   Most recently: ranking the top results

•   What’s next
Search API Mission

Connect users with what's most
important and interesting to
them in the here and now

(return the best stuff for a query)
Search Stats
•   Over 600 million queries per day

•   Typically less than 200 milliseconds per query

•   Typically less than 20 seconds indexing
    latency

•   Index of hundreds of millions of tweets
Search API Use Cases
•   Search interfaces: collecta, oneriot, crowdeye, ...

•   Dashboard clients: tweetdeck, seesmic, ...

•   Widgets: twitter, tweetgrid, monitter, ...

•   Location search: trendsmap, foursquare, ...

•   Visualizations: radian6, crimsonhexagon, twistori, ...

•   Analytics: stocktwits, trendrr, tweetstats, ...

•   Recommenders: mrtweet, ...

•   Thousands not listed here + not invented yet
Search vs. Streaming
•   Do use the search API for your app when:

    •   The user can input a query

    •   You need immediate results, not tracking

•   Don’t use the search API for your app when:

    •   Your user experience requires comprehensive
        results (all the tweets, not just the best ones)

    •   You only need tweets from/to/at particular users
Refreshing Results
Client                                           API
                search.json?q=twitter

   "refresh_url":"?since_id=9290798834&q=twitter"




                                                       seconds
                                                         ~20
     search.json?since_id=9290798834&q=twitter

   "refresh_url":"?since_id=9290800152&q=twitter"
Why is this OK?
search.json?q=twitter   search.json?since_id=9290798834
                                   &q=twitter


  Timeline Cache               Timeline Cache
                             q=twitter    1   2 3 4




      Search                                          Tweets
      Index
Search API History

                                                                                             Quality Filtering on Trends
                                                                                             Nov 5, 2009

Summize Launches Twitter Search                                                                                            Top Results Include Popular
Apr 4, 2008                                                                                                                Apr 1, 2010

                 Summize Acquired by Twitter           Search on Twitter.com                             Local Trends        Chirp!
                 Jul 14, 2008                          Apr 1, 2009                                       Jan 6, 2010         Apr 15, 2010


                                                                                                                                     Twitter Search API
                    Sep 1, 2008          Jan 1, 2009   May 1, 2009             Sep 1, 2009        Jan 1, 2010
Ranking Top Results
             • Best stuff for a query

             • Many factors

             • First step

             • Available from API
Top Results API
•   New parameter: result_type

    •   mixed: Eventually this will become the
        default value. Include both popular and real
        time results in the response.

    •   recent: The current default value. Return
        only the most recent results in the response.

    •   popular: Return only the most popular
        results in the response.
Top Results Metadata
{"results":[
     {"text":"@twitterapi  http://
tinyurl.com/ctrefg",
     "from_user":"jkoum",
     "metadata":
     {
      "result_type":"popular",
      "recent_retweets": 100
     },
     "id":1478555574,   
Top Results API Example
        • Initial load includes top results

        • Metadata annotates them

        • Refreshes recent results on top
Include Top Results
url =
  ‘http://search.twitter.com/search.' +
  format +
  '?q=' + query +
  '&result_type=mixed'
Annotate w/ Metadata
if (tweet.metadata.result_type ==
     'popular') {


    return '<div class="twtr-popular">' +
     tweet.metadata.recent_retweets +
     ' recent retweets</div>';
}
Refresh Recent Results
refresh_url = response.refresh_url


...


url =
  ‘http://search.twitter.com/search.' +
  format +
  refresh_url
The Near Future
•   Remove duplicates (retweets)

•   Deeper index

•   Hit highlighting in the API

•   More consistency (with the REST API)

•   Better rate limiting
The Future (cont)
•   More relevance

•   More metadata

•   More stuff

•   More operators

    •   places, @anywhere, annotations
Open Source in Search
•   http://twitter.com/about/opensource

    •   mysql, hadoop, kestrel, twitter-text, etc.

•   lucene

•   commons-pipeline

•   varnish

•   jmeter

•   nutch language identifier

•   mecab
We’re Hiring
•   http://twitter.com/jobs

•   Data Analyst - Search

•   Product Manager - Search

•   Software Engineer - Search

•   Software Engineer - Search Front-End

•   Software Engineer - Search Relevance
Questions?

http://bit.ly/chirpsearch
or hashtag #chirpsearch

Also join us at the Real-Time
Search Birds of a Feather @
1:30 in The Coop

Weitere ähnliche Inhalte

Ähnlich wie Effective Use of the Twitter Search API

iPhoneアプリのTwitter連携
iPhoneアプリのTwitter連携iPhoneアプリのTwitter連携
iPhoneアプリのTwitter連携
So Matsuda
 

Ähnlich wie Effective Use of the Twitter Search API (20)

Social Developers London update for Twitter Developers
Social Developers London update for Twitter Developers Social Developers London update for Twitter Developers
Social Developers London update for Twitter Developers
 
Twitter api
Twitter apiTwitter api
Twitter api
 
Harvesting Data from Twitter Workshop: Hands-on Experience
Harvesting Data from Twitter Workshop: Hands-on ExperienceHarvesting Data from Twitter Workshop: Hands-on Experience
Harvesting Data from Twitter Workshop: Hands-on Experience
 
CSE5656 Complex Networks - Gathering Data from Twitter
CSE5656 Complex Networks - Gathering Data from TwitterCSE5656 Complex Networks - Gathering Data from Twitter
CSE5656 Complex Networks - Gathering Data from Twitter
 
Open Network Live - Chirp 情報共有
Open Network Live - Chirp 情報共有Open Network Live - Chirp 情報共有
Open Network Live - Chirp 情報共有
 
Internship
InternshipInternship
Internship
 
We are losing our tweets!
We are losing our tweets!We are losing our tweets!
We are losing our tweets!
 
Twet
TwetTwet
Twet
 
iPhoneアプリのTwitter連携
iPhoneアプリのTwitter連携iPhoneアプリのTwitter連携
iPhoneアプリのTwitter連携
 
A case about Twitter
A case about TwitterA case about Twitter
A case about Twitter
 
South JVM Users Group Talk - Building Social Media Tools using JVM Supported ...
South JVM Users Group Talk - Building Social Media Tools using JVM Supported ...South JVM Users Group Talk - Building Social Media Tools using JVM Supported ...
South JVM Users Group Talk - Building Social Media Tools using JVM Supported ...
 
Twitter - What, Why, Who & How
Twitter - What, Why, Who & HowTwitter - What, Why, Who & How
Twitter - What, Why, Who & How
 
HootSuite 101 Workshop
HootSuite 101 WorkshopHootSuite 101 Workshop
HootSuite 101 Workshop
 
Sentiment analysis on demonetisation
Sentiment analysis on demonetisationSentiment analysis on demonetisation
Sentiment analysis on demonetisation
 
Jinchao demo v7
Jinchao demo v7Jinchao demo v7
Jinchao demo v7
 
Potential of twitter archives
Potential of twitter archivesPotential of twitter archives
Potential of twitter archives
 
Twitter
TwitterTwitter
Twitter
 
Unleashing Twitter Data for Fun and Insight
Unleashing Twitter Data for Fun and InsightUnleashing Twitter Data for Fun and Insight
Unleashing Twitter Data for Fun and Insight
 
Unleashing twitter data for fun and insight
Unleashing twitter data for fun and insightUnleashing twitter data for fun and insight
Unleashing twitter data for fun and insight
 
Turbocharge Twitter With Apps SMBMTL 082510
Turbocharge Twitter With Apps SMBMTL 082510Turbocharge Twitter With Apps SMBMTL 082510
Turbocharge Twitter With Apps SMBMTL 082510
 

Kürzlich hochgeladen

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Kürzlich hochgeladen (20)

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 

Effective Use of the Twitter Search API

  • 1.
  • 2. Effective Use of the Twitter Search API Eric Jensen Twitter Search Submit your questions via http://bit.ly/chirpsearch or hashtag #chirpsearch
  • 3. Agenda • Mission of the Twitter Search API • History • Most recently: ranking the top results • What’s next
  • 4. Search API Mission Connect users with what's most important and interesting to them in the here and now (return the best stuff for a query)
  • 5. Search Stats • Over 600 million queries per day • Typically less than 200 milliseconds per query • Typically less than 20 seconds indexing latency • Index of hundreds of millions of tweets
  • 6. Search API Use Cases • Search interfaces: collecta, oneriot, crowdeye, ... • Dashboard clients: tweetdeck, seesmic, ... • Widgets: twitter, tweetgrid, monitter, ... • Location search: trendsmap, foursquare, ... • Visualizations: radian6, crimsonhexagon, twistori, ... • Analytics: stocktwits, trendrr, tweetstats, ... • Recommenders: mrtweet, ... • Thousands not listed here + not invented yet
  • 7. Search vs. Streaming • Do use the search API for your app when: • The user can input a query • You need immediate results, not tracking • Don’t use the search API for your app when: • Your user experience requires comprehensive results (all the tweets, not just the best ones) • You only need tweets from/to/at particular users
  • 8. Refreshing Results Client API search.json?q=twitter "refresh_url":"?since_id=9290798834&q=twitter" seconds ~20 search.json?since_id=9290798834&q=twitter "refresh_url":"?since_id=9290800152&q=twitter"
  • 9. Why is this OK? search.json?q=twitter search.json?since_id=9290798834 &q=twitter Timeline Cache Timeline Cache q=twitter 1 2 3 4 Search Tweets Index
  • 10. Search API History Quality Filtering on Trends Nov 5, 2009 Summize Launches Twitter Search Top Results Include Popular Apr 4, 2008 Apr 1, 2010 Summize Acquired by Twitter Search on Twitter.com Local Trends Chirp! Jul 14, 2008 Apr 1, 2009 Jan 6, 2010 Apr 15, 2010 Twitter Search API Sep 1, 2008 Jan 1, 2009 May 1, 2009 Sep 1, 2009 Jan 1, 2010
  • 11. Ranking Top Results • Best stuff for a query • Many factors • First step • Available from API
  • 12. Top Results API • New parameter: result_type • mixed: Eventually this will become the default value. Include both popular and real time results in the response. • recent: The current default value. Return only the most recent results in the response. • popular: Return only the most popular results in the response.
  • 13. Top Results Metadata {"results":[      {"text":"@twitterapi  http:// tinyurl.com/ctrefg",      "from_user":"jkoum",      "metadata":      {       "result_type":"popular",       "recent_retweets": 100      },      "id":1478555574,   
  • 14. Top Results API Example • Initial load includes top results • Metadata annotates them • Refreshes recent results on top
  • 15. Include Top Results url = ‘http://search.twitter.com/search.' + format + '?q=' + query + '&result_type=mixed'
  • 16. Annotate w/ Metadata if (tweet.metadata.result_type == 'popular') { return '<div class="twtr-popular">' + tweet.metadata.recent_retweets + ' recent retweets</div>'; }
  • 17. Refresh Recent Results refresh_url = response.refresh_url ... url = ‘http://search.twitter.com/search.' + format + refresh_url
  • 18. The Near Future • Remove duplicates (retweets) • Deeper index • Hit highlighting in the API • More consistency (with the REST API) • Better rate limiting
  • 19. The Future (cont) • More relevance • More metadata • More stuff • More operators • places, @anywhere, annotations
  • 20. Open Source in Search • http://twitter.com/about/opensource • mysql, hadoop, kestrel, twitter-text, etc. • lucene • commons-pipeline • varnish • jmeter • nutch language identifier • mecab
  • 21. We’re Hiring • http://twitter.com/jobs • Data Analyst - Search • Product Manager - Search • Software Engineer - Search • Software Engineer - Search Front-End • Software Engineer - Search Relevance
  • 22. Questions? http://bit.ly/chirpsearch or hashtag #chirpsearch Also join us at the Real-Time Search Birds of a Feather @ 1:30 in The Coop

Hinweis der Redaktion

  1. i will talk about: - start by giving some of our thinking about why we have a search api and what differentiates it from the other api&amp;#x2019;s twitter offers - i&amp;#x2019;ll get into some technical implications of these differences with respect to polling on search versus tracking keywords on the streaming api - next, i&amp;#x2019;ll talk briefly about how the search api has changed over time - and then we&amp;#x2019;ll dig into the most recent change where we began ranking the top results beyond recency order. i&amp;#x2019;ll show you how i&amp;#x2019;ve modified one of our own search api clients to take advantage of that change
  2. simple definition: user provides a query by engaging with an api application, we provide the best stuff (currently tweets and trends) for that query Obviously the &amp;#x201C;best&amp;#x201D; stuff for twitter has a lot to do with how recent it is, so our primary focus is on the &amp;#x201C;here and now&amp;#x201D;
  3. Just to give you an idea of the parameters search operates under: - as ev told you yesterday we are doing more than 600M queries per day, seen up to 750M on a day recently - while realtime is our main focus, our index does contain hundreds of millions of tweets and we&amp;#x2019;ve roughly doubled its size in the last six months. - of course, the amount of tweets has grown even faster than we&amp;#x2019;ve increased that index size, so this only covers about a week of them right now, but that is something we&amp;#x2019;re currently working on expanding
  4. So obviously we&amp;#x2019;re operating a large scale, but what&amp;#x2019;s really interesting to me about the search API is the variety of applications you as developers have found for it. I&amp;#x2019;ve listed just a few here to illustrate what people are currently doing with the API.
  5. So that&amp;#x2019;s what people are doing with the search api, but the streaming api also supports tracking keywords and some location and language filtering. So, if you&amp;#x2019;re developing a new app, how do you decide which to use?
  6. The biggest difference between the search API and the track API is how you get new results matching your standing query. On the streaming API the push model makes this obvious: new results are sent to you as they come in. Since the focus of the search API is on apps that let the user manipulate the query (whether explicitly or implicitly), registering a standing query for every request makes less sense. Instead, the search API uses a polling model with a cursor. --- make sure you explain this diagram by pointing at it (or at least describing it). It took me a minute to get the visual presentation
  7. One question that comes up frequently is why we encourage apps to use this cursor to poll and how that helps us to support refreshes more efficiently, so here&amp;#x2019;s a diagram of what happens under the covers. A lot like the streaming API, when you make any query to search we actually do register that as a standing query, but only in one of our caching layers we call the timeline cache.
  8. Next I&amp;#x2019;d like to take a step back and talk briefly about the history of the search API and how our thinking about it has developed. twitter search and the API have been around for about two years now, and we made a lot of changes early on like supporting location search, but after that we had to shift our focus to scaling the system to support the growth in tweets and queries. It&amp;#x2019;s really just in the last six months that we&amp;#x2019;ve made enough progress with scaling and grown the search team enough to be able to focus more on relevance and figuring out what that means for twitter search.
  9. Our mission: ---- Under &amp;#x201C;many factors&amp;#x201D; you should note that it&amp;#x2019;s not always the popular users that show up here -- that seems to be an early misconception. Our algorithm looks to find things that are interesting from any user - things that &amp;#x201C;resonate,&amp;#x201D; to use a word that Dick talked about yesterday (good to tie it in to other things being said at Chirp). Rather than &amp;#x201C;not final&amp;#x201D; (which seems to imply there is a &amp;#x201C;final&amp;#x201D; step when we won&amp;#x2019;t be improving this) I&amp;#x2019;d say something like &amp;#x201C;First step of a long road of relevance improvements&amp;#x201D; (implying that we&amp;#x2019;ve got lots of ideas and we&amp;#x2019;ll be delivering cool stuff for a long way.
  10. right now at the top
  11. explain that this uses since_id
  12. we want to hear from you