4. Tinkering with Twitter’s API(1/2)
✤ Setup
✤ easy_install twitter
✤ but, Twitter’s apis was updated
✤ http://github.com/sixohsix/twitter/issues/56
✤ The Minimalist Twitter API for Python is a Python API for Twitter
✤ Equivalent REST query
✤ http://search.twitter.com/trends.json
11년 10월 20일 목요일
5. Tinkering with Twitter’s API(2/2)
✤ Retrieving Twitter search trends
# ex.3
import twitter
twitter_api = twitter.Twitter()
WORLD_WOE_ID = 1 # The Yahoo! Where On Earth ID for the entire world
world_trends = twitter_api.trends._(WORLD_WOE_ID) # get back a callable
#[ trend["name"] for trend in world_trends()[0]['trends'] ] # call the callabl
for trend in world_trends()[0]['trends']: # call the callabl
print trend["name"]
✤ Paging through Twitter search results
# ex.4
search_results = []
for page in range(1,6):
search_results.append(twitter_api.search(q="Dennis Ritchie", rpp=20, page=page))
11년 10월 20일 목요일
6. Frequency Analysis and Lexical
Diversity(1/5)
✤ Lexical diversity
✤ One of the most intuitive measurements that can be applied to
unstructured text
✤ Expression of the number of unique tokens in the text divided by
the total number of tokens
>>> words = []
>>> for t in tweets:
... words += [ w for w in t.split() ]
>>> len(words) # total words
7238
>>> len(set(words)) # unique words
1636
>>> 1.0*len(set(words))/len(words) # lexical diversity
0.22602928985907708
>>> 1.0*sum([ len(t.split()) for t in tweets ])/len(tweets) # avg words per tweet
14.476000000000001
✤ Each tweet carries about 20 percent unique infomation
11년 10월 20일 목요일
8. Frequency Analysis and Lexical
Diversity(3/5)
✤ Extracting relationships from the tweets
✤ The social web is foremost the linkages between people
✤ One high convenient format for storing social web data is graph
✤ Using regular expressions to find retweets
✤ RT followed by a username
✤ via followed by a username
>>> import re
>>> rt_patterns = re.compile(r"(RT|via)((?:bW*@w+)+)", re.IGNORECASE)
>>> example_tweets = ["RT @SocialWebMining Justin Bieber is on SNL 2nite. w00t?!?",
... "Justin Bieber is on SNL 2nite. w00t?!? (via @SocialWebMining)"]
>>> for t in example_tweets:
... rt_patterns.findall(t)
[('RT', ' @SocialWebMining')]
[('via', ' @SocialWebMining')
11년 10월 20일 목요일
10. Frequency Analysis and Lexical
Diversity(5/5)
✤ Analysis
✤ 500 tweets
✤ 160 users: number of nodes
✤ 160 users involved in retweet relationships with one another
✤ 125 edges connected
✤ 1.28(160/125): some nodes are connected to more than one
node
✤ 37: The graph consists of 32 subgraphs and is not fully
connected
✤ The output of degree
✤ node are connected to anywhere
11년 10월 20일 목요일
11. Visualizing Tweet Graphs(1/3)
✤ Dot language
✤ Text graph description language
✤ Support simple way of describing graphs that both humans and
computer programs can use
✤ Graphviz
✤ install from source: http://www.graphviz.org/
✤ pygraphviz
✤ easy_install pygraphviz
✤ setup.py: library_path, include_path
11년 10월 20일 목요일
12. Visualizing Tweet Graphs(2/3)
✤ Generating DOT language output
OUT = "snl_search_results.dot"
try:
nx.drawing.write_dot(g, OUT)
except ImportError, e:
# Help for Windows users:
# Not a general-purpose method, but representative of
# the same output write_dot would provide for this graph
# if installed and easy to implement
dot = ['"%s" -> "%s" [tweet_id=%s]' % (n1, n2, g[n1][n2]['tweet_id'])
for n1, n2 in g.edges()]
f = open(OUT, 'w')
f.write('strict digraph {n%sn}' % (';n'.join(dot),))
f.close()
✤ Output
strict digraph {
"@ericastolte" -> "bonitasworld" [tweet_id=11965974697];
"@mpcoelho" -> "Lil_Amaral" [tweet_id=11965954427];
"@BieberBelle123" -> "BELIEBE4EVER" [tweet_id=11966261062];
"@BieberBelle123" -> "sabrina9451" [tweet_id=11966197327];
✤ }
11년 10월 20일 목요일
13. Visualizing Tweet Graphs(3/3)
✤ Convert
✤ $circo -Tpng -Osnl_search_results snl_search_results.dot
✤
11년 10월 20일 목요일
14. Closing Remarks
✤ Illustrated how easy it is to use Python’s interactive interpreter to
explore and visualize Twitter data
✤ Feel comfortable with your Python development environment
✤ Spend some time with the Twitter APIs and Graphviz
✤ Canviz project
✤ Draw Graphviz graphs on a web browser <canvas> element.
11년 10월 20일 목요일