While journalism is evolving toward a rather open-minded participatory paradigm, social media presents overwhelming streams of data that make it difficult to identify the information of a journalist's interest. Given the increasing interest of journalists in broadening and democratizing news by incorporating social media sources, we have developed TweetGathering, a prototype tool that provides curated and contextualized access to news stories on Twitter. This tool was built with the aim of assisting journalists both with gathering and with researching news stories as users comment on them. Five journalism professionals who tested the tool found helpful characteristics that could assist them with gathering additional facts on breaking news, as well as facilitating discovery of potential information sources such as witnesses in the geographical locations of news.
Curating and Contextualizing Twitter Stories to Assist with Social Newsgathering
1. Curating and Contextualizing Twitter
Stories to Assist with Social
Newsgathering
Arkaitz Zubiaga (City University of New York)
Heng Ji (City University of New York)
Kevin Knight (University of Southern California)
IUI 2013 - March 21, 2013
2. Motivation
●
Users share about news on Twitter, often
complementing news media:
–
–
●
News break early.
Users contribute with additional facts/info.
Journalists are interested in gathering facts,
and finding sources.
8. Two main shortcomings
(1) Overwhelming amounts of contents.
✔
Need of curation.
(2) Lack of context.
✔
Need of contextualization.
9. Data
●
2,593 trending topics (Feb 1-28)
–
●
Tweets in 46 languages.
–
●
Up to 1,500 tweets per TT (3.6M tweets).
57.8% en, 14.9% es, 7.6% pt, etc.
Split into:
–
Feb 1-21: Development set.
–
Feb 22-28: Test set.
10. Annotation
●
Annotation of the 2,593 trending topics.
–
–
●
Newsworthy if the story was later covered by news
media (thepaperboy.com).
Unnewsworthy otherwise.
358 deemed newsworthy.
12. Data processing
●
Translation of tweets into English
–
98.1% translated.
–
27 of 46 languages (19 of top 20).
●
Ranking of trending topics.
●
Curation + contextualization of contents.
22. User Study
●
Tested on-site by 5 journalism practitioners.
–
–
Twitter users.
–
●
Native English speakers.
They use Twitter for newsgathering.
They used the tool while thinking aloud, and
we interviewed them afterward.
23. Feedback: news discovery
●
●
“the ranking is very helpful, the stories in
the bottom of the list are mostly memes and
pointless conversations”.
“it's easy to catch the scoop, and find out
whether a story is worth exploring in more
detail”.
24. Feedback: curation
“Twitter conversations are full of
spam instead of talking about the actual
news. Having a few tweets selected
really helps me discover salient tweets
instead of doing it manually”
27. Feedback: contextualization
“It all depends on whether the story contains links to news media. When there
are, I can research the news written by others”
28. Feedback: contextualization
“We often look into contributing users to find
sources, or even witnesses that might be
reporting about the news. We rely on those as
information sources, and sometimes contact
them to learn more”
29. Feedback: contextualization
●
Hashtags:
“It is sometimes worth exploring the
hashtags that are co-occurring with the story. I
was really wondering why #ows appears so
frequently in a story about #wellsfargo,
that’s something I needed to look into more
detail”.
30. Feedback: contextualization
●
Events:
“They show what people are commenting on
the story. It’s not only about the story itself. I
found in a story about a political
announcement that a popular event being
mentioned in the tweets was resignation,
this must be a strong community that wants
this politician to step down”.
31. Feedback: contextualization
●
Named entities & external descriptions:
“They may reveal where the story is
happening, and so makes it easier to locate
it, as well as easily identify involved people
and organizations, especially when it is
culturally foreign to me”.
32. Final remarks / Future work
●
Customized stories and contents.
●
Categorize news by topic.
●
Identify location of users.
●
Test the tool in real-time to quantify results.