Social Media Analytics Research at the QUT Digital Media Research Centre
1. Social Media Analytics Research
at the QUT Digital Media Research Centre
Prof. Axel Bruns
ARC Future Fellow
Digital Media Research Centre
Queensland University of Technology
a.bruns@qut.edu.au – @snurb_dot_info
2. QUT Digital Media Research Centre
The Digital Media Research Centre (DMRC) conducts world-leading
research that helps society understand and adapt to the social,
cultural and economic transformations associated with digital media
technologies, and trains the researchers of tomorrow.
For more, see: http://www.qut.edu.au/research/dmrc
6. The Promise of Big Social Data
• Social media and big data:
– Substantial growth in social media usage
– User activity generates data and metadata
– Readily accessible through APIs
– New tools for processing and visualising big data at scale
• Emergence of social media analytics:
– Large-scale tracking of public user activities
– ‘Trending topics’, user sentiment, network influencers
– Scholarly and commercial research
– A ‘computational turn’ towards the digital humanities (David Berry)
– Ethical concerns around profiling and content ownership
7. Big Data and Society
• New methodologies:
– Empirical, large-scale, real-time investigation
– Data-led, comprehensive evaluation rather than small-scale sampling of public
communication
– But also: combined quantitative/qualitative approaches
– Not studying the Internet, but studying society with the Internet (Richard Rogers)
• Applications:
– Political engagement, especially during elections, crises, scandals
– Crisis communication during natural and human-made disasters
– Engagement with mainstream media: watching, reading, sharing, …
– Brand communication, especially during brand crises
– Identification of earthquakes (USGS), tracking of epidemics (Google)
– …
11. Big Data, Rare Data?
• The political economy of social media research:
– API-based data access is shaped to privilege certain approaches
– Research funding is easier to obtain for specific, limited purposes
– Longitudinal, ‘big’ data access requires ongoing, substantial funding and infrastructure
– Exploratory, data-driven research is difficult to sell to most funding bodies
– Also related to divergent resources available to different scholarly disciplines
• Most ‘difficult’ large-scale social media research is conducted by Facebook /
Twitter and commercial research institutes
12. Social Media and Beyond
• Facebook, Twitter:
– Useful but highly particular areas of online activity
– Not necessarily generalisable to overall activity patterns
– Current research approaches and API limitations introduce further biases
• E.g. publics on Twitter:
– Micro: @reply and retweet conversations
– Meso: follower/followee networks
– Macro: #hashtag ‘communities’ (Bruns & Moe, 2014)
• Key needs in Twitter research:
– Understand how hashtags are situated in a wider communicative ecology on Twitter
– Document the day-to-day uses of Twitter, beyond and outside hashtags
– Trace the dynamics of Twitter as a platform for everyday quasi-private, interpersonal,
and/or public communication
– Track the impact of social and technological changes on these uses
14. The Australian Twittersphere
• Twitter in Australia:
– Strong take-up since 2009
– Centred around 25-55 age range, urban, educated, affluent users (but gradually broadening)
– Significant role in crisis communication, political communication, audience engagement, …
• Mapping the Twittersphere:
– Long-term project to identify all Australian Twitter accounts
– First iteration: snowball crawl of follower/followee networks
• Starting with key hashtag populations (#auspol, #spill, …)
• Map of ~1m accounts in early 2012
– Second iteration: full crawl of global Twitter ID numberspace through to Sep. 2013 (~870m accounts)
– Third iteration: full crawl of global Twitter ID numberspace through to Feb. 2016 (~1.4b accounts)
• Filtering by description, location, timezone fields: identifiably Australian cities, states, timezones, etc.
• 4 million Australian accounts identified (by Feb. 2016)
• Retrieval of their follower/followee lists
– Continuous gathering of their public tweets
• Capturing ~1.3m new tweets per day
17. Mapping the Australian Userbase
• Mapping the Twittersphere:
– Filtered to include only accounts with (followers + followees) >= 1000
• ~255k accounts, 61m follower/followee connections within this group
– Mapped using Gephi Force Atlas 2 algorithm (LinLog mode, scaling 0.00001, gravity 1.0)
• Force-directed visualisation: closely interconnected groups of accounts will form clusters in the network
• Clusters in the Twittersphere:
– Identification of clusters using the Louvain community detection algorithm (resolutions 0.5 and 0.25)
– Qualitative interpretation of clusters themes based on high-degree nodes in each cluster
• Applications:
– Combined analysis of network structures and tweeting activities
– Evaluation of potential and actual information flows across the network
– Comparative benchmarking of clusters across different markers
18. The Australian Twittersphere, 2016
4m known Australian accounts
Network of follower connections
Filtered for degree ≥1000
255k nodes (6.4%), 61m edges
Edges not shown in graph
20. 4m known Australian accounts
Network of follower connections
Filtered for degree ≥1000
255k nodes (6.4%), 61m edges
Edges not shown in graph
Clusters
Teen Culture
Aspirational
Sports
Netizens
Arts & Culture
Politics
Television
Fashion
Popular Music
Food & Drinks
Agriculture Activism
Porn
Education
Cycling
News &
Generic
Hard Right
Progressive
South
Australia
Celebrities
Horse Racing
36. Tweets per Cluster (Average)
Colour scale: yellow to red
Non-tweeting accounts in grey
Louvain modularity resolution 0.5
Average over tweeting accounts only
48. Future Research Perspectives
• The end of the beginning:
– Social media analytics now widely utilised (but still poorly understood and operationalised)
– Substantial innovation in powerful tools and methods (but more in computer than social sciences)
– Broad range of mainstream commercial solutions (but often black boxes with dubious assumptions)
– Platform providers offering various data products (but unreliable and at inflated prices)
• Next steps:
– Beyond simplistic analytics (hashtags, keywords, text-based content)
– Towards (post)demographic perspectives based on interest profiles
– Multi-platform and cross-platform user and information flows
– Critical analysis of roles played by platform algorithms and social bots
• Key concerns:
– Susceptibility to commercial and political interference
– ‘Fake news’, ‘echo chambers’, ‘filter bubbles’, etc.
– Exclusion of independent scholarly researchers through access and pricing policies
– Long-term commercial viability of leading platforms