AI as Research Assistant: Upscaling Content Analysis to Identify Patterns of Polarisation in the News
1. CRICOS No.00213J
AI as Research Assistant:
Upscaling Content Analysis to Identify
Patterns of Polarisation in the News
Axel Bruns with important contributions from:
Australian Laureate Fellow Laura Vodden Katharina Esau Sebastian Svegaard
Digital Media Research Centre Tariq Choucair Samantha Vilkins Kate O’Connor Farfan
Queensland University of Technology Laura Lefevre Vishnu PS Carly Lubicz-Zaorski
Brisbane, Australia
a.bruns@qut.edu.au
@snurb_dot_info | @snurb@aoir.social | @snurb.bsky.social
4. CRICOS No.00213J
Our Project
• Australian Laureate Fellowship (2022-27)
• Determining the Drivers and Dynamics of Partisanship and Polarisation in Online Public
Debate
• Digital Media Research Centre, Queensland University of Technology, Brisbane, Australia
• 4 postdocs, 4 + 4* PhD students, 1 data scientist
• Cross-national comparisons (intended: AU, US, UK, DE, DK, CH, probably + BR, PE, CA)
• Longitudinal analysis over the course of the project
* Starting in 2024 – interested? Get in touch! (a.bruns@qut.edu.au)
7. CRICOS No.00213J
Forms of Polarisation
• Polarisation at what levels?
• Micro: between individuals
• Meso: between groups
• Macro: across society
• Mass: involving everyone
• Elite: amongst formal political actors (however defined)
• See: Esau et al. (2023) — https://eprints.qut.edu.au/238775/
• (and chapter forthcoming in the Routledge Handbook of Political Campaigning)
8. CRICOS No.00213J
Forms of Polarisation
• Polarisation on what attributes?
• Issue-based: disagreements over specific policy settings
• Ideological: fundamental differences based on political belief systems
• Affective: political beliefs turned into deeply felt in-group / out-group identity
• Perceived: view of society, as based on personal views and media reporting
• Interpretive: reading of issues, events, and media coverage based on personal views
• Interactional: manifested in choices to interact with or ignore other individuals/groups
• (and more…)
10. CRICOS No.00213J
Agonism? Polarisation? Dysfunction?
• How bad is it, exactly?
• All politics is polarised (just not to the point of dysfunction)
• Much (most?) politics is multipolar, not just left/right
• When does mild antagonism turn into destructive polarisation?
• We suggest five symptoms (Esau et al., 2023):
a) breakdown of communication;
b) discrediting and dismissing of information;
c) erasure of complexities;
d) exacerbated attention and space for extreme voices;
e) exclusion through emotions.
Image: Midjourney
13. Park, Sora, Caroline Fisher, Kieran McGuinness, Jee Young Lee, and Kerry McCallum. 2021. Digital News Report: Australia 2021. Canberra: News and Media Research Centre. https://doi.org/10.25916/KYGY-S066.
News Audience Polarisation
14.
15. CRICOS No.00213J
(And by extension, news
audience polarization?)
Can we assess news
content polarisation?
16. CRICOS No.00213J
This study examines non-editorial news coverage in
leading US newspapers as a source of ideological
differences on climate change. A quantitative content
analysis compared how the threat of climate change and
efficacy for actions to address it were represented in
climate change coverage across The New York Times,
The Wall Street Journal, The Washington Post, and USA
Today between 2006 and 2011. Results show that The
Wall Street Journal was least likely to discuss the
impacts of and threat posed by climate change and
most likely to include negative efficacy information
and use conflict and negative economic framing when
discussing actions to address climate change. The
inclusion of positive efficacy information was similar across
newspapers. Also, across all newspapers, climate impacts
and actions to address climate change were more likely to
be discussed separately than together in the same article.
Implications for public engagement and ideological
polarization are discussed.
(http://journals.sagepub.com/doi/10.1177/0963662515595348)
18. CRICOS No.00213J
What Questions Can We Ask?
• Polarisation in news coverage:
• Who gets to speak in the coverage (viewpoint diversity)?
• How are they presented (detail and frequency; attribution; direct/indirect speech)?
• What language and key terms are used in the journalistic text (framing)?
• How does this match the language of which speakers / actors / stakeholders?
• How consistent is this across articles from the same news outlet, over time?
• How uniform or diverse is this across different news outlets?
• Are such coverage differences issue-specific, or persistent across issues?
• Do such differences map onto perceived outlet polarisation or audience polarisation?
20. CRICOS No.00213J
News Data Are Hard to Find
• The trouble with news databases:
• Factiva, ProQuest, LexisNexis, … are geared for library use and qualitative research
• Some surprising gaps in news outlet coverage (and very text-centric)
• Licencing arrangements prohibit large-scale content exports (>100 articles)
• Exacerbated by growing publisher concerns about use of news data in AI training
• Example:
• Factiva – ~US$30,000 per annum for API access, double for longer-term storage
23. CRICOS No.00213J
AI (LLM) Training
Prompt development and refining
Manual Coding
3,000-5,000 articles, hand-coded
Data Sources
NewsDataIO Proquest TDM
🔁
🧐
24. CRICOS No.00213J
Evaluation and Analysis
Patterns in news coverage, signs of polarisation
LLM Coding
Large number of articles, coded by trained AI
Data Sources
NewsDataIO Proquest TDM
🧐
🧐
26. CRICOS No.00213J
Viewpoint Diversity and Stance Detection
• Working prototypes:
• Who is speaking? e.g. Jane Smith; a spokesperson
• How are they introduced? e.g. Defence Minister; eyewitness
• Who are they said to represent? e.g. the government; themselves
• Are they afforded direct or indirect speech? e.g. direct quote; paraphrase
• What are they saying? e.g. “It wasn’t me. I didn’t do it.”
• Further steps:
• Potentially more successful after Named Entity Recognition (to prime LLM process)
• More to be done on stance detection (what stance does their statement represent?)
29. CRICOS No.00213J
LLM Prompting, Finetuning, Evaluation
• Prompt development:
• Highly LLM- (and version-) specific – e.g. ChatGPT 3.5 vs 4.0
• Room for uncertainty (‘other’ category) and asking for code choice explanations may help
• Finetuning:
• Repeated tuning against manually coded ‘gold standard’ datasets can improve performance
• But may risk overtuning to specific contexts, creating worse results for more diverse data
• Evaluation:
• Potential to combine and average repeated LLM runs – using the same or different LLMs
• Important to remember that human coders don’t always agree either
31. CRICOS No.00213J
Towards Advanced AI-Supported Analysis
• First steps towards frame analysis:
• Framing remains very topic-specific – may be difficult to develop general approach
• Potential to take initial topic modelling step (to identify key themes), …
• … and then ask LLM to determine how they are framed in articles
• Can we further enrich / preprocess the input data through Natural Language Processing?
• Exploring non-textual data:
• (Online) news content is increasingly image-, audio-, and video-based
• Some early steps towards AI transcription of AV content (but need to identify speakers)
• Promising tools for image clustering (still images / keyframes from videos)
• Much more to do in combining these data points in a meaningful way
33. CRICOS No.00213J
Current thinking:
Quantifying specific aspects of individual participant
activities, then identifying and interpreting similar patterns
at a group level.
* With particular thanks to Kateryna Kasianenko.
Beyond Qualitative
Interpretation:
Practice Mapping*
Image: Midjourney
34. CRICOS No.00213J
Twitter @mention network during Voice to Parliament campaign
Red: exclusively using #VoteNo; green: exclusively using #VoteYes.
Twitter interaction pattern similarity network – based on cosine similarity between
normalised interaction vectors per account, colours based on modularity detection.
Pro-Voice
campaigners
Labor
supporters
Anti-Voice
campaigners
Liberal / National
supporters
35. CRICOS No.00213J
• Account-to-account interactions
(relative to interactive affordances available
on any given social media platform)
• Account’s post content (topics, sentiment,
hashtags, named entities, etc.)
• Account’s use of sources (URLs, domains,
AI-coded source content features, etc.)
• Account’s profile information (name,
description, etc.)
• Manually and computationally coded
information about the account and its posts
• …
Potential Patterns
to Operationalise
in Practice Mapping
Image: Midjourney
36. CRICOS No.00213J
Assessing Destructive Polarisation
• Key questions:
• Does practice mapping show distinct practices?
• What divergent patterns drive such distinctions?
• Do these patterns map onto one of the symptoms
of destructive polarisation?
• (Or: do they represent a new pattern that might be
seen as destructive – a new symptom?)
• How severe are these differences (i.e. how deeply
and destructively polarised is the situation)?
• How are these patterns evolving over time?
Image: Midjourney
39. CRICOS No.00213J
This research is supported by the Australian Research Council through the
Australian Laureate Fellowship project Determining the Dynamics of
Partisanship and Polarisation in Online Public Debate.
Acknowledgments