SlideShare ist ein Scribd-Unternehmen logo
1 von 44
Interactive visualization and exploration
of network data with gephi
Bernhard Rieder
Universiteit van Amsterdam
Mediastudies Department
and some conceptual context
Two kinds of mathematics
Can there be data analysis without math? No.
Does this imply epistemological commitments? Yes.
But there are choices, e.g. between:
☉ Confirmatory data analysis => deductive
☉ Exploratory data analysis (Tukey 1962) => inductive
Two kinds of mathematics
Statistics
Observed: objects and properties
Inferred: social forces
Data representation: the table
Visual representation: quantity charts
Grouping: "class" (similar properties)
Graph-theory
Observed: objects and relations
Inferred: structure
Data representation: the matrix
Visual representation: network diagrams
Grouping: "clique" (dense relations)
Graph theory
Leonhard Euler, "Seven Bridges of Königsberg", 1735
Introducing the "point and line" model
Graph theory
Develops over the 20th century, in particular the second half.
Integrates branches of mathematics (topology, geometry, statistics, etc.).
Graph theory is "the mathematics of structure" (Harary 1965), "a
mathematical model for any system involving a binary relation" (Harary
1969); it makes relational structure calculable.
"Perhaps even more than to the contact between mankind and nature, graph theory owes to
the contact of human beings between each other." (König 1936)
Basic ideas
Moreno 1934
Graph theory developed in
exchange with sociometry,
small-group research and
(later) social exchange
theory.
Starting point:
"the sociometric test"
(experimental definition of
"relation")
Basic ideas
Forsythe and Katz, 1946, "adjacency matrix"
Harary, Graph Theory, 1969
Basic ideas
The network singularity
Why do network analysis and visualization? Which arguments are put
forward?
☉ New media: technical and conceptual structures modeled as networks
☉ The network imaginary: networks as analytical device and trending topic
☉ Calculative capacities: powerful techniques and tools
☉ Visualization: the network diagram, "visual analytics"
☉ Logistics: data, software, and hardware are available and cheap
☉ Methodology I: dissatisfaction with statistics => SNA
☉ Methodology II: a "new science of networks" (Watts 2005) emerged
☉ Society: diversification, problems with demographics / statistics / theory
Basic ideas
Adamic and Glance, "Divided They Blog", 2005
Graph theory
Graph theory consists of or provides:
☉ A basic conceptual and formal model (point and line)
☉ Descriptive and analytical language to talk about specific graphs
☉ Extensive calculability of structure
☉ Various “native” (and non-native) forms of visualization
Formalization
"As we have seen, the basic terms of digraph theory are point and line. Thus, if an
appropriate coordination is made so that each entity of an empirical system is identified
with a point and each relationship is identified with a line, then for all true statements
about structural properties of the obtained digraph there are corresponding true statements
about structural properties of the empirical system." (Harary et al. 1965)
There is always an epistemological commitment!
=> What can "carry" the reductionism and formalization?
Much of these data can be
analyzed as graphs.
Social media formalize
interaction at the interface.
Basic ideas
What Kind of Phenomena/Data?
Interactive networks (Watts 2004): link encodes tangible interaction
☉ social network
☉ citation networks
☉ hypertext networks
Symbolic networks (Watts 2004): link is conceptual
☉ co-presence (Tracker Tracker, IMDB, etc.)
☉ co-word
☉ any kind of "structure" that can be formalized as point and line
=> do all kinds of analysis (SNA, transportation, text mining, etc.)
=> analyze structural properties in various ways
Basic ideas
File formats
To be able to begin, we need data in a graph file format. There are a
number of different file formats used to specify graphs.
Different formats have different capacities (e.g. .gexf allows to specify
time intervals).
The guess (.gdf) format:
http://courses.polsys.net/gephi/
Basic ideas
What is a graph?
An abstract representation of nodes connected by links.
Two ways of analyzing graphs:
☉ numerical analysis (graph statistics, structural measures, etc.)
☉ visualization (network diagram, matrix, arc diagram, etc.)
Basic ideas
Wikipedia: Glossary of graph theory
Tools are easy, concepts are hard
http://courses.polsys.net/gephi/
Vertices and edges!
Nodes and lines!
Two main types:
Directed (e.g. Twitter)
Undirected (e.g. Facebook)
Properties of nodes:
degree, centrality, etc.
Properties of edges:
weight, direction, etc.
Properties of the graph:
averages, diameter, communities, etc.
Basic ideas
What is a graph?
A
B
C
D
a-b
b-d
b-c c-d
Nodes, Degree:
A: 1, B: 3, C: 2, D: 2
Nodes, Weighted Degree:
A: 1, B: 3, C: 3, D: 3
Edges, Weight:
a-b: 1, b-c: 1, b-d: 1, c-d: 2
Graph, diameter: 2
Graph, density: 0.667 (4 edges out of 6)
Graph, average shortest path: 1.334
Numbers are great for comparison!
Basic ideas
Basic ideashttp://courses.polsys.net/gephi/
Basic ideas
Interactive visual analytics
Bringing structure to the surface (gephi panel: "layout")
☉ different spatializations (force, geometry, etc.)
Projecting variables into the diagram (gephi panel: "ranking")
☉ Size (nodes, edges, labels, etc.)
☉ Color (nodes, edges, labels, etc.)
Deriving measures (gephi panel: "statistics")
☉ Properties of nodes, edges, structure => new variables
Analysis: e.g. correlation between spatial layout and variables?
Layout algorithms transform n-dimensional
adjacency matrices into two-dimensional diagrams
Every algorithm/technique reveals the structure
of the graph differently, shows different aspects
Basic ideas
Interactive visual analytics
Bringing structure to the surface (gephi panel: "layout")
☉ different spatializations (force, geometry, etc.)
Projecting variables into the diagram (gephi panel: "ranking")
☉ Size (nodes, edges, labels, etc.)
☉ Color (nodes, edges, labels, etc.)
Analysis: e.g. “correlation” between spatial layout and variables?
Basic ideas
Nine measures of centrality (Freeman 1979)
Basic ideas
Interactive visual analytics
Bringing structure to the surface (gephi panel: "layout")
☉ different spatializations (force, geometry, etc.)
Projecting variables into the diagram (gephi panel: "ranking")
☉ Size (nodes, edges, labels, etc.)
☉ Color (nodes, edges, labels, etc.)
Deriving measures (gephi panel: "statistics")
☉ Properties of nodes, edges, structure => new variables
Analysis: e.g. “correlation” between spatial layout and variables?
Basic ideas
Basic ideas
Label PR α=0.85 PR α=0.7 PR α=0.55 PR α=0.4 In-Degree Out-Degree Degree
n34 0.0944 0.0743 0.0584 0.0460 4 1 5
n1 0.0867 0.0617 0.0450 0.0345 1 2 3
n17 0.0668 0.0521 0.0423 0.0355 2 1 3
n39 0.0663 0.0541 0.0453 0.0388 5 1 6
n22 0.0619 0.0506 0.0441 0.0393 5 1 6
n27 0.0591 0.0451 0.0371 0.0318 1 0 1
n38 0.0522 0.0561 0.0542 0.0486 6 0 6
n11 0.0492 0.0372 0.0306 0.0274 3 1 4
FB group "Islam is dangerous"
Friendship network, color: betweenness centrality
2.339 members
Average degree of 39.69
81.7% have at least one friend in the group
55.4% five or more
37.2% have 20 or more
founder and admin has 609 friends
Twitter 1% sample, co-hashtag analysis
227,029 unique hashtags, 1627 displayed (freq >= 50)
Size: frequency
Color: modularity
Size: frequency
Color: user diversity
Twitter 1% sample, co-hashtag analysis
227,029 unique hashtags, 1627 displayed (freq >= 50)
Size: frequency
Color: degree
Twitter 1% sample, co-hashtag analysis
227,029 unique hashtags, 1627 displayed (freq >= 50)
Network statistics
betweenness centrality
degree
Relational elements of graphs can
be represented as tables (nodes
have properties) and analyzed
through statistics.
Network statistics bridge the gap
between individual units and the
structural forms they are
embedded in.
This is currently an extremely
prolific field of research.
Twitter 1% sample
Co-hashtag analysis
Degree vs.
wordFrequency
Degree vs. userDiversity
Twitter 1% sample
Co-hashtag analysis
Basic ideas
PlugIn: Spatial Ranking
Co-like analysis of my personal FB network:
Nodes: users / Links: "liking the same thing"
Example 3: our imagination
Basic ideas
PlugIn: Multimodal Projection
Basic ideas
Basic ideas
PlugIn: GeoLayout
Thank You
rieder@uva.nl
https://www.digitalmethods.net
http://thepoliticsofsystems.net
"Far better an approximate answer to the right question,
which is often vague, than an exact answer to the wrong
question, which can always be made precise. Data
analysis must progress by approximate answers, at best,
since its knowledge of what the problem really is will at
best be approximate." (Tukey 1962)

Weitere ähnliche Inhalte

Andere mochten auch

Crawling and Scraping tutorial at the Digital Methods Summer School 2013
Crawling and Scraping tutorial at the Digital Methods Summer School 2013Crawling and Scraping tutorial at the Digital Methods Summer School 2013
Crawling and Scraping tutorial at the Digital Methods Summer School 2013
Digital Methods Initiative
 
National Tracking Ecologies - Digital Methods Summer School 2013
National Tracking Ecologies - Digital Methods Summer School 2013National Tracking Ecologies - Digital Methods Summer School 2013
National Tracking Ecologies - Digital Methods Summer School 2013
Digital Methods Initiative
 
Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013
Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013
Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013
Digital Methods Initiative
 
Traces of the Trackers. Tracking the Trackers: A historical analysis using th...
Traces of the Trackers. Tracking the Trackers: A historical analysis using th...Traces of the Trackers. Tracking the Trackers: A historical analysis using th...
Traces of the Trackers. Tracking the Trackers: A historical analysis using th...
Digital Methods Initiative
 
Tracking the Trackers tutorial at the Digital Methods Summer School 2013
Tracking the Trackers tutorial at the Digital Methods Summer School 2013Tracking the Trackers tutorial at the Digital Methods Summer School 2013
Tracking the Trackers tutorial at the Digital Methods Summer School 2013
Digital Methods Initiative
 
Digital Methods Summer School 2015 Tool Medley
Digital Methods Summer School 2015 Tool MedleyDigital Methods Summer School 2015 Tool Medley
Digital Methods Summer School 2015 Tool Medley
Digital Methods Initiative
 
Studying Facebook via Data Extraction: a Netvizz tutorial at the Digital Meth...
Studying Facebook via Data Extraction: a Netvizz tutorial at the Digital Meth...Studying Facebook via Data Extraction: a Netvizz tutorial at the Digital Meth...
Studying Facebook via Data Extraction: a Netvizz tutorial at the Digital Meth...
Digital Methods Initiative
 

Andere mochten auch (20)

Gephi bbc
Gephi bbcGephi bbc
Gephi bbc
 
Use case for Using Gephi for Social Network Analysis of facebook
Use case for Using Gephi for Social Network Analysis of facebookUse case for Using Gephi for Social Network Analysis of facebook
Use case for Using Gephi for Social Network Analysis of facebook
 
Rogers data days_2014_slides_opti
Rogers data days_2014_slides_optiRogers data days_2014_slides_opti
Rogers data days_2014_slides_opti
 
Web Flags Summer School 2012
Web Flags Summer School 2012Web Flags Summer School 2012
Web Flags Summer School 2012
 
Crawling and Scraping tutorial at the Digital Methods Summer School 2013
Crawling and Scraping tutorial at the Digital Methods Summer School 2013Crawling and Scraping tutorial at the Digital Methods Summer School 2013
Crawling and Scraping tutorial at the Digital Methods Summer School 2013
 
Digital Methods Summer School 2014 Tool Medley
Digital Methods Summer School 2014 Tool MedleyDigital Methods Summer School 2014 Tool Medley
Digital Methods Summer School 2014 Tool Medley
 
Rogers studyingpoliticalissues mar2014_optimized_ii_
Rogers studyingpoliticalissues mar2014_optimized_ii_Rogers studyingpoliticalissues mar2014_optimized_ii_
Rogers studyingpoliticalissues mar2014_optimized_ii_
 
Repurposing Wikipedia: Wikipedia as data set and analytical device
Repurposing Wikipedia: Wikipedia as data set and analytical deviceRepurposing Wikipedia: Wikipedia as data set and analytical device
Repurposing Wikipedia: Wikipedia as data set and analytical device
 
Post-social methods? Issues in live research, by Noortje Marres and Esther We...
Post-social methods? Issues in live research, by Noortje Marres and Esther We...Post-social methods? Issues in live research, by Noortje Marres and Esther We...
Post-social methods? Issues in live research, by Noortje Marres and Esther We...
 
Hashtag lifelines
Hashtag lifelinesHashtag lifelines
Hashtag lifelines
 
National Tracking Ecologies - Digital Methods Summer School 2013
National Tracking Ecologies - Digital Methods Summer School 2013National Tracking Ecologies - Digital Methods Summer School 2013
National Tracking Ecologies - Digital Methods Summer School 2013
 
Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013
Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013
Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013
 
Traces of the Trackers. Tracking the Trackers: A historical analysis using th...
Traces of the Trackers. Tracking the Trackers: A historical analysis using th...Traces of the Trackers. Tracking the Trackers: A historical analysis using th...
Traces of the Trackers. Tracking the Trackers: A historical analysis using th...
 
Tracking the Trackers tutorial at the Digital Methods Summer School 2013
Tracking the Trackers tutorial at the Digital Methods Summer School 2013Tracking the Trackers tutorial at the Digital Methods Summer School 2013
Tracking the Trackers tutorial at the Digital Methods Summer School 2013
 
Rogers digitalmethodsaftersocialmedia nov2013_optimized_
Rogers digitalmethodsaftersocialmedia nov2013_optimized_Rogers digitalmethodsaftersocialmedia nov2013_optimized_
Rogers digitalmethodsaftersocialmedia nov2013_optimized_
 
Digital Methods Summer School 2015 Tool Medley
Digital Methods Summer School 2015 Tool MedleyDigital Methods Summer School 2015 Tool Medley
Digital Methods Summer School 2015 Tool Medley
 
Richard Rogers, Otherwise Engaged: Critical Analytics and the New Meanings of...
Richard Rogers, Otherwise Engaged: Critical Analytics and the New Meanings of...Richard Rogers, Otherwise Engaged: Critical Analytics and the New Meanings of...
Richard Rogers, Otherwise Engaged: Critical Analytics and the New Meanings of...
 
Digital Methods Tool Medley
Digital Methods Tool MedleyDigital Methods Tool Medley
Digital Methods Tool Medley
 
Gephi Tutorial Layouts
Gephi Tutorial LayoutsGephi Tutorial Layouts
Gephi Tutorial Layouts
 
Studying Facebook via Data Extraction: a Netvizz tutorial at the Digital Meth...
Studying Facebook via Data Extraction: a Netvizz tutorial at the Digital Meth...Studying Facebook via Data Extraction: a Netvizz tutorial at the Digital Meth...
Studying Facebook via Data Extraction: a Netvizz tutorial at the Digital Meth...
 

Ähnlich wie Interactive visualization and exploration of network data with Gephi

2010-November-8-NIA - Smart Society and Civic Culture - Marc Smith
2010-November-8-NIA - Smart Society and Civic Culture - Marc Smith2010-November-8-NIA - Smart Society and Civic Culture - Marc Smith
2010-November-8-NIA - Smart Society and Civic Culture - Marc Smith
Marc Smith
 
lec3_socialnetwork_part1.pptx
lec3_socialnetwork_part1.pptxlec3_socialnetwork_part1.pptx
lec3_socialnetwork_part1.pptx
datapro2
 
Social Networks and Computer Science
Social Networks and Computer ScienceSocial Networks and Computer Science
Social Networks and Computer Science
dragonmeteor
 

Ähnlich wie Interactive visualization and exploration of network data with Gephi (20)

Interactive visualization and exploration of network data with gephi
Interactive visualization and exploration of network data with gephiInteractive visualization and exploration of network data with gephi
Interactive visualization and exploration of network data with gephi
 
SSRI_pt1.ppt
SSRI_pt1.pptSSRI_pt1.ppt
SSRI_pt1.ppt
 
Frontiers of Computational Journalism week 8 - Visualization and Network Anal...
Frontiers of Computational Journalism week 8 - Visualization and Network Anal...Frontiers of Computational Journalism week 8 - Visualization and Network Anal...
Frontiers of Computational Journalism week 8 - Visualization and Network Anal...
 
2010-November-8-NIA - Smart Society and Civic Culture - Marc Smith
2010-November-8-NIA - Smart Society and Civic Culture - Marc Smith2010-November-8-NIA - Smart Society and Civic Culture - Marc Smith
2010-November-8-NIA - Smart Society and Civic Culture - Marc Smith
 
Group and Community Detection in Social Networks
Group and Community Detection in Social NetworksGroup and Community Detection in Social Networks
Group and Community Detection in Social Networks
 
lec3_socialnetwork_part1.pptx
lec3_socialnetwork_part1.pptxlec3_socialnetwork_part1.pptx
lec3_socialnetwork_part1.pptx
 
01 Network Data Collection (2017)
01 Network Data Collection (2017)01 Network Data Collection (2017)
01 Network Data Collection (2017)
 
01 Introduction to Networks Methods and Measures
01 Introduction to Networks Methods and Measures01 Introduction to Networks Methods and Measures
01 Introduction to Networks Methods and Measures
 
01 Introduction to Networks Methods and Measures (2016)
01 Introduction to Networks Methods and Measures (2016)01 Introduction to Networks Methods and Measures (2016)
01 Introduction to Networks Methods and Measures (2016)
 
Scits 2014
Scits 2014Scits 2014
Scits 2014
 
Computer model
Computer modelComputer model
Computer model
 
Map history-networks-shorter
Map history-networks-shorterMap history-networks-shorter
Map history-networks-shorter
 
Crafting poems for data analysis?
Crafting poems for data analysis?Crafting poems for data analysis?
Crafting poems for data analysis?
 
Introduction to Topological Data Analysis
Introduction to Topological Data AnalysisIntroduction to Topological Data Analysis
Introduction to Topological Data Analysis
 
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
 
Node XL - features and demo
Node XL - features and demoNode XL - features and demo
Node XL - features and demo
 
Session6 02.jeremi ochab
Session6 02.jeremi ochabSession6 02.jeremi ochab
Session6 02.jeremi ochab
 
Social Networks and Computer Science
Social Networks and Computer ScienceSocial Networks and Computer Science
Social Networks and Computer Science
 
Sementic nets
Sementic netsSementic nets
Sementic nets
 
10 More than a Pretty Picture: Visual Thinking in Network Studies (2016)
10 More than a Pretty Picture: Visual Thinking in Network Studies (2016)10 More than a Pretty Picture: Visual Thinking in Network Studies (2016)
10 More than a Pretty Picture: Visual Thinking in Network Studies (2016)
 

Mehr von Digital Methods Initiative

Digital Methods Tool Medley. Digital Methods Summer School 2012
Digital Methods Tool Medley. Digital Methods Summer School 2012Digital Methods Tool Medley. Digital Methods Summer School 2012
Digital Methods Tool Medley. Digital Methods Summer School 2012
Digital Methods Initiative
 
Digital Methods Winterschool 2012: API - Interfaces to the Cloud
Digital Methods Winterschool 2012: API - Interfaces to the CloudDigital Methods Winterschool 2012: API - Interfaces to the Cloud
Digital Methods Winterschool 2012: API - Interfaces to the Cloud
Digital Methods Initiative
 
DMI Workshop: Data visualization. Analytical clouding.
DMI Workshop: Data visualization. Analytical clouding.DMI Workshop: Data visualization. Analytical clouding.
DMI Workshop: Data visualization. Analytical clouding.
Digital Methods Initiative
 
DMI Workshop: Wikileaks and the Myth of (Data-Driven) Citizen Journalism (wik...
DMI Workshop: Wikileaks and the Myth of (Data-Driven) Citizen Journalism (wik...DMI Workshop: Wikileaks and the Myth of (Data-Driven) Citizen Journalism (wik...
DMI Workshop: Wikileaks and the Myth of (Data-Driven) Citizen Journalism (wik...
Digital Methods Initiative
 

Mehr von Digital Methods Initiative (13)

Query Design for Digital Methods by Richard Rogers
Query Design for Digital Methods by Richard RogersQuery Design for Digital Methods by Richard Rogers
Query Design for Digital Methods by Richard Rogers
 
Digital Methods by Richard Rogers
Digital Methods by Richard RogersDigital Methods by Richard Rogers
Digital Methods by Richard Rogers
 
The Birth of Social Media Methods
The Birth of Social Media MethodsThe Birth of Social Media Methods
The Birth of Social Media Methods
 
Digital Methods Summer School 2013 Tool Medley
Digital Methods Summer School 2013 Tool MedleyDigital Methods Summer School 2013 Tool Medley
Digital Methods Summer School 2013 Tool Medley
 
Dmi12 workshops - crawling and scraping
Dmi12   workshops - crawling and scrapingDmi12   workshops - crawling and scraping
Dmi12 workshops - crawling and scraping
 
Digital Methods Tool Medley. Digital Methods Summer School 2012
Digital Methods Tool Medley. Digital Methods Summer School 2012Digital Methods Tool Medley. Digital Methods Summer School 2012
Digital Methods Tool Medley. Digital Methods Summer School 2012
 
Digital Methods Winterschool 2012: API - Interfaces to the Cloud
Digital Methods Winterschool 2012: API - Interfaces to the CloudDigital Methods Winterschool 2012: API - Interfaces to the Cloud
Digital Methods Winterschool 2012: API - Interfaces to the Cloud
 
DMI Workshop: When Search Becomes Research
DMI Workshop: When Search Becomes ResearchDMI Workshop: When Search Becomes Research
DMI Workshop: When Search Becomes Research
 
DMI Workshop: Crawling and Scraping
DMI Workshop: Crawling and Scraping DMI Workshop: Crawling and Scraping
DMI Workshop: Crawling and Scraping
 
DMI Workshop: Data visualization. Analytical clouding.
DMI Workshop: Data visualization. Analytical clouding.DMI Workshop: Data visualization. Analytical clouding.
DMI Workshop: Data visualization. Analytical clouding.
 
DMI Workshop: Wikileaks and the Myth of (Data-Driven) Citizen Journalism (wik...
DMI Workshop: Wikileaks and the Myth of (Data-Driven) Citizen Journalism (wik...DMI Workshop: Wikileaks and the Myth of (Data-Driven) Citizen Journalism (wik...
DMI Workshop: Wikileaks and the Myth of (Data-Driven) Citizen Journalism (wik...
 
DMI Workshop. Data visualization: Clouding
DMI Workshop. Data visualization: CloudingDMI Workshop. Data visualization: Clouding
DMI Workshop. Data visualization: Clouding
 
IIPC Dutch Blogosphere
IIPC Dutch BlogosphereIIPC Dutch Blogosphere
IIPC Dutch Blogosphere
 

Kürzlich hochgeladen

Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 

Kürzlich hochgeladen (20)

COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 

Interactive visualization and exploration of network data with Gephi

  • 1. Interactive visualization and exploration of network data with gephi Bernhard Rieder Universiteit van Amsterdam Mediastudies Department and some conceptual context
  • 2. Two kinds of mathematics Can there be data analysis without math? No. Does this imply epistemological commitments? Yes. But there are choices, e.g. between: ☉ Confirmatory data analysis => deductive ☉ Exploratory data analysis (Tukey 1962) => inductive
  • 3. Two kinds of mathematics Statistics Observed: objects and properties Inferred: social forces Data representation: the table Visual representation: quantity charts Grouping: "class" (similar properties) Graph-theory Observed: objects and relations Inferred: structure Data representation: the matrix Visual representation: network diagrams Grouping: "clique" (dense relations)
  • 4. Graph theory Leonhard Euler, "Seven Bridges of Königsberg", 1735 Introducing the "point and line" model
  • 5. Graph theory Develops over the 20th century, in particular the second half. Integrates branches of mathematics (topology, geometry, statistics, etc.). Graph theory is "the mathematics of structure" (Harary 1965), "a mathematical model for any system involving a binary relation" (Harary 1969); it makes relational structure calculable. "Perhaps even more than to the contact between mankind and nature, graph theory owes to the contact of human beings between each other." (König 1936)
  • 6. Basic ideas Moreno 1934 Graph theory developed in exchange with sociometry, small-group research and (later) social exchange theory. Starting point: "the sociometric test" (experimental definition of "relation")
  • 8. Forsythe and Katz, 1946, "adjacency matrix"
  • 10. Basic ideas The network singularity Why do network analysis and visualization? Which arguments are put forward? ☉ New media: technical and conceptual structures modeled as networks ☉ The network imaginary: networks as analytical device and trending topic ☉ Calculative capacities: powerful techniques and tools ☉ Visualization: the network diagram, "visual analytics" ☉ Logistics: data, software, and hardware are available and cheap ☉ Methodology I: dissatisfaction with statistics => SNA ☉ Methodology II: a "new science of networks" (Watts 2005) emerged ☉ Society: diversification, problems with demographics / statistics / theory
  • 11. Basic ideas Adamic and Glance, "Divided They Blog", 2005
  • 12. Graph theory Graph theory consists of or provides: ☉ A basic conceptual and formal model (point and line) ☉ Descriptive and analytical language to talk about specific graphs ☉ Extensive calculability of structure ☉ Various “native” (and non-native) forms of visualization
  • 13. Formalization "As we have seen, the basic terms of digraph theory are point and line. Thus, if an appropriate coordination is made so that each entity of an empirical system is identified with a point and each relationship is identified with a line, then for all true statements about structural properties of the obtained digraph there are corresponding true statements about structural properties of the empirical system." (Harary et al. 1965) There is always an epistemological commitment! => What can "carry" the reductionism and formalization?
  • 14. Much of these data can be analyzed as graphs. Social media formalize interaction at the interface.
  • 15. Basic ideas What Kind of Phenomena/Data? Interactive networks (Watts 2004): link encodes tangible interaction ☉ social network ☉ citation networks ☉ hypertext networks Symbolic networks (Watts 2004): link is conceptual ☉ co-presence (Tracker Tracker, IMDB, etc.) ☉ co-word ☉ any kind of "structure" that can be formalized as point and line => do all kinds of analysis (SNA, transportation, text mining, etc.) => analyze structural properties in various ways
  • 16. Basic ideas File formats To be able to begin, we need data in a graph file format. There are a number of different file formats used to specify graphs. Different formats have different capacities (e.g. .gexf allows to specify time intervals). The guess (.gdf) format: http://courses.polsys.net/gephi/
  • 17. Basic ideas What is a graph? An abstract representation of nodes connected by links. Two ways of analyzing graphs: ☉ numerical analysis (graph statistics, structural measures, etc.) ☉ visualization (network diagram, matrix, arc diagram, etc.)
  • 18. Basic ideas Wikipedia: Glossary of graph theory Tools are easy, concepts are hard http://courses.polsys.net/gephi/
  • 19. Vertices and edges! Nodes and lines! Two main types: Directed (e.g. Twitter) Undirected (e.g. Facebook) Properties of nodes: degree, centrality, etc. Properties of edges: weight, direction, etc. Properties of the graph: averages, diameter, communities, etc. Basic ideas What is a graph? A B C D a-b b-d b-c c-d Nodes, Degree: A: 1, B: 3, C: 2, D: 2 Nodes, Weighted Degree: A: 1, B: 3, C: 3, D: 3 Edges, Weight: a-b: 1, b-c: 1, b-d: 1, c-d: 2 Graph, diameter: 2 Graph, density: 0.667 (4 edges out of 6) Graph, average shortest path: 1.334 Numbers are great for comparison!
  • 22. Basic ideas Interactive visual analytics Bringing structure to the surface (gephi panel: "layout") ☉ different spatializations (force, geometry, etc.) Projecting variables into the diagram (gephi panel: "ranking") ☉ Size (nodes, edges, labels, etc.) ☉ Color (nodes, edges, labels, etc.) Deriving measures (gephi panel: "statistics") ☉ Properties of nodes, edges, structure => new variables Analysis: e.g. correlation between spatial layout and variables?
  • 23. Layout algorithms transform n-dimensional adjacency matrices into two-dimensional diagrams
  • 24. Every algorithm/technique reveals the structure of the graph differently, shows different aspects
  • 25. Basic ideas Interactive visual analytics Bringing structure to the surface (gephi panel: "layout") ☉ different spatializations (force, geometry, etc.) Projecting variables into the diagram (gephi panel: "ranking") ☉ Size (nodes, edges, labels, etc.) ☉ Color (nodes, edges, labels, etc.) Analysis: e.g. “correlation” between spatial layout and variables?
  • 27. Nine measures of centrality (Freeman 1979)
  • 28. Basic ideas Interactive visual analytics Bringing structure to the surface (gephi panel: "layout") ☉ different spatializations (force, geometry, etc.) Projecting variables into the diagram (gephi panel: "ranking") ☉ Size (nodes, edges, labels, etc.) ☉ Color (nodes, edges, labels, etc.) Deriving measures (gephi panel: "statistics") ☉ Properties of nodes, edges, structure => new variables Analysis: e.g. “correlation” between spatial layout and variables?
  • 31. Label PR α=0.85 PR α=0.7 PR α=0.55 PR α=0.4 In-Degree Out-Degree Degree n34 0.0944 0.0743 0.0584 0.0460 4 1 5 n1 0.0867 0.0617 0.0450 0.0345 1 2 3 n17 0.0668 0.0521 0.0423 0.0355 2 1 3 n39 0.0663 0.0541 0.0453 0.0388 5 1 6 n22 0.0619 0.0506 0.0441 0.0393 5 1 6 n27 0.0591 0.0451 0.0371 0.0318 1 0 1 n38 0.0522 0.0561 0.0542 0.0486 6 0 6 n11 0.0492 0.0372 0.0306 0.0274 3 1 4
  • 32. FB group "Islam is dangerous" Friendship network, color: betweenness centrality 2.339 members Average degree of 39.69 81.7% have at least one friend in the group 55.4% five or more 37.2% have 20 or more founder and admin has 609 friends
  • 33. Twitter 1% sample, co-hashtag analysis 227,029 unique hashtags, 1627 displayed (freq >= 50) Size: frequency Color: modularity
  • 34. Size: frequency Color: user diversity Twitter 1% sample, co-hashtag analysis 227,029 unique hashtags, 1627 displayed (freq >= 50)
  • 35. Size: frequency Color: degree Twitter 1% sample, co-hashtag analysis 227,029 unique hashtags, 1627 displayed (freq >= 50)
  • 36. Network statistics betweenness centrality degree Relational elements of graphs can be represented as tables (nodes have properties) and analyzed through statistics. Network statistics bridge the gap between individual units and the structural forms they are embedded in. This is currently an extremely prolific field of research.
  • 37. Twitter 1% sample Co-hashtag analysis Degree vs. wordFrequency
  • 38. Degree vs. userDiversity Twitter 1% sample Co-hashtag analysis
  • 40. Co-like analysis of my personal FB network: Nodes: users / Links: "liking the same thing" Example 3: our imagination
  • 44. Thank You rieder@uva.nl https://www.digitalmethods.net http://thepoliticsofsystems.net "Far better an approximate answer to the right question, which is often vague, than an exact answer to the wrong question, which can always be made precise. Data analysis must progress by approximate answers, at best, since its knowledge of what the problem really is will at best be approximate." (Tukey 1962)

Hinweis der Redaktion

  1. Tukey, The Future of Data Analysis, 1962We don’t have rigorous methods for hypothesis testing in network analysis.
  2. Allows for all kinds of folding, combinations, etc. – Math is not homogeneous, but sprawling!Different forms of reasoning, different modes of aggregation.These are already analytical frameworks, different ways of formalizing.Statistics: atomism, structure is implicit ("hidden forces", "social forces" cf. Durhkeim) => groups are abstractions, constituted by socioeconomic similaritySocial Network Analysis: atomism, structure is explicit ("dyadic forces") => groups are concrete, constituted by social exchange
  3. Now we can calculate (in particular via matrix algebra).
  4. Handbooks on graph theory are full of exhaustive discussions of basic graph types. Loads of vocabulary and analytical approaches.
  5. Handbooks on graph theory are full of exhaustive discussions of basic graph types. Loads of vocabulary and analytical approaches.
  6. Very large scale systems on the one side, but highly concentrated data repositories on the other.The promise of data analysis is, of course, to use that data to make sense of all the complexity.
  7. Visualization is, again, one type of analysis.Which properties of the network are "made salient" by an algorithm?http://thepoliticsofsystems.net/2010/10/one-network-and-four-algorithms/Models behind: spring simulation, simulated annealing (http://wiki.cns.iu.edu/pages/viewpage.action?pageId=1704113)
  8. Non force-based layouts can be extremely useful. Gephi can produce those as well
  9. Network analysis has produced a large number of calculated metrics that take into account the structure of the network."All in all, this process resulted in the specification of nine centrality measures based on three conceptual foundations. Three are based on the degrees of points and are indexes of communication activity. Three are based on the betweenness of points and are indexes of potential for control of communication. And three are based on closeness and are indexes either of independence or efficiency." (Freeman 1979)What concepts are these metrics based on?
  10. Network metrics are highly dependent on individual variables. Here: the same network with PageRank with four different values for the dampening parameter alpha. (red=highest PR value, yellow=second highest, turquoise=third highest)See Rieder 2012: http://computationalculture.net/article/what_is_in_pagerank
  11. From DMI workshop on anti-Islamism and right-wing extremism.We can also look at interaction patters: activity structure, held together by leaders?
  12. Extend word lists (what am I missing?), account for refraction. Rieder & Gerlitz 2013: http://journal.media-culture.org.au/index.php/mcjournal/article/viewArticle/620Rieder 2012: http://firstmonday.org/ojs/index.php/fm/article/view/4199/3359
  13. Project variables into the graph User diversity = no of unique users of a hashtag divided by hashtag frequency
  14. Larger roles of hashtags, not all are issue markers!
  15. There is no need to analyze and visualize a graph as a network.Characterize hashtags in relation to a whole. (their role beyond a particular topic sample), better understand our "fishing pole" (the sample technique) and the weight it carries.Tbt: throwback thursday
  16. This is a technical process, but to be a method, there needs to be adequation between a conceptual element and a technical one.These steps translate a large number of commitments to particular ideas.A postdemographic (Rogers) approach.