SlideShare a Scribd company logo
1 of 64
Networks
           Part II

            Sharad Goel
        Columbia University
Computational Social Science: Lecture 6

            March 1, 2013
Corporate E-mail Communication
[ Adamic & Adar, 2004 ]
via Easley & Kleinberg
Networks/Graphs

             Nodes/vertices
people, organizations, webpages, computers

                  Edges
represent connections between pairs of nodes
Distance
Length of the shortest path between two nodes
Distance
Length of the shortest path between two nodes
Breadth-first Search
iteratively explore nodes one layer at a time
# initialize distances
dist = {}
for u in G:
   dist[u] = NA

dist [u0] = 0

d=0
periphery = { u0 }
while len(periphery) > 0:
  # find nodes one step away from the periphery
  next_level = {}
  for u in periphery:
     next_level += { w for w in neighbors[u] if dist[w] == NA }

   # update distances
   d += 1
   for u in next_level:
     dist[u] = d

  # update periphery
  periphery = next_level
BFS @ scale
    undirected network

           Input
 edge list, starting node u0

          Output
Distance to all nodes from u0
BFS @ scale
        undirected network

Input: edge list, distances (u, d)
1. join distances with edge list
2. foreach (u, d, w) output (w, d+1)
   [ also output (u0, 0) ]
3. group by w, and output min d
Connected Components
     undirected network

             Input
            Edge list

            Output
List of nodes for each component
Connected Graph
There is a path between every pair of nodes
Connected Graph
There is a path between every pair of nodes
Connected Component
 A connected subset of nodes that is not
contained in any larger connected subset
Connected Components
             undirected network

1. Select a node u0 that has not yet been assigned
2. BFS starting from u0
3. Record nodes reached by BFS
Consider the global human social network,
with an edge between every pair of friends

       Is this network connected?
Consider the global human social network,
   with an edge between every pair of friends

           Is this network connected?

No – there are people with no (living) friends, who
 are hence isolated from the rest of the network
Consider the global human social network,
with an edge between every pair of friends

Is there a “giant” connected component?
Consider the global human social network,
with an edge between every pair of friends

Is there more than one “giant” component?
Consider the global human social network,
with an edge between every pair of friends

Is there more than one “giant” component?

        No – unlikely to have two
    large disconnected sets of people
Consider the global human social network,
with an edge between every pair of friends

Is there more than one “giant” component?

        No – unlikely to have two
    large disconnected sets of people

        Historically it was more likely
  e.g., pre-Columbian America & Eurasia
Consider the global human social network,
   with an edge between every pair of friends

On average, how far are people from one another?
The Small-world Experiment
            Stanley Milgram, 1967

296 people were randomly selected in Omaha and Wichita

Packages sent to the selected individuals with instructions to
forward to a particular stock broker in Boston through a chain of
people they knew on a first-name basis.
The Small-world Experiment
     Stanley Milgram, 1967

  Of the 296 packages, 232 did not reach target

Of the 64 that did arrive, average path length was 6

            “Six degrees of separation”
Small-world phenomenon

Is “six degrees” big or small?
Small-world phenomenon

navigational vs. topological
The Anatomy of the Facebook Social Graph
J. Ugander, B. Karrer, L. Backstrom, C. Marlow

     721 million users, 69 billion edges
         5 degrees of separation
Edge list  degree distribution
       undirected network

             Input
            Edge list

            Output
       Degree distribution
3
1       2


                5

    4


                              7
            6



                Degree of node u
                # of edges incident on u
Edge list  degree distribution
       undirected network

              Map
         input: (u, w)
     output: (u, w), key := u
     output: (w, u), key := w

            Reduce
      input: u, {w1, …, wk}
          output: u, k
Edge list  degree distribution
       undirected network

              Map
           input: u, k
        identity, key := k

            Reduce
       input: k, {u1, …, um}
          output: k, m
An email network of 130M users
Edges indicate reciprocated communication
An email network of 130M users
Edges indicate reciprocated communication
                (log-log plot)
Clustering
Clustering
Triadic closure

1. Opportunity
2. Incentive
3. Commonality
Counting Triangles
          undirected network

                 Input
              adjacency list

                Output
Number of triangles incident on each node
Counting Triangles
                     In memory


for u in nodes:
   triangles[u] = 0
   for w in neighbors[u]:
      triangles[u] += len(neighbors[w] & neighbors[u])

triangles[u] = triangles[u] / 2
Counting Triangles
                       @ scale


Every node needs to know to which nodes it is connected
                         and
      to which nodes its neighbors are connected
Counting Triangles
        @ scale

         Map
  input: u {w1, …, wk}
  foreach wi:
            output wi u {w1, …, wk}

        Reduce
In memory triangle count
Homophily
the tendency of individuals to associate with similar others


              “birds of a feather flock together”
Birds of a Feather: Homophily in Social Networks
          McPherson, Smith-Lovin, Cook

            race, sex, age, religion, education,
            occupation, social class, behaviors,
              attitudes, abilities, aspirations
Homophily

1. Preference
2. Influence
3. Opportunity
Fantasy Football
Computing Homophily

                  Input
    Edge list, race of each individual

                  Output
   Distribution of race among friends

          White    Black   Latino   Asian
White
Black
Latino
Asian
Computing Homophily

1. Join edges (u, v) by u, demographics (w, race) by w
2. Join edges (u, v, urace) by v, demographics (w, race) by w
Computing Homophily

1. Join edges (u, v) by u, demographics (w, race) by w
2. Join edges (u, v, urace) by v, demographics (w, race) by w
3. Group edges (u, v, urace, vrace) by sorted([urace, vrace])
Computing Homophily

1.   Join edges (u, v) by u, demographics (w, race) by w
2.   Join edges (u, v, urace) by v, demographics (w, race) by w
3.   Group edges (u, v, urace, vrace) by sorted([urace, vrace])
4.   Count edges in each group
5.   Normalize the table
How do ideas and products
 spread through society?
The structure of diffusion


93%   5%        1%          0.3%    0.3%
Computing the structure of diffusion

                Input
          Twitter network
 Time-stamped “adoptions” of 1B URLs

               Output
   Distribution of cascade structures
Computing the structure of diffusion

 We assume v influenced u to adopt link t if
v is the last of u’s contacts to adopt before u

                     2
                             3



                     5




                     9
Computing the structure of diffusion

   Draw a labeled edge from v to u




                          3

                      t
                5
Computing the structure of diffusion

   Group edges by their labels (URL)
Computing the structure of diffusion

Compute the connected components for each
      forest corresponding to a URL
Computing the structure of diffusion

Definition. Two (rooted) trees are isomorphic if they are
identical under a relabeling of the vertices.
x          (x)            (x, x)             ((x))         (x, (x))




Basis. The canonical name c(T) for the one-node tree T is x.

Induction. If T has more than one node, let T1, . . . ,Tk denote the
subtrees of the root indexed such that c(T1) ≤ c(T2) ≤ · · · ≤ c(Tk)
under the lexicographic order. Then the canonical name for T is
(c(T1), . . . ,c(Tk)).
                           Aho et al. [1974]
Computing the structure of diffusion

 Compute the canonical name for each
       tree in the URL forests
Computing the structure of diffusion

Count the number of trees of each type
Computing the structure of diffusion

 We assume v influenced u to adopt link t if
v is the last of u’s contacts to adopt before u

                     2
                             3



                     5




                     9
Computing the structure of diffusion

   Draw a labeled edge from v to u




                          3

                      t
                5
Computing the structure of diffusion


1. Join adoptions (link, u, uts) by u, edges (u, w) by u
2. Join (link, u, uts, w) by (link, w),
   adoptions (link, w, wts) by (link, w)
Computing the structure of diffusion


1. Join adoptions (link, u, uts) by u, edges (u, w) by u
2. Join (link, u, uts, w) by
   (link, w),                             adoptions (link, w, wts) by
   (link, w)
3. Group (link, u, uts, w, wts) by (link, u)
Computing the structure of diffusion


1. Join adoptions (link, u, uts) by u, edges (u, w) by u
2. Join (link, u, uts, w) by (link, w),
   adoptions (link, w, wts) by (link, w)
3. Group (link, u, uts, w, wts) by (link, u)
4. Output unique “parent” edge (link, u, uts, w, wts) for each group

More Related Content

Viewers also liked

Computational Social Science, Lecture 03: Counting at Scale, Part I
Computational Social Science, Lecture 03: Counting at Scale, Part IComputational Social Science, Lecture 03: Counting at Scale, Part I
Computational Social Science, Lecture 03: Counting at Scale, Part Ijakehofman
 
Computational Social Science, Lecture 04: Counting at Scale, Part II
Computational Social Science, Lecture 04: Counting at Scale, Part IIComputational Social Science, Lecture 04: Counting at Scale, Part II
Computational Social Science, Lecture 04: Counting at Scale, Part IIjakehofman
 
Computational Social Science, Lecture 02: An Introduction to Counting
Computational Social Science, Lecture 02: An Introduction to CountingComputational Social Science, Lecture 02: An Introduction to Counting
Computational Social Science, Lecture 02: An Introduction to Countingjakehofman
 
Modeling Social Data, Lecture 6: Regression, Part 1
Modeling Social Data, Lecture 6: Regression, Part 1Modeling Social Data, Lecture 6: Regression, Part 1
Modeling Social Data, Lecture 6: Regression, Part 1jakehofman
 
Modeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to CountingModeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to Countingjakehofman
 
Modeling Social Data, Lecture 1: Overview
Modeling Social Data, Lecture 1: OverviewModeling Social Data, Lecture 1: Overview
Modeling Social Data, Lecture 1: Overviewjakehofman
 
Influential figures in the thriller genre
Influential figures in the thriller genreInfluential figures in the thriller genre
Influential figures in the thriller genrelottieseaton7
 
Presentación para los niños de primaria
Presentación para los niños de primaria Presentación para los niños de primaria
Presentación para los niños de primaria virmirlim1976
 
Auto Elektrikoen Aurkezpena
Auto Elektrikoen AurkezpenaAuto Elektrikoen Aurkezpena
Auto Elektrikoen Aurkezpenaguest40206e1d
 
14.virus lentos no convencionales priones
14.virus lentos no convencionales priones14.virus lentos no convencionales priones
14.virus lentos no convencionales prionesCFUK 22
 
Certificate and transcript
Certificate and transcriptCertificate and transcript
Certificate and transcriptEslam Ahmed
 
Nuestra salud maria g
Nuestra salud maria g Nuestra salud maria g
Nuestra salud maria g rosayago
 
RUSA 2016 Final Proof Jan. 21 [146409]
RUSA 2016 Final Proof Jan. 21 [146409]RUSA 2016 Final Proof Jan. 21 [146409]
RUSA 2016 Final Proof Jan. 21 [146409]Isabela Palmieri
 
Genre research
Genre research Genre research
Genre research Gudj
 

Viewers also liked (20)

Computational Social Science, Lecture 03: Counting at Scale, Part I
Computational Social Science, Lecture 03: Counting at Scale, Part IComputational Social Science, Lecture 03: Counting at Scale, Part I
Computational Social Science, Lecture 03: Counting at Scale, Part I
 
Computational Social Science, Lecture 04: Counting at Scale, Part II
Computational Social Science, Lecture 04: Counting at Scale, Part IIComputational Social Science, Lecture 04: Counting at Scale, Part II
Computational Social Science, Lecture 04: Counting at Scale, Part II
 
Computational Social Science, Lecture 02: An Introduction to Counting
Computational Social Science, Lecture 02: An Introduction to CountingComputational Social Science, Lecture 02: An Introduction to Counting
Computational Social Science, Lecture 02: An Introduction to Counting
 
Modeling Social Data, Lecture 6: Regression, Part 1
Modeling Social Data, Lecture 6: Regression, Part 1Modeling Social Data, Lecture 6: Regression, Part 1
Modeling Social Data, Lecture 6: Regression, Part 1
 
Modeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to CountingModeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to Counting
 
Modeling Social Data, Lecture 1: Overview
Modeling Social Data, Lecture 1: OverviewModeling Social Data, Lecture 1: Overview
Modeling Social Data, Lecture 1: Overview
 
Influential figures in the thriller genre
Influential figures in the thriller genreInfluential figures in the thriller genre
Influential figures in the thriller genre
 
Presentación para los niños de primaria
Presentación para los niños de primaria Presentación para los niños de primaria
Presentación para los niños de primaria
 
Auto Elektrikoen Aurkezpena
Auto Elektrikoen AurkezpenaAuto Elektrikoen Aurkezpena
Auto Elektrikoen Aurkezpena
 
14.virus lentos no convencionales priones
14.virus lentos no convencionales priones14.virus lentos no convencionales priones
14.virus lentos no convencionales priones
 
лабар6
лабар6лабар6
лабар6
 
лабар9
лабар9лабар9
лабар9
 
Certificate and transcript
Certificate and transcriptCertificate and transcript
Certificate and transcript
 
Nuestra salud maria g
Nuestra salud maria g Nuestra salud maria g
Nuestra salud maria g
 
практ1
практ1практ1
практ1
 
RUSA 2016 Final Proof Jan. 21 [146409]
RUSA 2016 Final Proof Jan. 21 [146409]RUSA 2016 Final Proof Jan. 21 [146409]
RUSA 2016 Final Proof Jan. 21 [146409]
 
Proyecto i.a.v.a
Proyecto i.a.v.aProyecto i.a.v.a
Proyecto i.a.v.a
 
Genre research
Genre research Genre research
Genre research
 
Question 1
Question 1Question 1
Question 1
 
банери
банерибанери
банери
 

Similar to Computational Social Science, Lecture 06: Networks, Part II

Socialnetworkanalysis (Tin180 Com)
Socialnetworkanalysis (Tin180 Com)Socialnetworkanalysis (Tin180 Com)
Socialnetworkanalysis (Tin180 Com)Tin180 VietNam
 
Higher-order clustering coefficients
Higher-order clustering coefficientsHigher-order clustering coefficients
Higher-order clustering coefficientsAustin Benson
 
Exploratory social network analysis with pajek
Exploratory social network analysis with pajekExploratory social network analysis with pajek
Exploratory social network analysis with pajekTHomas Plotkowiak
 
Complexity Play&Learn
Complexity Play&LearnComplexity Play&Learn
Complexity Play&LearnMassimo Conte
 
Higher-order clustering coefficients at Purdue CSoI
Higher-order clustering coefficients at Purdue CSoIHigher-order clustering coefficients at Purdue CSoI
Higher-order clustering coefficients at Purdue CSoIAustin Benson
 
ICPSR - Complex Systems Models in the Social Sciences - Lecture 3 - Professor...
ICPSR - Complex Systems Models in the Social Sciences - Lecture 3 - Professor...ICPSR - Complex Systems Models in the Social Sciences - Lecture 3 - Professor...
ICPSR - Complex Systems Models in the Social Sciences - Lecture 3 - Professor...Daniel Katz
 
The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics
The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics
The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics Kolja Kleineberg
 
4 musatov
4 musatov4 musatov
4 musatovYandex
 
Scott Complex Networks
Scott Complex NetworksScott Complex Networks
Scott Complex Networksjilung hsieh
 
Microsoft Research, India Social Networks And Their Applications To Web (Ti...
Microsoft Research, India   Social Networks And Their Applications To Web (Ti...Microsoft Research, India   Social Networks And Their Applications To Web (Ti...
Microsoft Research, India Social Networks And Their Applications To Web (Ti...Tin180 VietNam
 
Spectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresSpectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresDavid Gleich
 
Lecture 5b graphs and hashing
Lecture 5b graphs and hashingLecture 5b graphs and hashing
Lecture 5b graphs and hashingVictor Palmar
 
Link Prediction in the Real World
Link Prediction in the Real WorldLink Prediction in the Real World
Link Prediction in the Real WorldBalaji Ganesan
 
Pan-genome Graphs biodata14
Pan-genome Graphs biodata14Pan-genome Graphs biodata14
Pan-genome Graphs biodata14Andrew Warren
 
From coincidence to purposeful flow? Properties of transcendental information...
From coincidence to purposeful flow? Properties of transcendental information...From coincidence to purposeful flow? Properties of transcendental information...
From coincidence to purposeful flow? Properties of transcendental information...Markus Luczak-Rösch
 
Higher-order clustering coefficients
Higher-order clustering coefficientsHigher-order clustering coefficients
Higher-order clustering coefficientsAustin Benson
 

Similar to Computational Social Science, Lecture 06: Networks, Part II (20)

Socialnetworkanalysis (Tin180 Com)
Socialnetworkanalysis (Tin180 Com)Socialnetworkanalysis (Tin180 Com)
Socialnetworkanalysis (Tin180 Com)
 
Higher-order clustering coefficients
Higher-order clustering coefficientsHigher-order clustering coefficients
Higher-order clustering coefficients
 
Exploratory social network analysis with pajek
Exploratory social network analysis with pajekExploratory social network analysis with pajek
Exploratory social network analysis with pajek
 
Complexity Play&Learn
Complexity Play&LearnComplexity Play&Learn
Complexity Play&Learn
 
Higher-order clustering coefficients at Purdue CSoI
Higher-order clustering coefficients at Purdue CSoIHigher-order clustering coefficients at Purdue CSoI
Higher-order clustering coefficients at Purdue CSoI
 
ICPSR - Complex Systems Models in the Social Sciences - Lecture 3 - Professor...
ICPSR - Complex Systems Models in the Social Sciences - Lecture 3 - Professor...ICPSR - Complex Systems Models in the Social Sciences - Lecture 3 - Professor...
ICPSR - Complex Systems Models in the Social Sciences - Lecture 3 - Professor...
 
An Introduction to Networks
An Introduction to NetworksAn Introduction to Networks
An Introduction to Networks
 
4 Cliques Clusters
4 Cliques Clusters4 Cliques Clusters
4 Cliques Clusters
 
The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics
The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics
The Hidden Geometry of Multiplex Networks @ Next Generation Network Analytics
 
4 musatov
4 musatov4 musatov
4 musatov
 
Scott Complex Networks
Scott Complex NetworksScott Complex Networks
Scott Complex Networks
 
Microsoft Research, India Social Networks And Their Applications To Web (Ti...
Microsoft Research, India   Social Networks And Their Applications To Web (Ti...Microsoft Research, India   Social Networks And Their Applications To Web (Ti...
Microsoft Research, India Social Networks And Their Applications To Web (Ti...
 
Spectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresSpectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structures
 
Lecture 5b graphs and hashing
Lecture 5b graphs and hashingLecture 5b graphs and hashing
Lecture 5b graphs and hashing
 
[PPT]
[PPT][PPT]
[PPT]
 
Link Prediction in the Real World
Link Prediction in the Real WorldLink Prediction in the Real World
Link Prediction in the Real World
 
Lausanne 2019 #4
Lausanne 2019 #4Lausanne 2019 #4
Lausanne 2019 #4
 
Pan-genome Graphs biodata14
Pan-genome Graphs biodata14Pan-genome Graphs biodata14
Pan-genome Graphs biodata14
 
From coincidence to purposeful flow? Properties of transcendental information...
From coincidence to purposeful flow? Properties of transcendental information...From coincidence to purposeful flow? Properties of transcendental information...
From coincidence to purposeful flow? Properties of transcendental information...
 
Higher-order clustering coefficients
Higher-order clustering coefficientsHigher-order clustering coefficients
Higher-order clustering coefficients
 

More from jakehofman

Modeling Social Data, Lecture 12: Causality & Experiments, Part 2
Modeling Social Data, Lecture 12: Causality & Experiments, Part 2Modeling Social Data, Lecture 12: Causality & Experiments, Part 2
Modeling Social Data, Lecture 12: Causality & Experiments, Part 2jakehofman
 
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1Modeling Social Data, Lecture 11: Causality and Experiments, Part 1
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1jakehofman
 
Modeling Social Data, Lecture 10: Networks
Modeling Social Data, Lecture 10: NetworksModeling Social Data, Lecture 10: Networks
Modeling Social Data, Lecture 10: Networksjakehofman
 
Modeling Social Data, Lecture 8: Classification
Modeling Social Data, Lecture 8: ClassificationModeling Social Data, Lecture 8: Classification
Modeling Social Data, Lecture 8: Classificationjakehofman
 
Modeling Social Data, Lecture 7: Model complexity and generalization
Modeling Social Data, Lecture 7: Model complexity and generalizationModeling Social Data, Lecture 7: Model complexity and generalization
Modeling Social Data, Lecture 7: Model complexity and generalizationjakehofman
 
Modeling Social Data, Lecture 4: Counting at Scale
Modeling Social Data, Lecture 4: Counting at ScaleModeling Social Data, Lecture 4: Counting at Scale
Modeling Social Data, Lecture 4: Counting at Scalejakehofman
 
Modeling Social Data, Lecture 3: Data manipulation in R
Modeling Social Data, Lecture 3: Data manipulation in RModeling Social Data, Lecture 3: Data manipulation in R
Modeling Social Data, Lecture 3: Data manipulation in Rjakehofman
 
Modeling Social Data, Lecture 8: Recommendation Systems
Modeling Social Data, Lecture 8: Recommendation SystemsModeling Social Data, Lecture 8: Recommendation Systems
Modeling Social Data, Lecture 8: Recommendation Systemsjakehofman
 
Modeling Social Data, Lecture 6: Classification with Naive Bayes
Modeling Social Data, Lecture 6: Classification with Naive BayesModeling Social Data, Lecture 6: Classification with Naive Bayes
Modeling Social Data, Lecture 6: Classification with Naive Bayesjakehofman
 
Modeling Social Data, Lecture 3: Counting at Scale
Modeling Social Data, Lecture 3: Counting at ScaleModeling Social Data, Lecture 3: Counting at Scale
Modeling Social Data, Lecture 3: Counting at Scalejakehofman
 
Modeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to CountingModeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to Countingjakehofman
 
Modeling Social Data, Lecture 1: Case Studies
Modeling Social Data, Lecture 1: Case StudiesModeling Social Data, Lecture 1: Case Studies
Modeling Social Data, Lecture 1: Case Studiesjakehofman
 
NYC Data Science Meetup: Computational Social Science
NYC Data Science Meetup: Computational Social ScienceNYC Data Science Meetup: Computational Social Science
NYC Data Science Meetup: Computational Social Sciencejakehofman
 
Technical Tricks of Vowpal Wabbit
Technical Tricks of Vowpal WabbitTechnical Tricks of Vowpal Wabbit
Technical Tricks of Vowpal Wabbitjakehofman
 
Data-driven modeling: Lecture 10
Data-driven modeling: Lecture 10Data-driven modeling: Lecture 10
Data-driven modeling: Lecture 10jakehofman
 
Data-driven modeling: Lecture 09
Data-driven modeling: Lecture 09Data-driven modeling: Lecture 09
Data-driven modeling: Lecture 09jakehofman
 
Using Data to Understand the Brain
Using Data to Understand the BrainUsing Data to Understand the Brain
Using Data to Understand the Brainjakehofman
 

More from jakehofman (17)

Modeling Social Data, Lecture 12: Causality & Experiments, Part 2
Modeling Social Data, Lecture 12: Causality & Experiments, Part 2Modeling Social Data, Lecture 12: Causality & Experiments, Part 2
Modeling Social Data, Lecture 12: Causality & Experiments, Part 2
 
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1Modeling Social Data, Lecture 11: Causality and Experiments, Part 1
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1
 
Modeling Social Data, Lecture 10: Networks
Modeling Social Data, Lecture 10: NetworksModeling Social Data, Lecture 10: Networks
Modeling Social Data, Lecture 10: Networks
 
Modeling Social Data, Lecture 8: Classification
Modeling Social Data, Lecture 8: ClassificationModeling Social Data, Lecture 8: Classification
Modeling Social Data, Lecture 8: Classification
 
Modeling Social Data, Lecture 7: Model complexity and generalization
Modeling Social Data, Lecture 7: Model complexity and generalizationModeling Social Data, Lecture 7: Model complexity and generalization
Modeling Social Data, Lecture 7: Model complexity and generalization
 
Modeling Social Data, Lecture 4: Counting at Scale
Modeling Social Data, Lecture 4: Counting at ScaleModeling Social Data, Lecture 4: Counting at Scale
Modeling Social Data, Lecture 4: Counting at Scale
 
Modeling Social Data, Lecture 3: Data manipulation in R
Modeling Social Data, Lecture 3: Data manipulation in RModeling Social Data, Lecture 3: Data manipulation in R
Modeling Social Data, Lecture 3: Data manipulation in R
 
Modeling Social Data, Lecture 8: Recommendation Systems
Modeling Social Data, Lecture 8: Recommendation SystemsModeling Social Data, Lecture 8: Recommendation Systems
Modeling Social Data, Lecture 8: Recommendation Systems
 
Modeling Social Data, Lecture 6: Classification with Naive Bayes
Modeling Social Data, Lecture 6: Classification with Naive BayesModeling Social Data, Lecture 6: Classification with Naive Bayes
Modeling Social Data, Lecture 6: Classification with Naive Bayes
 
Modeling Social Data, Lecture 3: Counting at Scale
Modeling Social Data, Lecture 3: Counting at ScaleModeling Social Data, Lecture 3: Counting at Scale
Modeling Social Data, Lecture 3: Counting at Scale
 
Modeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to CountingModeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to Counting
 
Modeling Social Data, Lecture 1: Case Studies
Modeling Social Data, Lecture 1: Case StudiesModeling Social Data, Lecture 1: Case Studies
Modeling Social Data, Lecture 1: Case Studies
 
NYC Data Science Meetup: Computational Social Science
NYC Data Science Meetup: Computational Social ScienceNYC Data Science Meetup: Computational Social Science
NYC Data Science Meetup: Computational Social Science
 
Technical Tricks of Vowpal Wabbit
Technical Tricks of Vowpal WabbitTechnical Tricks of Vowpal Wabbit
Technical Tricks of Vowpal Wabbit
 
Data-driven modeling: Lecture 10
Data-driven modeling: Lecture 10Data-driven modeling: Lecture 10
Data-driven modeling: Lecture 10
 
Data-driven modeling: Lecture 09
Data-driven modeling: Lecture 09Data-driven modeling: Lecture 09
Data-driven modeling: Lecture 09
 
Using Data to Understand the Brain
Using Data to Understand the BrainUsing Data to Understand the Brain
Using Data to Understand the Brain
 

Recently uploaded

Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structuredhanjurrannsibayan2
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - Englishneillewis46
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Association for Project Management
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the ClassroomPooky Knightsmith
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin ClassesCeline George
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxDr. Sarita Anand
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701bronxfugly43
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseAnaAcapella
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and ModificationsMJDuyan
 

Recently uploaded (20)

Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 

Computational Social Science, Lecture 06: Networks, Part II

  • 1. Networks Part II Sharad Goel Columbia University Computational Social Science: Lecture 6 March 1, 2013
  • 2. Corporate E-mail Communication [ Adamic & Adar, 2004 ] via Easley & Kleinberg
  • 3. Networks/Graphs Nodes/vertices people, organizations, webpages, computers Edges represent connections between pairs of nodes
  • 4.
  • 5. Distance Length of the shortest path between two nodes
  • 6. Distance Length of the shortest path between two nodes
  • 7. Breadth-first Search iteratively explore nodes one layer at a time
  • 8. # initialize distances dist = {} for u in G: dist[u] = NA dist [u0] = 0 d=0 periphery = { u0 } while len(periphery) > 0: # find nodes one step away from the periphery next_level = {} for u in periphery: next_level += { w for w in neighbors[u] if dist[w] == NA } # update distances d += 1 for u in next_level: dist[u] = d # update periphery periphery = next_level
  • 9. BFS @ scale undirected network Input edge list, starting node u0 Output Distance to all nodes from u0
  • 10. BFS @ scale undirected network Input: edge list, distances (u, d) 1. join distances with edge list 2. foreach (u, d, w) output (w, d+1) [ also output (u0, 0) ] 3. group by w, and output min d
  • 11. Connected Components undirected network Input Edge list Output List of nodes for each component
  • 12. Connected Graph There is a path between every pair of nodes
  • 13. Connected Graph There is a path between every pair of nodes
  • 14. Connected Component A connected subset of nodes that is not contained in any larger connected subset
  • 15. Connected Components undirected network 1. Select a node u0 that has not yet been assigned 2. BFS starting from u0 3. Record nodes reached by BFS
  • 16. Consider the global human social network, with an edge between every pair of friends Is this network connected?
  • 17. Consider the global human social network, with an edge between every pair of friends Is this network connected? No – there are people with no (living) friends, who are hence isolated from the rest of the network
  • 18. Consider the global human social network, with an edge between every pair of friends Is there a “giant” connected component?
  • 19. Consider the global human social network, with an edge between every pair of friends Is there more than one “giant” component?
  • 20. Consider the global human social network, with an edge between every pair of friends Is there more than one “giant” component? No – unlikely to have two large disconnected sets of people
  • 21. Consider the global human social network, with an edge between every pair of friends Is there more than one “giant” component? No – unlikely to have two large disconnected sets of people Historically it was more likely e.g., pre-Columbian America & Eurasia
  • 22. Consider the global human social network, with an edge between every pair of friends On average, how far are people from one another?
  • 23. The Small-world Experiment Stanley Milgram, 1967 296 people were randomly selected in Omaha and Wichita Packages sent to the selected individuals with instructions to forward to a particular stock broker in Boston through a chain of people they knew on a first-name basis.
  • 24. The Small-world Experiment Stanley Milgram, 1967 Of the 296 packages, 232 did not reach target Of the 64 that did arrive, average path length was 6 “Six degrees of separation”
  • 25. Small-world phenomenon Is “six degrees” big or small?
  • 27. The Anatomy of the Facebook Social Graph J. Ugander, B. Karrer, L. Backstrom, C. Marlow 721 million users, 69 billion edges 5 degrees of separation
  • 28. Edge list  degree distribution undirected network Input Edge list Output Degree distribution
  • 29. 3 1 2 5 4 7 6 Degree of node u # of edges incident on u
  • 30. Edge list  degree distribution undirected network Map input: (u, w) output: (u, w), key := u output: (w, u), key := w Reduce input: u, {w1, …, wk} output: u, k
  • 31. Edge list  degree distribution undirected network Map input: u, k identity, key := k Reduce input: k, {u1, …, um} output: k, m
  • 32. An email network of 130M users Edges indicate reciprocated communication
  • 33. An email network of 130M users Edges indicate reciprocated communication (log-log plot)
  • 36. Triadic closure 1. Opportunity 2. Incentive 3. Commonality
  • 37. Counting Triangles undirected network Input adjacency list Output Number of triangles incident on each node
  • 38. Counting Triangles In memory for u in nodes: triangles[u] = 0 for w in neighbors[u]: triangles[u] += len(neighbors[w] & neighbors[u]) triangles[u] = triangles[u] / 2
  • 39. Counting Triangles @ scale Every node needs to know to which nodes it is connected and to which nodes its neighbors are connected
  • 40. Counting Triangles @ scale Map input: u {w1, …, wk} foreach wi: output wi u {w1, …, wk} Reduce In memory triangle count
  • 41. Homophily the tendency of individuals to associate with similar others “birds of a feather flock together”
  • 42. Birds of a Feather: Homophily in Social Networks McPherson, Smith-Lovin, Cook race, sex, age, religion, education, occupation, social class, behaviors, attitudes, abilities, aspirations
  • 45. Computing Homophily Input Edge list, race of each individual Output Distribution of race among friends White Black Latino Asian White Black Latino Asian
  • 46. Computing Homophily 1. Join edges (u, v) by u, demographics (w, race) by w 2. Join edges (u, v, urace) by v, demographics (w, race) by w
  • 47. Computing Homophily 1. Join edges (u, v) by u, demographics (w, race) by w 2. Join edges (u, v, urace) by v, demographics (w, race) by w 3. Group edges (u, v, urace, vrace) by sorted([urace, vrace])
  • 48. Computing Homophily 1. Join edges (u, v) by u, demographics (w, race) by w 2. Join edges (u, v, urace) by v, demographics (w, race) by w 3. Group edges (u, v, urace, vrace) by sorted([urace, vrace]) 4. Count edges in each group 5. Normalize the table
  • 49. How do ideas and products spread through society?
  • 50. The structure of diffusion 93% 5% 1% 0.3% 0.3%
  • 51. Computing the structure of diffusion Input Twitter network Time-stamped “adoptions” of 1B URLs Output Distribution of cascade structures
  • 52. Computing the structure of diffusion We assume v influenced u to adopt link t if v is the last of u’s contacts to adopt before u 2 3 5 9
  • 53. Computing the structure of diffusion Draw a labeled edge from v to u 3 t 5
  • 54. Computing the structure of diffusion Group edges by their labels (URL)
  • 55. Computing the structure of diffusion Compute the connected components for each forest corresponding to a URL
  • 56. Computing the structure of diffusion Definition. Two (rooted) trees are isomorphic if they are identical under a relabeling of the vertices.
  • 57. x (x) (x, x) ((x)) (x, (x)) Basis. The canonical name c(T) for the one-node tree T is x. Induction. If T has more than one node, let T1, . . . ,Tk denote the subtrees of the root indexed such that c(T1) ≤ c(T2) ≤ · · · ≤ c(Tk) under the lexicographic order. Then the canonical name for T is (c(T1), . . . ,c(Tk)). Aho et al. [1974]
  • 58. Computing the structure of diffusion Compute the canonical name for each tree in the URL forests
  • 59. Computing the structure of diffusion Count the number of trees of each type
  • 60. Computing the structure of diffusion We assume v influenced u to adopt link t if v is the last of u’s contacts to adopt before u 2 3 5 9
  • 61. Computing the structure of diffusion Draw a labeled edge from v to u 3 t 5
  • 62. Computing the structure of diffusion 1. Join adoptions (link, u, uts) by u, edges (u, w) by u 2. Join (link, u, uts, w) by (link, w), adoptions (link, w, wts) by (link, w)
  • 63. Computing the structure of diffusion 1. Join adoptions (link, u, uts) by u, edges (u, w) by u 2. Join (link, u, uts, w) by (link, w), adoptions (link, w, wts) by (link, w) 3. Group (link, u, uts, w, wts) by (link, u)
  • 64. Computing the structure of diffusion 1. Join adoptions (link, u, uts) by u, edges (u, w) by u 2. Join (link, u, uts, w) by (link, w), adoptions (link, w, wts) by (link, w) 3. Group (link, u, uts, w, wts) by (link, u) 4. Output unique “parent” edge (link, u, uts, w, wts) for each group