SlideShare ist ein Scribd-Unternehmen logo
1 von 16
Introduction to Networks
dsdht.wikispaces.com
Networks
• a collection of individuals or
entities, each called a vertex or node
• a list of pairs of vertices that are
neighbors, representing edges or links
• vertices are Facebook users, edges represent Facebook friendships
• Vertices of Drugs and Targets ,edges represent Drug target relations
Network Basic definitions
N to denote the number of vertices in a network
• Number of possible edges: N(N-1)/2 ~ N2/2
• The degree of a vertex is its number of neighbors
• The distance between two vertices is the length of the shortest path connecting
them.
• If two vertices are in different components, their distance is undefined or infinite.
• The diameter of a network is the average distance between pairs(It measures how
near or far typical individuals are from each other)
Network Basic definitions
Directed and Undirected Networks
• If the pairs are unordered, then the graph is undirected:
vertices = {A, B, C, D, E}
edges = ({A, B}, {A, C}, {B, C}, {C, E}).
• Otherwise it is directed:
vertices = {A, B, C, D, E}
edges = ((A, B),(A, C),(B, C),(C, E)).
Example
• I can follow you on Twitter without you following me
• web page A may link to page B, but not vice-versa.
Representing Networks
• adjacency matrix
• edgelist
2, 3
2, 4
3, 2
3, 4
4, 5
5, 2
5, 1
• adjacency list
1:
2: 3 4
3: 2 4
4: 5
5: 1 2
1
2
3
45
0 0 0 0 0
0 0 1 1 0
0 1 0 1 0
0 0 0 0 1
1 1 0 0 0
A =
Degree Properties
• Indegree
how many directed edges (arcs) are incident on a node 2
• outdegree
how many directed edges (arcs) originate at a node 2
• degree (in or out)
number of edges incident on a node 3
• Degree distribution: A frequency count of the occurrence of each
degree
• In-degree distribution:
[(2,3) (1,4) (0,1)]
• Out-degree distribution:
[(2,4) (1,3) (0,1)]
• (undirected) distribution:
[(3,3) (2,2) (1,3)]
Connected Components
• Strongly connected components
• Each node within the component can be reached from every other node in
the component by following directed links
Strongly connected components
B C D E
A
G H
F
• Weakly connected components:
every node can be reached from every other node by following links in either
direction
Weakly connected components
A B C D E
G H F
• In undirected networks one talks simply about
“connected components”
B
C
D
F
H
GA
E
A
C
D
E
F
G
H
B
Giant Component
In a network, a "component" is a group of nodes (people) that are all connected to
each other, directly or indirectly. So if a network has a "giant component", that
means almost every node is reachable from almost every other.
Centrality Measures
• We often want to know important nodes in a network .
• Centrality is one of the methods
There are many but for this course we will look into :
Degree - connectedness
Closeness – ease of reaching other nodes
Betweeness – role as an intermediary node
Eigenvector - not what you know, but who you know
Things to Remember
• Centrality is a measure of an node, centralization is a measure of
the network.
• It matters whether you are considering a directed or an undirected
network.
• Most centrality measures work on binary/unweighted networks
Centrality Measures
Node Degree Centrality : divide degree by the max. possible, i.e. (N-1)
Degree Centralization:
CD =
CD (n*
) - CD (i)[ ]i=1
g
å
[(N -1)(N -2)]
maximum value in the network
CD = 0.167CD = 1.0
Centrality Measures
Closeness Centrality
• one still wants to be in the “middle” of network, not
too far from the center.
• Closeness is based on the length of the average shortest
path between a node and all other nodes in the network
CC
'
(i) = (CC (i))/(N -1)
Closeness Centrality:
Normalized Closeness Centrality:
Cc
'
(A) =
d(A, j)
j=1
N
å
N -1
é
ë
ê
ê
ê
ê
ù
û
ú
ú
ú
ú
-1
=
1+ 2 +3+ 4
4
é
ëê
ù
ûú
-1
=
10
4
é
ëê
ù
ûú
-1
= 0.4
An actor who has very low closeness centrality takes many more steps to get to everyone.
Centrality Measures
Betweeness Centrality
• A node has a high betweenness centrality when they occupy a
position in the geodesics connecting many pairs of other actors
in the network
• It is equal to the number of shortest paths from all vertices
to all others that pass through that node.
CB (i) = gjk (i)/gjk
j<k
å
Where gjk = the number of shortest paths connecting jk
gjk(i) = the number that actor i is on.
Usually
normalized by:
CB
'
(i) = CB (i )/[(n -1)(n -2)/2]
number of pairs of vertices
excluding the vertex itself
• non-normalized version:
A lies between no two other vertices
B lies between A and 3 other vertices: C, D, and E
C lies between 4 pairs of vertices (A,D),(A,E),(B,D),(B,E)
note that there are no alternate paths for these pairs to
take, so C gets full credit
A B C ED
Betweeness Centrality
Betweeness Example
why do C and D each have betweenness
1?
They are both on shortest paths for pairs
(A,E), and (B,E), and so must share
credit:
• ½+½ = 1
Eigen Vector Centrality
Eigenvector centrality is calculated by assessing how well connected an individual is
to the parts of the network with the greatest connectivity.
Here yellow has more eigen vector centrality because it is connected to nodes with
greatest connectivity
Applications: High eigenvector centrality individuals are leaders of the network.
They are often public figures with many connections to other high-profile
individuals. Thus, they often play roles of key opinion leaders and shape public
perception. A related example of this is Google’s page rank algorithm, which is
closely related to eigenvector centrality calculated on websites based on links to
them.
Resources
Lada Adamic :
https://www.coursera.org/instructor/~267
Network,crowds and Markets
https://www.coursera.org/instructor/~267
http://www.cs.cornell.edu/home/kleinber/netwo
rks-book/networks-book.pdf

Weitere ähnliche Inhalte

Was ist angesagt?

Clustering of graphs and search of assemblages
Clustering of graphs and search of assemblagesClustering of graphs and search of assemblages
Clustering of graphs and search of assemblagesData-Centric_Alliance
 
Natural Language Query to SQL conversion using Machine Learning Approach
Natural Language Query to SQL conversion using Machine Learning ApproachNatural Language Query to SQL conversion using Machine Learning Approach
Natural Language Query to SQL conversion using Machine Learning ApproachMinhazul Arefin
 
Link Prediction in the Real World
Link Prediction in the Real WorldLink Prediction in the Real World
Link Prediction in the Real WorldBalaji Ganesan
 

Was ist angesagt? (6)

Clustering of graphs and search of assemblages
Clustering of graphs and search of assemblagesClustering of graphs and search of assemblages
Clustering of graphs and search of assemblages
 
Natural Language Query to SQL conversion using Machine Learning Approach
Natural Language Query to SQL conversion using Machine Learning ApproachNatural Language Query to SQL conversion using Machine Learning Approach
Natural Language Query to SQL conversion using Machine Learning Approach
 
Hash Function
Hash FunctionHash Function
Hash Function
 
Face recognition using PCA
Face recognition using PCAFace recognition using PCA
Face recognition using PCA
 
os
osos
os
 
Link Prediction in the Real World
Link Prediction in the Real WorldLink Prediction in the Real World
Link Prediction in the Real World
 

Andere mochten auch

GraphDice: A System for Exploring Multivariate Social Networks
GraphDice: A System for Exploring Multivariate Social NetworksGraphDice: A System for Exploring Multivariate Social Networks
GraphDice: A System for Exploring Multivariate Social NetworksNiklas Elmqvist
 
Mapping the South African Twittersphere
Mapping the South African TwittersphereMapping the South African Twittersphere
Mapping the South African TwittersphereSocialphysicist
 
A network based model for predicting a hashtag break out in twitter
A network based model for predicting a hashtag break out in twitter A network based model for predicting a hashtag break out in twitter
A network based model for predicting a hashtag break out in twitter Sultan Alzahrani
 
Network centrality measures and their effectiveness
Network centrality measures and their effectivenessNetwork centrality measures and their effectiveness
Network centrality measures and their effectivenessemapesce
 
Social network analysis of Jose Rizal
Social network analysis of Jose RizalSocial network analysis of Jose Rizal
Social network analysis of Jose RizalJose Fadul
 
Big Data: Mapping Twitter Communities
Big Data: Mapping Twitter CommunitiesBig Data: Mapping Twitter Communities
Big Data: Mapping Twitter CommunitiesSocialphysicist
 
Social Network Analysis: applications for education research
Social Network Analysis: applications for education researchSocial Network Analysis: applications for education research
Social Network Analysis: applications for education researchChristian Bokhove
 
Noli me tangere characters
Noli me tangere charactersNoli me tangere characters
Noli me tangere charactersImYakultGirl
 

Andere mochten auch (9)

GraphDice: A System for Exploring Multivariate Social Networks
GraphDice: A System for Exploring Multivariate Social NetworksGraphDice: A System for Exploring Multivariate Social Networks
GraphDice: A System for Exploring Multivariate Social Networks
 
Mapping the South African Twittersphere
Mapping the South African TwittersphereMapping the South African Twittersphere
Mapping the South African Twittersphere
 
A network based model for predicting a hashtag break out in twitter
A network based model for predicting a hashtag break out in twitter A network based model for predicting a hashtag break out in twitter
A network based model for predicting a hashtag break out in twitter
 
Network centrality measures and their effectiveness
Network centrality measures and their effectivenessNetwork centrality measures and their effectiveness
Network centrality measures and their effectiveness
 
Social network analysis of Jose Rizal
Social network analysis of Jose RizalSocial network analysis of Jose Rizal
Social network analysis of Jose Rizal
 
Big Data: Mapping Twitter Communities
Big Data: Mapping Twitter CommunitiesBig Data: Mapping Twitter Communities
Big Data: Mapping Twitter Communities
 
Social Network Analysis: applications for education research
Social Network Analysis: applications for education researchSocial Network Analysis: applications for education research
Social Network Analysis: applications for education research
 
Noli me tangere characters
Noli me tangere charactersNoli me tangere characters
Noli me tangere characters
 
Social Network Analysis
Social Network AnalysisSocial Network Analysis
Social Network Analysis
 

Ähnlich wie Networks

4. social network analysis
4. social network analysis4. social network analysis
4. social network analysisLokesh Ramaswamy
 
Social Network Analysis (Part 2)
Social Network Analysis (Part 2)Social Network Analysis (Part 2)
Social Network Analysis (Part 2)Vala Ali Rohani
 
Interpretation of the biological knowledge using networks approach
Interpretation of the biological knowledge using networks approachInterpretation of the biological knowledge using networks approach
Interpretation of the biological knowledge using networks approachElena Sügis
 
Minicourse on Network Science
Minicourse on Network ScienceMinicourse on Network Science
Minicourse on Network SciencePavel Loskot
 
Network theory - PyCon 2015
Network theory - PyCon 2015Network theory - PyCon 2015
Network theory - PyCon 2015Sarah Guido
 
Network Measures Social Computing-Unit 2.pptx
Network Measures Social Computing-Unit 2.pptxNetwork Measures Social Computing-Unit 2.pptx
Network Measures Social Computing-Unit 2.pptxchavanprasad17092001
 
Lecture 14 data structures and algorithms
Lecture 14 data structures and algorithmsLecture 14 data structures and algorithms
Lecture 14 data structures and algorithmsAakash deep Singhal
 
Bridging Centrality: Identifying Bridging Nodes in Transportation Network
Bridging Centrality: Identifying Bridging Nodes in Transportation NetworkBridging Centrality: Identifying Bridging Nodes in Transportation Network
Bridging Centrality: Identifying Bridging Nodes in Transportation NetworkEswar Publications
 
routing algorithm
routing algorithmrouting algorithm
routing algorithmAnusuaBasu
 
Community detection in graphs
Community detection in graphsCommunity detection in graphs
Community detection in graphsNicola Barbieri
 
Graphs in c language
Graphs in c languageGraphs in c language
Graphs in c languageSARITHA REDDY
 
Computer networking presentation
Computer networking presentationComputer networking presentation
Computer networking presentationMd. Touhidur Rahman
 
Exploratory social network analysis with pajek
Exploratory social network analysis with pajekExploratory social network analysis with pajek
Exploratory social network analysis with pajekTHomas Plotkowiak
 
Ripple Algorithm to Evaluate the Importance of Network Nodes
Ripple Algorithm to Evaluate the Importance of Network NodesRipple Algorithm to Evaluate the Importance of Network Nodes
Ripple Algorithm to Evaluate the Importance of Network Nodesrahulmonikasharma
 

Ähnlich wie Networks (20)

4. social network analysis
4. social network analysis4. social network analysis
4. social network analysis
 
Social Network Analysis (Part 2)
Social Network Analysis (Part 2)Social Network Analysis (Part 2)
Social Network Analysis (Part 2)
 
Interpretation of the biological knowledge using networks approach
Interpretation of the biological knowledge using networks approachInterpretation of the biological knowledge using networks approach
Interpretation of the biological knowledge using networks approach
 
Minicourse on Network Science
Minicourse on Network ScienceMinicourse on Network Science
Minicourse on Network Science
 
Network theory - PyCon 2015
Network theory - PyCon 2015Network theory - PyCon 2015
Network theory - PyCon 2015
 
10.graph
10.graph10.graph
10.graph
 
Network Measures Social Computing-Unit 2.pptx
Network Measures Social Computing-Unit 2.pptxNetwork Measures Social Computing-Unit 2.pptx
Network Measures Social Computing-Unit 2.pptx
 
Lecture 14 data structures and algorithms
Lecture 14 data structures and algorithmsLecture 14 data structures and algorithms
Lecture 14 data structures and algorithms
 
Part7-routing.pptx
Part7-routing.pptxPart7-routing.pptx
Part7-routing.pptx
 
02 Descriptive Statistics (2017)
02 Descriptive Statistics (2017)02 Descriptive Statistics (2017)
02 Descriptive Statistics (2017)
 
Bridging Centrality: Identifying Bridging Nodes in Transportation Network
Bridging Centrality: Identifying Bridging Nodes in Transportation NetworkBridging Centrality: Identifying Bridging Nodes in Transportation Network
Bridging Centrality: Identifying Bridging Nodes in Transportation Network
 
routing algorithm
routing algorithmrouting algorithm
routing algorithm
 
Cnetwork
CnetworkCnetwork
Cnetwork
 
eeca
eecaeeca
eeca
 
Community detection in graphs
Community detection in graphsCommunity detection in graphs
Community detection in graphs
 
Graphs in c language
Graphs in c languageGraphs in c language
Graphs in c language
 
Computer networking presentation
Computer networking presentationComputer networking presentation
Computer networking presentation
 
Week11 lec2
Week11 lec2Week11 lec2
Week11 lec2
 
Exploratory social network analysis with pajek
Exploratory social network analysis with pajekExploratory social network analysis with pajek
Exploratory social network analysis with pajek
 
Ripple Algorithm to Evaluate the Importance of Network Nodes
Ripple Algorithm to Evaluate the Importance of Network NodesRipple Algorithm to Evaluate the Importance of Network Nodes
Ripple Algorithm to Evaluate the Importance of Network Nodes
 

Mehr von Abhik Seal

Clinicaldataanalysis in r
Clinicaldataanalysis in rClinicaldataanalysis in r
Clinicaldataanalysis in rAbhik Seal
 
Virtual Screening in Drug Discovery
Virtual Screening in Drug DiscoveryVirtual Screening in Drug Discovery
Virtual Screening in Drug DiscoveryAbhik Seal
 
Data manipulation on r
Data manipulation on rData manipulation on r
Data manipulation on rAbhik Seal
 
Data handling in r
Data handling in rData handling in r
Data handling in rAbhik Seal
 
Modeling Chemical Datasets
Modeling Chemical DatasetsModeling Chemical Datasets
Modeling Chemical DatasetsAbhik Seal
 
Introduction to Adverse Drug Reactions
Introduction to Adverse Drug ReactionsIntroduction to Adverse Drug Reactions
Introduction to Adverse Drug ReactionsAbhik Seal
 
Mapping protein to function
Mapping protein to functionMapping protein to function
Mapping protein to functionAbhik Seal
 
Sequencedatabases
SequencedatabasesSequencedatabases
SequencedatabasesAbhik Seal
 
Chemical File Formats for storing chemical data
Chemical File Formats for storing chemical dataChemical File Formats for storing chemical data
Chemical File Formats for storing chemical dataAbhik Seal
 
Understanding Smiles
Understanding Smiles Understanding Smiles
Understanding Smiles Abhik Seal
 
Learning chemistry with google
Learning chemistry with googleLearning chemistry with google
Learning chemistry with googleAbhik Seal
 
3 d virtual screening of pknb inhibitors using data
3 d virtual screening of pknb inhibitors using data3 d virtual screening of pknb inhibitors using data
3 d virtual screening of pknb inhibitors using dataAbhik Seal
 
R scatter plots
R scatter plotsR scatter plots
R scatter plotsAbhik Seal
 
Q plot tutorial
Q plot tutorialQ plot tutorial
Q plot tutorialAbhik Seal
 
Pharmacohoreppt
PharmacohorepptPharmacohoreppt
PharmacohorepptAbhik Seal
 

Mehr von Abhik Seal (20)

Chemical data
Chemical dataChemical data
Chemical data
 
Clinicaldataanalysis in r
Clinicaldataanalysis in rClinicaldataanalysis in r
Clinicaldataanalysis in r
 
Virtual Screening in Drug Discovery
Virtual Screening in Drug DiscoveryVirtual Screening in Drug Discovery
Virtual Screening in Drug Discovery
 
Data manipulation on r
Data manipulation on rData manipulation on r
Data manipulation on r
 
Data handling in r
Data handling in rData handling in r
Data handling in r
 
Modeling Chemical Datasets
Modeling Chemical DatasetsModeling Chemical Datasets
Modeling Chemical Datasets
 
Introduction to Adverse Drug Reactions
Introduction to Adverse Drug ReactionsIntroduction to Adverse Drug Reactions
Introduction to Adverse Drug Reactions
 
Mapping protein to function
Mapping protein to functionMapping protein to function
Mapping protein to function
 
Sequencedatabases
SequencedatabasesSequencedatabases
Sequencedatabases
 
Chemical File Formats for storing chemical data
Chemical File Formats for storing chemical dataChemical File Formats for storing chemical data
Chemical File Formats for storing chemical data
 
Understanding Smiles
Understanding Smiles Understanding Smiles
Understanding Smiles
 
Learning chemistry with google
Learning chemistry with googleLearning chemistry with google
Learning chemistry with google
 
3 d virtual screening of pknb inhibitors using data
3 d virtual screening of pknb inhibitors using data3 d virtual screening of pknb inhibitors using data
3 d virtual screening of pknb inhibitors using data
 
Poster
PosterPoster
Poster
 
R scatter plots
R scatter plotsR scatter plots
R scatter plots
 
Indo us 2012
Indo us 2012Indo us 2012
Indo us 2012
 
Q plot tutorial
Q plot tutorialQ plot tutorial
Q plot tutorial
 
Weka guide
Weka guideWeka guide
Weka guide
 
Pharmacohoreppt
PharmacohorepptPharmacohoreppt
Pharmacohoreppt
 
Document1
Document1Document1
Document1
 

Networks

  • 2. Networks • a collection of individuals or entities, each called a vertex or node • a list of pairs of vertices that are neighbors, representing edges or links • vertices are Facebook users, edges represent Facebook friendships • Vertices of Drugs and Targets ,edges represent Drug target relations
  • 3. Network Basic definitions N to denote the number of vertices in a network • Number of possible edges: N(N-1)/2 ~ N2/2 • The degree of a vertex is its number of neighbors • The distance between two vertices is the length of the shortest path connecting them. • If two vertices are in different components, their distance is undefined or infinite. • The diameter of a network is the average distance between pairs(It measures how near or far typical individuals are from each other)
  • 4. Network Basic definitions Directed and Undirected Networks • If the pairs are unordered, then the graph is undirected: vertices = {A, B, C, D, E} edges = ({A, B}, {A, C}, {B, C}, {C, E}). • Otherwise it is directed: vertices = {A, B, C, D, E} edges = ((A, B),(A, C),(B, C),(C, E)). Example • I can follow you on Twitter without you following me • web page A may link to page B, but not vice-versa.
  • 5. Representing Networks • adjacency matrix • edgelist 2, 3 2, 4 3, 2 3, 4 4, 5 5, 2 5, 1 • adjacency list 1: 2: 3 4 3: 2 4 4: 5 5: 1 2 1 2 3 45 0 0 0 0 0 0 0 1 1 0 0 1 0 1 0 0 0 0 0 1 1 1 0 0 0 A =
  • 6. Degree Properties • Indegree how many directed edges (arcs) are incident on a node 2 • outdegree how many directed edges (arcs) originate at a node 2 • degree (in or out) number of edges incident on a node 3 • Degree distribution: A frequency count of the occurrence of each degree • In-degree distribution: [(2,3) (1,4) (0,1)] • Out-degree distribution: [(2,4) (1,3) (0,1)] • (undirected) distribution: [(3,3) (2,2) (1,3)]
  • 7. Connected Components • Strongly connected components • Each node within the component can be reached from every other node in the component by following directed links Strongly connected components B C D E A G H F • Weakly connected components: every node can be reached from every other node by following links in either direction Weakly connected components A B C D E G H F • In undirected networks one talks simply about “connected components” B C D F H GA E A C D E F G H B
  • 8. Giant Component In a network, a "component" is a group of nodes (people) that are all connected to each other, directly or indirectly. So if a network has a "giant component", that means almost every node is reachable from almost every other.
  • 9. Centrality Measures • We often want to know important nodes in a network . • Centrality is one of the methods There are many but for this course we will look into : Degree - connectedness Closeness – ease of reaching other nodes Betweeness – role as an intermediary node Eigenvector - not what you know, but who you know Things to Remember • Centrality is a measure of an node, centralization is a measure of the network. • It matters whether you are considering a directed or an undirected network. • Most centrality measures work on binary/unweighted networks
  • 10. Centrality Measures Node Degree Centrality : divide degree by the max. possible, i.e. (N-1) Degree Centralization: CD = CD (n* ) - CD (i)[ ]i=1 g å [(N -1)(N -2)] maximum value in the network CD = 0.167CD = 1.0
  • 11. Centrality Measures Closeness Centrality • one still wants to be in the “middle” of network, not too far from the center. • Closeness is based on the length of the average shortest path between a node and all other nodes in the network CC ' (i) = (CC (i))/(N -1) Closeness Centrality: Normalized Closeness Centrality: Cc ' (A) = d(A, j) j=1 N å N -1 é ë ê ê ê ê ù û ú ú ú ú -1 = 1+ 2 +3+ 4 4 é ëê ù ûú -1 = 10 4 é ëê ù ûú -1 = 0.4 An actor who has very low closeness centrality takes many more steps to get to everyone.
  • 12. Centrality Measures Betweeness Centrality • A node has a high betweenness centrality when they occupy a position in the geodesics connecting many pairs of other actors in the network • It is equal to the number of shortest paths from all vertices to all others that pass through that node. CB (i) = gjk (i)/gjk j<k å Where gjk = the number of shortest paths connecting jk gjk(i) = the number that actor i is on. Usually normalized by: CB ' (i) = CB (i )/[(n -1)(n -2)/2] number of pairs of vertices excluding the vertex itself
  • 13. • non-normalized version: A lies between no two other vertices B lies between A and 3 other vertices: C, D, and E C lies between 4 pairs of vertices (A,D),(A,E),(B,D),(B,E) note that there are no alternate paths for these pairs to take, so C gets full credit A B C ED Betweeness Centrality
  • 14. Betweeness Example why do C and D each have betweenness 1? They are both on shortest paths for pairs (A,E), and (B,E), and so must share credit: • ½+½ = 1
  • 15. Eigen Vector Centrality Eigenvector centrality is calculated by assessing how well connected an individual is to the parts of the network with the greatest connectivity. Here yellow has more eigen vector centrality because it is connected to nodes with greatest connectivity Applications: High eigenvector centrality individuals are leaders of the network. They are often public figures with many connections to other high-profile individuals. Thus, they often play roles of key opinion leaders and shape public perception. A related example of this is Google’s page rank algorithm, which is closely related to eigenvector centrality calculated on websites based on links to them.
  • 16. Resources Lada Adamic : https://www.coursera.org/instructor/~267 Network,crowds and Markets https://www.coursera.org/instructor/~267 http://www.cs.cornell.edu/home/kleinber/netwo rks-book/networks-book.pdf