3. Density problem
Dense networks are hard to
visualize
interpret
Solution: pruning networks
PathFinder (Schvaneveldt, 1990)
Deleting low-weight links (De Nooy, Mrvar, and Batagelj, 2005)
Cocitation and bibliographic coupling (Persson, 2010)
Threshold for cosine values (Leydesdorff, 2007; Egghe &
Leydesdorff, 2009)
4. Cooccurrence networks
E.g. cocitation, bibliographic coupling, coauthorship…
Especially prone to density problem
Two-mode network Cooccurrence network
e.g., authors
e.g., citing
papers
6. Steps
Based on Zweig and Kaufman (2011): we start from two-mode
network
1. Define pattern of interest
2. Determine interestingness of cooccurrence
3. If cooccurrence is interesting, authors are linked
7. Why interestingness?
Highly cited author
High coocurrence counts with many other authors
Citing paper referring to many authors under consideration
Resulting cooccurrences are less important
8. Determining interestingness
Here:
How to determine Exp and σ?
Estimate by sampling from Fixed Degree Sequence Model
(FDSM): all two-mode networks with same node degrees
Markov Chain Monte Carlo simulation: link swapping
If p < 0.0001 (or z > 3.29) , we consider link interesting
12. Author cocitation
Author (co-)citations to
12 authors from bibliometrics
12 authors from information retrieval
in Scientometrics and JASIS, 1996-2000
Same data set studied by
Ahlgren, Jarneving & Rousseau (2003)
Egghe & Leydesdorff (2009)
Leydesdorff & Vaughan (2006)
18. Conclusions
Advantages
1. Both positive and negative cooccurrences
2. Thresholds correspond to specific p-values
3. Accounts for degree variations of bottom nodes
Disadvantages
1. Some nodes may become isolates
2. More computationally intensive than cosine similarity