This document summarizes a presentation about bibliometric network analysis tools and techniques. It discusses two software tools, VOSviewer and CitNetExplorer, that are used to construct and visualize bibliometric networks. It also outlines various network analysis techniques, including layout algorithms, community detection methods, and a unified approach to mapping and clustering networks. Finally, it provides an analysis of the structure and evolution of the field of network science based on a large bibliometric dataset.
Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University
1. Bibliometric network analysis:
Software tools, techniques, and an
analysis of network science at Leiden
University
Ludo Waltman and Nees Jan van Eck
Centre for Science and Technology Studies (CWTS), Leiden University
LCN2 Seminar
Leiden, November 27, 2015
2. Centre for Science and Technology
Studies (CWTS)
⢠Research center at Leiden University
focusing on science and technology
studies
⢠About 30 staff members
⢠History of more than 25 years in
bibliometric and scientometric
research
⢠Contract research
⢠Full access to large bibliographic
database (Web of Science and
Scopus)
1
3. Bibliographic databases: âBig dataâ
2
Web of Science Scopus
Journals 12,000 20,000
Publications 45 million 35 million
Citations 1 billion 0.9 billion
4. Bibliometric networks
3
Web of
Science
Scopus
Citation network
of publications
Co-authorship network
of authors / organizations
Co-citation network
of pubs / authors / journals
Co-occurrence network
of terms
Bibliographic coupling network
of pubs / authors / journals
Bibliographic
database
12. ⢠Any type of bibliometric
network
⢠Co-authorship, co-citation, and
bibliographic coupling
⢠Time dimension is ignored
⢠Networks of at most ~10,000
nodes are supported
⢠Only citation networks of
publications
⢠Direct citation relations
⢠Time dimension is explicitly
considered
⢠Millions of publications are
supported
11
VOSviewer CitNetExplorer
17. Unified approach to mapping and
clustering
Minimize
where
n: number of nodes in the network
m: total weight of all edges in the network
Aij: weight of edge between nodes i and j
ki: total weight of all edges of node i
16
ďĽďĽ ďźďź
ďď˝
ji
ij
ji
ijij
ji
n ddA
kk
m
xxQ 2
1
2
),,( ď
Mapping
xi: vector denoting the location
of node i in a p-dimensional
space
ďĽď˝
ďď˝ďď˝
p
k
jkikjiij xxxxd
1
2
)(
Clustering
xi: integer denoting the
community to which node i
belongs
ď§: resolution parameter
ďŽ
ď
ďŹ
ďš
ď˝
ď˝
ji
ji
ij
xx
xx
d
if1
if0
ď§
18. Unified approach: Clustering
Equivalent to a weighted variant of modularity-based
community detection (Waltman et al., 2010)
Maximize
where
ď¤(xi, xj) equals 1 if xi = xj and 0 otherwise
17
ďĽďź
ďˇďˇ
ď¸
ďś
ď§ď§
ď¨
ďŚ
ďď˝
ji
ji
ijijjin
m
kk
Awxx
m
xxQ
2
),(
2
1
),,(Ë 1 ď§ď¤ď
ji
ij
kk
m
w
2
ď˝
19. Unified approach: Mapping
⢠Equivalent to the VOS (visualization of similarities)
technique (Van Eck & Waltman, 2007)
⢠Limit case of multidimensional scaling (Van Eck et
al., 2010)
18
ďĽďĽ ďźďź
ďďďď˝
ji
ji
ji
jiij
ji
xxxxA
kk
m
Q
22
ď¨ ďŠďĽďź
ďďď˝
ji
jiijij xxDW
2
ďł
1
2
ď
ď˝ ij
ji
ij A
m
kk
D ij
ji
ij A
kk
m
W
2
ď˝
VOS
MDS
20. Unified approach
Commonly used clustering technique (modularity)
and commonly used mapping technique (MDS) can be
brought together in a unified framework
19
Unified
approach
Modularity
(weighted)
VOS
MDS
(limit case)
21. Louvain algorithm
⢠âLouvain algorithmâ (Blondel et al., 2008) is the
most popular heuristic algorithm for large-scale
modularity optimization
20
22. Louvain algorithm
21
Q = 0.3791
Q = 0.4151
Local
moving
heuristic
Local moving heuristic
Reduced
network
Original
network
23. Smart local moving algorithm
⢠Smart local moving algorithm extends the Louvain
algorithm in two ways:
1. Multiple algorithm iterations, with output of one iteration
serving as input for the next iteration
2. Recursive application of the local moving heuristic
22
24. Smart local moving algorithm
23
Q = 0.4198
Q = 0.3791
Reduced
network
Local moving
heuristic in
subnetworks
Local moving heuristic
Original
network
25. Empirical comparison (large networks)
⢠6 networks
⢠Algorithms:
â Louvain (1 iteration)
â Louvain (10 iterations)
â Smart local moving (10 iterations)
⢠10 algorithm runs using different random numbers
24
28. Algorithmic classification systems of
science
⢠Publications (not journals) are clustered into
research areas based on citation relations
⢠Research areas are defined at different levels of
granularity and are organized hierarchically
⢠Clustering is performed using the smart local
moving algorithm (improved Louvain algorithm;
Waltman & Van Eck, 2013)
27
29. Algorithmically constructed
classification system of science
⢠16.2 million publications from the period 2000â
2014 indexed in Web of Science
⢠241.7 million citation relations
⢠Classification system of 3 hierarchical levels:
â 28 broad disciplines
â 813 fields
â 3,822 subfields
28
30. Breakdown of scientific literature into
3822 subfields
30
Social sciences
and humanities
Biomedical and
health sciences
Life and earth
sciences
Physical
sciences and
engineering
Mathematics and
computer science
34. Application: Emerging research areas
in physics
35
Particle physics
Astronomy and
astrophysics
Optics
Applied physics
Atomic, molecular,
and chemical
physics
Condensed matter
physics
37. Network science according to
Wikipedia
Network science is an interdisciplinary academic field
which studies complex networks such as
telecommunication networks, computer networks,
biological networks, cognitive and semantic networks,
and social networks. The field draws on theories and
methods including graph theory from mathematics,
statistical mechanics from physics, data mining and
information visualization from computer science,
inferential modeling from statistics, and social
structure from sociology.
38
38. Networks text book by Mark Newman
The scientific study of networks, including computer
networks, social networks, and biological networks,
has received an enormous amount of interest in the
last few years. (...) The study of networks is broadly
interdisciplinary and important developments have
occurred in many fields, including mathematics,
physics, computer and information
sciences, biology, and the social sciences.
39
39. Journal of Complex Networks
The journal covers everything from the basic
mathematical, physical and computational principles
needed for studying complex networks to their
applications leading to predictive models in
molecular, biological, ecological, informational,
engineering, social, technological and other systems.
40
40. Network Science journal
Network Science is a new journal for a new discipline -
one using the network paradigm, focusing on actors
and relational linkages, to inform research,
methodology, and applications from many fields
across the natural, social, engineering and
informational sciences.
41
41. Popular network terms
42
neural network
social network
wireless sensor
network
complex network
wireless network
regulatory
network
42. Network publications
⢠Web of Science database
⢠Time period 1992â2014
⢠Research articles and review articles
⢠ânetworkâ or âgraphâ in title or abstract
⢠0.7 million publications
43
44. Co-occurrence relations between terms
in network publications
45
Biology
Neuroscience
Social science
Chemistry
Mathematics
Computer science
45. Co-occurrence relations between terms
in network publications
46
Biology
Neuroscience
Social science
Chemistry
Mathematics
Computer science
46. Network fields
⢠Network publications are clustered into fields
⢠Based on 3.1 million citation relations between
network publications
⢠Clustering methodology of Waltman and Van Eck
(2012, 2013)
⢠Publications in the same journal are assigned to the
same cluster, except for multidisciplinary journals
⢠13 main clusters, covering 97% of all 0.7 million
network publications
47
48. Citation relations between journals
with ⼠100 network publications
49
Computer science
Mathematics
Physics
Neuroscience
Biology
Chemistry
49. Convergence toward an integrated
network science field?
Number of citations between network fields
(x 100; 5-year citation window)
50
2004
Physics
Math
CS
Biology SSNeuro
3 2
2 7 4 2 1 2
Physics
Math
CS
Biology SSNeuro
10 5
10 13 9 9 8 5
2014
2
5 27
6 39 1
50. Convergence toward an integrated
network science field?
% of publications in each of two fields citing at least one
publication in the other field (5-year citation window)
51
2004
Physics
Math
CS
Biology SSNeuro
3 4
2 6 5 3 2 2
Physics
Math
CS
Biology SSNeuro
5 5
3 6 3 5 4 5
2014
6 10
7 12
52. Citation relations between journals at
the SS-physics interface (2005â2014)
53
Scientometrics
Economics
Sociology and SNA
Physica A
PRE PRL
PLOS ONE
PNAS
Nature
Science
Sci. Rep.
JSTAT
EPL
EPJ B
53. Leiden Universityâs institutes with most
publications on network science
⢠LUMC
⢠Leiden Institute of Advanced Computer Science (Science)
⢠Leiden Institute of Chemistry (Science)
⢠Leiden Institute of Physics (Science)
⢠Institute of Psychology (FSW)
⢠Mathematical Institute (Science)
⢠Leiden Observatory (Science)
⢠Institute of Biology Leiden (Science)
⢠Centre for Science and Technology Studies (FSW)
54
54. Citation relations between journals
with ⼠100 network publications
55
Computer science
Mathematics
Physics
Neuroscience
Biology
Chemistry
56. Leiden Universityâs publication output
in network science journals
57
CWTS
Leiden Institute
of Chemistry
LIACS
Leiden Institute
of Physics
Leiden Institute
of Physics
Institute of
Psychology
LUMC
Institute of
Biology Leiden
Mathematical
Institute
57. Conclusions
⢠Network research has increased tremendously
during the past 10â15 years
⢠Network research covers many fields of science,
but there is only limited evidence of increasing
integration
⢠Network research in social science and physics is
becoming more connected
⢠Leiden University contributes to all major areas of
network research, although the contribution to in
the area of computer science is somewhat modest
58