SlideShare ist ein Scribd-Unternehmen logo
1 von 21
Downloaden Sie, um offline zu lesen
Citation Graph Analysis to Identify Memes in
Scientific Literature
Tobias Kuhn and Matjaz Perc and Dirk Helbing
http://www.tkuhn.ch
@txkuhn
ETH Zurich
Quid Inc.
11 June 2014
Citation Graph of Scientific Publications
Nodes: publications
Edges: citations (in gray)
Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 2 / 21
Citation Graph of Scientific Publications
Nodes: publications
Edges: citations (in gray)
Legend:
Natural/Agricultural Sciences
(except Physical Sciences)
Physical Sciences
Engineering and Technology
Medical and Health Sciences
Social Sciences / Humanities
Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 3 / 21
Citation Graph of Scientific Publications
Nodes: publications
Edges: citations (in gray)
Legend:
Natural/Agricultural Sciences
(except Physical Sciences)
Physical Sciences
Engineering and Technology
Medical and Health Sciences
Social Sciences / Humanities
Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 4 / 21
Citation Graph of Scientific Publications
Entire giant component (33
million nodes) of the citation
graph of Thomson Reuter’s
Web of Science dataset.
Legend:
Natural/Agricultural Sciences
(except Physical Sciences)
Physical Sciences
Engineering and Technology
Medical and Health Sciences
Social Sciences / Humanities
Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 5 / 21
Citation Graph: American Physical Society
Citation graph of the Phys-
ical Review journals (463k
nodes).
Legend:
A: Atomic, molecular,
optical phys.
B: Condensed matter,
materials phys.
C: Nuclear phys.
D: Particles, fields, gravitation,
cosmology
E: Statistical, nonlinear,
soft matter phys.
other journals
Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 6 / 21
Citation Graph: Memes
Specific phrases or “memes”
localize to specific regions in
the citation graph.
Legend:
quantum
fission
graphene
self-organized criticality
traffic flow
Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 7 / 21
Scientific Memes
“Meme” was coined by Richard Dawkins:
“Just as genes propagate themselves in the gene pool by leaping from body
to body via sperm or eggs, so memes propagate themselves in the meme pool
by leaping from brain to brain via a process which, in the broad sense, can
be called imitation.” [Dawkins, The Selfish Gene]
Examples of memes:
• Melodies
• Recipes
• Cultural habits
• Scientific concepts
Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 8 / 21
Genes/Memes as Network Patterns!
Dawkins’ Definition of “Gene”:
“I am using the word gene to mean a genetic unit that is small enough to last
for a number of generations and to be distributed around in many copies.”
[Dawkins, The Selfish Gene]
Our Working Definition of “Scientific Meme”:
A scientific meme is a short unit of text in a publication that is replicated in
citing publications and thereby distributed around in many copies.
Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 9 / 21
Propagation Score
Propagation score P quantifies the degree to which a meme’s
occurrence aligns with the citation graph:
Pm =
sticking factor
sparking factor
=
? ?
=
dm→m
d→m
dm→&m
d→&m
To prevent that some infrequent phrases get a high propagation score
by chance, we can add small amount of controlled noise δ (we use
δ = 3):
Pm =
dm→m
d→m + δ
dm→&m + δ
d→&m + δ
Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 10 / 21
Frequency/Propagation Score for APS Data
relativefrequency→
10−2
100
102
104
106
10−6
10
−4
10−2
100
APS
n = 1,372,365
quantum
fission
graphene
self-organized
criticality
traffic flow
propagation score →
densityofn-grams:
100
101
102
103
104
105
Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 11 / 21
Randomized Network
relativefrequency→
10−2
100
102
104
106
10−6
10
−4
10−2
100
APS
randomized
(time preserving)
n = 89,356
propagation score →
densityofn-grams:
100
101
102
103
104
105
Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 12 / 21
Meme Score
Meme score M as the Product of relative frequency f and
propagation score P:
Mm = fmPm
Top 20 Memes:
1. loop quantum cosmology+
* 11. dark energy+
*
2. unparticle+
* 12. Rashba
3. sonoluminescence+
* 13. CuGeO3
+
4. MgB2
+
14. strange nonchaotic
5. stochastic resonance+
* 15. in NbSe3
6. carbon nanotubes+
* 16. spin Hall+
7. NbSe3
+
17. elliptic flow+
*
8. black hole+
* 18. quantum Hall+
*
9. nanotubes+
19. CeCoIn5
+
10. lattice Boltzmann+
* 20. inflation+
+
annotators agreed that this is an interesting and important physics concept
* also found on the list of terms extracted from Wikipedia
Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 13 / 21
Properties of the Meme Score
The meme score has a number of nice properties:
• Can be calculated efficiently and exhaustively even on very large
dataset
• No upper limit on the length of n-grams
• No dependence on external linguistic or ontological knowledge
• No stop-word lists or other kinds of arbitrary filters or thresholds
Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 14 / 21
Manual Annotation
• Two annotators (A1, A2): PhD students with physics degree
• Annotation with respect to (1) physics concept or not and (2)
linguistic category
• Randomly extracted phrases for comparison
physics concept not a physics concept
noun phrase verb adjective or adverb other
meme score
A1
A2
A1
A2
random
A1
A2
A1
A2
weighted random
terms
30 60 90 120 150
A1
A2
A1
A2
Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 15 / 21
Comparison to Alternative Metrics
0 0.1 0.2 0.3 0.4 0.5
meme score
frequency
max. absolute
change
over time
max. relative
change
over time
max. absolute
difference
across journals
max. relative
difference
across journals
A (area under curve)
10
1
10
2
10
3
0
20
40
60
80
100
top x terms by meme score
percentageofWikipediaterms
40% of top 50
terms are found
on Wikipedia list
Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 16 / 21
Evolution over Time: Exemplary Memes
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x 10
5
0
2
4
6
8
10
12
14
publication count
memescore(δ=1)
1940
1960
1970
198019821984
1986
1988
1990
1992
1994
1996
1998
2000
2002
2004
2006
2008
quantum
fission
graphene
self−organized criticality
traffic flow
Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 17 / 21
Evolution over Time
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x 10
5
0
2
4
6
8
10
12
publication count
memescore
1940
1960
1970
198019821984
1986
1988
1990
1992
1994
1996
1998
2000
2002
2004
2006
2008
graphene
entanglement
MgB2
nanotubes
carbon nanotubes
quark
neutrino
Bose−Einstein
quantum Hall
black
C60
Hubbard model
quantum wells
graphite
reactions
photoemission
black hole
tricritical
Kondo
superconducting
fission
MeV
diffuse scattering
Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 18 / 21
Meme Score Calculation
1 Collect all phrases that stick at least once (not counting
“free-riding” on larger memes)
2 Calculate sticking and sparking factors for all collected phrases
Mm = fmPm with Pm =
sticking factor
sparking factor
=
dm→m
d→m + δ
dm→
¡m
+ δ
d→
¡m
+ δ
Example
Citing title:
covariant effective action for loop quantum cosmology from order reduction
Cited titles:
– quantum nature of the big bang
– absence of a singularity in loop quantum cosmology
– large scale effective theory for cosmological bounces
Sticking phrases: loop quantum cosmology, quantum, effective, for
Sparking phrases: covariant, covariant effective action, order reduction, ...
Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 19 / 21
Conclusions
Inheritance patterns of memes in the scientific citation graph reveal a
simple mathematical regularity.
This regularity can be formalized by the meme score.
Allows for studying memes in an exhaustive manner.
Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 20 / 21
Thank you for your Attention!
Twitter: @txkuhn
Pre-print article:
http://arxiv.org/abs/1404.3757
Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 21 / 21

Weitere ähnliche Inhalte

Ähnlich wie Citation Graph Analysis to Identify Memes in Scientific Literature

Complexity A Guided Tour By Melanie Mitchell
Complexity A Guided Tour By Melanie MitchellComplexity A Guided Tour By Melanie Mitchell
Complexity A Guided Tour By Melanie Mitchell528Hz TRUTH
 
Data Publishing and Post-Publication Reviews
Data Publishing and Post-Publication ReviewsData Publishing and Post-Publication Reviews
Data Publishing and Post-Publication ReviewsTobias Kuhn
 
Sciences of Europe VOL 1, No 62 (2021)
Sciences of Europe VOL 1, No 62 (2021)Sciences of Europe VOL 1, No 62 (2021)
Sciences of Europe VOL 1, No 62 (2021)Sciences of Europe
 
Semantic Publishing and Nanopublications
Semantic Publishing and NanopublicationsSemantic Publishing and Nanopublications
Semantic Publishing and NanopublicationsTobias Kuhn
 
Science Education: Themes for the Next 50 Years
Science Education: Themes for the Next 50 YearsScience Education: Themes for the Next 50 Years
Science Education: Themes for the Next 50 YearsInstitute for the Future
 
Intersection Scale and Social Machines 2016
Intersection Scale and Social Machines 2016Intersection Scale and Social Machines 2016
Intersection Scale and Social Machines 2016David De Roure
 
Pre-newtonian calculus
Pre-newtonian calculusPre-newtonian calculus
Pre-newtonian calculusKeith Rodgers
 
Linked Data Publishing with Nanopublications
Linked Data Publishing with NanopublicationsLinked Data Publishing with Nanopublications
Linked Data Publishing with NanopublicationsTobias Kuhn
 
Linked data for knowledge curation in humanities research
Linked data for knowledge curation in humanities researchLinked data for knowledge curation in humanities research
Linked data for knowledge curation in humanities researchEnrico Daga
 
PhD_Thesis_slides.pdf
PhD_Thesis_slides.pdfPhD_Thesis_slides.pdf
PhD_Thesis_slides.pdfNiloyBiswas36
 
Visualizing the Transcribe Bentham Corpus
Visualizing the Transcribe Bentham CorpusVisualizing the Transcribe Bentham Corpus
Visualizing the Transcribe Bentham CorpusUCLDH
 
Intercarto intergis-world-mapping-modelling-2006
Intercarto intergis-world-mapping-modelling-2006Intercarto intergis-world-mapping-modelling-2006
Intercarto intergis-world-mapping-modelling-2006Heiner Benking
 
Digital Scholarship Intersection Scale Social Machines
Digital Scholarship Intersection Scale Social MachinesDigital Scholarship Intersection Scale Social Machines
Digital Scholarship Intersection Scale Social MachinesDavid De Roure
 
Semantic annotation of digital libraries. A model for science communication
Semantic annotation of digital libraries. A model for science communicationSemantic annotation of digital libraries. A model for science communication
Semantic annotation of digital libraries. A model for science communicationFrancesca Di Donato
 
NG2S: A Study of Pro-Environmental Tipping Point via ABMs
NG2S: A Study of Pro-Environmental Tipping Point via ABMsNG2S: A Study of Pro-Environmental Tipping Point via ABMs
NG2S: A Study of Pro-Environmental Tipping Point via ABMsKan Yuenyong
 
Fantastic Realities.pdf
Fantastic Realities.pdfFantastic Realities.pdf
Fantastic Realities.pdffoxbeta1
 
Genuine semantic publishing
Genuine semantic publishingGenuine semantic publishing
Genuine semantic publishingTobias Kuhn
 

Ähnlich wie Citation Graph Analysis to Identify Memes in Scientific Literature (20)

Codata mist2005
Codata mist2005Codata mist2005
Codata mist2005
 
Complexity A Guided Tour By Melanie Mitchell
Complexity A Guided Tour By Melanie MitchellComplexity A Guided Tour By Melanie Mitchell
Complexity A Guided Tour By Melanie Mitchell
 
Data Publishing and Post-Publication Reviews
Data Publishing and Post-Publication ReviewsData Publishing and Post-Publication Reviews
Data Publishing and Post-Publication Reviews
 
Sciences of Europe VOL 1, No 62 (2021)
Sciences of Europe VOL 1, No 62 (2021)Sciences of Europe VOL 1, No 62 (2021)
Sciences of Europe VOL 1, No 62 (2021)
 
Semantic Publishing and Nanopublications
Semantic Publishing and NanopublicationsSemantic Publishing and Nanopublications
Semantic Publishing and Nanopublications
 
Science Education: Themes for the Next 50 Years
Science Education: Themes for the Next 50 YearsScience Education: Themes for the Next 50 Years
Science Education: Themes for the Next 50 Years
 
Intersection Scale and Social Machines 2016
Intersection Scale and Social Machines 2016Intersection Scale and Social Machines 2016
Intersection Scale and Social Machines 2016
 
Pre-newtonian calculus
Pre-newtonian calculusPre-newtonian calculus
Pre-newtonian calculus
 
Linked Data Publishing with Nanopublications
Linked Data Publishing with NanopublicationsLinked Data Publishing with Nanopublications
Linked Data Publishing with Nanopublications
 
Linked data for knowledge curation in humanities research
Linked data for knowledge curation in humanities researchLinked data for knowledge curation in humanities research
Linked data for knowledge curation in humanities research
 
PhD_Thesis_slides.pdf
PhD_Thesis_slides.pdfPhD_Thesis_slides.pdf
PhD_Thesis_slides.pdf
 
Visualizing the Transcribe Bentham Corpus
Visualizing the Transcribe Bentham CorpusVisualizing the Transcribe Bentham Corpus
Visualizing the Transcribe Bentham Corpus
 
Enano newsletter issue20-21
Enano newsletter issue20-21Enano newsletter issue20-21
Enano newsletter issue20-21
 
Intercarto intergis-world-mapping-modelling-2006
Intercarto intergis-world-mapping-modelling-2006Intercarto intergis-world-mapping-modelling-2006
Intercarto intergis-world-mapping-modelling-2006
 
Digital Scholarship Intersection Scale Social Machines
Digital Scholarship Intersection Scale Social MachinesDigital Scholarship Intersection Scale Social Machines
Digital Scholarship Intersection Scale Social Machines
 
Semantic annotation of digital libraries. A model for science communication
Semantic annotation of digital libraries. A model for science communicationSemantic annotation of digital libraries. A model for science communication
Semantic annotation of digital libraries. A model for science communication
 
2013 05-23-knowledge triangle
2013 05-23-knowledge triangle2013 05-23-knowledge triangle
2013 05-23-knowledge triangle
 
NG2S: A Study of Pro-Environmental Tipping Point via ABMs
NG2S: A Study of Pro-Environmental Tipping Point via ABMsNG2S: A Study of Pro-Environmental Tipping Point via ABMs
NG2S: A Study of Pro-Environmental Tipping Point via ABMs
 
Fantastic Realities.pdf
Fantastic Realities.pdfFantastic Realities.pdf
Fantastic Realities.pdf
 
Genuine semantic publishing
Genuine semantic publishingGenuine semantic publishing
Genuine semantic publishing
 

Mehr von Tobias Kuhn

Nanopublications and Decentralized Publishing
Nanopublications and Decentralized PublishingNanopublications and Decentralized Publishing
Nanopublications and Decentralized PublishingTobias Kuhn
 
A Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
A Decentralized Approach to Dissemination, Retrieval, and Archiving of DataA Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
A Decentralized Approach to Dissemination, Retrieval, and Archiving of DataTobias Kuhn
 
The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer Tobias Kuhn
 
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...Tobias Kuhn
 
nanopub-java: A Java Library for Nanopublications
nanopub-java: A Java Library for Nanopublicationsnanopub-java: A Java Library for Nanopublications
nanopub-java: A Java Library for NanopublicationsTobias Kuhn
 
Scientific Data Publishing
Scientific Data PublishingScientific Data Publishing
Scientific Data PublishingTobias Kuhn
 
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...Tobias Kuhn
 
Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?Tobias Kuhn
 
A Multilingual Semantic Wiki Based on Controlled Natural Language
A Multilingual Semantic Wiki Based on Controlled Natural LanguageA Multilingual Semantic Wiki Based on Controlled Natural Language
A Multilingual Semantic Wiki Based on Controlled Natural LanguageTobias Kuhn
 
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...Tobias Kuhn
 
Automatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen WikiAutomatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen WikiTobias Kuhn
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...Tobias Kuhn
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...Tobias Kuhn
 
Improving Text Mining with Controlled Natural Language: A Case Study for Prot...
Improving Text Mining with Controlled Natural Language: A Case Study for Prot...Improving Text Mining with Controlled Natural Language: A Case Study for Prot...
Improving Text Mining with Controlled Natural Language: A Case Study for Prot...Tobias Kuhn
 
AceRules: Executing Rules in Controlled Natural Language
AceRules: Executing Rules in Controlled Natural LanguageAceRules: Executing Rules in Controlled Natural Language
AceRules: Executing Rules in Controlled Natural LanguageTobias Kuhn
 
AceWiki: A Natural and Expressive Semantic Wiki
AceWiki: A Natural and Expressive Semantic WikiAceWiki: A Natural and Expressive Semantic Wiki
AceWiki: A Natural and Expressive Semantic WikiTobias Kuhn
 
AceWiki: Controlled English in a Semantic Wiki
AceWiki: Controlled English in a Semantic WikiAceWiki: Controlled English in a Semantic Wiki
AceWiki: Controlled English in a Semantic WikiTobias Kuhn
 
How Controlled English can Improve Semantic Wikis
How Controlled English can Improve Semantic WikisHow Controlled English can Improve Semantic Wikis
How Controlled English can Improve Semantic WikisTobias Kuhn
 
How to Evaluate Controlled Natural Languages
How to Evaluate Controlled Natural LanguagesHow to Evaluate Controlled Natural Languages
How to Evaluate Controlled Natural LanguagesTobias Kuhn
 

Mehr von Tobias Kuhn (20)

Nanopublications and Decentralized Publishing
Nanopublications and Decentralized PublishingNanopublications and Decentralized Publishing
Nanopublications and Decentralized Publishing
 
A Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
A Decentralized Approach to Dissemination, Retrieval, and Archiving of DataA Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
A Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
 
The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer
 
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
 
nanopub-java: A Java Library for Nanopublications
nanopub-java: A Java Library for Nanopublicationsnanopub-java: A Java Library for Nanopublications
nanopub-java: A Java Library for Nanopublications
 
Scientific Data Publishing
Scientific Data PublishingScientific Data Publishing
Scientific Data Publishing
 
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
 
Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?
 
Nanopubs
NanopubsNanopubs
Nanopubs
 
A Multilingual Semantic Wiki Based on Controlled Natural Language
A Multilingual Semantic Wiki Based on Controlled Natural LanguageA Multilingual Semantic Wiki Based on Controlled Natural Language
A Multilingual Semantic Wiki Based on Controlled Natural Language
 
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
 
Automatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen WikiAutomatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen Wiki
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
 
Improving Text Mining with Controlled Natural Language: A Case Study for Prot...
Improving Text Mining with Controlled Natural Language: A Case Study for Prot...Improving Text Mining with Controlled Natural Language: A Case Study for Prot...
Improving Text Mining with Controlled Natural Language: A Case Study for Prot...
 
AceRules: Executing Rules in Controlled Natural Language
AceRules: Executing Rules in Controlled Natural LanguageAceRules: Executing Rules in Controlled Natural Language
AceRules: Executing Rules in Controlled Natural Language
 
AceWiki: A Natural and Expressive Semantic Wiki
AceWiki: A Natural and Expressive Semantic WikiAceWiki: A Natural and Expressive Semantic Wiki
AceWiki: A Natural and Expressive Semantic Wiki
 
AceWiki: Controlled English in a Semantic Wiki
AceWiki: Controlled English in a Semantic WikiAceWiki: Controlled English in a Semantic Wiki
AceWiki: Controlled English in a Semantic Wiki
 
How Controlled English can Improve Semantic Wikis
How Controlled English can Improve Semantic WikisHow Controlled English can Improve Semantic Wikis
How Controlled English can Improve Semantic Wikis
 
How to Evaluate Controlled Natural Languages
How to Evaluate Controlled Natural LanguagesHow to Evaluate Controlled Natural Languages
How to Evaluate Controlled Natural Languages
 

Kürzlich hochgeladen

Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPirithiRaju
 
CHROMATOGRAPHY PALLAVI RAWAT.pptx
CHROMATOGRAPHY  PALLAVI RAWAT.pptxCHROMATOGRAPHY  PALLAVI RAWAT.pptx
CHROMATOGRAPHY PALLAVI RAWAT.pptxpallavirawat456
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringPrajakta Shinde
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxEran Akiva Sinbar
 
Four Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptFour Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptJoemSTuliba
 
Introduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptxIntroduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptxMedical College
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxMurugaveni B
 
Biological classification of plants with detail
Biological classification of plants with detailBiological classification of plants with detail
Biological classification of plants with detailhaiderbaloch3
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationColumbia Weather Systems
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naJASISJULIANOELYNV
 
Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫qfactory1
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensorsonawaneprad
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024innovationoecd
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPirithiRaju
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuinethapagita
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologycaarthichand2003
 

Kürzlich hochgeladen (20)

Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
 
CHROMATOGRAPHY PALLAVI RAWAT.pptx
CHROMATOGRAPHY  PALLAVI RAWAT.pptxCHROMATOGRAPHY  PALLAVI RAWAT.pptx
CHROMATOGRAPHY PALLAVI RAWAT.pptx
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical Engineering
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptx
 
Four Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptFour Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.ppt
 
Introduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptxIntroduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptx
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
 
Biological classification of plants with detail
Biological classification of plants with detailBiological classification of plants with detail
Biological classification of plants with detail
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather Station
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by na
 
Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensor
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technology
 

Citation Graph Analysis to Identify Memes in Scientific Literature

  • 1. Citation Graph Analysis to Identify Memes in Scientific Literature Tobias Kuhn and Matjaz Perc and Dirk Helbing http://www.tkuhn.ch @txkuhn ETH Zurich Quid Inc. 11 June 2014
  • 2. Citation Graph of Scientific Publications Nodes: publications Edges: citations (in gray) Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 2 / 21
  • 3. Citation Graph of Scientific Publications Nodes: publications Edges: citations (in gray) Legend: Natural/Agricultural Sciences (except Physical Sciences) Physical Sciences Engineering and Technology Medical and Health Sciences Social Sciences / Humanities Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 3 / 21
  • 4. Citation Graph of Scientific Publications Nodes: publications Edges: citations (in gray) Legend: Natural/Agricultural Sciences (except Physical Sciences) Physical Sciences Engineering and Technology Medical and Health Sciences Social Sciences / Humanities Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 4 / 21
  • 5. Citation Graph of Scientific Publications Entire giant component (33 million nodes) of the citation graph of Thomson Reuter’s Web of Science dataset. Legend: Natural/Agricultural Sciences (except Physical Sciences) Physical Sciences Engineering and Technology Medical and Health Sciences Social Sciences / Humanities Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 5 / 21
  • 6. Citation Graph: American Physical Society Citation graph of the Phys- ical Review journals (463k nodes). Legend: A: Atomic, molecular, optical phys. B: Condensed matter, materials phys. C: Nuclear phys. D: Particles, fields, gravitation, cosmology E: Statistical, nonlinear, soft matter phys. other journals Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 6 / 21
  • 7. Citation Graph: Memes Specific phrases or “memes” localize to specific regions in the citation graph. Legend: quantum fission graphene self-organized criticality traffic flow Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 7 / 21
  • 8. Scientific Memes “Meme” was coined by Richard Dawkins: “Just as genes propagate themselves in the gene pool by leaping from body to body via sperm or eggs, so memes propagate themselves in the meme pool by leaping from brain to brain via a process which, in the broad sense, can be called imitation.” [Dawkins, The Selfish Gene] Examples of memes: • Melodies • Recipes • Cultural habits • Scientific concepts Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 8 / 21
  • 9. Genes/Memes as Network Patterns! Dawkins’ Definition of “Gene”: “I am using the word gene to mean a genetic unit that is small enough to last for a number of generations and to be distributed around in many copies.” [Dawkins, The Selfish Gene] Our Working Definition of “Scientific Meme”: A scientific meme is a short unit of text in a publication that is replicated in citing publications and thereby distributed around in many copies. Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 9 / 21
  • 10. Propagation Score Propagation score P quantifies the degree to which a meme’s occurrence aligns with the citation graph: Pm = sticking factor sparking factor = ? ? = dm→m d→m dm→&m d→&m To prevent that some infrequent phrases get a high propagation score by chance, we can add small amount of controlled noise δ (we use δ = 3): Pm = dm→m d→m + δ dm→&m + δ d→&m + δ Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 10 / 21
  • 11. Frequency/Propagation Score for APS Data relativefrequency→ 10−2 100 102 104 106 10−6 10 −4 10−2 100 APS n = 1,372,365 quantum fission graphene self-organized criticality traffic flow propagation score → densityofn-grams: 100 101 102 103 104 105 Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 11 / 21
  • 12. Randomized Network relativefrequency→ 10−2 100 102 104 106 10−6 10 −4 10−2 100 APS randomized (time preserving) n = 89,356 propagation score → densityofn-grams: 100 101 102 103 104 105 Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 12 / 21
  • 13. Meme Score Meme score M as the Product of relative frequency f and propagation score P: Mm = fmPm Top 20 Memes: 1. loop quantum cosmology+ * 11. dark energy+ * 2. unparticle+ * 12. Rashba 3. sonoluminescence+ * 13. CuGeO3 + 4. MgB2 + 14. strange nonchaotic 5. stochastic resonance+ * 15. in NbSe3 6. carbon nanotubes+ * 16. spin Hall+ 7. NbSe3 + 17. elliptic flow+ * 8. black hole+ * 18. quantum Hall+ * 9. nanotubes+ 19. CeCoIn5 + 10. lattice Boltzmann+ * 20. inflation+ + annotators agreed that this is an interesting and important physics concept * also found on the list of terms extracted from Wikipedia Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 13 / 21
  • 14. Properties of the Meme Score The meme score has a number of nice properties: • Can be calculated efficiently and exhaustively even on very large dataset • No upper limit on the length of n-grams • No dependence on external linguistic or ontological knowledge • No stop-word lists or other kinds of arbitrary filters or thresholds Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 14 / 21
  • 15. Manual Annotation • Two annotators (A1, A2): PhD students with physics degree • Annotation with respect to (1) physics concept or not and (2) linguistic category • Randomly extracted phrases for comparison physics concept not a physics concept noun phrase verb adjective or adverb other meme score A1 A2 A1 A2 random A1 A2 A1 A2 weighted random terms 30 60 90 120 150 A1 A2 A1 A2 Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 15 / 21
  • 16. Comparison to Alternative Metrics 0 0.1 0.2 0.3 0.4 0.5 meme score frequency max. absolute change over time max. relative change over time max. absolute difference across journals max. relative difference across journals A (area under curve) 10 1 10 2 10 3 0 20 40 60 80 100 top x terms by meme score percentageofWikipediaterms 40% of top 50 terms are found on Wikipedia list Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 16 / 21
  • 17. Evolution over Time: Exemplary Memes 0.5 1 1.5 2 2.5 3 3.5 4 4.5 x 10 5 0 2 4 6 8 10 12 14 publication count memescore(δ=1) 1940 1960 1970 198019821984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 quantum fission graphene self−organized criticality traffic flow Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 17 / 21
  • 18. Evolution over Time 0.5 1 1.5 2 2.5 3 3.5 4 4.5 x 10 5 0 2 4 6 8 10 12 publication count memescore 1940 1960 1970 198019821984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 graphene entanglement MgB2 nanotubes carbon nanotubes quark neutrino Bose−Einstein quantum Hall black C60 Hubbard model quantum wells graphite reactions photoemission black hole tricritical Kondo superconducting fission MeV diffuse scattering Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 18 / 21
  • 19. Meme Score Calculation 1 Collect all phrases that stick at least once (not counting “free-riding” on larger memes) 2 Calculate sticking and sparking factors for all collected phrases Mm = fmPm with Pm = sticking factor sparking factor = dm→m d→m + δ dm→ ¡m + δ d→ ¡m + δ Example Citing title: covariant effective action for loop quantum cosmology from order reduction Cited titles: – quantum nature of the big bang – absence of a singularity in loop quantum cosmology – large scale effective theory for cosmological bounces Sticking phrases: loop quantum cosmology, quantum, effective, for Sparking phrases: covariant, covariant effective action, order reduction, ... Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 19 / 21
  • 20. Conclusions Inheritance patterns of memes in the scientific citation graph reveal a simple mathematical regularity. This regularity can be formalized by the meme score. Allows for studying memes in an exhaustive manner. Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 20 / 21
  • 21. Thank you for your Attention! Twitter: @txkuhn Pre-print article: http://arxiv.org/abs/1404.3757 Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 21 / 21