SlideShare ist ein Scribd-Unternehmen logo
1 von 13
Early Detection and Forecasting of
Research Trends
Angelo Antonio Salatino
@angelosalatino
Advisors:
Prof. Enrico Motta
Dr. Francesco Osborne
ISWC 2015 – Doctoral Consortium
Problem
Who cares?
• Researchers: following the evolution of
the research environment
• Academic publishers: promoting up-to-
date and interesting contents
• Companies: early intelligence on
potentially important research trends to
remain at the forefront of innovation
• Funding bodies: improved understanding
of the research landscape
State of the art: Trend detection
• Topic evolution using bibliometric analysis:
– Content analysis
• Topics extraction
• Main terms in documents
– Citation analysis
– Main limitation: cannot detect new trends
early enough in the lifecycle
[Wu et al. 2011, Bolelli et al. 2009, He et al. 2009]
State of the art: Forecasting impact
• Impact based on number of publications and
authors associated with topics
• Approaches based on exponential
smoothing, simple medium average and
machine learning
• Limitations:
– These approaches don’t work at embryonic and
early stages
– They only use a limited set of data sources
[Budi al. 2012, Jun et al. 2010, Tseng et al. 2009]
Planned approach
Wider range of data sources:
comprehensive knowledge base integrating
both scholarly data and social media
Planned approach
– For example, before the Semantic Web
emerged explicitly as research area we
could identify new interesting dynamics
involving authors from different research
areas such as knowledge representation,
agent systems, hypertext and databases.
– Creation of a model that takes into
account all the discovered patterns which
may involve different entities (e.g.,
authors, venues, topics, communities)
Focus on discovering patterns emerging from the
research dynamics:
Initial study
• Goal: To identify the dynamics that may
indicate the emergence of a new topic
• Approach:
– Integration of Keywords network and Semantic
topics network (Klink-2, Osborne et al. @ ISWC
2015)
– Analysis of the evolution in time of sub-networks
that will generate new topics vs. a control
group of establish topics.
• Debutant group (new topics)
• Non-debutant group (established topics)
Preliminary results
• My analysis indicates that for Debutant Topics there is
an intense activity between the most co-occurring
keywords which would normally be established topics
• My hypothesis is that I can use this understanding for
the early detection of new topics on the basis of the
activity of established topics
Student’s t-test on the two distributions:
• p-value = 2.81*10-83
• null hypothesis can be rejected
Evaluation plan
• Quantitative: retrospective analysis and
detection of historical trends
• Qualitative: informal feedback from
domain experts, including senior editors
and publishers at Springer, on the system
suggestions for future trends
Reflections
• So far, my initial experiments provided
promising results which confirm the initial
hypotheses
• The adoption of semantic technologies
has been beneficial to improve these
results
Next steps
• Analyse dynamics in other networks (e.g.,
authors, communities and venues)
• Integration of social media data
Early Detection and Forecasting of Research Trends

Weitere ähnliche Inhalte

Was ist angesagt?

Algorithms for the thematic analysis of twitter datasets
Algorithms for the thematic analysis of twitter datasetsAlgorithms for the thematic analysis of twitter datasets
Algorithms for the thematic analysis of twitter datasetsaneeshabakharia
 
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...Angelo Salatino
 
Mining from Open Answers in Questionnaire Data
Mining from Open Answers in Questionnaire DataMining from Open Answers in Questionnaire Data
Mining from Open Answers in Questionnaire Datafeiwin
 
Social Phrases Having Impact in Altmetrics - SOPHIA
Social Phrases Having Impact in Altmetrics - SOPHIASocial Phrases Having Impact in Altmetrics - SOPHIA
Social Phrases Having Impact in Altmetrics - SOPHIAInsight_Altmetrics
 
Detecting Incongruity Between News Headline and Body Text via a Deep Hierarch...
Detecting Incongruity Between News Headline and Body Text via a Deep Hierarch...Detecting Incongruity Between News Headline and Body Text via a Deep Hierarch...
Detecting Incongruity Between News Headline and Body Text via a Deep Hierarch...Seoul National University
 
[poster] Detecting Incongruity Between News Headline and Body Text via a Deep...
[poster] Detecting Incongruity Between News Headline and Body Text via a Deep...[poster] Detecting Incongruity Between News Headline and Body Text via a Deep...
[poster] Detecting Incongruity Between News Headline and Body Text via a Deep...Seoul National University
 
Intra- and interdisciplinary cross-concordances for information retrieval
Intra- and interdisciplinary cross-concordances for information retrieval Intra- and interdisciplinary cross-concordances for information retrieval
Intra- and interdisciplinary cross-concordances for information retrieval GESIS
 
Navigation through citation network based on content similarity using cosine ...
Navigation through citation network based on content similarity using cosine ...Navigation through citation network based on content similarity using cosine ...
Navigation through citation network based on content similarity using cosine ...Salam Shah
 
WIDS 2021--An Introduction to Network Science
WIDS 2021--An Introduction to Network ScienceWIDS 2021--An Introduction to Network Science
WIDS 2021--An Introduction to Network ScienceColleen Farrelly
 
Topic model
Topic modelTopic model
Topic modelLiam Bui
 
Assigning semantic labels to data sources
Assigning semantic labels to data sourcesAssigning semantic labels to data sources
Assigning semantic labels to data sourcesCraig Knoblock
 
A scalable architecture for extracting, aligning, linking, and visualizing mu...
A scalable architecture for extracting, aligning, linking, and visualizing mu...A scalable architecture for extracting, aligning, linking, and visualizing mu...
A scalable architecture for extracting, aligning, linking, and visualizing mu...Craig Knoblock
 
Proposing a Scientific Paper Retrieval and Recommender Framework
Proposing a Scientific Paper Retrieval and Recommender FrameworkProposing a Scientific Paper Retrieval and Recommender Framework
Proposing a Scientific Paper Retrieval and Recommender FrameworkAravind Sesagiri Raamkumar
 
What papers should I cite from my reading list? User evaluation of a manuscri...
What papers should I cite from my reading list? User evaluation of a manuscri...What papers should I cite from my reading list? User evaluation of a manuscri...
What papers should I cite from my reading list? User evaluation of a manuscri...Aravind Sesagiri Raamkumar
 

Was ist angesagt? (20)

Analyzing User Reviews in Tourism with Topic Models
Analyzing User Reviews in Tourism with Topic ModelsAnalyzing User Reviews in Tourism with Topic Models
Analyzing User Reviews in Tourism with Topic Models
 
Algorithms for the thematic analysis of twitter datasets
Algorithms for the thematic analysis of twitter datasetsAlgorithms for the thematic analysis of twitter datasets
Algorithms for the thematic analysis of twitter datasets
 
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...
 
Data wrangling week 9
Data wrangling week 9Data wrangling week 9
Data wrangling week 9
 
Mining from Open Answers in Questionnaire Data
Mining from Open Answers in Questionnaire DataMining from Open Answers in Questionnaire Data
Mining from Open Answers in Questionnaire Data
 
Cluster stability
Cluster stabilityCluster stability
Cluster stability
 
Social Phrases Having Impact in Altmetrics - SOPHIA
Social Phrases Having Impact in Altmetrics - SOPHIASocial Phrases Having Impact in Altmetrics - SOPHIA
Social Phrases Having Impact in Altmetrics - SOPHIA
 
Detecting Incongruity Between News Headline and Body Text via a Deep Hierarch...
Detecting Incongruity Between News Headline and Body Text via a Deep Hierarch...Detecting Incongruity Between News Headline and Body Text via a Deep Hierarch...
Detecting Incongruity Between News Headline and Body Text via a Deep Hierarch...
 
[poster] Detecting Incongruity Between News Headline and Body Text via a Deep...
[poster] Detecting Incongruity Between News Headline and Body Text via a Deep...[poster] Detecting Incongruity Between News Headline and Body Text via a Deep...
[poster] Detecting Incongruity Between News Headline and Body Text via a Deep...
 
Intra- and interdisciplinary cross-concordances for information retrieval
Intra- and interdisciplinary cross-concordances for information retrieval Intra- and interdisciplinary cross-concordances for information retrieval
Intra- and interdisciplinary cross-concordances for information retrieval
 
Navigation through citation network based on content similarity using cosine ...
Navigation through citation network based on content similarity using cosine ...Navigation through citation network based on content similarity using cosine ...
Navigation through citation network based on content similarity using cosine ...
 
WIDS 2021--An Introduction to Network Science
WIDS 2021--An Introduction to Network ScienceWIDS 2021--An Introduction to Network Science
WIDS 2021--An Introduction to Network Science
 
Topic model
Topic modelTopic model
Topic model
 
Relation-wise Automatic Domain-Range Information Management for Knowledge Ent...
Relation-wise Automatic Domain-Range Information Management for Knowledge Ent...Relation-wise Automatic Domain-Range Information Management for Knowledge Ent...
Relation-wise Automatic Domain-Range Information Management for Knowledge Ent...
 
Sybrandt Thesis Proposal Presentation
Sybrandt Thesis Proposal PresentationSybrandt Thesis Proposal Presentation
Sybrandt Thesis Proposal Presentation
 
Assigning semantic labels to data sources
Assigning semantic labels to data sourcesAssigning semantic labels to data sources
Assigning semantic labels to data sources
 
resume_data
resume_dataresume_data
resume_data
 
A scalable architecture for extracting, aligning, linking, and visualizing mu...
A scalable architecture for extracting, aligning, linking, and visualizing mu...A scalable architecture for extracting, aligning, linking, and visualizing mu...
A scalable architecture for extracting, aligning, linking, and visualizing mu...
 
Proposing a Scientific Paper Retrieval and Recommender Framework
Proposing a Scientific Paper Retrieval and Recommender FrameworkProposing a Scientific Paper Retrieval and Recommender Framework
Proposing a Scientific Paper Retrieval and Recommender Framework
 
What papers should I cite from my reading list? User evaluation of a manuscri...
What papers should I cite from my reading list? User evaluation of a manuscri...What papers should I cite from my reading list? User evaluation of a manuscri...
What papers should I cite from my reading list? User evaluation of a manuscri...
 

Ähnlich wie Early Detection and Forecasting of Research Trends

researchmethodologyi-140707092303-phpapp02.pdf
researchmethodologyi-140707092303-phpapp02.pdfresearchmethodologyi-140707092303-phpapp02.pdf
researchmethodologyi-140707092303-phpapp02.pdfMdali657802
 
Research Methodology Part I
Research Methodology Part IResearch Methodology Part I
Research Methodology Part IAnwar Siddiqui
 
chapter 1 Course Overview.pptx
chapter 1 Course Overview.pptxchapter 1 Course Overview.pptx
chapter 1 Course Overview.pptxYoniYoni7
 
Applied research methodology lecture 1
Applied research methodology lecture 1Applied research methodology lecture 1
Applied research methodology lecture 1Pulchowk Campus
 
Introduction to research methodology
Introduction to research methodologyIntroduction to research methodology
Introduction to research methodologyYogeshSorot
 
Trends in-connecting-research-sgd-2013
Trends in-connecting-research-sgd-2013Trends in-connecting-research-sgd-2013
Trends in-connecting-research-sgd-2013Sanjeev Deshmukh
 
Modern political Science.pptx
Modern political Science.pptxModern political Science.pptx
Modern political Science.pptxROSHANRAI52
 
Early Detection of Research Trends [thesis defence]
Early Detection of Research Trends [thesis defence]Early Detection of Research Trends [thesis defence]
Early Detection of Research Trends [thesis defence]Angelo Salatino
 
Introduction to research methodology
Introduction to research methodologyIntroduction to research methodology
Introduction to research methodologyASIM MANZOOR
 
research process in nursing nursing process.ppsx
research process in nursing  nursing process.ppsxresearch process in nursing  nursing process.ppsx
research process in nursing nursing process.ppsxlovedhaliwal1
 
INTELLECTUAL AND PROPERTY RIGHTSunit 1 R23 (1).pptx
INTELLECTUAL AND PROPERTY RIGHTSunit 1 R23 (1).pptxINTELLECTUAL AND PROPERTY RIGHTSunit 1 R23 (1).pptx
INTELLECTUAL AND PROPERTY RIGHTSunit 1 R23 (1).pptxSamuelAbragham
 
Part 1 research and evaluation edited
Part 1 research and evaluation editedPart 1 research and evaluation edited
Part 1 research and evaluation editedYISMAW MENGGISTU
 
lec1.pdf
lec1.pdflec1.pdf
lec1.pdfjeys3
 
Chapter 3 The Research Process: The broad problem area and defining the pro...
Chapter 3 The Research Process: The broad  problem area and defining the  pro...Chapter 3 The Research Process: The broad  problem area and defining the  pro...
Chapter 3 The Research Process: The broad problem area and defining the pro...Nardin A
 
BRM PPT 1.pptxbufyf6f7f6fydyddddfftsr6sidfg
BRM  PPT  1.pptxbufyf6f7f6fydyddddfftsr6sidfgBRM  PPT  1.pptxbufyf6f7f6fydyddddfftsr6sidfg
BRM PPT 1.pptxbufyf6f7f6fydyddddfftsr6sidfgAMANPathak744625
 

Ähnlich wie Early Detection and Forecasting of Research Trends (20)

19 2
19 219 2
19 2
 
researchmethodologyi-140707092303-phpapp02.pdf
researchmethodologyi-140707092303-phpapp02.pdfresearchmethodologyi-140707092303-phpapp02.pdf
researchmethodologyi-140707092303-phpapp02.pdf
 
Research Methodology Part I
Research Methodology Part IResearch Methodology Part I
Research Methodology Part I
 
chapter 1 Course Overview.pptx
chapter 1 Course Overview.pptxchapter 1 Course Overview.pptx
chapter 1 Course Overview.pptx
 
Applied research methodology lecture 1
Applied research methodology lecture 1Applied research methodology lecture 1
Applied research methodology lecture 1
 
Introduction to research methodology
Introduction to research methodologyIntroduction to research methodology
Introduction to research methodology
 
zero.pptx
zero.pptxzero.pptx
zero.pptx
 
Trends in-connecting-research-sgd-2013
Trends in-connecting-research-sgd-2013Trends in-connecting-research-sgd-2013
Trends in-connecting-research-sgd-2013
 
Research methodology
Research methodologyResearch methodology
Research methodology
 
Modern political Science.pptx
Modern political Science.pptxModern political Science.pptx
Modern political Science.pptx
 
Early Detection of Research Trends [thesis defence]
Early Detection of Research Trends [thesis defence]Early Detection of Research Trends [thesis defence]
Early Detection of Research Trends [thesis defence]
 
Introduction to research methodology
Introduction to research methodologyIntroduction to research methodology
Introduction to research methodology
 
research process in nursing nursing process.ppsx
research process in nursing  nursing process.ppsxresearch process in nursing  nursing process.ppsx
research process in nursing nursing process.ppsx
 
INTELLECTUAL AND PROPERTY RIGHTSunit 1 R23 (1).pptx
INTELLECTUAL AND PROPERTY RIGHTSunit 1 R23 (1).pptxINTELLECTUAL AND PROPERTY RIGHTSunit 1 R23 (1).pptx
INTELLECTUAL AND PROPERTY RIGHTSunit 1 R23 (1).pptx
 
Introduction to Research
Introduction to ResearchIntroduction to Research
Introduction to Research
 
Types of research
Types of researchTypes of research
Types of research
 
Part 1 research and evaluation edited
Part 1 research and evaluation editedPart 1 research and evaluation edited
Part 1 research and evaluation edited
 
lec1.pdf
lec1.pdflec1.pdf
lec1.pdf
 
Chapter 3 The Research Process: The broad problem area and defining the pro...
Chapter 3 The Research Process: The broad  problem area and defining the  pro...Chapter 3 The Research Process: The broad  problem area and defining the  pro...
Chapter 3 The Research Process: The broad problem area and defining the pro...
 
BRM PPT 1.pptxbufyf6f7f6fydyddddfftsr6sidfg
BRM  PPT  1.pptxbufyf6f7f6fydyddddfftsr6sidfgBRM  PPT  1.pptxbufyf6f7f6fydyddddfftsr6sidfg
BRM PPT 1.pptxbufyf6f7f6fydyddddfftsr6sidfg
 

Mehr von Angelo Salatino

Scientific Knowledge Graphs: an Overview
Scientific Knowledge Graphs: an OverviewScientific Knowledge Graphs: an Overview
Scientific Knowledge Graphs: an OverviewAngelo Salatino
 
Applying machine learning techniques to big data in the scholarly domain
Applying machine learning techniques to big data in the scholarly domainApplying machine learning techniques to big data in the scholarly domain
Applying machine learning techniques to big data in the scholarly domainAngelo Salatino
 
ResearchFlow: Understanding the Knowledge Flow between Academia and Industry
ResearchFlow: Understanding the Knowledge Flow between Academia and IndustryResearchFlow: Understanding the Knowledge Flow between Academia and Industry
ResearchFlow: Understanding the Knowledge Flow between Academia and IndustryAngelo Salatino
 
The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology:  A Large-Scale Taxonomy of Research AreasThe Computer Science Ontology:  A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology: A Large-Scale Taxonomy of Research AreasAngelo Salatino
 
The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology: A Large-Scale Taxonomy of Research AreasThe Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology: A Large-Scale Taxonomy of Research AreasAngelo Salatino
 
Invited Talk: Early Detection of Research Topics
Invited Talk: Early Detection of Research Topics Invited Talk: Early Detection of Research Topics
Invited Talk: Early Detection of Research Topics Angelo Salatino
 
AUGUR: Forecasting the Emergence of New Research Topics
AUGUR: Forecasting the Emergence of New Research TopicsAUGUR: Forecasting the Emergence of New Research Topics
AUGUR: Forecasting the Emergence of New Research TopicsAngelo Salatino
 
Introductory Lecture to Audio Signal Processing
Introductory Lecture to Audio Signal ProcessingIntroductory Lecture to Audio Signal Processing
Introductory Lecture to Audio Signal ProcessingAngelo Salatino
 

Mehr von Angelo Salatino (9)

Scientific Knowledge Graphs: an Overview
Scientific Knowledge Graphs: an OverviewScientific Knowledge Graphs: an Overview
Scientific Knowledge Graphs: an Overview
 
Applying machine learning techniques to big data in the scholarly domain
Applying machine learning techniques to big data in the scholarly domainApplying machine learning techniques to big data in the scholarly domain
Applying machine learning techniques to big data in the scholarly domain
 
ResearchFlow: Understanding the Knowledge Flow between Academia and Industry
ResearchFlow: Understanding the Knowledge Flow between Academia and IndustryResearchFlow: Understanding the Knowledge Flow between Academia and Industry
ResearchFlow: Understanding the Knowledge Flow between Academia and Industry
 
The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology:  A Large-Scale Taxonomy of Research AreasThe Computer Science Ontology:  A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
 
The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology: A Large-Scale Taxonomy of Research AreasThe Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
 
Invited Talk: Early Detection of Research Topics
Invited Talk: Early Detection of Research Topics Invited Talk: Early Detection of Research Topics
Invited Talk: Early Detection of Research Topics
 
AUGUR: Forecasting the Emergence of New Research Topics
AUGUR: Forecasting the Emergence of New Research TopicsAUGUR: Forecasting the Emergence of New Research Topics
AUGUR: Forecasting the Emergence of New Research Topics
 
Tesi Triennale Slide
Tesi Triennale SlideTesi Triennale Slide
Tesi Triennale Slide
 
Introductory Lecture to Audio Signal Processing
Introductory Lecture to Audio Signal ProcessingIntroductory Lecture to Audio Signal Processing
Introductory Lecture to Audio Signal Processing
 

Kürzlich hochgeladen

Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 

Kürzlich hochgeladen (20)

Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 

Early Detection and Forecasting of Research Trends

  • 1. Early Detection and Forecasting of Research Trends Angelo Antonio Salatino @angelosalatino Advisors: Prof. Enrico Motta Dr. Francesco Osborne ISWC 2015 – Doctoral Consortium
  • 3. Who cares? • Researchers: following the evolution of the research environment • Academic publishers: promoting up-to- date and interesting contents • Companies: early intelligence on potentially important research trends to remain at the forefront of innovation • Funding bodies: improved understanding of the research landscape
  • 4. State of the art: Trend detection • Topic evolution using bibliometric analysis: – Content analysis • Topics extraction • Main terms in documents – Citation analysis – Main limitation: cannot detect new trends early enough in the lifecycle [Wu et al. 2011, Bolelli et al. 2009, He et al. 2009]
  • 5. State of the art: Forecasting impact • Impact based on number of publications and authors associated with topics • Approaches based on exponential smoothing, simple medium average and machine learning • Limitations: – These approaches don’t work at embryonic and early stages – They only use a limited set of data sources [Budi al. 2012, Jun et al. 2010, Tseng et al. 2009]
  • 6. Planned approach Wider range of data sources: comprehensive knowledge base integrating both scholarly data and social media
  • 7. Planned approach – For example, before the Semantic Web emerged explicitly as research area we could identify new interesting dynamics involving authors from different research areas such as knowledge representation, agent systems, hypertext and databases. – Creation of a model that takes into account all the discovered patterns which may involve different entities (e.g., authors, venues, topics, communities) Focus on discovering patterns emerging from the research dynamics:
  • 8. Initial study • Goal: To identify the dynamics that may indicate the emergence of a new topic • Approach: – Integration of Keywords network and Semantic topics network (Klink-2, Osborne et al. @ ISWC 2015) – Analysis of the evolution in time of sub-networks that will generate new topics vs. a control group of establish topics. • Debutant group (new topics) • Non-debutant group (established topics)
  • 9. Preliminary results • My analysis indicates that for Debutant Topics there is an intense activity between the most co-occurring keywords which would normally be established topics • My hypothesis is that I can use this understanding for the early detection of new topics on the basis of the activity of established topics Student’s t-test on the two distributions: • p-value = 2.81*10-83 • null hypothesis can be rejected
  • 10. Evaluation plan • Quantitative: retrospective analysis and detection of historical trends • Qualitative: informal feedback from domain experts, including senior editors and publishers at Springer, on the system suggestions for future trends
  • 11. Reflections • So far, my initial experiments provided promising results which confirm the initial hypotheses • The adoption of semantic technologies has been beneficial to improve these results
  • 12. Next steps • Analyse dynamics in other networks (e.g., authors, communities and venues) • Integration of social media data

Hinweis der Redaktion

  1. Nowadays we are experiencing that the research environment evolves rapidly. New research areas emerge meanwhile others fade out, making difficult to keep up with these dynamics. At the moment, the task of understanding the main emergent area is accomplished either in an automatic or in a semi-automatic way using systems such as rexplore, saffron, arnetminer, MAS, google scholar, faceted dblp and citeseer. Taking as an example the evolution in time of a topic based on the number of papers, like for example the semantic web in figure, we can recognize three main stages: embryonic, early stage and recognised. In fact, it can be argued that a number of topics start to exist in an embryonic way, often as a combination of other topics, before being officially identified and then named by researchers. For example, the Semantic Web emerged as a common area for researchers working on Artificial Intelligence, WWW and Knowledge-Based Systems, before being acknowledged and labelled in the 2001 paper by Tim Berners-Lee. The early stage phase starts when a group of scientists agree with some theories related to the topic, build their own conceptual framework, and potentially give birth to a new scientific community. Finally, in the recognized phase, many authors are aware of this topic and then they start to work on it, producing results and then publish research papers. The problem is that all the aforementioned systems are capable of performing the detection of trends only when the research area is already recognised and not before. They actually need some years to make sense of these new trends. Moreover there are no systems able to forecast their impact in the early stage. I am interested in identifying, making sense and forecasting the impact of research trends.
  2. Who is really interested? Well, Researchers need to be updated regularly on the evolution of research environments because they are interested in new trends related to their topics. Academic publishers or editors knowing in advance new emerging topics is crucial for offering the most up to date and interesting contents. For example, an editor can gain a competitive advantage by being the first one to recognize the importance of a new trend and publish a special issue or a journal about it. And actually my PhD project is supported from Springer-Verlag. Institutional funding bodies and companies need also to be aware of research developments and promising research trends. For example, being aware of the future research trends will allow them to move in advance for making some important investments.
  3. This problem can be analysed from two point of view that are the topic trend detection and the forecast of the impact of topics. For what concerns the trend detection, all the current approaches do use bibliometric analysis aiming to extract either topics or main terms from the text and then the evolution of these topics is analysed investigating the citation network. The main limitation of these approaches is that the content for specific topic need first to be produced and then cited taking years before they can realise it.
  4. On the other hand, for forecasting the impact there are approaches that define the impact as number of publications and authors associated with topics and they are mainly based on statistical techniques like exponential smoothing, simple medium average and also machine learning algorithms. In this case, the main limitations of these approaches is that they do not work in the first phases of the evolution of topics and also they employ limited set of features. However it can argued that a different definition of the impact based also on social media data can improve the forecasting phase and will allow us to perform it in a short timescale.
  5. Initially, I will aim to integrate a variety of heterogeneous data sources including scholarly data and social media data in order to create a comprehensive knowledge base. This knowledge based will make use of an ontology to describe all the relationships between the research elements.
  6. Afterwards I will focus on analysing pattern that can lead to the emergence of a new research topic. For example, before Tim Berners Lee named officially the semantic web as a research area, we were already able to identify that the AI, the WWW and KBS were sharing their knowledge in this new common area. An interesting fact about scholarly data is that they store information about papers, therefore many research elements like topics, author, communities, venues, organizations can be inferred and all these research elements are inherently interconnected because an author writes paper about certain topics, an author belong to a community that is connected to a topic. These relationships can be analysed diachronically to derive new dynamics that can lead to the emergence of new topics, and then I can design a comprehensive model that takes into account all the discovered patterns.
  7. I conducted an initial study aiming to identify the dynamics that may lead to the emergence of a new topic using only scholarly data. In order to do so, I firstly combined the keywords network and the semantic topic network available in REXPLORE database. The keywords network as the name suggests is a network in which nodes represent keywords tagged in paper and the link between two keywords represent the amount of paper in which these two keyword co-occur per each year. The semantic topic network is also a network of keywords but in this case they are connected by semantic relationships subAreaOf, sameAs and so on that creates then a hierarchy of research topics. As a next step, I conducted a diachronic analysis on some portion of this joint network that are related to two different kind of topics: debutant and non debutant
  8. As a result I obtained that for the portion of network related to the debutant group of topics the pace of collaboration between topics is higher than the portion of network related to the non-debutant group. In this picture we can see two different distribution of the pace of collaboration of topics in time. The green line is for the topics belonging to the non debutant group while the blue line is for topics belonging to the debutant group. We can see that the distribution of the pace in collaboration for the non debutant group is centred in zero which means that on overall this group doesn’t show any increase in collaboration, while for the debutant group the distribution is shifted toward positive values showing that in this case the pace of collaboration is increasing. Moreover, applying the Student’s t-test on the two distributions allows us to reject the null hypothesis indicating that there is no relationship between the two measured phenomena. For this reason I believe that the acquired know-how can be applied for understanding the emergence of new topics based on the established ones. As preliminary results, I joined the Keywords Network that is a co-occurrences graphs with nodes representing topics and links representing the number of co-occurrences between them and the Semantic Topic Network that is a taxonomy of topic connected by semantic relationships extracted by Klink. I conducted a diachronic analysis on some portions of this joined graphs to confirm if the creation of novel topics is actually correlated to an increase in the pace of collaboration of already existing ones. These portions of graph were related to two different groups of topics: debutant and non-debutant.
  9. I plan to evaluate my work on both quantitative and qualitative perspective. From a quantitative point of view, I will use historical data to estimate statistical indexes like precision, recall, f-measure and so on. While from the qualitative perspecive, it is intended to receive informal feedback about future trend from domain experts, such as senior editors and publishers at Springer
  10. It can be said that the initial experiments provided promising result confirming also the initial hypotheses about the emergence of new topics. And, the adoption of semantic technologies like the semantic topic network has been beneficial to improve these results.
  11. As a next step I aim to analyse the dynamics of other research elements like authors, communities and venues that can lead to the emergence of a new research topics and also integrate entities from social media like tweets and blog posts.