SlideShare ist ein Scribd-Unternehmen logo
1 von 22
(Scaling analysis) 
of author-level 
bibliometric 
indicators. 
Lorna Wildgaard 
Royal School of Library and Information 
Science 
Birger Larsen 
Department of Communication, AAU-CPH
CONTRIBUTE TO THE DISCUSSION: 
TODAY’S MEET 
https://todaysmeet.com/STI2014 
sign in with your email, create a password, confirm and 
log in 
OR 
Log in with Gmail 
15/09/2014 
Dias 2
PURPOSE OF THE INVESTIGATION 
Quantifiable and objective alternative to other 
metrics when evaluating faculty members for 
academic advancement. 
15/09/2014 
Dias 3
WHAT IS A RESEARCHER? 
15/09/2014 
Dias 4
MOTIVATION 
15/09/2014 
Dias 5
DATA 
2154 scholars identified 
in online questionnaire. 
793 working links to 
online CVs identified in 
sampling strategy across 
4 disciplines and 5 
seniorities 
Astronomy n203: 
PhD n15 
Post Doc n49 
Assis Prof n27 
Assoc Prof n72 
Prof n40 
Environment n203: 
PhD n3 
Post Doc n18 
Assis Prof n42 
Assoc Prof n85 
Prof n55 
Philosophy n250: 
PhD n9 
Post Doc n23 
Assis Prof n49 
Assoc Prof n82 
Prof n87 
Public Health n137: 
PhD n9 
Post Doc n14 
Assis Prof n31 
Assoc Prof n53 
Prof n30 
Data collection 
start date: 
13th June 2013. 
Publication data of 750 
researchers included. 
Data collection 
completed: July 10th 
2013 
Astronomy n203: 
PhD n15 
Post Doc n48 
Assis Prof n26 
Assoc Prof n67 
Prof n37 
Environment n203: 
PhD n3 
Post Doc n17 
Assis Prof n39 
Assoc Prof n85 
Prof n51 
Philosophy n250: 
PhD n9 
Post Doc n22 
Assis Prof n45 
Assoc Prof n75 
Prof n78 
Public Health n137: 
PhD n9 
Post Doc n14 
Assis Prof n30 
Assoc Prof n50 
Prof n29 
CVs and publication data of 
793 scholars collected from 
Google Scholar, via Publish 
or Perish. 
Excluded n43: 
Dead links n12 
Not included discipline: n15 
Duplicates: n1 
No publication list: n13 
No identifiable seniority: n1 
Impossible to find in POP: n1
IDENTIFICATION OF INDICATORS 
15/09/2014 
Dias 7
SUBJECTIVE GROUPING OF 54 INDICATORS 
15/09/2014 
Dias 8 
"A review of the characteristics of 
108 author-level bibliometric 
indicators", Scientometrics, 
DOI: 10.1007/s11192-014-1423-3
METHOD 1: IDENTIFICATION OF CENTRAL INDICATORS 
Discipline Index Calculation nCorr. 
Astronomy Hg 
The square root of (h multiplied by g). 
25 
Environ. Sci. H, H2 
Publications are ranked in descending order 
after number of citations. H is where number 
of citations and rank is the same. 
H2 is where the square of the number of 
papers is equal to the number of citations. 
26 
Philosophy IQP 
IQP= expected average performance of 
scholar in the field, amount of papers that are 
cited more frequently than average and how 
much more than average they are cited 
(Tc>a) 
28 
Pub. Health G 
Publications are ranked in descending order 
after number of citations. G is where the the 
square root of the cumulative sum of citations 
is equal to the rank 
23
MDS SCALING: SIMILARITY/DISSIMILARITY OF INDICES 
ASTRO 
Enviro. 
Sci 
25% 
24% 
ENVIRO 
47% 38 % 
PHIL PUB. HEALTH
EXPLORATIVE FACTOR ANALYSIS 
Discipline Publication & 
15/09/2014 
Dias 11 
recognition 
Normalized for 
field or time 
Miscellaneas 
Astronomy 57.3 % (0.78) 11.8 (0.49) 8.3 (-0.028) 
Environ. Sci 57.2% (0.77) 6.2 (0.04) 10.4 (0.89) 
Philosophy 53.6 (0.82) 7.0 (0.50) 10.4 (0.03) 
Public Health 56.2 (0.77) 6.6 (0.00) 12.1 (0.59) 
24-32 indicators in dimension 1 
4-9 indicators in dimension 2 
3-15 in dimension 3
REASSESSING THE METHOD 
Purpose: Quantifiable and objective alternative to other metrics 
when evaluating faculty members for academic 
advancement. 
What we have learnt so far: 
1. Publication and citation data is highly skewed 
2. Transforming the variables with log, inverse, sqrt did not 
improve the normality assumption of the data or improve 
the MDS or the Factor Analysis, 
3. Recoding the variables into categorical groups resulted in 
lack of detail and still not significant results (a lot of work, 
inconclusive results 
So we returned to non-parametric and descriptive analyses of 
the data – simple seems to be more informative when we 
have skewed data that builds on publications and citations. 
15/09/2014 
Dias 12
DIFFERENCE IN MEDIAN PUBLICATIONS 
BETWEEN SENIORITIES 
Publications 
Median 
Post Doc- 
PhD 
Assis Prof – 
Post Doc 
Assoc. Prof 
– Assis Prof 
Prof.-Assoc 
Prof 
Mean 
difference 
Astronomy 12.5 20 22 28.5 20.7 
Environment 5 9 11 22.5 11.8 
Philosophy 3 2.5 0.5 11 4.25 
Public Health 5 11 21 33 17.5 
DIFFERENCE IN MEDIAN CITATIONS 
BETWEEN SENIORITIES 
Citations 
Median 
15/09/2014 
Dias 13 
Post Doc- 
PhD 
Assis Prof – 
Post Doc 
Assoc. Prof 
– Assis Prof 
Prof.-Assoc 
Prof 
Mean 
difference 
Astronomy 51.1 500.5 512 675 434.7 
Environment 7 107 178 109 100.2 
Philosophy 7.5 -1.5 1.5 21 7.1 
Public Health 20.5 86.5 351 436 223.5
P & C INCREASE WITH SENIORITY. DO OTHER INDICATORS? 
DISCIPLINE OUTPUT EFFECT OF 
15/09/2014 
Dias 14 
OUTPUT 
IMPACT OVER 
TIME 
QUALIFY IMPACT 
TO FIELD 
RANK 
PORTFOLIO 
Astronomy P 
C, sc, nnc, 
Sig, Csc, 
Fc, 
Cage, AWCR, 
AWCRpa, AW, 
AR 
Sum pp top ncits, 
IQP, NprodP 
Millers H, h, 
A, R, g, hg, 
e, Q2, POPh 
Enviro. Sci. P, Fp 
C, CPP, Sc, 
FracCPP, 
nnc, Sig, 
Csc, Fc 
Cage, AWCR, 
AWCRpa, AW, 
AR 
Mcs, sum pp top 
ncits, mean mjs 
mcs, max mjs 
mcs, IQP, NprodP 
Millers h, h, 
m, A, R, g, 
hg, e, Q2, 
H2, POPh 
Philosophy P, Fp 
C, Sc, nnc, 
Sig, Csc, 
Fc 
Cage, AR NprodP 
m,A,R,g,e, 
H2 
Pub. Health P, Fp 
C, Sc, nnc, 
Sig, Csc, 
Fc 
AWCR, 
AWCRpa, AW, 
AR 
Mcs, Sum pp rop 
ncits, Sum pp top 
prop, NprodP 
Millers h, m, 
A, R, g, hg, 
e, Q2, H2, 
PopH
ARE PUBLICATION & CITATION COUNT EFFECTED BY 
GENDER? 
nMales nFemales Md P, 
male 
Md P, 
female 
Md C, 
male 
Md C, 
female 
Astronomy 162 30 48 39 881 518 
Environ. Sci 160 35 29 18 321 135 
Philosophy 179 43 9 8 12 8 
Pub. Health 79 53 31 29 311 353 
Environmental Science: Significant difference in the amount of 
publications produced by male and female researchers, 
U=2036, z=-2.525, p=0.012, r=0.18. Significant difference in 
the amount of citations male and female researchers receive, 
U=2056, z=-2.460, p=0.014, r=0.176 
15/09/2014 
Dias 15
ARE PUBLICATION & CITATION COUNT EFFECTED BY 
ORIGIN? 
15/09/2014 
Dias 16
ARE PUBLICATION & CITATION COUNT 
EFFECTED BY ACADEMIC AGE OR SENIORITY? 
Purpose: 
How well do seniority and academic age predict number of 
publications? How much of the variance in publication scores 
can be explained by scores on these two scales? 
Method: Multiple Regression 
Results (ALL FIELDS): 
The model which controls for seniority and academic age 
explains between 22-36.2% of the variance in publications 
(A=36%, E=36%, P=30%, PH=22%) and 1-22% of the 
variance in citations, (A=18%, E= 19%, P=0,9%, PH=22%. 
Conclusions: 
Academic Age makes the largest unique contribution as a 
predictor of publications or citations, Seniority makes very 
little contribution . 
15/09/2014 
Dias 17
ARE PUBLICATION & CITATION COUNT EFFECTED BY 
ACADEMIC AGE OR SENIORITY WHEN CONTROLLING 
FOR GENDER AND ORIGIN? 
Purpose: 
Controlling for the effect of gender and origin, is our set of variables 
(academic age and seniority) still able to predict a significant 
amount of the variance in publication count? 
Method: Hierarchical Regression (ATT: high correlated data, 
assumptions of normality violated) 
Results (All Disciplines): 
Only Academic age and seniority made a statistically significant 
contribution to the model. With academic age recording a higher 
beta value (.30-.46) than seniority (.18-.25) in each discipline. 
15/09/2014 
Dias 18
FINDINGS: RECOMENDING PUBLICATION AND 
CITATION INDICATORS
CONCLUSIONS (SO FAR) 
1. Indicator values are effected to varying degrees, 
dependent on discipline, by gender, origin, seniority and 
academic age (database & version). 
2. Academic age is dependent on how it is calculated. Here 
is highly dependent on database coverage. Seniority is 
more understandable. But does it make sense? 
3. Don’t have to wrap data in algorithims. More informative 
to summarize patterns between indicators. 
4. It is important to report the database and which version 
of the database was used to collect the data.
CONCLUSIONS (SO FAR) 
5. Variance in amount of publications between scholars 
differs from discipline to discipline. Clear difference in 
amount of publications and citations. 
6. The indicators are estimates. Report confidence intervals 
and range to contextualize the values. 
7. Strong correlation between indicators. Central and 
isolated indicators need further investigation allowing for 
confounders. 
8. No one indicator can stand alone. Work continues to 
identify indicators suitable for discipline and seniority.
THANK YOU FOR YOUR ATTENTION! 
Q. When does adjusting the data to fit the model 
become cherry picking?

Weitere ähnliche Inhalte

Ähnlich wie Scaling analysis of author-level bibliometric indicators

Uses and misuses of quantitative indicators of impact
Uses and misuses of quantitative indicators of impactUses and misuses of quantitative indicators of impact
Uses and misuses of quantitative indicators of impactBerenika Webster
 
Running head DATA ANALYSIS1DATA ANALYSIS 7Dat.docx
Running head DATA ANALYSIS1DATA ANALYSIS 7Dat.docxRunning head DATA ANALYSIS1DATA ANALYSIS 7Dat.docx
Running head DATA ANALYSIS1DATA ANALYSIS 7Dat.docxhealdkathaleen
 
Comparing scientific performance across disciplines: Methodological and conce...
Comparing scientific performance across disciplines: Methodological and conce...Comparing scientific performance across disciplines: Methodological and conce...
Comparing scientific performance across disciplines: Methodological and conce...Ludo Waltman
 
The Effect of Radiology Data Mining Software on Departmental Scholarly Activity
The Effect of Radiology Data Mining Software on Departmental Scholarly ActivityThe Effect of Radiology Data Mining Software on Departmental Scholarly Activity
The Effect of Radiology Data Mining Software on Departmental Scholarly ActivityEric Hymer
 
Critical Analysis Journal club how to do as a beginner
Critical Analysis Journal club  how to do as a beginnerCritical Analysis Journal club  how to do as a beginner
Critical Analysis Journal club how to do as a beginnerebinroshan07
 
Society for Research into Higher Education conference presentation
Society for Research into Higher Education conference presentationSociety for Research into Higher Education conference presentation
Society for Research into Higher Education conference presentationChristian Bokhove
 
Standard wording for formulating evidence conclusions and implications for re...
Standard wording for formulating evidence conclusions and implications for re...Standard wording for formulating evidence conclusions and implications for re...
Standard wording for formulating evidence conclusions and implications for re...CEBaP_rkv
 
how much would it cost to do the followingHow can graphics and.docx
how much would it cost to do the followingHow can graphics and.docxhow much would it cost to do the followingHow can graphics and.docx
how much would it cost to do the followingHow can graphics and.docxhoward4little59962
 
how much for help with homework.docx
how much for help with homework.docxhow much for help with homework.docx
how much for help with homework.docxwrite4
 
Project analysis -22 dec 2014-am (1)
Project analysis  -22 dec 2014-am (1)Project analysis  -22 dec 2014-am (1)
Project analysis -22 dec 2014-am (1)Nadzirah Hanis
 
Week 14Analysis and Presentation of Data - Hypothesis Tes.docx
Week 14Analysis and Presentation of Data -  Hypothesis Tes.docxWeek 14Analysis and Presentation of Data -  Hypothesis Tes.docx
Week 14Analysis and Presentation of Data - Hypothesis Tes.docxmelbruce90096
 
Paper 9: How to increase one's ranking efficiently? (Xu)
Paper 9: How to increase one's ranking efficiently? (Xu)Paper 9: How to increase one's ranking efficiently? (Xu)
Paper 9: How to increase one's ranking efficiently? (Xu)Kent Business School
 
Increase Tdap Vaccination Rates.docx
Increase Tdap Vaccination Rates.docxIncrease Tdap Vaccination Rates.docx
Increase Tdap Vaccination Rates.docxwrite4
 
improving the utilization and presentation of p values
improving the utilization and presentation of p valuesimproving the utilization and presentation of p values
improving the utilization and presentation of p valuesRamachandra Barik
 
Determining cognitive distance between publication portfolios of evaluators a...
Determining cognitive distance between publication portfolios of evaluators a...Determining cognitive distance between publication portfolios of evaluators a...
Determining cognitive distance between publication portfolios of evaluators a...Jakaria Rahman
 

Ähnlich wie Scaling analysis of author-level bibliometric indicators (20)

Uses and misuses of quantitative indicators of impact
Uses and misuses of quantitative indicators of impactUses and misuses of quantitative indicators of impact
Uses and misuses of quantitative indicators of impact
 
Running head DATA ANALYSIS1DATA ANALYSIS 7Dat.docx
Running head DATA ANALYSIS1DATA ANALYSIS 7Dat.docxRunning head DATA ANALYSIS1DATA ANALYSIS 7Dat.docx
Running head DATA ANALYSIS1DATA ANALYSIS 7Dat.docx
 
Comparing scientific performance across disciplines: Methodological and conce...
Comparing scientific performance across disciplines: Methodological and conce...Comparing scientific performance across disciplines: Methodological and conce...
Comparing scientific performance across disciplines: Methodological and conce...
 
The Effect of Radiology Data Mining Software on Departmental Scholarly Activity
The Effect of Radiology Data Mining Software on Departmental Scholarly ActivityThe Effect of Radiology Data Mining Software on Departmental Scholarly Activity
The Effect of Radiology Data Mining Software on Departmental Scholarly Activity
 
Bowman.2014 nordicworkshop
Bowman.2014 nordicworkshopBowman.2014 nordicworkshop
Bowman.2014 nordicworkshop
 
Critical Analysis Journal club how to do as a beginner
Critical Analysis Journal club  how to do as a beginnerCritical Analysis Journal club  how to do as a beginner
Critical Analysis Journal club how to do as a beginner
 
Society for Research into Higher Education conference presentation
Society for Research into Higher Education conference presentationSociety for Research into Higher Education conference presentation
Society for Research into Higher Education conference presentation
 
EDR8205-5
EDR8205-5EDR8205-5
EDR8205-5
 
Standard wording for formulating evidence conclusions and implications for re...
Standard wording for formulating evidence conclusions and implications for re...Standard wording for formulating evidence conclusions and implications for re...
Standard wording for formulating evidence conclusions and implications for re...
 
how much would it cost to do the followingHow can graphics and.docx
how much would it cost to do the followingHow can graphics and.docxhow much would it cost to do the followingHow can graphics and.docx
how much would it cost to do the followingHow can graphics and.docx
 
how much for help with homework.docx
how much for help with homework.docxhow much for help with homework.docx
how much for help with homework.docx
 
Picos model in research
Picos model in researchPicos model in research
Picos model in research
 
Project analysis -22 dec 2014-am (1)
Project analysis  -22 dec 2014-am (1)Project analysis  -22 dec 2014-am (1)
Project analysis -22 dec 2014-am (1)
 
Week 14Analysis and Presentation of Data - Hypothesis Tes.docx
Week 14Analysis and Presentation of Data -  Hypothesis Tes.docxWeek 14Analysis and Presentation of Data -  Hypothesis Tes.docx
Week 14Analysis and Presentation of Data - Hypothesis Tes.docx
 
Paper 9: How to increase one's ranking efficiently? (Xu)
Paper 9: How to increase one's ranking efficiently? (Xu)Paper 9: How to increase one's ranking efficiently? (Xu)
Paper 9: How to increase one's ranking efficiently? (Xu)
 
18 19
18 1918 19
18 19
 
Increase Tdap Vaccination Rates.docx
Increase Tdap Vaccination Rates.docxIncrease Tdap Vaccination Rates.docx
Increase Tdap Vaccination Rates.docx
 
improving the utilization and presentation of p values
improving the utilization and presentation of p valuesimproving the utilization and presentation of p values
improving the utilization and presentation of p values
 
Sample qp
Sample qpSample qp
Sample qp
 
Determining cognitive distance between publication portfolios of evaluators a...
Determining cognitive distance between publication portfolios of evaluators a...Determining cognitive distance between publication portfolios of evaluators a...
Determining cognitive distance between publication portfolios of evaluators a...
 

Kürzlich hochgeladen

Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxTasha Penwell
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfSubhamKumar3239
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingsocarem879
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectBoston Institute of Analytics
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataTecnoIncentive
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxHaritikaChhatwal1
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 

Kürzlich hochgeladen (20)

Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdf
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processing
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis Project
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded data
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptx
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 

Scaling analysis of author-level bibliometric indicators

  • 1. (Scaling analysis) of author-level bibliometric indicators. Lorna Wildgaard Royal School of Library and Information Science Birger Larsen Department of Communication, AAU-CPH
  • 2. CONTRIBUTE TO THE DISCUSSION: TODAY’S MEET https://todaysmeet.com/STI2014 sign in with your email, create a password, confirm and log in OR Log in with Gmail 15/09/2014 Dias 2
  • 3. PURPOSE OF THE INVESTIGATION Quantifiable and objective alternative to other metrics when evaluating faculty members for academic advancement. 15/09/2014 Dias 3
  • 4. WHAT IS A RESEARCHER? 15/09/2014 Dias 4
  • 6. DATA 2154 scholars identified in online questionnaire. 793 working links to online CVs identified in sampling strategy across 4 disciplines and 5 seniorities Astronomy n203: PhD n15 Post Doc n49 Assis Prof n27 Assoc Prof n72 Prof n40 Environment n203: PhD n3 Post Doc n18 Assis Prof n42 Assoc Prof n85 Prof n55 Philosophy n250: PhD n9 Post Doc n23 Assis Prof n49 Assoc Prof n82 Prof n87 Public Health n137: PhD n9 Post Doc n14 Assis Prof n31 Assoc Prof n53 Prof n30 Data collection start date: 13th June 2013. Publication data of 750 researchers included. Data collection completed: July 10th 2013 Astronomy n203: PhD n15 Post Doc n48 Assis Prof n26 Assoc Prof n67 Prof n37 Environment n203: PhD n3 Post Doc n17 Assis Prof n39 Assoc Prof n85 Prof n51 Philosophy n250: PhD n9 Post Doc n22 Assis Prof n45 Assoc Prof n75 Prof n78 Public Health n137: PhD n9 Post Doc n14 Assis Prof n30 Assoc Prof n50 Prof n29 CVs and publication data of 793 scholars collected from Google Scholar, via Publish or Perish. Excluded n43: Dead links n12 Not included discipline: n15 Duplicates: n1 No publication list: n13 No identifiable seniority: n1 Impossible to find in POP: n1
  • 7. IDENTIFICATION OF INDICATORS 15/09/2014 Dias 7
  • 8. SUBJECTIVE GROUPING OF 54 INDICATORS 15/09/2014 Dias 8 "A review of the characteristics of 108 author-level bibliometric indicators", Scientometrics, DOI: 10.1007/s11192-014-1423-3
  • 9. METHOD 1: IDENTIFICATION OF CENTRAL INDICATORS Discipline Index Calculation nCorr. Astronomy Hg The square root of (h multiplied by g). 25 Environ. Sci. H, H2 Publications are ranked in descending order after number of citations. H is where number of citations and rank is the same. H2 is where the square of the number of papers is equal to the number of citations. 26 Philosophy IQP IQP= expected average performance of scholar in the field, amount of papers that are cited more frequently than average and how much more than average they are cited (Tc>a) 28 Pub. Health G Publications are ranked in descending order after number of citations. G is where the the square root of the cumulative sum of citations is equal to the rank 23
  • 10. MDS SCALING: SIMILARITY/DISSIMILARITY OF INDICES ASTRO Enviro. Sci 25% 24% ENVIRO 47% 38 % PHIL PUB. HEALTH
  • 11. EXPLORATIVE FACTOR ANALYSIS Discipline Publication & 15/09/2014 Dias 11 recognition Normalized for field or time Miscellaneas Astronomy 57.3 % (0.78) 11.8 (0.49) 8.3 (-0.028) Environ. Sci 57.2% (0.77) 6.2 (0.04) 10.4 (0.89) Philosophy 53.6 (0.82) 7.0 (0.50) 10.4 (0.03) Public Health 56.2 (0.77) 6.6 (0.00) 12.1 (0.59) 24-32 indicators in dimension 1 4-9 indicators in dimension 2 3-15 in dimension 3
  • 12. REASSESSING THE METHOD Purpose: Quantifiable and objective alternative to other metrics when evaluating faculty members for academic advancement. What we have learnt so far: 1. Publication and citation data is highly skewed 2. Transforming the variables with log, inverse, sqrt did not improve the normality assumption of the data or improve the MDS or the Factor Analysis, 3. Recoding the variables into categorical groups resulted in lack of detail and still not significant results (a lot of work, inconclusive results So we returned to non-parametric and descriptive analyses of the data – simple seems to be more informative when we have skewed data that builds on publications and citations. 15/09/2014 Dias 12
  • 13. DIFFERENCE IN MEDIAN PUBLICATIONS BETWEEN SENIORITIES Publications Median Post Doc- PhD Assis Prof – Post Doc Assoc. Prof – Assis Prof Prof.-Assoc Prof Mean difference Astronomy 12.5 20 22 28.5 20.7 Environment 5 9 11 22.5 11.8 Philosophy 3 2.5 0.5 11 4.25 Public Health 5 11 21 33 17.5 DIFFERENCE IN MEDIAN CITATIONS BETWEEN SENIORITIES Citations Median 15/09/2014 Dias 13 Post Doc- PhD Assis Prof – Post Doc Assoc. Prof – Assis Prof Prof.-Assoc Prof Mean difference Astronomy 51.1 500.5 512 675 434.7 Environment 7 107 178 109 100.2 Philosophy 7.5 -1.5 1.5 21 7.1 Public Health 20.5 86.5 351 436 223.5
  • 14. P & C INCREASE WITH SENIORITY. DO OTHER INDICATORS? DISCIPLINE OUTPUT EFFECT OF 15/09/2014 Dias 14 OUTPUT IMPACT OVER TIME QUALIFY IMPACT TO FIELD RANK PORTFOLIO Astronomy P C, sc, nnc, Sig, Csc, Fc, Cage, AWCR, AWCRpa, AW, AR Sum pp top ncits, IQP, NprodP Millers H, h, A, R, g, hg, e, Q2, POPh Enviro. Sci. P, Fp C, CPP, Sc, FracCPP, nnc, Sig, Csc, Fc Cage, AWCR, AWCRpa, AW, AR Mcs, sum pp top ncits, mean mjs mcs, max mjs mcs, IQP, NprodP Millers h, h, m, A, R, g, hg, e, Q2, H2, POPh Philosophy P, Fp C, Sc, nnc, Sig, Csc, Fc Cage, AR NprodP m,A,R,g,e, H2 Pub. Health P, Fp C, Sc, nnc, Sig, Csc, Fc AWCR, AWCRpa, AW, AR Mcs, Sum pp rop ncits, Sum pp top prop, NprodP Millers h, m, A, R, g, hg, e, Q2, H2, PopH
  • 15. ARE PUBLICATION & CITATION COUNT EFFECTED BY GENDER? nMales nFemales Md P, male Md P, female Md C, male Md C, female Astronomy 162 30 48 39 881 518 Environ. Sci 160 35 29 18 321 135 Philosophy 179 43 9 8 12 8 Pub. Health 79 53 31 29 311 353 Environmental Science: Significant difference in the amount of publications produced by male and female researchers, U=2036, z=-2.525, p=0.012, r=0.18. Significant difference in the amount of citations male and female researchers receive, U=2056, z=-2.460, p=0.014, r=0.176 15/09/2014 Dias 15
  • 16. ARE PUBLICATION & CITATION COUNT EFFECTED BY ORIGIN? 15/09/2014 Dias 16
  • 17. ARE PUBLICATION & CITATION COUNT EFFECTED BY ACADEMIC AGE OR SENIORITY? Purpose: How well do seniority and academic age predict number of publications? How much of the variance in publication scores can be explained by scores on these two scales? Method: Multiple Regression Results (ALL FIELDS): The model which controls for seniority and academic age explains between 22-36.2% of the variance in publications (A=36%, E=36%, P=30%, PH=22%) and 1-22% of the variance in citations, (A=18%, E= 19%, P=0,9%, PH=22%. Conclusions: Academic Age makes the largest unique contribution as a predictor of publications or citations, Seniority makes very little contribution . 15/09/2014 Dias 17
  • 18. ARE PUBLICATION & CITATION COUNT EFFECTED BY ACADEMIC AGE OR SENIORITY WHEN CONTROLLING FOR GENDER AND ORIGIN? Purpose: Controlling for the effect of gender and origin, is our set of variables (academic age and seniority) still able to predict a significant amount of the variance in publication count? Method: Hierarchical Regression (ATT: high correlated data, assumptions of normality violated) Results (All Disciplines): Only Academic age and seniority made a statistically significant contribution to the model. With academic age recording a higher beta value (.30-.46) than seniority (.18-.25) in each discipline. 15/09/2014 Dias 18
  • 19. FINDINGS: RECOMENDING PUBLICATION AND CITATION INDICATORS
  • 20. CONCLUSIONS (SO FAR) 1. Indicator values are effected to varying degrees, dependent on discipline, by gender, origin, seniority and academic age (database & version). 2. Academic age is dependent on how it is calculated. Here is highly dependent on database coverage. Seniority is more understandable. But does it make sense? 3. Don’t have to wrap data in algorithims. More informative to summarize patterns between indicators. 4. It is important to report the database and which version of the database was used to collect the data.
  • 21. CONCLUSIONS (SO FAR) 5. Variance in amount of publications between scholars differs from discipline to discipline. Clear difference in amount of publications and citations. 6. The indicators are estimates. Report confidence intervals and range to contextualize the values. 7. Strong correlation between indicators. Central and isolated indicators need further investigation allowing for confounders. 8. No one indicator can stand alone. Work continues to identify indicators suitable for discipline and seniority.
  • 22. THANK YOU FOR YOUR ATTENTION! Q. When does adjusting the data to fit the model become cherry picking?

Hinweis der Redaktion

  1. Then there is no need to wrap a ‘closed’ algebraic graphical solution on your multidimensional matrix. Explorative factor analysis: Determine nature and number of “latent” variables that account for observed variation and covariation among the set of observed indicators. In other words, what “causes” these observed correlations? Summarize patterns of correlation among indicators. Solution is an end (i.e., is of interest) in and of itself. Here you can (colour) code the indicators according to you categorization and then proceed with the analysis; where difference codes correlate on the same dimension you can go back and analyse WHY the correlation, now that we think that they are conceptually different. The indicators were grouped into 3 or 4 components dependent on field that explain 4.5 to 57% of the total variance in the data. PC methodology for factor extraction allows for non-normal distributed variables. On this slide the dimensions are presented and % variance explained by each dimension. Dimension reduction was supported by parallel analysis (MonteCarlo PA) which showed which components with eigen values greater than the correspodning criterion values for a radomly generated datamatrix of the same size. The rotated solution solution (oblimin rotation) aided interpretation of components. This revealed which items loaded strongly on one component. There was a weak correlation between components, less than 0.3. The reliability of the components was tested using Cronbachs Alpha. The results of this analysis supports our idea of the use of indicators as seperate scales, that all use citations and publications, but measure different aspects of publication performance at the indicvidual level and some indicators are more useful in some disciplines than others.
  2. Determine nature and number of “latent” variables that account for observed variation and covariation among the set of observed indicators. In other words, what “causes” these observed correlations? : Determining what causes the variation and co-variation Summarize patterns of correlation among indicators. Solution is an end (i.e., is of interest) in and of itself. Here you can (colour) code the indicators according to you categorization and then proceed with the analysis; where difference codes correlate on the same dimension you can go back and analyse WHY the correlation, now that we think that they are conceptually different. Same circus of publications and citations The focus should be explorative analyses of the matrices, either factor analysis or simply extract the eigenvalues and vectors of the matrix using Principal Components Analysis.
  3. The amount of publications (P) and citations (C) increased with academic rank across all disciplines, apart from Philosophy where PHD students and assistant professors have the lowest median citation counts. Further examination demonstrated statistical significant increases through all academic ranks in publication levels, Kruskal-Wallis test X2 (2n192)=92.267, p.000 and a statistically significant difference in the amount of citations, X2 (2n192)=68.54, p.000. Tests of the four a priori hypotheses were conducted using Bonferroni adjusted alpha levels COMPLETE BONFERRONI The data is highly skewed and attempts to normalize the data to enable regression analysis was not successful. Nonparametric statistics are used, which are less powerful than parametric measures, and tend to be less sensitive and fail to detect differences between groups that actually exist.
  4. As publication and citation counts reliably increase with academic rank and the values between different disciplines vary, it is relevant to investigate if some indicators are more appropriate for some seniorities and disciplines than others. AR and R measure the same Withiin field and category – when we rank with these indicators what does this mean for the researcher – do they change position? As publication and citation counts reliably increase with academic rank and the values between different disciplines vary, it is relevant to investigate if some indicators are more appropriate for some seniorities and disciplines than others. This is where we can start to reduce the amount of potentially useful indicators. Kruskal Wallis: statistical difference between the values of indicators and academic rank. Yes there was a stsitistical difference, but not all of these increased with academic rank, reducing the set even further Indicators that increase reliably with rank
  5. How to correct for Gender in Environmental Science? What is it in this discipline that causes the discrepancy? However, there are some ”confounders” to consider, that might also effect our results. Gender, country, academic age, rather than seniority. Compared the medain publications and median citations of male and female researchers using Mann Whitney U ranks the variables across the two genders. AS the scores are converted to ranks the actual distribution doesn’t matter. Effect size r=0.18 – what does this mean? But small effect size is weak and might not be a consistent differnece between the amount of publications and citations between men and women.
  6. Group Scholars into top 25, upper middle 50, lower middle 50 and bottom 25% in discipline. Academic age: categorized into 5 year groups as the average for phd in Europe is 4-5 years according pHD regulation not completion time. Landcode WHO classification. The mutlinominal regression was inconclusive, couldn’t get a good model fit – only a little of the variance was explained and analyses were not significant. Inconclusive across all indicators if country, seniority,, academic age and gender have a significant contribution to the model Grouping defined by WHO member states defined by geography, state of economic and demographic development and mortality stratum. In this study these are the developed countries (Amr,n=9, Eur-A n=645, Eur-B n=37, Eur-C n=45 and Wpr n=7), and high-mortality developing countries (Afr n=5, Sear n=6) Astro. No diff P, Bonferroni adjustment revealed sig diff between amount of citations in EUR A and all other member states Enviro: no diff in the amount of P or C Phil: no diff in the amount of P or C Public Health: no dif in Pub, statistical sig diff in citations EUR A and other groups, but with small effect size (r=0.18) Pub Astro: Italy/eastern Europe X2(2=58) U=226, z=-2.971, p=0.003, r=0.3 moderate effect Cit Astro: France n18 /east n32 U=98, z=-3.840, p=0.00, r=0.54, Scandinavia, n6/East, n32, U=34, z=-2.482, p=.011, r=0.4 Bias towards Eastern Europe (Publishing less and cited least. Pub Enviro: no stat sig diff between n pub Lowest Germany, n=7, md=16, highest other n=12, md=54 Cit Enviro: No Statistially sig. diff between number of citations, lowest: netherlands, n14(md=113) Pub Phil: Spain producing sig fewer publications than other groups: Bonferroni correction NL/Spain (U=94-5p 0.003, z=-2.964 r=0.4), UK/spain=sig U303, z=-3.610, p=0.00, r=0.4 (bonferroni 0.05/4=0.01, Cit Phil: Spain producing less and cited less, bias against eastern european: NL 18/eastern Europe 19: x2(2,37)=50.0, z=-3.684, p=0.000, r=0.6 Pub Public Health: no sig difference in amount of publications or Citations
  7. Seniority is a label given by the university, likewise academic age is defined in our study by the number of years since the first article registered in WoS. Seniority, as we have seen is a useful bench mark for expected indicator values, where as academic age is dependent on the database used to source the data or a subjective measure (years since phd defence, years since first meaningful publication?) Knowing the data is very skewed, after studing it so carefullt, I felt confident to do a hierarchical regression to see if academic age or seniority had a greater affect on number of publications. Beta: distinct contribution of a variable, excluding overlap with other predictor variables. Controlling for the effect of gender and origin, is out set of variables (academic age and seniority) still able to predict a significant amount of the variance in publication count?
  8. Astro and Public Health wise to normalize citations for country Enviro normalize for gender Use h type indicators with care in Phil, coverage limited in WOS All countries normalise for seniority Academic age? Make sure the indactors are calculated in the same version of the same database.
  9. Indicators good at discriminating between top performers and bottom performers.
  10. Fishing trip after the method, which is ok, as this is not a medical investiaqtion with a strict protocolm How to identify homogenous data, within x standard variations? What can be excluded? 2,25% at each end of the scale Limitation of study that only based on limited data How much of data must model represent? Is data manipulation the way forward – ranking, sorting, looking for patterns and trends ”play with the data” without fundementally changing it.