Presents our examination of submission data for the SIGITE conference between the years 2007-2012. SIGITE is an ACM computing conference on IT education. The presentation describes which external factors and which internal characteristics of the submissions are related to eventual reviewer ratings. Ramifications of the findings for future authors and conference organizers are also discussed. If you want to read the full paper, visit http://dl.acm.org/citation.cfm?id=2656450.2656465
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
A longitudinal examination of SIGITE conference submission data
1. A LONGITUDINAL EXAMINATION OF
SIGITE CONFERENCE SUBMISSION DATA
2007‐2012
Presentation for SIGITE 2014 1 by Randy Connolly, Janet Miller, and Rob Friedman
2. THE ABSTRACT
This paper examines
submission data for the
SIGITE conference between
the years 2007‐2012.
It examines which external
factors and which internal
characteristics of the
submissions are related to
eventual reviewer ratings.
Presentation for SIGITE 2014 2 by Randy Connolly, Janet Miller, and Rob Friedman
Ramifications of the findings
for future authors and
conference organizers are
also discussed.
3. RELATEDWORK Peer review is the main quality control
mechanism within the academic
sciences and is used for assessing the
merits of a written work as well as for
ensuring the standards of the academic
field.
3
4. PEER REVIEW
Enjoys broad support, yet …
BIAS PROBLEMS
• Author/Institution status
• Asymmetrical power
relations
SOLUTIONS
•Single‐Blind Reviews (SBR)
•Double‐Blind Reviews (DBR)
SIGITE 2007‐2012
Used Double‐Blind reviews
Presentation for SIGITE 2014 4 by Randy Connolly, Janet Miller, and Rob Friedman
5. RESEARCH ON SBRAND DBR
RELIABILITY
ISSUES
Presentation for SIGITE 2014 5 by Randy Connolly, Janet Miller, and Rob Friedman
VALIDITY
ISSUES
6. PEER REVIEW OFTEN LACKS RELIABILITY
That is, reviewers often differ strongly about the merits of any given paper.
6
7. PEER REVIEW OFTEN LACKS VALIDITY
There is often little relationship between the judgments of
reviewers and the subsequent judgments of the relevant
larger scholarly community as defined by eventual citations.
Presentation for SIGITE 2014 7 by Randy Connolly, Janet Miller, and Rob Friedman
8. SOME RESEARCH
DISAGREES
Others have found that there is indeed a
“statistically significant association between
selection decisions and the applicants' scientific
achievements, if quantity and impact of research
publications are used as a criterion for scientific
achievement”
Presentation for SIGITE 2014 8 by Randy Connolly, Janet Miller, and Rob Friedman
9. Our Study
PROVIDES A UNIQUE ADDITION TO THIS
LITERATURE
Unlike earlier work, our study assesses reviews and submissions for a single
international computing conference across an extended time period (2007‐2012).
It assesses the reliability of the peer view process at SIGITE by examining both
internal and external factors; the combination of these analyses is also unique.
This paper also provides some innovation in the measures it uses to assess the
validity of the peer review process.
Presentation for SIGITE 2014 9 by Randy Connolly, Janet Miller, and Rob Friedman
10. 10
METHOD
From 2007 to 2012, the ACM SIGITE
conference used the same “Grinnell”
submission system as the larger
SIGCSE and ITiCSE education
conferences.
10
This web‐based system was used by
authors to submit their work, by
reviewers to review submissions, and
by program committees to evaluate
reviews and to organize the eventual
conference program.
11. DATACOLLECTION
Presentation for SIGITE 2014 11 by Randy Connolly, Janet Miller, and Rob Friedman
STEP 4
Data was further
manipulated in
Excel and then
exported and
statistically analyzed
using SPSS.
STEP 3
Other relevant
data (e.g., number
of references,
citation rates, etc)
were manually
gathered.
STEP 2
Since 2007‐2010
conferences used a
slightly different
process, the data
had to be
normalized.
STEP 1
Individual Access
databases used by
the submission
system for each year
had to be merged
into a single file.
12. 12
RESULTS
Over the six years, there were 1026
reviews from 192 different reviewers,
and 508 authors were involved in
submitting a total of 332 papers.
12
The 2010 version of the conference
had the lowest number of paper
submissions (n=37), while the 2012
had the largest (n=87).
13. AUTHOR AND PAPER INFORMATION
Who were our authors and how did they do on their papers?
Presentation for SIGITE 2014 13 by Randy Connolly, Janet Miller, and Rob Friedman
14. PAPERS WERE SUBMITTED FROM 32 DIFFERENT COUNTRIES
USA
N=378
Canada
N=24
Saudi Arabia
N=14
Presentation for SIGITE 2014 14 by Randy Connolly, Janet Miller, and Rob Friedman
Pakistan
N=8
Italy
N=8
United Arab Emirates
N=8
Finland
N=7
Korea
N=7
15. Acceptance
Rate (74.1%)
However, this acceptance figure is
not representative of the true
acceptance rate of SIGITE,
because the review process was
altered back in 2011.
From 2007‐2010 there was a
separate abstract submission
stage, which helped reduce the
eventual number of rejected
papers during those years.
Presentation for SIGITE 2014 15 by Randy Connolly, Janet Miller, and Rob Friedman
16. Actual acceptance rates were:
41% (2007)
63% (2008)
68% (2009)
49% (2010)
52% (2011)
58% (2012)
Presentation for SIGITE 2014 16 by Randy Connolly, Janet Miller, and Rob Friedman
17. Single Author
31%
Two Authors
38%
Four+ Authors
16%
Three Authors
15%
There was no difference
in acceptance rates
between multi‐author
and single author papers.
Presentation for SIGITE 2014 17 by Randy Connolly, Janet Miller, and Rob Friedman
18. PAPER CATEGORIES
What were our papers about?
Presentation for SIGITE 2014 18 by Randy Connolly, Janet Miller, and Rob Friedman
19. CATEGORIES BY IT PILLAR
Presentation for SIGITE 2014 19 by Randy Connolly, Janet Miller, and Rob Friedman
21. REVIEWER INFORMATION
Who were our reviewers?
Presentation for SIGITE 2014 21 by Randy Connolly, Janet Miller, and Rob Friedman
22. REVIEWER INFORMATION
1026
reviews
Presentation for SIGITE 2014 22 by Randy Connolly, Janet Miller, and Rob Friedman
192 reviewers
70% reviewed by
3 or 4 reviewers
3.11 reviews / paper
23. INTERESTING FINDING
The number of reviews a paper had was negatively correlated with its
probability of being accepted to the conference.
Generally speaking, the more reviews a paper had, the less likely it was of
being accepted!
Presentation for SIGITE 2014 23 by Randy Connolly, Janet Miller, and Rob Friedman
24. RATING INFORMATION
What did the ratings look like?
Presentation for SIGITE 2014 24 by Randy Connolly, Janet Miller, and Rob Friedman
25. FIVE CATEGORIES
Reviewers supplied a rating between 1 and 6 for five different categories
TECHNICAL ORGANIZATION ORIGINALITY SIGNIFICANCE OVERALL
3.62 mean 3.86 mean 3.70 mean 3.75 mean 3.60 mean
Presentation for SIGITE 2014 25 by Randy Connolly, Janet Miller, and Rob Friedman
26. OVERALL RATING
Rating definitions and number received
Overall Rating Description N %
1 Deficient 51 5.0%
2 Below Average 192 18.7%
3 Average 223 21.7%
4 Very Good 254 24.8%
5 Outstanding 267 26.0%
6 Exceptional 39 3.8%
Total 1026 100.0%
Presentation for SIGITE 2014 26 by Randy Connolly, Janet Miller, and Rob Friedman
27. INTERESTING FINDING
These subcategory ratings were significantly correlated (p<0.00) with the overall rating.
Additional post‐hoc testing showed significant relationships between every one of these
four factors and every level of overall rating, which suggested strong internal reliability
for each of the reviewers (i.e, each reviewer was consistent with him/herself).
Generally speaking, this means that the subcategory ratings were not really needed.
Presentation for SIGITE 2014 27 by Randy Connolly, Janet Miller, and Rob Friedman
28. REVIEWER VARIABILITY
Central tendency statistics for these ratings alone does not adequately capture the
variability of reviewer scoring for poor, average, and excellent papers.
Presentation for SIGITE 2014 28 by Randy Connolly, Janet Miller, and Rob Friedman
29. REVIEWER VARIABILITY
Combination of min vs max overall rating
Maximum Values
Minimum
Values
1 2 3 4 5 6 N
1 2 5 8 10 14 2 41
2 8 23 29 47 10 117
3 11 21 51 5 88
4 16 31 14 61
5 16 5 21
6 2 2
# papers 330
Presentation for SIGITE 2014 29 by Randy Connolly, Janet Miller, and Rob Friedman
30. INTERESTING FINDING
While the overall statistics exhibited a strong tendency towards the mean,
paper ratings can vary considerably from reviewer to reviewer.
Based on these findings, it is recommended that future program
committees individually consider papers where rating scores deviate by 2
or more rating points.
Presentation for SIGITE 2014 30 by Randy Connolly, Janet Miller, and Rob Friedman
31. FACTORS AFFECTING RATING
What things affect reviewer ratings?
Presentation for SIGITE 2014 31 by Randy Connolly, Janet Miller, and Rob Friedman
32. Characteristics Reviewer
Here we looked at two
characteristics that may
impact reviewer ratings:
1. familiarity with the
subject being reviewed
2. regional location.
Presentation for SIGITE 2014 32 by Randy Connolly, Janet Miller, and Rob Friedman
33. REVIEWER FAMILIARITY
FAMILIARITY
•For each review, reviewers
assigned themselves a
familiarity rating of low,
medium, or high
ANALYSIS
•We performed ANOVA tests
to see if the reviewer’s
familiarity affected their
ratings.
THERE WERE NO DIFFERENCES BETWEEN GROUPS
This supports findings of other researchers
Presentation for SIGITE 2014 33 by Randy Connolly, Janet Miller, and Rob Friedman
34. WHAT ABOUT
REVIEWER LOCATION?
Presentation for SIGITE 2014 34 by Randy Connolly, Janet Miller, and Rob Friedman
35. Europe
N=53
Presentation for SIGITE 2014 35 by Randy Connolly, Janet Miller, and Rob Friedman
Everywhere else
N=70
English
Speaking
N=903
We found no differences
between regions
36. TEXTUAL
CHARACTERISTICS
36
We compared several
quantitative textual
measures on a subset of our
papers to see if any of them
were related to reviewers’
overall ratings.
The readability indices that
we tested included the
following:
the percentage of complex
words, the Flesh‐Kincaid
Reading Ease Index, the
Gunning Fog Score, the
SMOG index, and the
Coleman Liau Index.
All of these indices are
meant to measure the
reading difficulty of a block
of text.
37. TEXTUAL CHARACTERISTICS
The results
Characteristic Significant Correlation
Total number of words in paper
(n=55, M=3152.22) No r = 0.264
p = 0.052
Readability indices of paper
(n=55, M=39.33) No r = ‐0.016
Presentation for SIGITE 2014 37 by Randy Connolly, Janet Miller, and Rob Friedman
p = 0.909
Readability indices of abstract
(n=34, M=30.96) No r = ‐0.083
p = 0.641
Total # of words in abstract
(n=159; M=115.13) Yes r = 0.379
p < 0.00
Number of references in paper
(n=159; M=16.47) Yes r = 0.270
p = 0.001
38. INTERESTING FINDING
We were not surprised to find that the number of references in the paper
would affect reviewer ratings.
We were surprised to discover that the length of the abstract affects
reviewer ratings!
Presentation for SIGITE 2014 38 by Randy Connolly, Janet Miller, and Rob Friedman
39. PEER REVIEW VALIDITY
How accurate were our reviewers?
Presentation for SIGITE 2014 39 by Randy Connolly, Janet Miller, and Rob Friedman
40. 40
WHAT IS VALIDITY?
Validity refers to the degree to which a reviewer’s
ratings of a paper are reflective of the paper’s
actual value.
While this may be the goal of all peer
review, it is difficult to measure
objectively.
Perhaps the easiest way to assess the
academic impact and quality of a paper is
to examine the paper’s eventual citation
count.
We grouped all the accepted papers
(n=245) into four quartiles based on
average overall rating.
We then took a random sampling of 96
papers from all six years, with an even
number from each year and each quartile.
Image description
Lorem ipsum dolor sit amet
41. We gathered
THE NUMBER OF CITATIONS FROM GOOGLE SCHOLAR
As well as
THE NUMBER OF DOWNLOADS FROM
THE ACM DIGITAL LIBRARY
96 papers
Presentation for SIGITE 2014 41
by Randy Connolly, Janet Miller, and Rob Friedman
And then checked if
REVIEWER RATINGS WERE
REFLECTIVE OF CITATIONS
OR DOWNLOADS
For each of these
42. VALIDITY MEASURES
Did the peer review process at SIGITE predict the longer‐term impact of the paper?
Characteristic Significant Correlation
Number of Google Scholar citations
(n=96; M=4.60) No r = 0.121
Presentation for SIGITE 2014 42 by Randy Connolly, Janet Miller, and Rob Friedman
p = 0.241
Cumulative ACM DL downloads to date
(n=96; M=239.61) No r = 0.096
p = 0.351
Number of ACM DL downloads in past year
(n=96; M=37.23) No r = 0.023
p = 0.822
43. This study has several limitations.
Our data set contained six years of data for a
computing education conference: such
conferences arguably have a unique set of
reviewers and authors in comparison to
“normal” computing conferences.
As such, there may be limits to the
generalizability of our results.
It is also important to recognize that
correlations are not the same as causation.
43
44. OTHER LIMITATIONS
In the future, we hope also to examine
whether reviewer reliability is related to
the experience level of the reviewer.
We would like to also fine tune our
validity analysis by seeing if correlations
differ for the top or bottom quartile of
papers.
Presentation for SIGITE 2014 44 by Randy Connolly, Janet Miller, and Rob Friedman
46. SIGNIFICANT VARIABILITY IN REVIEWER
RATINGS
REVIEWER #1
4
REVIEWER #2
5
REVIEWER #3
1
REVIEWER #4
Presentation for SIGITE 2014 46 by Randy Connolly, Janet Miller, and Rob Friedman
3
REVIEWER #5
2
Future program chairs would be advised to control
for this variability by increasing the number of
reviewers per paper.
47. 4.0 Need
reviewers per paper in the future.
Presentation for SIGITE 2014 47 by Randy Connolly, Janet Miller, and Rob Friedman
48. EXTERNAL FACTORS DID NOT MATTER
Happily, there was no evidence that the nationality (or whether they were
native English speakers) of the reviewer or the author played a statistical
significant role in the eventual ratings the paper received.
Presentation for SIGITE 2014 48
by Randy Connolly, Janet Miller, and Rob Friedman
49. SOME TEXTUAL FACTORS DID MATTER
Significant
Number of references
Significant
Number of words in abstract
No Significance
Total number of words in paper
Presentation for SIGITE 2014 49 by Randy Connolly, Janet Miller, and Rob Friedman
No Significance
Readability Indices
50. 50
WHY THE ABSTRACT?
We were quite surprised to find that
the number of words in the abstract
was statistically significant.
Presumably, reviewers read the
abstract particularly carefully.
As such, our results show that erring
on the side of abstract brevity is
usually a mistake.
On the contrary, our evidence shows
that it is important for authors to
make sure the abstract contains
sufficient information.
51. We also found that the number of
references was significant.
ACCEPTANCE Probability based on number of references
REJECTION
Almost None
Very few
Presentation for SIGITE 2014 51 by Randy Connolly, Janet Miller, and Rob Friedman
Sufficient Lots of em!
52. Presentation for SIGITE 2014 52 by Randy Connolly, Janet Miller, and Rob Friedman
21.26
per paper
16.47
per paper
+103%
SIGITE: Avg # of References
ACM Digital Library
+110%
Science Citation Index
34.36
per paper
+110%
53. OBVIOUS CONCLUSIONS
Making a concerted effort at increasing citations is likely to improve a
paper’s ratings with reviewers.
It should be emphasized that the number of citations is not the cause of
lower or better reviewer ratings.
Rather, the number of citations is likely a proxy measure for determining if
the paper under review is a properly researched paper that is connected to
the broader scholarly community.
Presentation for SIGITE 2014 53 by Randy Connolly, Janet Miller, and Rob Friedman
54. Final Conclusion
VALIDITY
We did not find any connection between reviewers’ ratings of a paper and its
subsequent academic impact (measured by citations) or practical impact (measured by
ACM Digital Library downloads).
This might seem to be a disturbing result.
However, other research in this area also found no correlation between reviewer ratings
and subsequent academic impact.
It is important to remember that, “the aim of the peer review process is not the selection
of high impact papers, but is simply to filter junk papers and accept only the ones above
a certain quality threshold”.
Presentation for SIGITE 2014 54 by Randy Connolly, Janet Miller, and Rob Friedman
55. FUTUREWORK
55
We hope to extend our analysis to
include not only more recent years, but
also to include more fine‐grained
examinations of the different factors
affecting peer review at the SIGITE
conference.