2. PeerJudge Overview
• Based on a dictionary of review sentiment terms
and phrases from F1000Research reviews
• Each dictionary term or phrase has a praise or
criticism score
• Well written: +2
• Flawed: -4
• Reviews given the maximum positive and negative
scores of words or phrases found in each sentence.
• -1: no criticism … -5: very strong criticism
• 1: no praise …. 5: very strong praise
• Also 12 linguistic rules to cope with negation,
booster words (very, slightly)
3. PeerJudge Example
• The paper is well written but the study is poorly
designed.
• Praise: 2; Criticism: -4.
• Try: online
• http://sentistrength.wlv.ac.uk/PeerJudge.html
4. Part of the
dictionary
acceptabl* 3
accurate 2
adequat* 3
appropriate 3
arbitrary -2
balanced 2
bewilder* -3
but 1
careful* 3
clarify -2
clear 4
clearer -3
compelling 3
5. Technical details
• Java jar program
• portable
• Dictionaries are external plain text files
• Easily customizable
• Fast
• 14,000 reviews per second
• Explains its judgement
• So is transparent and the owner can adjust the dictionary for
recurrent problems
• Agrees above random chance with reviewer scores
• Because based on a dictionary, does not “cheat” by
identifying hot topics, fields, affiliations or jargon
6. Where is the dictionary from?
• Human evaluation of a development dataset of
F1000Research reviews
• Machine learning to suggest extra terms and
different weights
7. Limitations
• Designed for F1000Research decisions – needs
dictionary modification for good performance on other
review datasets.
• F1000Research reviews are unbalanced – few negative
decisions
• F1000Research reviews have standard concluding text
that had to be removed – so referees might not
conclude
• Referees often give judgements in field-specialist
languages, avoiding general conclusions
• More substantial modifications may be needed for
technical domains.
• Difficult to do this in advance because very few outlets
publish reviews and scores
8. Applications
• Warning reviewers if their judgements are
apparently out of line with their scores?
• Warning reviewers if they have not given any
praise.
• As above for editors
• On a larger scale, allow publishers to check for
anomalies in the reviewer process, such as by
identifying journals with uncritical referees (low
average criticism scores).