A contribution to the pro-iBiosphere Final Conference on June 12, 2014 in Meise, Belgium. More info via http://wiki.pro-ibiosphere.eu/wiki/Final_Conference .
2. ● hundreds of millions of pages
○ ca. 20k treatments of new taxa per year
○ 50-100k re-descriptions annually
○ scattered across thousands of journals
and books
Biodiversity literature
3. ● geared towards the human reader
● not machine-readable (scans/ PDF)
● accumulated over three centuries
● includes much of what is published
today
Legacy literature
4. ● digital > paper-only
● open access > hidden
● with > without open data
● soon: machine readable > PDF
● these biases may skew analyses
Use & citation
5. ● identifying concepts
● linking them using controlled vocabularies
● integrating with other sources of information
Markup
6. Reis et al. (2008), marked up by Shotton et al. (2009). CC BY 2.5
7. ● automated markup of prospective literature
● crowdsourced markup of legacy literature
● semi-automated markup with expert
assistance
Scaling up
8. ● mark up taxonomic publications henceforth
● focus on revisionary works (biotas)
● adjust granularity to concrete use cases
● follow standards
● automate workflows
Recommendations