SlideShare ist ein Scribd-Unternehmen logo
1 von 66
The Future of Scientific Publishing Donat Agosti  (Plazi, Bern) 21 January 2011 Paris
I don‘t know the future,  but I have a dream…
Immersing in the knowledge
I want to ask a publication a question, not the author telling me what I have to read.
I want to find out  how many and which species are there?  how are they related?  do they disappear? how are they distributed?
I want to find out  how many and which species there are  how are they related  do they disappear Other people have different interests
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],“ protein-protein interaction networks” John Wilbanks,    Neurocommons
 
In a semantic Web environment (where machines talk to each other and do most of our work), data need to be able to talk to each other: “ protein-protein interaction networks” John Wilbanks,    Neurocommons 27,266 papers 4,563 papers 41,985 papers 10,365 papers 128,437 papers
It will open up scientific literature for data mining “ protein-protein interaction networks” John Wilbanks,    Neurocommons
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Taxon mining project
1996 Conservation, Phylogeny, Systematics, Curiosity, Aesthetics, Fascination
2011 Experience, Frustration, Wonder, Excitment, Satisfaction, Determination
Modeling taxonomic literature: TaxonX Taxpub NLM DTD Plazi
[object Object],[object Object],- Get bibliographic Metadata from HNS (MODS) - Get bibliographic Guids from bioguid (or EDIT?) - Get geographic long/lat from geonames.org Plazi workflow: GoldenGate mark up as an example ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
The semantically enhanced treatments, extracted, stored on Plazi.org, and served in a human readable form, are linked to the underlying data: Fisher & Smith, 2008, PLoS ONE.
Plazi Search and Retrieval Server: Access to data TAPIR, SPM You You You human machine
The conversion comes at a cost, even though GoldenGate and other editors exist
Time per minute to produce clean OCR using ABBYY; publications in chronological order Production metrics to measure effort and compare various approaches and alogrithm
How to mark up large body of legacy publications? Inhouse? Build / use commercial services? Use the community, e.g. volunteers? Activation energy Gutenberg Semantic Web Cost per knowledge
Training and demos...
Avoid it
Prospective publications: Zookeys / Phytokeys
Semantic enhancements to published texts
2036 ?
Why do we publish?
Public funded research
Contribute to the welfare of the nations…
Dissemination
Access
Before antbase.org, Harvard‘s Museum of Comparative Zoology could claim to be the only location with a complete set of ant systematics publications from 1758 - present. Through antbase.org‘s digital library, access to this body of literature is worldwide, and it is actively used (>10,000 visits in one month only).
Access to ant taxonomic publications through antbase.org /Smithsonian Institution, including currently the entire body of non-copyrighted publications since 1758 (>4,000 publications or 85,000 pages)
The Biodiversity Heritage Library is currently digitizing and make accessible >100 million pages, most of them out of copyright, ie older then 1925. ........ to be finished in 2048...
What is a publication from public funded science?
 
Open Access
What is a scientific publication? Print, journal, article, treatment, public funding, pdf, xml Tool to disseminate scientific knowledge
Why do we publish the way we publish?
What kind of publications serve our needs?
IPBES
Access
Beyond the PDF
Access to what?
Scratchpad, EOL page, Wikipage, species page
Treatment
Treatments come with a lot of overhead
Genus Diagnosis Notes Biology Distribution Key to sp. Species  descriptions The structure of a systematics publication Species treatments Title Author Abstract Introduction Taxon descriptions Suppl. Materials Acknowledgments References Species 1 Species 2 Species 3 Species 4 Species .. Species n Nomenclature Diagnosis Distribution Material Examined Comments Description Graphic art Species 1
Treatments come with a lot of overhead Treatments are highly structured
Genus Diagnosis Notes Biology Distribution Key to sp. Species  descriptions The structure of a systematics publication Species treatments Title Author Abstract Introduction Taxon descriptions Suppl. Materials Acknowledgments References Species 1 Species 2 Species 3 Species 4 Species .. Species n Nomenclature Diagnosis Distribution Material Examined Comments Description Graphic art Species 1
Treatments come with a lot of overhead Treatments are highly structured Content ist defined
Treatments come with a lot of overhead Treatments are highly structured   Content ist defined  XML can define it
This can also be applied to entire sections of text, such as the descriptions of a species and its parts. <tax:treatment> <tax:nomenclature> <tax:name> <tax:xid source=&quot;HNS&quot; identifier=&quot;193329&quot;/> <tax:xmldata> <dc:Genus>Mystrium</dc:Genus> <dc:Species>leonie</dc:Species> </tax:xmldata> Mystrium leonie </tax:name> <tax:status>n. sp.</tax:status> Fig 1 D - F </tax:nomenclature> <tax:div type=&quot;description&quot;> <tax:p>HOLOTYPE WORKER: TL 3.95, HL 1.02, HW 0.95, CI 93, SL  1.30, SI 137, PW 0.73, ML 0.38. Mandible outer margin strongly curving  to a sharp apical tooth, the apex parallel to the anterior clypeal margin.  (Holotype with material in mandibles, so mandibles and anterior clypeus $ described below from paratypes.) Median clypeus .... </treatment>
Treatments come with a lot of overhead treatments are highly structured   Content ist defined XML defines them The question is, how to get them
Mark-up of legacy publications
$$$$$$$$$$$$$$$$$
Prospective semantic mark-up and linking to external sources is the future
Treatment repository + external resources
BHL-Modern
The future is writable.
Happy Birthday! January 15, 2001
What is a scientific publication? Wikipedia entry as a publication?
Quality control
What is a scientific publication? Centrifugal versus centripetal forces or  are we attractive enough?
Continuity
$$$$$$$
http://plazi.org Thank you very much! Donat Agosti [email_address]

Weitere ähnliche Inhalte

Was ist angesagt?

ContentMine (TDM) at JISC Digifest
ContentMine (TDM) at JISC DigifestContentMine (TDM) at JISC Digifest
ContentMine (TDM) at JISC Digifestpetermurrayrust
 
ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika!ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika!petermurrayrust
 
ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika! ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika! TheContentMine
 
OBOPedia: An Encyclopaedia of Biology Using OBO OntologiesObopedia swat4ls-20...
OBOPedia: An Encyclopaedia of Biology Using OBO OntologiesObopedia swat4ls-20...OBOPedia: An Encyclopaedia of Biology Using OBO OntologiesObopedia swat4ls-20...
OBOPedia: An Encyclopaedia of Biology Using OBO OntologiesObopedia swat4ls-20...robertstevens65
 
Content Mining of Science in Cambridge
Content Mining of Science in CambridgeContent Mining of Science in Cambridge
Content Mining of Science in CambridgeTheContentMine
 
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...Juan Antonio Vizcaino
 
Bio ontologies and semantic technologies
Bio ontologies and semantic technologiesBio ontologies and semantic technologies
Bio ontologies and semantic technologiesProf. Wim Van Criekinge
 
Ontologies: Necessary, but not sufficient
Ontologies: Necessary, but not sufficientOntologies: Necessary, but not sufficient
Ontologies: Necessary, but not sufficientrobertstevens65
 
Content Mining of Science and Medicine
Content Mining of Science and MedicineContent Mining of Science and Medicine
Content Mining of Science and MedicineTheContentMine
 
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...Juan Antonio Vizcaino
 
Ontologies neo4j-graph-workshop-berlin
Ontologies neo4j-graph-workshop-berlinOntologies neo4j-graph-workshop-berlin
Ontologies neo4j-graph-workshop-berlinSimon Jupp
 
Architecture of ContentMine Components contentmine.org
Architecture of ContentMine Components contentmine.orgArchitecture of ContentMine Components contentmine.org
Architecture of ContentMine Components contentmine.orgpetermurrayrust
 
Automatic Extraction of Knowledge from Biomedical literature
Automatic Extraction of Knowledge from Biomedical literature Automatic Extraction of Knowledge from Biomedical literature
Automatic Extraction of Knowledge from Biomedical literature TheContentMine
 
Proteomics public data resources: enabling "big data" analysis in proteomics
Proteomics public data resources: enabling "big data" analysis in proteomicsProteomics public data resources: enabling "big data" analysis in proteomics
Proteomics public data resources: enabling "big data" analysis in proteomicsJuan Antonio Vizcaino
 
2009 0807 Lod Gmod
2009 0807 Lod Gmod2009 0807 Lod Gmod
2009 0807 Lod GmodJun Zhao
 
Cochrane workshop 2016
Cochrane workshop 2016Cochrane workshop 2016
Cochrane workshop 2016TheContentMine
 

Was ist angesagt? (20)

Empirical Semantics
Empirical SemanticsEmpirical Semantics
Empirical Semantics
 
ContentMine (TDM) at JISC Digifest
ContentMine (TDM) at JISC DigifestContentMine (TDM) at JISC Digifest
ContentMine (TDM) at JISC Digifest
 
7 advanced uses of rdfs
7 advanced uses of rdfs7 advanced uses of rdfs
7 advanced uses of rdfs
 
ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika!ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika!
 
ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika! ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika!
 
20140521 sem-tech-biz-guest-lecture
20140521 sem-tech-biz-guest-lecture20140521 sem-tech-biz-guest-lecture
20140521 sem-tech-biz-guest-lecture
 
Bh14 ogo
Bh14 ogoBh14 ogo
Bh14 ogo
 
OBOPedia: An Encyclopaedia of Biology Using OBO OntologiesObopedia swat4ls-20...
OBOPedia: An Encyclopaedia of Biology Using OBO OntologiesObopedia swat4ls-20...OBOPedia: An Encyclopaedia of Biology Using OBO OntologiesObopedia swat4ls-20...
OBOPedia: An Encyclopaedia of Biology Using OBO OntologiesObopedia swat4ls-20...
 
Content Mining of Science in Cambridge
Content Mining of Science in CambridgeContent Mining of Science in Cambridge
Content Mining of Science in Cambridge
 
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
 
Bio ontologies and semantic technologies
Bio ontologies and semantic technologiesBio ontologies and semantic technologies
Bio ontologies and semantic technologies
 
Ontologies: Necessary, but not sufficient
Ontologies: Necessary, but not sufficientOntologies: Necessary, but not sufficient
Ontologies: Necessary, but not sufficient
 
Content Mining of Science and Medicine
Content Mining of Science and MedicineContent Mining of Science and Medicine
Content Mining of Science and Medicine
 
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
 
Ontologies neo4j-graph-workshop-berlin
Ontologies neo4j-graph-workshop-berlinOntologies neo4j-graph-workshop-berlin
Ontologies neo4j-graph-workshop-berlin
 
Architecture of ContentMine Components contentmine.org
Architecture of ContentMine Components contentmine.orgArchitecture of ContentMine Components contentmine.org
Architecture of ContentMine Components contentmine.org
 
Automatic Extraction of Knowledge from Biomedical literature
Automatic Extraction of Knowledge from Biomedical literature Automatic Extraction of Knowledge from Biomedical literature
Automatic Extraction of Knowledge from Biomedical literature
 
Proteomics public data resources: enabling "big data" analysis in proteomics
Proteomics public data resources: enabling "big data" analysis in proteomicsProteomics public data resources: enabling "big data" analysis in proteomics
Proteomics public data resources: enabling "big data" analysis in proteomics
 
2009 0807 Lod Gmod
2009 0807 Lod Gmod2009 0807 Lod Gmod
2009 0807 Lod Gmod
 
Cochrane workshop 2016
Cochrane workshop 2016Cochrane workshop 2016
Cochrane workshop 2016
 

Andere mochten auch

ViBRANT Project Overview
ViBRANT Project OverviewViBRANT Project Overview
ViBRANT Project Overviewvbrant
 
INOTAXA markup and its relations to ViBRANT
INOTAXA markup and its relations to ViBRANTINOTAXA markup and its relations to ViBRANT
INOTAXA markup and its relations to ViBRANTvbrant
 
Linking standards
Linking standardsLinking standards
Linking standardsvbrant
 
Citizen Science Workshop: Global Canopy Project (Jon Parsons)
Citizen Science Workshop: Global Canopy Project (Jon Parsons)Citizen Science Workshop: Global Canopy Project (Jon Parsons)
Citizen Science Workshop: Global Canopy Project (Jon Parsons)vbrant
 
Citizen Science Workshop: Comber (Sarah Faulwetter)
Citizen Science Workshop: Comber (Sarah Faulwetter)Citizen Science Workshop: Comber (Sarah Faulwetter)
Citizen Science Workshop: Comber (Sarah Faulwetter)vbrant
 
Tweddle & robinson vibrant jan 13 web
Tweddle & robinson vibrant jan 13 webTweddle & robinson vibrant jan 13 web
Tweddle & robinson vibrant jan 13 webvbrant
 
Participation, Publication, Persistence & Platforms
Participation, Publication, Persistence & PlatformsParticipation, Publication, Persistence & Platforms
Participation, Publication, Persistence & Platformsvbrant
 
I spot @ vibrant nhm 10.1.13
I spot @ vibrant nhm 10.1.13I spot @ vibrant nhm 10.1.13
I spot @ vibrant nhm 10.1.13vbrant
 
ViBRANT management arrangements
ViBRANT management arrangementsViBRANT management arrangements
ViBRANT management arrangementsvbrant
 

Andere mochten auch (9)

ViBRANT Project Overview
ViBRANT Project OverviewViBRANT Project Overview
ViBRANT Project Overview
 
INOTAXA markup and its relations to ViBRANT
INOTAXA markup and its relations to ViBRANTINOTAXA markup and its relations to ViBRANT
INOTAXA markup and its relations to ViBRANT
 
Linking standards
Linking standardsLinking standards
Linking standards
 
Citizen Science Workshop: Global Canopy Project (Jon Parsons)
Citizen Science Workshop: Global Canopy Project (Jon Parsons)Citizen Science Workshop: Global Canopy Project (Jon Parsons)
Citizen Science Workshop: Global Canopy Project (Jon Parsons)
 
Citizen Science Workshop: Comber (Sarah Faulwetter)
Citizen Science Workshop: Comber (Sarah Faulwetter)Citizen Science Workshop: Comber (Sarah Faulwetter)
Citizen Science Workshop: Comber (Sarah Faulwetter)
 
Tweddle & robinson vibrant jan 13 web
Tweddle & robinson vibrant jan 13 webTweddle & robinson vibrant jan 13 web
Tweddle & robinson vibrant jan 13 web
 
Participation, Publication, Persistence & Platforms
Participation, Publication, Persistence & PlatformsParticipation, Publication, Persistence & Platforms
Participation, Publication, Persistence & Platforms
 
I spot @ vibrant nhm 10.1.13
I spot @ vibrant nhm 10.1.13I spot @ vibrant nhm 10.1.13
I spot @ vibrant nhm 10.1.13
 
ViBRANT management arrangements
ViBRANT management arrangementsViBRANT management arrangements
ViBRANT management arrangements
 

Ähnlich wie Setting the Scene for ViBRANT – Strategy, Philosophy and Communication

20090921 Art Databanken Agosti Final
20090921 Art Databanken Agosti Final20090921 Art Databanken Agosti Final
20090921 Art Databanken Agosti Finalagosti
 
20110725 ibc xml
20110725 ibc xml20110725 ibc xml
20110725 ibc xmlagosti
 
20140317 pi b_nmbe_journal_club
20140317 pi b_nmbe_journal_club20140317 pi b_nmbe_journal_club
20140317 pi b_nmbe_journal_clubagosti
 
20140327 rda plazi_final
20140327 rda plazi_final20140327 rda plazi_final
20140327 rda plazi_finalagosti
 
High throughput mining of the scholarly literature: journals and theses
High throughput mining of the scholarly literature: journals and thesesHigh throughput mining of the scholarly literature: journals and theses
High throughput mining of the scholarly literature: journals and thesespetermurrayrust
 
20110222 behesty monitoring and measuring biodiversity
20110222 behesty monitoring and measuring biodiversity20110222 behesty monitoring and measuring biodiversity
20110222 behesty monitoring and measuring biodiversityagosti
 
Donat Agosti & Norman F. Johnson - Copyright: the new taxonomic impediment
Donat Agosti & Norman F. Johnson - Copyright: the new taxonomic impedimentDonat Agosti & Norman F. Johnson - Copyright: the new taxonomic impediment
Donat Agosti & Norman F. Johnson - Copyright: the new taxonomic impedimentICZN
 
Can machines understand the scientific literature
Can machines understand the scientific literatureCan machines understand the scientific literature
Can machines understand the scientific literaturepetermurrayrust
 
ContentMining for France and Europe; Lessons from 2 years in UK
ContentMining for France and Europe; Lessons from 2 years in UKContentMining for France and Europe; Lessons from 2 years in UK
ContentMining for France and Europe; Lessons from 2 years in UKpetermurrayrust
 
ContentMine: Mining the Scientific Literature
ContentMine: Mining the Scientific LiteratureContentMine: Mining the Scientific Literature
ContentMine: Mining the Scientific Literaturepetermurrayrust
 
A Step Towards (From) Read to Write Access to Taxonomic Publications
A Step Towards  (From) Read to Write Access to Taxonomic PublicationsA Step Towards  (From) Read to Write Access to Taxonomic Publications
A Step Towards (From) Read to Write Access to Taxonomic Publicationsagosti
 
Open Research Data: Taxonomy
Open Research Data: TaxonomyOpen Research Data: Taxonomy
Open Research Data: Taxonomyagosti
 
Can Computers understand the scientific literature (includes compscie material)
Can Computers understand the scientific literature (includes compscie material)Can Computers understand the scientific literature (includes compscie material)
Can Computers understand the scientific literature (includes compscie material)petermurrayrust
 
The Future of Microalgal Taxonomy
The Future of Microalgal TaxonomyThe Future of Microalgal Taxonomy
The Future of Microalgal TaxonomyAnne Thessen
 
Computation and Knowledge
Computation and KnowledgeComputation and Knowledge
Computation and KnowledgeIan Foster
 
Nothing in taxonomy makes sense except in the light of Open Access
Nothing in taxonomy makes sense except in the light of Open Access Nothing in taxonomy makes sense except in the light of Open Access
Nothing in taxonomy makes sense except in the light of Open Access agosti
 
Can Computers understand the scientific literature (includes compscie material)
Can Computers understand the scientific literature (includes compscie material)Can Computers understand the scientific literature (includes compscie material)
Can Computers understand the scientific literature (includes compscie material)TheContentMine
 
Introduction to Scratchpads & ViBRANT
Introduction to Scratchpads & ViBRANTIntroduction to Scratchpads & ViBRANT
Introduction to Scratchpads & ViBRANTEdward Baker
 

Ähnlich wie Setting the Scene for ViBRANT – Strategy, Philosophy and Communication (20)

20090921 Art Databanken Agosti Final
20090921 Art Databanken Agosti Final20090921 Art Databanken Agosti Final
20090921 Art Databanken Agosti Final
 
20110725 ibc xml
20110725 ibc xml20110725 ibc xml
20110725 ibc xml
 
20140317 pi b_nmbe_journal_club
20140317 pi b_nmbe_journal_club20140317 pi b_nmbe_journal_club
20140317 pi b_nmbe_journal_club
 
20140327 rda plazi_final
20140327 rda plazi_final20140327 rda plazi_final
20140327 rda plazi_final
 
High throughput mining of the scholarly literature: journals and theses
High throughput mining of the scholarly literature: journals and thesesHigh throughput mining of the scholarly literature: journals and theses
High throughput mining of the scholarly literature: journals and theses
 
20110222 behesty monitoring and measuring biodiversity
20110222 behesty monitoring and measuring biodiversity20110222 behesty monitoring and measuring biodiversity
20110222 behesty monitoring and measuring biodiversity
 
Donat Agosti & Norman F. Johnson - Copyright: the new taxonomic impediment
Donat Agosti & Norman F. Johnson - Copyright: the new taxonomic impedimentDonat Agosti & Norman F. Johnson - Copyright: the new taxonomic impediment
Donat Agosti & Norman F. Johnson - Copyright: the new taxonomic impediment
 
Can machines understand the scientific literature
Can machines understand the scientific literatureCan machines understand the scientific literature
Can machines understand the scientific literature
 
The agricultural ontology service
The agricultural ontology serviceThe agricultural ontology service
The agricultural ontology service
 
ContentMining for France and Europe; Lessons from 2 years in UK
ContentMining for France and Europe; Lessons from 2 years in UKContentMining for France and Europe; Lessons from 2 years in UK
ContentMining for France and Europe; Lessons from 2 years in UK
 
Recommandation sociale : filtrage collaboratif et par le contenu
Recommandation sociale : filtrage collaboratif et par le contenuRecommandation sociale : filtrage collaboratif et par le contenu
Recommandation sociale : filtrage collaboratif et par le contenu
 
ContentMine: Mining the Scientific Literature
ContentMine: Mining the Scientific LiteratureContentMine: Mining the Scientific Literature
ContentMine: Mining the Scientific Literature
 
A Step Towards (From) Read to Write Access to Taxonomic Publications
A Step Towards  (From) Read to Write Access to Taxonomic PublicationsA Step Towards  (From) Read to Write Access to Taxonomic Publications
A Step Towards (From) Read to Write Access to Taxonomic Publications
 
Open Research Data: Taxonomy
Open Research Data: TaxonomyOpen Research Data: Taxonomy
Open Research Data: Taxonomy
 
Can Computers understand the scientific literature (includes compscie material)
Can Computers understand the scientific literature (includes compscie material)Can Computers understand the scientific literature (includes compscie material)
Can Computers understand the scientific literature (includes compscie material)
 
The Future of Microalgal Taxonomy
The Future of Microalgal TaxonomyThe Future of Microalgal Taxonomy
The Future of Microalgal Taxonomy
 
Computation and Knowledge
Computation and KnowledgeComputation and Knowledge
Computation and Knowledge
 
Nothing in taxonomy makes sense except in the light of Open Access
Nothing in taxonomy makes sense except in the light of Open Access Nothing in taxonomy makes sense except in the light of Open Access
Nothing in taxonomy makes sense except in the light of Open Access
 
Can Computers understand the scientific literature (includes compscie material)
Can Computers understand the scientific literature (includes compscie material)Can Computers understand the scientific literature (includes compscie material)
Can Computers understand the scientific literature (includes compscie material)
 
Introduction to Scratchpads & ViBRANT
Introduction to Scratchpads & ViBRANTIntroduction to Scratchpads & ViBRANT
Introduction to Scratchpads & ViBRANT
 

Mehr von vbrant

Setting the Scene for ViBRANT – Strategy, Philosophy and Communication
Setting the Scene for ViBRANT – Strategy, Philosophy and CommunicationSetting the Scene for ViBRANT – Strategy, Philosophy and Communication
Setting the Scene for ViBRANT – Strategy, Philosophy and Communicationvbrant
 
Welcome and logistics
Welcome and logisticsWelcome and logistics
Welcome and logisticsvbrant
 
Lessons learnt from EDIT - linking taxonomy and conservation
Lessons learnt from EDIT - linking taxonomy and conservationLessons learnt from EDIT - linking taxonomy and conservation
Lessons learnt from EDIT - linking taxonomy and conservationvbrant
 
The Path to Enlightened Solutions for Biodiversity's Dark Data
The Path to Enlightened Solutions for Biodiversity's Dark DataThe Path to Enlightened Solutions for Biodiversity's Dark Data
The Path to Enlightened Solutions for Biodiversity's Dark Datavbrant
 
Search portal
Search portalSearch portal
Search portalvbrant
 
WP2 Overview (Technical architecture)
WP2 Overview (Technical architecture)WP2 Overview (Technical architecture)
WP2 Overview (Technical architecture)vbrant
 
Semantic data mining of literature
Semantic data mining of literatureSemantic data mining of literature
Semantic data mining of literaturevbrant
 
Nothing can possbily go wrong – Risk Analysis discussion
Nothing can possbily go wrong – Risk Analysis discussionNothing can possbily go wrong – Risk Analysis discussion
Nothing can possbily go wrong – Risk Analysis discussionvbrant
 
Content Markup / Plazi
Content Markup / PlaziContent Markup / Plazi
Content Markup / Plazivbrant
 
WP7 Overview (Biodiversity literature)
WP7 Overview (Biodiversity literature)WP7 Overview (Biodiversity literature)
WP7 Overview (Biodiversity literature)vbrant
 
XML-based editorial workflow, or how to extract more value from the same source?
XML-based editorial workflow, or how to extract more value from the same source?XML-based editorial workflow, or how to extract more value from the same source?
XML-based editorial workflow, or how to extract more value from the same source?vbrant
 
Gathering data for publications
Gathering data for publicationsGathering data for publications
Gathering data for publicationsvbrant
 
WP6 Overview: From prototypes to industry standards: Markup, semantic enhance...
WP6 Overview: From prototypes to industry standards: Markup, semantic enhance...WP6 Overview: From prototypes to industry standards: Markup, semantic enhance...
WP6 Overview: From prototypes to industry standards: Markup, semantic enhance...vbrant
 
Mobile phone apps monitoring biodiversity/Biodiversity indices
Mobile phone apps monitoring biodiversity/Biodiversity indicesMobile phone apps monitoring biodiversity/Biodiversity indices
Mobile phone apps monitoring biodiversity/Biodiversity indicesvbrant
 
Descriptive data supporting keys – SDD
Descriptive data supporting keys – SDDDescriptive data supporting keys – SDD
Descriptive data supporting keys – SDDvbrant
 
Ontologies for description, classification and identification processes
Ontologies for description, classification and identification processesOntologies for description, classification and identification processes
Ontologies for description, classification and identification processesvbrant
 
Identification keys
Identification keysIdentification keys
Identification keysvbrant
 
WP5 Overview (Data services)
WP5 Overview (Data services)WP5 Overview (Data services)
WP5 Overview (Data services)vbrant
 
Terrestrial data logging
Terrestrial data loggingTerrestrial data logging
Terrestrial data loggingvbrant
 
GBIF nodes and data enhancement activities
GBIF nodes and data enhancement activitiesGBIF nodes and data enhancement activities
GBIF nodes and data enhancement activitiesvbrant
 

Mehr von vbrant (20)

Setting the Scene for ViBRANT – Strategy, Philosophy and Communication
Setting the Scene for ViBRANT – Strategy, Philosophy and CommunicationSetting the Scene for ViBRANT – Strategy, Philosophy and Communication
Setting the Scene for ViBRANT – Strategy, Philosophy and Communication
 
Welcome and logistics
Welcome and logisticsWelcome and logistics
Welcome and logistics
 
Lessons learnt from EDIT - linking taxonomy and conservation
Lessons learnt from EDIT - linking taxonomy and conservationLessons learnt from EDIT - linking taxonomy and conservation
Lessons learnt from EDIT - linking taxonomy and conservation
 
The Path to Enlightened Solutions for Biodiversity's Dark Data
The Path to Enlightened Solutions for Biodiversity's Dark DataThe Path to Enlightened Solutions for Biodiversity's Dark Data
The Path to Enlightened Solutions for Biodiversity's Dark Data
 
Search portal
Search portalSearch portal
Search portal
 
WP2 Overview (Technical architecture)
WP2 Overview (Technical architecture)WP2 Overview (Technical architecture)
WP2 Overview (Technical architecture)
 
Semantic data mining of literature
Semantic data mining of literatureSemantic data mining of literature
Semantic data mining of literature
 
Nothing can possbily go wrong – Risk Analysis discussion
Nothing can possbily go wrong – Risk Analysis discussionNothing can possbily go wrong – Risk Analysis discussion
Nothing can possbily go wrong – Risk Analysis discussion
 
Content Markup / Plazi
Content Markup / PlaziContent Markup / Plazi
Content Markup / Plazi
 
WP7 Overview (Biodiversity literature)
WP7 Overview (Biodiversity literature)WP7 Overview (Biodiversity literature)
WP7 Overview (Biodiversity literature)
 
XML-based editorial workflow, or how to extract more value from the same source?
XML-based editorial workflow, or how to extract more value from the same source?XML-based editorial workflow, or how to extract more value from the same source?
XML-based editorial workflow, or how to extract more value from the same source?
 
Gathering data for publications
Gathering data for publicationsGathering data for publications
Gathering data for publications
 
WP6 Overview: From prototypes to industry standards: Markup, semantic enhance...
WP6 Overview: From prototypes to industry standards: Markup, semantic enhance...WP6 Overview: From prototypes to industry standards: Markup, semantic enhance...
WP6 Overview: From prototypes to industry standards: Markup, semantic enhance...
 
Mobile phone apps monitoring biodiversity/Biodiversity indices
Mobile phone apps monitoring biodiversity/Biodiversity indicesMobile phone apps monitoring biodiversity/Biodiversity indices
Mobile phone apps monitoring biodiversity/Biodiversity indices
 
Descriptive data supporting keys – SDD
Descriptive data supporting keys – SDDDescriptive data supporting keys – SDD
Descriptive data supporting keys – SDD
 
Ontologies for description, classification and identification processes
Ontologies for description, classification and identification processesOntologies for description, classification and identification processes
Ontologies for description, classification and identification processes
 
Identification keys
Identification keysIdentification keys
Identification keys
 
WP5 Overview (Data services)
WP5 Overview (Data services)WP5 Overview (Data services)
WP5 Overview (Data services)
 
Terrestrial data logging
Terrestrial data loggingTerrestrial data logging
Terrestrial data logging
 
GBIF nodes and data enhancement activities
GBIF nodes and data enhancement activitiesGBIF nodes and data enhancement activities
GBIF nodes and data enhancement activities
 

Setting the Scene for ViBRANT – Strategy, Philosophy and Communication

  • 1. The Future of Scientific Publishing Donat Agosti (Plazi, Bern) 21 January 2011 Paris
  • 2. I don‘t know the future, but I have a dream…
  • 3. Immersing in the knowledge
  • 4. I want to ask a publication a question, not the author telling me what I have to read.
  • 5. I want to find out how many and which species are there? how are they related? do they disappear? how are they distributed?
  • 6. I want to find out how many and which species there are how are they related do they disappear Other people have different interests
  • 7.
  • 8.  
  • 9. In a semantic Web environment (where machines talk to each other and do most of our work), data need to be able to talk to each other: “ protein-protein interaction networks” John Wilbanks, Neurocommons 27,266 papers 4,563 papers 41,985 papers 10,365 papers 128,437 papers
  • 10. It will open up scientific literature for data mining “ protein-protein interaction networks” John Wilbanks, Neurocommons
  • 11.
  • 12. 1996 Conservation, Phylogeny, Systematics, Curiosity, Aesthetics, Fascination
  • 13. 2011 Experience, Frustration, Wonder, Excitment, Satisfaction, Determination
  • 14. Modeling taxonomic literature: TaxonX Taxpub NLM DTD Plazi
  • 15.
  • 16. The semantically enhanced treatments, extracted, stored on Plazi.org, and served in a human readable form, are linked to the underlying data: Fisher & Smith, 2008, PLoS ONE.
  • 17. Plazi Search and Retrieval Server: Access to data TAPIR, SPM You You You human machine
  • 18. The conversion comes at a cost, even though GoldenGate and other editors exist
  • 19. Time per minute to produce clean OCR using ABBYY; publications in chronological order Production metrics to measure effort and compare various approaches and alogrithm
  • 20. How to mark up large body of legacy publications? Inhouse? Build / use commercial services? Use the community, e.g. volunteers? Activation energy Gutenberg Semantic Web Cost per knowledge
  • 24. Semantic enhancements to published texts
  • 26. Why do we publish?
  • 28. Contribute to the welfare of the nations…
  • 31. Before antbase.org, Harvard‘s Museum of Comparative Zoology could claim to be the only location with a complete set of ant systematics publications from 1758 - present. Through antbase.org‘s digital library, access to this body of literature is worldwide, and it is actively used (>10,000 visits in one month only).
  • 32. Access to ant taxonomic publications through antbase.org /Smithsonian Institution, including currently the entire body of non-copyrighted publications since 1758 (>4,000 publications or 85,000 pages)
  • 33. The Biodiversity Heritage Library is currently digitizing and make accessible >100 million pages, most of them out of copyright, ie older then 1925. ........ to be finished in 2048...
  • 34. What is a publication from public funded science?
  • 35.  
  • 37. What is a scientific publication? Print, journal, article, treatment, public funding, pdf, xml Tool to disseminate scientific knowledge
  • 38. Why do we publish the way we publish?
  • 39. What kind of publications serve our needs?
  • 40. IPBES
  • 44. Scratchpad, EOL page, Wikipage, species page
  • 46. Treatments come with a lot of overhead
  • 47. Genus Diagnosis Notes Biology Distribution Key to sp. Species descriptions The structure of a systematics publication Species treatments Title Author Abstract Introduction Taxon descriptions Suppl. Materials Acknowledgments References Species 1 Species 2 Species 3 Species 4 Species .. Species n Nomenclature Diagnosis Distribution Material Examined Comments Description Graphic art Species 1
  • 48. Treatments come with a lot of overhead Treatments are highly structured
  • 49. Genus Diagnosis Notes Biology Distribution Key to sp. Species descriptions The structure of a systematics publication Species treatments Title Author Abstract Introduction Taxon descriptions Suppl. Materials Acknowledgments References Species 1 Species 2 Species 3 Species 4 Species .. Species n Nomenclature Diagnosis Distribution Material Examined Comments Description Graphic art Species 1
  • 50. Treatments come with a lot of overhead Treatments are highly structured Content ist defined
  • 51. Treatments come with a lot of overhead Treatments are highly structured Content ist defined XML can define it
  • 52. This can also be applied to entire sections of text, such as the descriptions of a species and its parts. <tax:treatment> <tax:nomenclature> <tax:name> <tax:xid source=&quot;HNS&quot; identifier=&quot;193329&quot;/> <tax:xmldata> <dc:Genus>Mystrium</dc:Genus> <dc:Species>leonie</dc:Species> </tax:xmldata> Mystrium leonie </tax:name> <tax:status>n. sp.</tax:status> Fig 1 D - F </tax:nomenclature> <tax:div type=&quot;description&quot;> <tax:p>HOLOTYPE WORKER: TL 3.95, HL 1.02, HW 0.95, CI 93, SL 1.30, SI 137, PW 0.73, ML 0.38. Mandible outer margin strongly curving to a sharp apical tooth, the apex parallel to the anterior clypeal margin. (Holotype with material in mandibles, so mandibles and anterior clypeus $ described below from paratypes.) Median clypeus .... </treatment>
  • 53. Treatments come with a lot of overhead treatments are highly structured Content ist defined XML defines them The question is, how to get them
  • 54. Mark-up of legacy publications
  • 56. Prospective semantic mark-up and linking to external sources is the future
  • 57. Treatment repository + external resources
  • 59. The future is writable.
  • 61. What is a scientific publication? Wikipedia entry as a publication?
  • 63. What is a scientific publication? Centrifugal versus centripetal forces or are we attractive enough?
  • 66. http://plazi.org Thank you very much! Donat Agosti [email_address]

Hinweis der Redaktion

  1. Notes: Add in Plazi and the idea of the treatment server
  2. Public funded science – what we talk about today – not military or industrial funded science has the opposite business model. It‘s funding includes the creation of a product (publication). Commerical publishing creates its resources from selling publications. The task of the scientific community is to disseminate its findings as widely as possible. Therefore, barriers linked to copyright need be avoided: passwords, pay per view, etc. Find information is the paramount act in science, and thus every impediment for the discovery must be removed. This does not exclude that we need a working business model. But it does also not need that we become opportunists and led control over our data slip (see context, bibliographic data, other databases)
  3. Building a DNA – go to the library and find all about the DNA
  4. Reuse of information
  5. Reuse of information
  6. Reuse of information
  7. Measuring and monitoring biodiversity
  8. Public funded science – what we talk about today – not military or industrial funded science has the opposite business model. It‘s funding includes the creation of a product (publication). Commerical publishing creates its resources from selling publications. The task of the scientific community is to disseminate its findings as widely as possible. Therefore, barriers linked to copyright need be avoided: passwords, pay per view, etc. Find information is the paramount act in science, and thus every impediment for the discovery must be removed. This does not exclude that we need a working business model. But it does also not need that we become opportunists and led control over our data slip (see context, bibliographic data, other databases)
  9. Public funded science – what we talk about today – not military or industrial funded science has the opposite business model. It‘s funding includes the creation of a product (publication). Commerical publishing creates its resources from selling publications. The task of the scientific community is to disseminate its findings as widely as possible. Therefore, barriers linked to copyright need be avoided: passwords, pay per view, etc. Find information is the paramount act in science, and thus every impediment for the discovery must be removed. This does not exclude that we need a working business model. But it does also not need that we become opportunists and led control over our data slip (see context, bibliographic data, other databases)
  10. Public funded science – what we talk about today – not military or industrial funded science has the opposite business model. It‘s funding includes the creation of a product (publication). Commerical publishing creates its resources from selling publications. The task of the scientific community is to disseminate its findings as widely as possible. Therefore, barriers linked to copyright need be avoided: passwords, pay per view, etc. Find information is the paramount act in science, and thus every impediment for the discovery must be removed. This does not exclude that we need a working business model. But it does also not need that we become opportunists and led control over our data slip (see context, bibliographic data, other databases)
  11. Public funded science – what we talk about today – not military or industrial funded science has the opposite business model. It‘s funding includes the creation of a product (publication). Commerical publishing creates its resources from selling publications. The task of the scientific community is to disseminate its findings as widely as possible. Therefore, barriers linked to copyright need be avoided: passwords, pay per view, etc. Find information is the paramount act in science, and thus every impediment for the discovery must be removed. This does not exclude that we need a working business model. But it does also not need that we become opportunists and led control over our data slip (see context, bibliographic data, other databases)
  12. Public funded science – what we talk about today – not military or industrial funded science has the opposite business model. It‘s funding includes the creation of a product (publication). Commerical publishing creates its resources from selling publications. The task of the scientific community is to disseminate its findings as widely as possible. Therefore, barriers linked to copyright need be avoided: passwords, pay per view, etc. Find information is the paramount act in science, and thus every impediment for the discovery must be removed. This does not exclude that we need a working business model. But it does also not need that we become opportunists and led control over our data slip (see context, bibliographic data, other databases)
  13. Public funded science – what we talk about today – not military or industrial funded science has the opposite business model. It‘s funding includes the creation of a product (publication). Commerical publishing creates its resources from selling publications. The task of the scientific community is to disseminate its findings as widely as possible. Therefore, barriers linked to copyright need be avoided: passwords, pay per view, etc. Find information is the paramount act in science, and thus every impediment for the discovery must be removed. This does not exclude that we need a working business model. But it does also not need that we become opportunists and led control over our data slip (see context, bibliographic data, other databases)
  14. Part of scientific discourse Records Public funded science – what we talk about today – not military or industrial funded science has the opposite business model. It‘s funding includes the creation of a product (publication). Commerical publishing creates its resources from selling publications. The task of the scientific community is to disseminate its findings as widely as possible. Therefore, barriers linked to copyright need be avoided: passwords, pay per view, etc. Find information is the paramount act in science, and thus every impediment for the discovery must be removed. This does not exclude that we need a working business model. But it does also not need that we become opportunists and led control over our data slip (see context, bibliographic data, other databases)
  15. Part of scientific discourse Records Public funded science – what we talk about today – not military or industrial funded science has the opposite business model. It‘s funding includes the creation of a product (publication). Commerical publishing creates its resources from selling publications. The task of the scientific community is to disseminate its findings as widely as possible. Therefore, barriers linked to copyright need be avoided: passwords, pay per view, etc. Find information is the paramount act in science, and thus every impediment for the discovery must be removed. This does not exclude that we need a working business model. But it does also not need that we become opportunists and led control over our data slip (see context, bibliographic data, other databases)
  16. Acces is enough?
  17. Acces is enough?
  18. Acces is enough?
  19. Public funded science – what we talk about today – not military or industrial funded science has the opposite business model. It‘s funding includes the creation of a product (publication). Commerical publishing creates its resources from selling publications. The task of the scientific community is to disseminate its findings as widely as possible. Therefore, barriers linked to copyright need be avoided: passwords, pay per view, etc. Find information is the paramount act in science, and thus every impediment for the discovery must be removed. This does not exclude that we need a working business model. But it does also not need that we become opportunists and led control over our data slip (see context, bibliographic data, other databases) From print to pdf to content
  20. Public funded science – what we talk about today – not military or industrial funded science has the opposite business model. It‘s funding includes the creation of a product (publication). Commerical publishing creates its resources from selling publications. The task of the scientific community is to disseminate its findings as widely as possible. Therefore, barriers linked to copyright need be avoided: passwords, pay per view, etc. Find information is the paramount act in science, and thus every impediment for the discovery must be removed. This does not exclude that we need a working business model. But it does also not need that we become opportunists and led control over our data slip (see context, bibliographic data, other databases)
  21. Public funded science – what we talk about today – not military or industrial funded science has the opposite business model. It‘s funding includes the creation of a product (publication). Commerical publishing creates its resources from selling publications. The task of the scientific community is to disseminate its findings as widely as possible. Therefore, barriers linked to copyright need be avoided: passwords, pay per view, etc. Find information is the paramount act in science, and thus every impediment for the discovery must be removed. This does not exclude that we need a working business model. But it does also not need that we become opportunists and led control over our data slip (see context, bibliographic data, other databases)
  22. What is context? Can we afford to create and maintain context? If you bet, what is the limiting factor for out future? Where is what hosted? Who is paying for it? Do we need cross-sectoral financing? What is the role of the natural history museums? What is after the infotainment wave? Intelligent customers as opposed to consumers?
  23. Where is the border of science and where not?
  24. Part of scientific discourse Records Public funded science – what we talk about today – not military or industrial funded science has the opposite business model. It‘s funding includes the creation of a product (publication). Commerical publishing creates its resources from selling publications. The task of the scientific community is to disseminate its findings as widely as possible. Therefore, barriers linked to copyright need be avoided: passwords, pay per view, etc. Find information is the paramount act in science, and thus every impediment for the discovery must be removed. This does not exclude that we need a working business model. But it does also not need that we become opportunists and led control over our data slip (see context, bibliographic data, other databases)