Donat Agosti Plazi
http://plazi.org
OpenData.ch/2015
1. Juli, Uni Bern
Open Biodiversity Data:
Auf der Suche nach den verlorenen
Arten
Lateinische
Namen als
Zugang zur
Wissenschaft
Treatment
XML
RDF
Die wissenschaftliche Herausforderung
1 tnntttccca cgaataaata atataagatt ttgattatta cctccttctt taattttatt
61 attatcaaga agattagttt ataaaggagt aggaacagga tgaactgttt atcctccttt
121 atctaataat ttatatcata atggattttc aactgattta gcaatttttt ctttacatat
181 tgcaggaata tcatcaatta taggagcaat taattttatt tcaacaattt taaatataca
241 tcataaaaat ttatcattag ataaaattcc attgttagtt tgatcaattt taattacagc
301 tattttatta ttattatctt tacctgtatt agcaggtgca attactatat tattaactga
361 tcgaaatcta aatacaactt tttttgatcc ttcgggtgga ggagatccaa ttttatatca
421 acatttattt
Die wissenschaftliche Herausforderung
Die wissenschaftliche Herausforderung
Die wissenschaftliche Herausforderung
LODPDF
HNS
H
Die wissenschaftliche Herausforderung
Die wissenschaftliche Herausforderung
Die wissenschaftliche Herausforderung
Die Plazi Vision: Giant Global Biodiversity Graph
Legal
Social
Technical
Ontologies
Infrastructure
500 M
pages 5*
What does this mean?
The Linking Open Data cloud diagram
Linked Open Data Cloud
Plazi Arbeitsablauf
Plazi
SRS
find scan «OCR» markup store +
access
Schwerpunkt auf Zugang zu Biodiversitätsdaten.
Text
<tax:treatment>
<tax:nomenclature>
<tax:name>
<tax:xid source="HNS" identifier="193329"/>
<tax:xmldata>
<dc:Genus>Mystrium</dc:Genus>
<dc:Species>leonie</dc:Species>
</tax:xmldata>
Mystrium leonie
</tax:name>
<tax:status>n. sp.</tax:status>
Fig 1 D - F
</tax:nomenclature>
<tax:div type="description">
<tax:p>HOLOTYPE WORKER: TL 3.95, HL 1.02, HW 0.95, CI
1.30, SI 137, PW 0.73, ML 0.38. Mandible outer margi
to a sharp apical tooth, the apex parallel to the an
(Holotype with material in mandibles, so mandibles a
$ described below from paratypes.) Median clypeus
....
</treatment>
Semantisch
erweiterter Text
(TaxonX)
… alternatives: From human to machine readable text
RDF
Treatment
Verlinkung der Daten mit externen Referenzen
5*
2014
NCBI
Zugang zu wissenschaftlicher Literatur: DOI via Zenodo/CERN
Plazi Arbeitsablauf: wissenschaftliche Namen
Plazi Arbeitsablauf: Tabellen
«Treatment»
Wissenschaftliche Artname
Verbreitungsnachweis
Cataglyphis tartessica workers
Variable mean ± SD
Head length 11.23 ± 0.12
Head width 11.15 ± 0.12
Scape length 11.47 ± 0.12
Mesosoma length 11.94 ± 0.16
Femur length 12.03 ± 0.14
Cephalic index 0 93.60 ± 3.940
Scape index 128.10 ± 7.660
Plazi Arbeitsablauf: Auffinden und “parsing” von Bibliographischen Referenzen
Plazi Arbeitsablauf: Datamining für Beobachtungsdaten
Plazi Arbeitsablauf: “Treatment”: eine Kombination von Werkzeugen
Status quo
• 50,000+ treatments life
• RDF in Betaversion
• GoldenGate Imagine (Text mining tool) in Betaversion
• Provider für Daten für NCBI, GBIF, EOL, antweb
• Biodiversity Literature Repository functional
Next steps
• 1 Million treatments life
• RDF Version 1
• GoldenGate Imagine (Text mining tool) Version 1
• Provider für Daten für NCBI, GBIF, EOL, antweb
• Biodiversity Literature Repository mit 100,000
Bibliographischen Referenzen und digitalen Versionen
Danke!
Donat Agosti
agosti@plazi.org

20150701 opendata bern_agosti_2

Hinweis der Redaktion

  • #2 Notes: Add in Plazi and the idea of the treatment server
  • #3 Notes: Add in Plazi and the idea of the treatment server
  • #4 Notes: Add in Plazi and the idea of the treatment server
  • #5 Notes: Add in Plazi and the idea of the treatment server
  • #6 Notes: Add in Plazi and the idea of the treatment server
  • #7 Notes: Add in Plazi and the idea of the treatment server
  • #8 Notes: Add in Plazi and the idea of the treatment server
  • #9 Notes: Add in Plazi and the idea of the treatment server
  • #10 Notes: Add in Plazi and the idea of the treatment server
  • #11 Notes: Add in Plazi and the idea of the treatment server
  • #12 Notes: Add in Plazi and the idea of the treatment server
  • #13 Notes: Add in Plazi and the idea of the treatment server
  • #14 The Linking Open Data cloud diagram This web page is the home of the LOD cloud diagram. This image shows datasets that have been published in Linked Data format, by contributors to the Linking Open Data community project and other individuals and organisations. It is based on metadata collected and curated by contributors to the CKAN directory. Clicking the image will take you to an image map, where each dataset is a hyperlink to its homepage.
  • #15 Notes: Add in Plazi and the idea of the treatment server
  • #16 Notes: Add in Plazi and the idea of the treatment server
  • #17 Notes: Add in Plazi and the idea of the treatment server
  • #18 Notes: Add in Plazi and the idea of the treatment server
  • #19 Notes: Add in Plazi and the idea of the treatment server
  • #20 Notes: Add in Plazi and the idea of the treatment server
  • #21 Notes: Add in Plazi and the idea of the treatment server
  • #23 The Linking Open Data cloud diagram This web page is the home of the LOD cloud diagram. This image shows datasets that have been published in Linked Data format, by contributors to the Linking Open Data community project and other individuals and organisations. It is based on metadata collected and curated by contributors to the CKAN directory. Clicking the image will take you to an image map, where each dataset is a hyperlink to its homepage.
  • #24 Notes: Add in Plazi and the idea of the treatment server
  • #26 Who are we?
  • #27 Who are we?
  • #28 Notes: Add in Plazi and the idea of the treatment server
  • #29 Notes: Add in Plazi and the idea of the treatment server
  • #30 Notes: Add in Plazi and the idea of the treatment server
  • #31 Notes: Add in Plazi and the idea of the treatment server
  • #32 Notes: Add in Plazi and the idea of the treatment server
  • #33 Notes: Add in Plazi and the idea of the treatment server
  • #34 Notes: Add in Plazi and the idea of the treatment server