Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

Force11: Enabling transparency and efficiency in the research landscape

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Wird geladen in …3
×

Hier ansehen

1 von 34 Anzeige

Force11: Enabling transparency and efficiency in the research landscape

Herunterladen, um offline zu lesen

Presented at the Feb 2015, NISO Virtual Conference
Scientific Data Management: Caring for Your Institution and its Intellectual Wealth
http://www.niso.org/news/events/2015/virtual_conferences/sci_data_management/

Presented at the Feb 2015, NISO Virtual Conference
Scientific Data Management: Caring for Your Institution and its Intellectual Wealth
http://www.niso.org/news/events/2015/virtual_conferences/sci_data_management/

Anzeige
Anzeige

Weitere Verwandte Inhalte

Diashows für Sie (20)

Ähnlich wie Force11: Enabling transparency and efficiency in the research landscape (20)

Anzeige

Weitere von mhaendel (20)

Aktuellste (20)

Anzeige

Force11: Enabling transparency and efficiency in the research landscape

  1. 1. Melissa Haendel, PhD Oregon Health & Science University Future of Research Communications and E-Scholarship Enabling transparency and efficiency in the research landscape @force11rescomm@ontowonka
  2. 2. Do an experiment Publish your results Research pre-Web: Document in a lab notebook
  3. 3. The Research Life Cycle TECHNIQUE COLLABORATION PUBLICATIONDATASET GRANT
  4. 4. Impetus for change: Is our current method serving science? 47/50 major preclinical published cancer studies could not be replicated “The scientific community assumes that the claims in a preclinical study can be taken at face value- that although there might be some errors in detail, the main message of the paper can be relied on and the data will, for the most part, stand the test of time. Unfortunately, this is not always the case.” Begley and Ellis, 29 MARCH 2012 | VOL 483 | NATURE | 531
  5. 5. Not all content is available for synthesis and discovery Search PubMed: Spinal Muscular Atrophy
  6. 6. The scientific corpus is fragmented  ~25 million articles total, each covering a fragment of the biomedical space  Each publisher owns a fragment of a particular field  The current process is inefficient and slow Wiley Elsevier MacMillian Oxford Spinal Muscular Atrophy
  7. 7. Committee on Academic Promotions What Counts  Money  Grants  Papers  Teaching  Service What Does Not  Sharing data  Sharing software  Open access  Collaboration  Patents  Startups Getting Ahead as a Computational Biologist in Academia PLOS Comp Biol doi:10.1371/journal.pcbi.1002001
  8. 8. Beyond the PDF  Conference/unconference where all stakeholders come together as equals to discuss issues – Publishers – Technologists – Scholars – Library scientists – Humanists – Policy makers – Funders  Incubator for change  What would you do to change scholarly communication? San Diego, Jan 2011 ...... Amsterdam, March 2013........Oxford, 2015 http://www.force11.org/beyondthepdf2
  9. 9. FORCE11 Future of Research Communications and E- Scholarship: A grass roots effort to accelerate the pace and nature of scholarly communications and e-scholarship through technology, education and community Why 11? We were born in 2011 in Dagstuhl, Germany Principles laid out in the FORCE11 Manifesto FORCE11 launched in July 2012 www.force11.org @
  10. 10. Promote community, cross- fertilization and interoperability  FORCE11 helps facilitate communications across disciplines and communities  Issues are not identical but we can learn from each other  Community platform – Meetings – Discussions – Tools and resources – Blogs – Event calendar – Community projects  Working groups – Data Citation – Resource identification initiative – Attribution – Data standards/Biosharing
  11. 11. Data Citation Working Group  FORCE11 provides a neutral space for bringing groups together  35 individuals representing > 20 organizations concerned with data citation  Conducted a review of current data citation recommendations from 4 different organizations  Arrived at consensus principles http://www.force11.org/datacitation
  12. 12. Data Citation Principles  Consensus Data Citation principles ready for comment  Designed to be high level and easy to understand 1. Importance 2. Credit and Attribution 3. Evidence 4. Unique identifiers 5. Access 6. Persistence 7. Versioning 8. Interoperability and flexibility
  13. 13. Data Citation Implementation https://www.force11.org/datacitationimplementation https://peerj.com/preprints/697/
  14. 14. BioCADDIE Data Discovery Index https://www.force11.org/group/biocaddie/cewg
  15. 15. Challenge: Working with Web Data  Often have inadequate descriptions so we don’t know what they are about or how they were constructed  Datasets change over time, but often don’t come with versioning information  May have been constructed using other data, but it’s not clear which version of data was used or whether these were modified  Data may be available in a variety of formats  There may be multiple copies of data from different providers, but it’s unclear if they are exact copies or derivatives  Version of standard or vocabulary used not indicated  Data registries are not synchronized and can contain conflicting information
  16. 16. W3C HCLS Dataset Description  Develop a guidance note for reusing existing vocabularies to describe datasets with RDF – Mandatory, recommended, optional descriptors – Identifiers – Versioning – Attribution – Provenance – Content summarization  Recommend vocabulary-linked attributes and value sets  Provide reference editor and validation
  17. 17. Metadata Model: description – version – distribution http://tiny.cc/hcls-datadesc
  18. 18. On another planet the FORCE was strong…..
  19. 19. Journal guidelines for methods are often poor and space is limited “All companies from which materials were obtained should be listed.” - A well-known journal Reproducibility is dependent at a minimum, on using the same resources. But…
  20. 20. How identifiable are resources in the published literature?
  21. 21. Only ~50% of resources were identifiable Vasilevsky et al, 2013, PeerJ
  22. 22. There is no correlation between impact factor and resource identification Journal Impact Factor 0 10 20 30 40 Fractionofresourcesidentified 0.0 0.2 0.4 0.6 0.8 1.0 Antibodies Cell Lines Constructs Knockdown reagents Organisms
  23. 23. http://www.force11.org/Resource_Identification_Initiative Numerous endorsers https://www.force11.org/RII/SignUp Implementation of the new standard http://biosharing.org/bsg-000532 RRIDs should be: Machine Readable Consistent across publishers and journals Free to generate and access
  24. 24. Sample citation: Polyclonal rabbit anti- MAPK3 antibody, Abgent, Cat# AP7251E, RRID:AB_2140114 1. Research er submits a manuscri pt for publicatio n 2. Editor or Publisher asks for inclusion of RRID 3. Author goes to Research Identification Portal to locate RRID 4. RRID is included in Methods section and as Keyword Publishing Workflow
  25. 25. What is the relationship of a person to a publication?
  26. 26. Example Scenario  Melissa creates mouse1  David creates mouse2  Layne uses performs RNAseq analysis on mouse1 and mouse2 to generate dataset3, which he subsequently curates and analyzes  Layne writes publication pmid:12345 about the results of his analysis  Layne explicitly credits Melissa as an author but not David.
  27. 27. Credit is connected => Credit to Melissa is asserted, but credit to David can be inferred
  28. 28. Attribution Working Group https://www.force11.org/group/attributionwg Project CredIT VIVO-ISF ontology PROV the Becker model Transitive credit The Scholarly Contributions and Roles ontology Goal is catalyze rapid convergence on requirements, approaches, and practical implementation of a system for tracking contributions to any scholarly product.
  29. 29. The 1K Challenge What would you do with £1k today to make research communication better, anticipating the increasing scale of people and machines?
  30. 30. Starting at Ground Zero CONSULTATIONS Researcher + 2-3 from Data Stewardship Team
  31. 31.  Researchers DO need assistance:  Finding and choosing data standards  File versioning  Applying metadata to facilitate data sharing  “Gummi Bear” themed data management exercise resonated well with students  Lack of awareness of services and expertise offered by the Library  OHSU Library is developing data services for researchers http://laughingsquid.com/the-anatomy-of-a- gummy-bear-by-jason-freeny/ Conclusions and new directions DOI:10.6083/M4QC0273
  32. 32. https://www.force11.org/force2015/1k-challenge-vote Join the Force11: https://www.force11.org/ “Meta Makes My Machine Marvellous (5M)” “Crowdreviewing: the sharing economy at its finest” “Science bots” “scientific articles are too expensive to publish and to read”
  33. 33. FORCE11 Vision • Modern technologies enable vastly improve knowledge transfer and far wider impact; freed from the restrictions of paper, numerous advantages appear • We see a future in which scientific information and scholarly communication more generally become part of a global, universal and explicit network of knowledge • To enable this vision, we need to create and use new forms of scholarly publication that work with reusable scholarly artifacts • To obtain the benefits that networked knowledge promises, we have to put in place reward systems that encourage scholars and researchers to participate and contribute • To ensure that this exciting future can develop and be sustained, we have to support the rich, variegated, integrated and disparate knowledge offerings that new technologies enable What is the 21st century equivalent of the library?
  34. 34. Acknowledgements Maryann Martone Phil Bourne Michel Dumontier Nicole Vasilevsky Stephanie Hagstrom And all 1000+ members of

Hinweis der Redaktion

  • Science used to be pretty linear, and slow.
    Clone by phone.
  • Now science is a web of interconnected resources and activities, only a portion of which is the scientific literature.
  • Should science be reproducible? Can it be? How would we make it so? How will we evaluate reproducibility? What does the scholarly article need to be or connect to to make it a venue for reproducibility?
  • First 6 results in Pub Med for SMA: Can’t access, 3 different publishers. Only one is freely available.
  • This WG came out of the first one. Example here are recommendations having to do with allowing metadata identifier systems.
    Paper is in preprint and will be out soon.
  • NIH funded BD2K initiative to develop recommendations for a data discovery index.
  • 84 journals, 248 papers, 5 disciplines, 3 impact factor ranges, 3 reporting guideline stringencies
  • RRID Working group, has numerous publishers and journals that have implemented.
  • We are working on determining how to deal with this longer term- is this a new data citation that goes alongside the paper. Needs to be in the keywords do it is mineable.
  • Not all contributions to a work end up in an authorship
  • A graph representing this scenario. Note also that we intentionally attributed melissa on the publication, but not david. David’s attribution could be inferred from the graph.
  • There are many contributors to the work presented.
    Some of the slides in this deck are directly adapted or borrowed from the above people, thank you very much.
    Maryann is currently the president of Force11.
    Phil was instrumental in helping start Force11.
    Michel is co-leading the HCLS data set description
    Nicole did the research resource identification project
    Stephanie keeps the Force in Force11

×