SlideShare ist ein Scribd-Unternehmen logo
1 von 35
National Center for Supercomputing Applications
University of Illinois at Urbana–Champaign
Software Citation:
Principles, Discussion, and Metadata
Daniel S. Katz
Associate Director for Scientific Software & Applications, NCSA
Research Associate Professor, ECE
Research Associate Professor, GSLIS
dskatz@illinois.edu, d.katz@ieee.org, @danielskatz
Principles work: with Arfon M. Smith, Kyle E. Niemeyer & WG
Metadata work: by Matthew B. Jones and Carl Boettiger
General Motivation
• Scientific research is becoming:
• More open – scientists want to collaborate; want/need to share
• More digital – outputs such as software and data; easier to share
• Significant time spent developing software & data
• Efforts not recognized or rewarded
• Citations for papers systematically collected, metrics
built
• But not for software & data
• Hypothesis:
Better measurement of contributions (citations, impact, metrics)
—> Rewards (incentives)
—> Career paths, willingness to join communities
—> More sustainable software
Credit is a problem in Academia
http://www.phdcomics.com/comics/archive.php?comicid=562
Software citation motivation
• Research process is increasingly digital
• Research outputs/products not just papers and books
• Also software, data, slides, posters, graphs, maps, etc.
• Research knowledge is embedded in these components
• Papers too are more digital; can be executable and reproducible
• But citation system was created for papers/books
• We need to either/both
1. Jam software into current citation system
2. Rework citation system
• Focus on 1 as possible; 2 is very hard.
• Challenge: not just how to identify software in a paper
• How to identify software used within research process
Software citation today
• Software and other digital resources currently appear in
publications in very inconsistent ways
• Howison: random sample of 90 articles in the biology
literature -> 7 different ways that software was
mentioned
• Studies on data and facility citation -> similar results
J. Howison and J. Bullard. Software in the scientific literature: Problems with seeing, finding, and using software mentioned in the biology literature. Journal of the
Association for Information Science and Technology, 2015. In press. http://dx.doi.org/10.1002/asi.23538.
Why software citation matters
• Understanding Research Fields
• Software is a product of research, and by not citing it, we leave holes in the
record of research of progress in those fields
• Academic Credit
• Academic researchers at all levels, including students, postdocs, faculty, and
staff, should be credited for the software products they develop and contribute to,
particularly when those products enable or further research done by others
• Discovering Software
• Citations enable the specific software used in a research product to be found.
Additional researchers can then use the same software for different purposes,
leading to credit for those responsible for the software
• Reproducibility
• Citation of specific software used is necessary for reproducibility, but is not
sufficient. Additional information such as configurations and platform issues are
also needed
Software citation principles: People
• FORCE11 Software Citation working group (Arfon Smith
& Dan Katz)
• Started July 2015, ~40 members
• Working on GitHub: https://github.com/force11/force11-scwg
• Also listed on FORCE11:
https://www.force11.org/group/software-citation-working-group
• WSSSPE3 Credit & Citation working group (Kyle
Niemeyer)
• September 2015
• http://wssspe.researchcomputing.org.uk/wssspe3/
• WSSSPE3 group merged into FORCE11 group in Oct.
• ~55 members (researchers, developers, publishers, repositories,
librarians)
Software citation principles: Process
• Review of existing community practices
• Software Sustainability Institute, WSSSPE, Project CRediT,
Ontosoft, CodeMeta
• Astronomy and astrophysics, life sciences, geosciences
• Developed use cases (collaborative via Google Doc)
• Drafted software citation principles document
• Started with data citation principles
• Updated based on software use cases and related work
• Updated based working group discussions, community feedback
and review of draft, workshop at FORCE2016 in April
Software citation principles
• Contents (details on next slides):
• 6 principles: Importance, Credit and Attribution, Unique
Identification, Persistence, Accessibility, Specificity
• Motivation, summary of use cases, related work, and discussion
(including recommendations)
• Format: working document in GitHub, linked from
FORCE11 SCWG page, discussion has been via GitHub
issues, changes have been tracked
• https://github.com/force11/force11-scwg
Principle 1. Importance
• Software should be considered a legitimate and citable
product of research. Software citations should be
accorded the same importance in the scholarly record as
citations of other research products, such as publications
and data; they should be included in the metadata of the
citing work, for example in the reference list of a journal
article, and should not be omitted or separated. Software
should be cited on the same basis as any other research
products such as papers or books, that is, authors
should cite the appropriate set of software products just
as they cite the appropriate set of papers.
Principle 2. Credit and Attribution
• Software citations should facilitate giving scholarly credit
and normative and legal attribution to all contributors to
the software, recognizing that a single style or
mechanism of attribution may not be applicable to all
software.
Principle 3. Unique Identification
• A software citation should include a method for
identification that is machine actionable, globally unique,
interoperable, and recognized by at least a community of
the corresponding domain experts, and preferably by
general public researchers.
Principle 4. Persistence
• Unique identifiers and metadata describing the software
and its disposition should persist – even beyond the
lifespan of the software they describe.
Principle 5. Accessibility
• Software citations should permit and facilitate access to
the software itself and to its associated metadata,
documentation, data, and other materials necessary for
both humans and machines to make informed use of the
referenced software.
Principle 6. Specificity
• Software citations should facilitate identification of, and
access to, the specific version of software that was used.
Software identification should be as specific as
necessary, such as using version numbers, revision
numbers, or variants such as platforms.
Use cases
Related work
• General community
• Blogs & papers studying the issue by groups (e.g., SSI), people
(e.g., Wilson), and workshop reports (e.g., by WSSSPE and SSI)
• Domain-specific
• Work by journals to encourage software publication & citation
(e.g., TOMS, AAS, ASCL, NIH SDI, Ontosoft)
• Metadata-focused
• For citation: DOAP, Research Objects, The Software Ontology,
EDAM Ontology, Project CRediT, Ontosoft, RRR/JISC
guidelines
• Also for build/distribution: Debian package format, Python
package descriptions, R package descriptions
• CodeMeta crosswalk activity to be discussed
Discussion: What to cite
• Importance principle: “…authors should cite the
appropriate set of software products just as they cite the
appropriate set of papers”
• What software to cite decided by author(s) of product, in
context of community norms and practices
• POWL: “Do not cite standard office software (e.g. Word,
Excel) or programming languages. Provide references
only for specialized software.”
• i.e., if using different software could produce different
data or results, then the software used should be cited
Purdue Online Writing Lab. Reference List: Electronic Sources (Web Publications). https://owl.english.purdue. edu/owl/resource/560/10/, 2015.
Discussion: What to cite (citation vs
provenance & reproducibility)
• Provenance/reproducibility requirements > citation
requirements
• Citation: software important to research outcome
• Provenance: all steps (including software) in research
• For data research product, provenance data includes all
cited software, not vice versa
• Software citation principles cover minimal needs for
software citation for software identification
• Provenance & reproducibility may need more metadata
Discussion: Software papers
• Goal: Software should be cited
• Practice: Papers about software (“software papers”) are
published and cited
• Importance principle (1) and other discussion: The
software itself should be cited on the same basis as
any other research product; authors should cite the
appropriate set of software products
• Ok to cite software paper too, if it contains results
(performance, validation, etc.) that are important to the
work
• If the software authors ask users to cite software paper,
can do so, in addition to citing to the software
Discussion: Derived software
• Imagine Code A is derived from Code B, and a paper
uses and cites Code A
• Should the paper also cite Code B?
• No, any research builds on other research
• Each research product just cites those products that it
directly builds on
• Together, this give credit and knowledge chains
• Science historians study these chains
• More automated analyses may also develop, such as
transitive credit
D. S. Katz and A. M. Smith. Implementing transitive credit with JSON-LD. Journal of Open Research Software, 3:e7, 2015. http://dx.doi.org/10.5334/jors.by.
Discussion: Software peer review
• Important issue for software in science
• Probably out-of-scope in citation discussion
• Goal of software citation is to identify software that has
been used in a scholarly product
• Whether or not that software has been peer-reviewed is
irrelevant
• Possible exception: if peer-review status of software is
part of software metadata
• Working group: this is not needed to identify the software
Discussion: Citations in text
• Each publisher/publication has a style it prefers
• e.g., AMS, APA, Chicago, MLA
• Examples for software using these styles published by
Lipson
• Citations typically sent to publishers as text formatted in
that citation style, not as structured metadata
• Recommendation
• All text citation styles should support:
• a) a label indicating that this is software, e.g. [Computer
program]
• b) support for version information, e.g. Version 1.8.7
C. Lipson. Cite Right, Second Edition: A Quick Guide to Citation Styles–MLA, APA, Chicago, the Sciences, Professions, and More. Chicago Guides to Writing,
Editing, and Publishing. University of Chicago Press, 2011.
Discussion: Citation limits
• Software citation principles
• –> more software citations in scholarly products
• –> more overall citations
• Some journals have strict limits on
• Number of citations
• Number of pages (including references)
• Recommendations to publishers:
• Add specific instructions regarding software citations to author
guidelines to not disincentivize software citation
• Don’t include references in content counted against page limits
Discussion: Unique identification
• Recommend DOIs for identification of published software
• However, identifier can point to
1. a specific version of a piece of software
2. the piece of software (all versions of the software)
3. the latest version of a piece of software
• One piece of software may have identifiers of all 3 types
• And maybe 1+ software papers, each with identifiers
• Use cases:
• Cite a specific version
• Cite the software in general
• Cite the latest version
• Link multiple releases together, to understanding all citations
Discussion: Unique identification (cont.)
• Principles intended to apply at all levels
• To all identifiers types, e.g., DOIs, RRIDs, ARKS, etc.
• Though again: recommend when possible use DOIs that
identify specific versions of source code
• RRIDs developed by the FORCE11 Resource
Identification Initiative
• Discussed for use to identify software packages (not specific
versions)
• FORCE11 Resource Identification Technical Specifications
Working Group says “Information resources like software are
better suited to the Software Citation WG”
• Currently no consensus on RRIDs for software
Discussion: Types of software
• Principles and discussion generally focus on software as
source code
• But some software is only available as an executable, a
container, or a service
• Principles intended to apply to all these forms of
software
• Implementation of principles will differ by software type
• When software exists as both source code and another
type, cite the source code
Discussion: Access to software
• Accessibility principle: “software citations should permit
and facilitate access to the software itself”
• Metadata should provide access information
• Free software: metadata includes UID that resolves to
URL to specific version of software
• Commercial software: metadata provides information on
how to access the specific software
• E.g., company’s product number, URL to buy the software
• If software isn’t available now, it still should be cited
along with information about how it was accessed
Software citation principles: Status & next
steps
• Draft document published on FORCE11 website
• Two week open comment period (ended May 20)
• Updated version will be published on FORCE11 website
and on archive site (e.g., F1000Research, figshare,
Zenodo)
• Endorsement period for both individuals and
organizations
• Will create infographic and 1–3 slides
• Software Citation Working Group ends
• Software Citation Implementation group starts? (work
with institutions, publishers, funders, researchers, etc. -
and write implementation examples paper?)
Journal of Open Source Software (JOSS)
• A developer friendly journal for research software packages
• “If you've already licensed your code and have good documentation
then we expect that it should take less than an hour to prepare and
submit your paper to JOSS”
• Everything is open:
• Submitted/published paper: http://joss.theoj.org
• Code itself: where is up to the author(s)
• Reviews & process: https://github.com/openjournals/joss-reviews
• Code for the journal itself: https://github.com/openjournals/joss
• Zenodo archives JOSS papers and issues DOIs
• First paper submitted May 4
• As of 6 June: 8 accepted papers, 5 under review
CodeMeta
• Intended as a Rosetta Stone for software metadata
• NSF-funded, led by Matthew B. Jones (NCEAS) and
Carl Boettiger (UC Berkeley)
• Began with Fidgit
• A proof of concept integration between a GitHub repo and
Figshare in an effort to get a DOI for a GitHub repository
• Developed by Arfon Smith, Kay Thaney, and Mark Hahnel at
Open Science Codefest in Santa Barbara in 2014
• When a repository is tagged for release on GitHub Fidgit will
import the release into Figshare thus giving the code bundle a
DOI
• Requires metadata about code, stored as JSON-LD
Code metadata vocabularies
• code.jsonld
• Figshare
• Zenodo
• NIH Software Discovery
Index
• Software Ontology
• Dublin Core
• R Package
• Python Distutils (PyPI)
• Debian Package
• Schema.org
• GitHub
• DataCite (Software
Entities Model)
• Trove Software Map
• Perl Module
• JavaScript package (npm)
• Java (Maven)
• Octave
• Ruby Gem
• OntoSoft
CodeMeta: build crosswalk table for these, then harmonize
CodeMeta status
• Bringing together a collaboration of stakeholders in the software
archive pipeline to harmonize their disparate approaches to software
metadata
• Participants include representatives from dedicated software
archives (e.g., ASCL), general purpose repositories (e.g., Zenodo,
GitHub, figshare), domain and institutional data repositories (e.g.,
DataONE, DataVerse), open science communities (e.g., Software
Sustainability Institute, Mozilla Science Lab), librarians, tool
developers, and domain researchers.
• Other stakeholders: publishers, funders
• Guiding principles:
• driven by real world use cases
• lead to results that are as simple as possible
• be conscious of how it will impact existing practices and repositories
CodeMeta status
• Overarching goal: produce a crosswalk among software
metadata approaches that enables software metadata
creators and repositories to communicate effectively
about software using a consensus vocabulary
• 2-day workshop in April with FORCE11:
• Made good progress on crosswalk table
• Began documenting use cases
• Software actions: Deposit, Discover, Analyze, Use, C3R3*
• Began vision paper
• Future work will provide reference implementations in
some stakeholder community repositories and
applications
*Credit, Comply, Count; Roles, Respect, Reputation
National Center for Supercomputing Applications
University of Illinois at Urbana–Champaign
Software Citation:
Principles, Discussion, and Metadata
Daniel S. Katz
Associate Director for Scientific Software & Applications, NCSA
Research Associate Professor, ECE
Research Associate Professor, GSLIS
dskatz@illinois.edu, d.katz@ieee.org, @danielskatz
Principles work: with Arfon M. Smith, Kyle E. Niemeyer & WG
Metadata work: by Matthew B. Jones and Carl Boettiger
Read http://danielskatzblog.wordpress.org for more

Weitere ähnliche Inhalte

Was ist angesagt?

Funding Software in Academia
Funding Software in AcademiaFunding Software in Academia
Funding Software in AcademiaDaniel S. Katz
 
ADVANCING RESEARCH COMPUTING ON CAMPUSES: BEST PRACTICES WORKSHOP - Facilitat...
ADVANCING RESEARCH COMPUTING ON CAMPUSES: BEST PRACTICES WORKSHOP - Facilitat...ADVANCING RESEARCH COMPUTING ON CAMPUSES: BEST PRACTICES WORKSHOP - Facilitat...
ADVANCING RESEARCH COMPUTING ON CAMPUSES: BEST PRACTICES WORKSHOP - Facilitat...Sean Cleveland Ph.D.
 
Software Citation in Theory and Practice
Software Citation in Theory and PracticeSoftware Citation in Theory and Practice
Software Citation in Theory and PracticeDaniel S. Katz
 
UVa Library Scientific Data Consulting Group (SciDaC): New Partnerships and...
UVa Library Scientific Data Consulting Group (SciDaC):  New Partnerships and...UVa Library Scientific Data Consulting Group (SciDaC):  New Partnerships and...
UVa Library Scientific Data Consulting Group (SciDaC): New Partnerships and...Andrew Sallans
 
Bioinformatic core facilities discussion
Bioinformatic core facilities discussionBioinformatic core facilities discussion
Bioinformatic core facilities discussionJennifer Shelton
 
Improving Integrity, Transparency, and Reproducibility Through Connection of ...
Improving Integrity, Transparency, and Reproducibility Through Connection of ...Improving Integrity, Transparency, and Reproducibility Through Connection of ...
Improving Integrity, Transparency, and Reproducibility Through Connection of ...Andrew Sallans
 
Enhancing DMPTool: Further Streamlineing Data Mangement Planning Process
Enhancing DMPTool: Further Streamlineing Data Mangement Planning ProcessEnhancing DMPTool: Further Streamlineing Data Mangement Planning Process
Enhancing DMPTool: Further Streamlineing Data Mangement Planning ProcessUniversity of California Curation Center
 
Gathering Evidence to Demonstrate Impact
Gathering Evidence to Demonstrate ImpactGathering Evidence to Demonstrate Impact
Gathering Evidence to Demonstrate ImpactIUPUI
 
Methods for measuring citizen-science impact
Methods for measuring citizen-science impactMethods for measuring citizen-science impact
Methods for measuring citizen-science impactLuigi Ceccaroni
 
NSF Data Management Plan Case Study: UVa’s Response.
NSF Data Management Plan Case Study:  UVa’s Response.NSF Data Management Plan Case Study:  UVa’s Response.
NSF Data Management Plan Case Study: UVa’s Response.Andrew Sallans
 
NSF Data Management Plan - Implications for Librarians
NSF Data Management Plan - Implications for LibrariansNSF Data Management Plan - Implications for Librarians
NSF Data Management Plan - Implications for LibrariansAndrew Sallans
 
Your Systematic Review: Getting Started
Your Systematic Review: Getting StartedYour Systematic Review: Getting Started
Your Systematic Review: Getting StartedElaine Lasda
 
SGCI HICSS50 Presentation
SGCI HICSS50 PresentationSGCI HICSS50 Presentation
SGCI HICSS50 Presentationmaytaldahan
 
Hands-On Data Management Planning for Life Sciences
Hands-On Data Management Planning for Life SciencesHands-On Data Management Planning for Life Sciences
Hands-On Data Management Planning for Life SciencesAndrew Sallans
 

Was ist angesagt? (20)

Funding Software in Academia
Funding Software in AcademiaFunding Software in Academia
Funding Software in Academia
 
ADVANCING RESEARCH COMPUTING ON CAMPUSES: BEST PRACTICES WORKSHOP - Facilitat...
ADVANCING RESEARCH COMPUTING ON CAMPUSES: BEST PRACTICES WORKSHOP - Facilitat...ADVANCING RESEARCH COMPUTING ON CAMPUSES: BEST PRACTICES WORKSHOP - Facilitat...
ADVANCING RESEARCH COMPUTING ON CAMPUSES: BEST PRACTICES WORKSHOP - Facilitat...
 
Software Citation in Theory and Practice
Software Citation in Theory and PracticeSoftware Citation in Theory and Practice
Software Citation in Theory and Practice
 
UVa Library Scientific Data Consulting Group (SciDaC): New Partnerships and...
UVa Library Scientific Data Consulting Group (SciDaC):  New Partnerships and...UVa Library Scientific Data Consulting Group (SciDaC):  New Partnerships and...
UVa Library Scientific Data Consulting Group (SciDaC): New Partnerships and...
 
Bioinformatic core facilities discussion
Bioinformatic core facilities discussionBioinformatic core facilities discussion
Bioinformatic core facilities discussion
 
Zucca "Technology & Systems"
Zucca "Technology & Systems"Zucca "Technology & Systems"
Zucca "Technology & Systems"
 
Improving Integrity, Transparency, and Reproducibility Through Connection of ...
Improving Integrity, Transparency, and Reproducibility Through Connection of ...Improving Integrity, Transparency, and Reproducibility Through Connection of ...
Improving Integrity, Transparency, and Reproducibility Through Connection of ...
 
Snowball Metrics: University-owned Benchmarking to Reveal Strengths within Al...
Snowball Metrics: University-owned Benchmarking to Reveal Strengths within Al...Snowball Metrics: University-owned Benchmarking to Reveal Strengths within Al...
Snowball Metrics: University-owned Benchmarking to Reveal Strengths within Al...
 
NISO Altmetrics Initiative: A Project Update - Martin Fenner, Technical Lead ...
NISO Altmetrics Initiative: A Project Update - Martin Fenner, Technical Lead ...NISO Altmetrics Initiative: A Project Update - Martin Fenner, Technical Lead ...
NISO Altmetrics Initiative: A Project Update - Martin Fenner, Technical Lead ...
 
Enhancing DMPTool: Further Streamlineing Data Mangement Planning Process
Enhancing DMPTool: Further Streamlineing Data Mangement Planning ProcessEnhancing DMPTool: Further Streamlineing Data Mangement Planning Process
Enhancing DMPTool: Further Streamlineing Data Mangement Planning Process
 
Gathering Evidence to Demonstrate Impact
Gathering Evidence to Demonstrate ImpactGathering Evidence to Demonstrate Impact
Gathering Evidence to Demonstrate Impact
 
Methods for measuring citizen-science impact
Methods for measuring citizen-science impactMethods for measuring citizen-science impact
Methods for measuring citizen-science impact
 
NSF Data Management Plan Case Study: UVa’s Response.
NSF Data Management Plan Case Study:  UVa’s Response.NSF Data Management Plan Case Study:  UVa’s Response.
NSF Data Management Plan Case Study: UVa’s Response.
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
NSF Data Management Plan - Implications for Librarians
NSF Data Management Plan - Implications for LibrariansNSF Data Management Plan - Implications for Librarians
NSF Data Management Plan - Implications for Librarians
 
Assessing and Reporting Research Impact – A Role for the Library - Kristi L....
Assessing and Reporting Research Impact – A Role for the Library  - Kristi L....Assessing and Reporting Research Impact – A Role for the Library  - Kristi L....
Assessing and Reporting Research Impact – A Role for the Library - Kristi L....
 
Your Systematic Review: Getting Started
Your Systematic Review: Getting StartedYour Systematic Review: Getting Started
Your Systematic Review: Getting Started
 
SGCI HICSS50 Presentation
SGCI HICSS50 PresentationSGCI HICSS50 Presentation
SGCI HICSS50 Presentation
 
Critical infrastructure to promote data synthesis
Critical infrastructure to promote data synthesis Critical infrastructure to promote data synthesis
Critical infrastructure to promote data synthesis
 
Hands-On Data Management Planning for Life Sciences
Hands-On Data Management Planning for Life SciencesHands-On Data Management Planning for Life Sciences
Hands-On Data Management Planning for Life Sciences
 

Ă„hnlich wie 20160607 citation4software panel

Software citation
Software citationSoftware citation
Software citationDaniel S. Katz
 
Crediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teamsCrediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teamsCarole Goble
 
Summary of WSSSPE and its working groups
Summary of WSSSPE and its working groupsSummary of WSSSPE and its working groups
Summary of WSSSPE and its working groupsDaniel S. Katz
 
Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...
Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...
Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...Daniel S. Katz
 
SciForge Workshop@Potsdam Institute for Climate Impact Reserach; Nov 2014
SciForge Workshop@Potsdam Institute for Climate Impact Reserach; Nov 2014SciForge Workshop@Potsdam Institute for Climate Impact Reserach; Nov 2014
SciForge Workshop@Potsdam Institute for Climate Impact Reserach; Nov 2014dreusser
 
Software Repositories for Research -- An Environmental Scan
Software Repositories for Research -- An Environmental ScanSoftware Repositories for Research -- An Environmental Scan
Software Repositories for Research -- An Environmental ScanMicah Altman
 
Research software susainability
Research software susainabilityResearch software susainability
Research software susainabilityDaniel S. Katz
 
Online Journal Management using Open Journal Systems (OJS)
Online Journal Management using Open Journal Systems (OJS)Online Journal Management using Open Journal Systems (OJS)
Online Journal Management using Open Journal Systems (OJS)Ina Smith
 
ufsojs-161024084446 (1).pdf
ufsojs-161024084446 (1).pdfufsojs-161024084446 (1).pdf
ufsojs-161024084446 (1).pdfTeshome Oljira
 
Software Repositories for Research-- An Environmental Scan
Software Repositories for Research-- An Environmental ScanSoftware Repositories for Research-- An Environmental Scan
Software Repositories for Research-- An Environmental ScanMicah Altman
 
Research software identification - Catherine Jones
Research software identification - Catherine JonesResearch software identification - Catherine Jones
Research software identification - Catherine JonesJisc RDM
 
FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...
FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...
FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...Daniel S. Katz
 
Software Professionals (RSEs) at NCSA
Software Professionals (RSEs) at NCSASoftware Professionals (RSEs) at NCSA
Software Professionals (RSEs) at NCSADaniel S. Katz
 
Software as a Well-Formed Research Object
Software as a Well-Formed Research ObjectSoftware as a Well-Formed Research Object
Software as a Well-Formed Research ObjectYasmin AlNoamany, PhD
 
Software: impact, metrics, and citation
Software: impact, metrics, and citationSoftware: impact, metrics, and citation
Software: impact, metrics, and citationDaniel S. Katz
 
20171003 lancaster data conversations Chue-Hong
20171003 lancaster data conversations Chue-Hong20171003 lancaster data conversations Chue-Hong
20171003 lancaster data conversations Chue-HongLancaster University Library
 
Talk on Research Data Management
Talk on Research Data ManagementTalk on Research Data Management
Talk on Research Data ManagementAnita de Waard
 
Planning and writing your documents - Software documentation
Planning and writing your documents - Software documentationPlanning and writing your documents - Software documentation
Planning and writing your documents - Software documentationRa'Fat Al-Msie'deen
 
Tracking Citations to Research Software via PIDs
Tracking Citations to Research Software via PIDsTracking Citations to Research Software via PIDs
Tracking Citations to Research Software via PIDsETH-Bibliothek
 
Systematic Literature Reviews and Systematic Mapping Studies
Systematic Literature Reviews and Systematic Mapping StudiesSystematic Literature Reviews and Systematic Mapping Studies
Systematic Literature Reviews and Systematic Mapping Studiesalessio_ferrari
 

Ă„hnlich wie 20160607 citation4software panel (20)

Software citation
Software citationSoftware citation
Software citation
 
Crediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teamsCrediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teams
 
Summary of WSSSPE and its working groups
Summary of WSSSPE and its working groupsSummary of WSSSPE and its working groups
Summary of WSSSPE and its working groups
 
Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...
Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...
Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...
 
SciForge Workshop@Potsdam Institute for Climate Impact Reserach; Nov 2014
SciForge Workshop@Potsdam Institute for Climate Impact Reserach; Nov 2014SciForge Workshop@Potsdam Institute for Climate Impact Reserach; Nov 2014
SciForge Workshop@Potsdam Institute for Climate Impact Reserach; Nov 2014
 
Software Repositories for Research -- An Environmental Scan
Software Repositories for Research -- An Environmental ScanSoftware Repositories for Research -- An Environmental Scan
Software Repositories for Research -- An Environmental Scan
 
Research software susainability
Research software susainabilityResearch software susainability
Research software susainability
 
Online Journal Management using Open Journal Systems (OJS)
Online Journal Management using Open Journal Systems (OJS)Online Journal Management using Open Journal Systems (OJS)
Online Journal Management using Open Journal Systems (OJS)
 
ufsojs-161024084446 (1).pdf
ufsojs-161024084446 (1).pdfufsojs-161024084446 (1).pdf
ufsojs-161024084446 (1).pdf
 
Software Repositories for Research-- An Environmental Scan
Software Repositories for Research-- An Environmental ScanSoftware Repositories for Research-- An Environmental Scan
Software Repositories for Research-- An Environmental Scan
 
Research software identification - Catherine Jones
Research software identification - Catherine JonesResearch software identification - Catherine Jones
Research software identification - Catherine Jones
 
FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...
FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...
FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...
 
Software Professionals (RSEs) at NCSA
Software Professionals (RSEs) at NCSASoftware Professionals (RSEs) at NCSA
Software Professionals (RSEs) at NCSA
 
Software as a Well-Formed Research Object
Software as a Well-Formed Research ObjectSoftware as a Well-Formed Research Object
Software as a Well-Formed Research Object
 
Software: impact, metrics, and citation
Software: impact, metrics, and citationSoftware: impact, metrics, and citation
Software: impact, metrics, and citation
 
20171003 lancaster data conversations Chue-Hong
20171003 lancaster data conversations Chue-Hong20171003 lancaster data conversations Chue-Hong
20171003 lancaster data conversations Chue-Hong
 
Talk on Research Data Management
Talk on Research Data ManagementTalk on Research Data Management
Talk on Research Data Management
 
Planning and writing your documents - Software documentation
Planning and writing your documents - Software documentationPlanning and writing your documents - Software documentation
Planning and writing your documents - Software documentation
 
Tracking Citations to Research Software via PIDs
Tracking Citations to Research Software via PIDsTracking Citations to Research Software via PIDs
Tracking Citations to Research Software via PIDs
 
Systematic Literature Reviews and Systematic Mapping Studies
Systematic Literature Reviews and Systematic Mapping StudiesSystematic Literature Reviews and Systematic Mapping Studies
Systematic Literature Reviews and Systematic Mapping Studies
 

Mehr von Daniel S. Katz

Parsl: Pervasive Parallel Programming in Python
Parsl: Pervasive Parallel Programming in PythonParsl: Pervasive Parallel Programming in Python
Parsl: Pervasive Parallel Programming in PythonDaniel S. Katz
 
What is eScience, and where does it go from here?
What is eScience, and where does it go from here?What is eScience, and where does it go from here?
What is eScience, and where does it go from here?Daniel S. Katz
 
Citation and Research Objects: Toward Active Research Objects
Citation and Research Objects: Toward Active Research ObjectsCitation and Research Objects: Toward Active Research Objects
Citation and Research Objects: Toward Active Research ObjectsDaniel S. Katz
 
Fundamentals of software sustainability
Fundamentals of software sustainabilityFundamentals of software sustainability
Fundamentals of software sustainabilityDaniel S. Katz
 
Expressing and sharing workflows
Expressing and sharing workflowsExpressing and sharing workflows
Expressing and sharing workflowsDaniel S. Katz
 
What do we need beyond a DOI?
What do we need beyond a DOI?What do we need beyond a DOI?
What do we need beyond a DOI?Daniel S. Katz
 
Scientific research: What Anna Karenina teaches us about useful negative results
Scientific research: What Anna Karenina teaches us about useful negative resultsScientific research: What Anna Karenina teaches us about useful negative results
Scientific research: What Anna Karenina teaches us about useful negative resultsDaniel S. Katz
 
Panel: Our Scholarly Recognition System Doesn’t Still Work
Panel: Our Scholarly Recognition System Doesn’t Still WorkPanel: Our Scholarly Recognition System Doesn’t Still Work
Panel: Our Scholarly Recognition System Doesn’t Still WorkDaniel S. Katz
 
US University Research Funding, Peer Reviews, and Metrics
US University Research Funding, Peer Reviews, and MetricsUS University Research Funding, Peer Reviews, and Metrics
US University Research Funding, Peer Reviews, and MetricsDaniel S. Katz
 
Swift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowSwift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowDaniel S. Katz
 
A Method to Select e-Infrastructure Components to Sustain
A Method to Select e-Infrastructure Components to SustainA Method to Select e-Infrastructure Components to Sustain
A Method to Select e-Infrastructure Components to SustainDaniel S. Katz
 
Multi-component Modeling with Swift at Extreme Scale
Multi-component Modeling with Swift at Extreme ScaleMulti-component Modeling with Swift at Extreme Scale
Multi-component Modeling with Swift at Extreme ScaleDaniel S. Katz
 
Application Fault Tolerance (AFT)
Application Fault Tolerance (AFT)Application Fault Tolerance (AFT)
Application Fault Tolerance (AFT)Daniel S. Katz
 
Metrics & Citation for Software (and Data)
Metrics & Citation for Software (and Data)Metrics & Citation for Software (and Data)
Metrics & Citation for Software (and Data)Daniel S. Katz
 

Mehr von Daniel S. Katz (15)

Parsl: Pervasive Parallel Programming in Python
Parsl: Pervasive Parallel Programming in PythonParsl: Pervasive Parallel Programming in Python
Parsl: Pervasive Parallel Programming in Python
 
What is eScience, and where does it go from here?
What is eScience, and where does it go from here?What is eScience, and where does it go from here?
What is eScience, and where does it go from here?
 
Citation and Research Objects: Toward Active Research Objects
Citation and Research Objects: Toward Active Research ObjectsCitation and Research Objects: Toward Active Research Objects
Citation and Research Objects: Toward Active Research Objects
 
Fundamentals of software sustainability
Fundamentals of software sustainabilityFundamentals of software sustainability
Fundamentals of software sustainability
 
URSSI
URSSIURSSI
URSSI
 
Expressing and sharing workflows
Expressing and sharing workflowsExpressing and sharing workflows
Expressing and sharing workflows
 
What do we need beyond a DOI?
What do we need beyond a DOI?What do we need beyond a DOI?
What do we need beyond a DOI?
 
Scientific research: What Anna Karenina teaches us about useful negative results
Scientific research: What Anna Karenina teaches us about useful negative resultsScientific research: What Anna Karenina teaches us about useful negative results
Scientific research: What Anna Karenina teaches us about useful negative results
 
Panel: Our Scholarly Recognition System Doesn’t Still Work
Panel: Our Scholarly Recognition System Doesn’t Still WorkPanel: Our Scholarly Recognition System Doesn’t Still Work
Panel: Our Scholarly Recognition System Doesn’t Still Work
 
US University Research Funding, Peer Reviews, and Metrics
US University Research Funding, Peer Reviews, and MetricsUS University Research Funding, Peer Reviews, and Metrics
US University Research Funding, Peer Reviews, and Metrics
 
Swift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowSwift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance Workflow
 
A Method to Select e-Infrastructure Components to Sustain
A Method to Select e-Infrastructure Components to SustainA Method to Select e-Infrastructure Components to Sustain
A Method to Select e-Infrastructure Components to Sustain
 
Multi-component Modeling with Swift at Extreme Scale
Multi-component Modeling with Swift at Extreme ScaleMulti-component Modeling with Swift at Extreme Scale
Multi-component Modeling with Swift at Extreme Scale
 
Application Fault Tolerance (AFT)
Application Fault Tolerance (AFT)Application Fault Tolerance (AFT)
Application Fault Tolerance (AFT)
 
Metrics & Citation for Software (and Data)
Metrics & Citation for Software (and Data)Metrics & Citation for Software (and Data)
Metrics & Citation for Software (and Data)
 

KĂĽrzlich hochgeladen

Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfCionsystems
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfWilly Marroquin (WillyDevNET)
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendArshad QA
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 

KĂĽrzlich hochgeladen (20)

Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdf
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and Backend
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 

20160607 citation4software panel

  • 1. National Center for Supercomputing Applications University of Illinois at Urbana–Champaign Software Citation: Principles, Discussion, and Metadata Daniel S. Katz Associate Director for Scientific Software & Applications, NCSA Research Associate Professor, ECE Research Associate Professor, GSLIS dskatz@illinois.edu, d.katz@ieee.org, @danielskatz Principles work: with Arfon M. Smith, Kyle E. Niemeyer & WG Metadata work: by Matthew B. Jones and Carl Boettiger
  • 2. General Motivation • Scientific research is becoming: • More open – scientists want to collaborate; want/need to share • More digital – outputs such as software and data; easier to share • Significant time spent developing software & data • Efforts not recognized or rewarded • Citations for papers systematically collected, metrics built • But not for software & data • Hypothesis: Better measurement of contributions (citations, impact, metrics) —> Rewards (incentives) —> Career paths, willingness to join communities —> More sustainable software
  • 3. Credit is a problem in Academia http://www.phdcomics.com/comics/archive.php?comicid=562
  • 4. Software citation motivation • Research process is increasingly digital • Research outputs/products not just papers and books • Also software, data, slides, posters, graphs, maps, etc. • Research knowledge is embedded in these components • Papers too are more digital; can be executable and reproducible • But citation system was created for papers/books • We need to either/both 1. Jam software into current citation system 2. Rework citation system • Focus on 1 as possible; 2 is very hard. • Challenge: not just how to identify software in a paper • How to identify software used within research process
  • 5. Software citation today • Software and other digital resources currently appear in publications in very inconsistent ways • Howison: random sample of 90 articles in the biology literature -> 7 different ways that software was mentioned • Studies on data and facility citation -> similar results J. Howison and J. Bullard. Software in the scientific literature: Problems with seeing, finding, and using software mentioned in the biology literature. Journal of the Association for Information Science and Technology, 2015. In press. http://dx.doi.org/10.1002/asi.23538.
  • 6. Why software citation matters • Understanding Research Fields • Software is a product of research, and by not citing it, we leave holes in the record of research of progress in those fields • Academic Credit • Academic researchers at all levels, including students, postdocs, faculty, and staff, should be credited for the software products they develop and contribute to, particularly when those products enable or further research done by others • Discovering Software • Citations enable the specific software used in a research product to be found. Additional researchers can then use the same software for different purposes, leading to credit for those responsible for the software • Reproducibility • Citation of specific software used is necessary for reproducibility, but is not sufficient. Additional information such as configurations and platform issues are also needed
  • 7. Software citation principles: People • FORCE11 Software Citation working group (Arfon Smith & Dan Katz) • Started July 2015, ~40 members • Working on GitHub: https://github.com/force11/force11-scwg • Also listed on FORCE11: https://www.force11.org/group/software-citation-working-group • WSSSPE3 Credit & Citation working group (Kyle Niemeyer) • September 2015 • http://wssspe.researchcomputing.org.uk/wssspe3/ • WSSSPE3 group merged into FORCE11 group in Oct. • ~55 members (researchers, developers, publishers, repositories, librarians)
  • 8. Software citation principles: Process • Review of existing community practices • Software Sustainability Institute, WSSSPE, Project CRediT, Ontosoft, CodeMeta • Astronomy and astrophysics, life sciences, geosciences • Developed use cases (collaborative via Google Doc) • Drafted software citation principles document • Started with data citation principles • Updated based on software use cases and related work • Updated based working group discussions, community feedback and review of draft, workshop at FORCE2016 in April
  • 9. Software citation principles • Contents (details on next slides): • 6 principles: Importance, Credit and Attribution, Unique Identification, Persistence, Accessibility, Specificity • Motivation, summary of use cases, related work, and discussion (including recommendations) • Format: working document in GitHub, linked from FORCE11 SCWG page, discussion has been via GitHub issues, changes have been tracked • https://github.com/force11/force11-scwg
  • 10. Principle 1. Importance • Software should be considered a legitimate and citable product of research. Software citations should be accorded the same importance in the scholarly record as citations of other research products, such as publications and data; they should be included in the metadata of the citing work, for example in the reference list of a journal article, and should not be omitted or separated. Software should be cited on the same basis as any other research products such as papers or books, that is, authors should cite the appropriate set of software products just as they cite the appropriate set of papers.
  • 11. Principle 2. Credit and Attribution • Software citations should facilitate giving scholarly credit and normative and legal attribution to all contributors to the software, recognizing that a single style or mechanism of attribution may not be applicable to all software.
  • 12. Principle 3. Unique Identification • A software citation should include a method for identification that is machine actionable, globally unique, interoperable, and recognized by at least a community of the corresponding domain experts, and preferably by general public researchers.
  • 13. Principle 4. Persistence • Unique identifiers and metadata describing the software and its disposition should persist – even beyond the lifespan of the software they describe.
  • 14. Principle 5. Accessibility • Software citations should permit and facilitate access to the software itself and to its associated metadata, documentation, data, and other materials necessary for both humans and machines to make informed use of the referenced software.
  • 15. Principle 6. Specificity • Software citations should facilitate identification of, and access to, the specific version of software that was used. Software identification should be as specific as necessary, such as using version numbers, revision numbers, or variants such as platforms.
  • 17. Related work • General community • Blogs & papers studying the issue by groups (e.g., SSI), people (e.g., Wilson), and workshop reports (e.g., by WSSSPE and SSI) • Domain-specific • Work by journals to encourage software publication & citation (e.g., TOMS, AAS, ASCL, NIH SDI, Ontosoft) • Metadata-focused • For citation: DOAP, Research Objects, The Software Ontology, EDAM Ontology, Project CRediT, Ontosoft, RRR/JISC guidelines • Also for build/distribution: Debian package format, Python package descriptions, R package descriptions • CodeMeta crosswalk activity to be discussed
  • 18. Discussion: What to cite • Importance principle: “…authors should cite the appropriate set of software products just as they cite the appropriate set of papers” • What software to cite decided by author(s) of product, in context of community norms and practices • POWL: “Do not cite standard office software (e.g. Word, Excel) or programming languages. Provide references only for specialized software.” • i.e., if using different software could produce different data or results, then the software used should be cited Purdue Online Writing Lab. Reference List: Electronic Sources (Web Publications). https://owl.english.purdue. edu/owl/resource/560/10/, 2015.
  • 19. Discussion: What to cite (citation vs provenance & reproducibility) • Provenance/reproducibility requirements > citation requirements • Citation: software important to research outcome • Provenance: all steps (including software) in research • For data research product, provenance data includes all cited software, not vice versa • Software citation principles cover minimal needs for software citation for software identification • Provenance & reproducibility may need more metadata
  • 20. Discussion: Software papers • Goal: Software should be cited • Practice: Papers about software (“software papers”) are published and cited • Importance principle (1) and other discussion: The software itself should be cited on the same basis as any other research product; authors should cite the appropriate set of software products • Ok to cite software paper too, if it contains results (performance, validation, etc.) that are important to the work • If the software authors ask users to cite software paper, can do so, in addition to citing to the software
  • 21. Discussion: Derived software • Imagine Code A is derived from Code B, and a paper uses and cites Code A • Should the paper also cite Code B? • No, any research builds on other research • Each research product just cites those products that it directly builds on • Together, this give credit and knowledge chains • Science historians study these chains • More automated analyses may also develop, such as transitive credit D. S. Katz and A. M. Smith. Implementing transitive credit with JSON-LD. Journal of Open Research Software, 3:e7, 2015. http://dx.doi.org/10.5334/jors.by.
  • 22. Discussion: Software peer review • Important issue for software in science • Probably out-of-scope in citation discussion • Goal of software citation is to identify software that has been used in a scholarly product • Whether or not that software has been peer-reviewed is irrelevant • Possible exception: if peer-review status of software is part of software metadata • Working group: this is not needed to identify the software
  • 23. Discussion: Citations in text • Each publisher/publication has a style it prefers • e.g., AMS, APA, Chicago, MLA • Examples for software using these styles published by Lipson • Citations typically sent to publishers as text formatted in that citation style, not as structured metadata • Recommendation • All text citation styles should support: • a) a label indicating that this is software, e.g. [Computer program] • b) support for version information, e.g. Version 1.8.7 C. Lipson. Cite Right, Second Edition: A Quick Guide to Citation Styles–MLA, APA, Chicago, the Sciences, Professions, and More. Chicago Guides to Writing, Editing, and Publishing. University of Chicago Press, 2011.
  • 24. Discussion: Citation limits • Software citation principles • –> more software citations in scholarly products • –> more overall citations • Some journals have strict limits on • Number of citations • Number of pages (including references) • Recommendations to publishers: • Add specific instructions regarding software citations to author guidelines to not disincentivize software citation • Don’t include references in content counted against page limits
  • 25. Discussion: Unique identification • Recommend DOIs for identification of published software • However, identifier can point to 1. a specific version of a piece of software 2. the piece of software (all versions of the software) 3. the latest version of a piece of software • One piece of software may have identifiers of all 3 types • And maybe 1+ software papers, each with identifiers • Use cases: • Cite a specific version • Cite the software in general • Cite the latest version • Link multiple releases together, to understanding all citations
  • 26. Discussion: Unique identification (cont.) • Principles intended to apply at all levels • To all identifiers types, e.g., DOIs, RRIDs, ARKS, etc. • Though again: recommend when possible use DOIs that identify specific versions of source code • RRIDs developed by the FORCE11 Resource Identification Initiative • Discussed for use to identify software packages (not specific versions) • FORCE11 Resource Identification Technical Specifications Working Group says “Information resources like software are better suited to the Software Citation WG” • Currently no consensus on RRIDs for software
  • 27. Discussion: Types of software • Principles and discussion generally focus on software as source code • But some software is only available as an executable, a container, or a service • Principles intended to apply to all these forms of software • Implementation of principles will differ by software type • When software exists as both source code and another type, cite the source code
  • 28. Discussion: Access to software • Accessibility principle: “software citations should permit and facilitate access to the software itself” • Metadata should provide access information • Free software: metadata includes UID that resolves to URL to specific version of software • Commercial software: metadata provides information on how to access the specific software • E.g., company’s product number, URL to buy the software • If software isn’t available now, it still should be cited along with information about how it was accessed
  • 29. Software citation principles: Status & next steps • Draft document published on FORCE11 website • Two week open comment period (ended May 20) • Updated version will be published on FORCE11 website and on archive site (e.g., F1000Research, figshare, Zenodo) • Endorsement period for both individuals and organizations • Will create infographic and 1–3 slides • Software Citation Working Group ends • Software Citation Implementation group starts? (work with institutions, publishers, funders, researchers, etc. - and write implementation examples paper?)
  • 30. Journal of Open Source Software (JOSS) • A developer friendly journal for research software packages • “If you've already licensed your code and have good documentation then we expect that it should take less than an hour to prepare and submit your paper to JOSS” • Everything is open: • Submitted/published paper: http://joss.theoj.org • Code itself: where is up to the author(s) • Reviews & process: https://github.com/openjournals/joss-reviews • Code for the journal itself: https://github.com/openjournals/joss • Zenodo archives JOSS papers and issues DOIs • First paper submitted May 4 • As of 6 June: 8 accepted papers, 5 under review
  • 31. CodeMeta • Intended as a Rosetta Stone for software metadata • NSF-funded, led by Matthew B. Jones (NCEAS) and Carl Boettiger (UC Berkeley) • Began with Fidgit • A proof of concept integration between a GitHub repo and Figshare in an effort to get a DOI for a GitHub repository • Developed by Arfon Smith, Kay Thaney, and Mark Hahnel at Open Science Codefest in Santa Barbara in 2014 • When a repository is tagged for release on GitHub Fidgit will import the release into Figshare thus giving the code bundle a DOI • Requires metadata about code, stored as JSON-LD
  • 32. Code metadata vocabularies • code.jsonld • Figshare • Zenodo • NIH Software Discovery Index • Software Ontology • Dublin Core • R Package • Python Distutils (PyPI) • Debian Package • Schema.org • GitHub • DataCite (Software Entities Model) • Trove Software Map • Perl Module • JavaScript package (npm) • Java (Maven) • Octave • Ruby Gem • OntoSoft CodeMeta: build crosswalk table for these, then harmonize
  • 33. CodeMeta status • Bringing together a collaboration of stakeholders in the software archive pipeline to harmonize their disparate approaches to software metadata • Participants include representatives from dedicated software archives (e.g., ASCL), general purpose repositories (e.g., Zenodo, GitHub, figshare), domain and institutional data repositories (e.g., DataONE, DataVerse), open science communities (e.g., Software Sustainability Institute, Mozilla Science Lab), librarians, tool developers, and domain researchers. • Other stakeholders: publishers, funders • Guiding principles: • driven by real world use cases • lead to results that are as simple as possible • be conscious of how it will impact existing practices and repositories
  • 34. CodeMeta status • Overarching goal: produce a crosswalk among software metadata approaches that enables software metadata creators and repositories to communicate effectively about software using a consensus vocabulary • 2-day workshop in April with FORCE11: • Made good progress on crosswalk table • Began documenting use cases • Software actions: Deposit, Discover, Analyze, Use, C3R3* • Began vision paper • Future work will provide reference implementations in some stakeholder community repositories and applications *Credit, Comply, Count; Roles, Respect, Reputation
  • 35. National Center for Supercomputing Applications University of Illinois at Urbana–Champaign Software Citation: Principles, Discussion, and Metadata Daniel S. Katz Associate Director for Scientific Software & Applications, NCSA Research Associate Professor, ECE Research Associate Professor, GSLIS dskatz@illinois.edu, d.katz@ieee.org, @danielskatz Principles work: with Arfon M. Smith, Kyle E. Niemeyer & WG Metadata work: by Matthew B. Jones and Carl Boettiger Read http://danielskatzblog.wordpress.org for more