The document discusses a vision for the future of scientific journals where research data and results are better connected and tracked. Key points of the vision include:
1) Each research item having metadata and connections to other data added to improve tracking and reuse.
2) Tools are needed for workflows, authoring, reviewing, and publishing that can integrate research data and provenance into the scientific process.
3) Standards, social changes, repositories, and publishing systems must also evolve to fully realize this vision of better connecting research activities and outputs.
6. What is the problem?
1. Researchers can’t keep track of their data.
7. What is the problem?
1. Researchers can’t keep track of their data.
2. Data is not stored in a way that is easy for authors.
8. What is the problem?
1. Researchers can’t keep track of their data.
2. Data is not stored in a way that is easy for authors.
3. For readers, article text is not linked to the underlying data.
9. The Vision Work done with Ed Hovy, Phil Bourne,
Gully Burns and Cartic Ramakrishnan
10. The Vision Work done with Ed Hovy, Phil Bourne,
Gully Burns and Cartic Ramakrishnan
1. Research: Each item in the system has metadata
metadata (including provenance) and relations to other data items
metadata added to it.
metadata
metadata
metadata
11. The Vision Work done with Ed Hovy, Phil Bourne,
Gully Burns and Cartic Ramakrishnan
1. Research: Each item in the system has metadata
metadata (including provenance) and relations to other data items
metadata added to it.
2. Workflow: All data items created in the lab are added
metadata
to a (lab-owned) workflow system.
metadata
metadata
12. The Vision Work done with Ed Hovy, Phil Bourne,
Gully Burns and Cartic Ramakrishnan
1. Research: Each item in the system has metadata
metadata (including provenance) and relations to other data items
metadata added to it.
2. Workflow: All data items created in the lab are added
metadata
to a (lab-owned) workflow system.
3. Authoring: A paper is written in an authoring tool which
can pull data with provenance from the workflow tool in the
appropriate representation into the document.
metadata
metadata
Rats were subjected to two
grueling tests
(click on fig 2 to see underlying
data). These results suggest that
the neurological pain pro-
13. The Vision Work done with Ed Hovy, Phil Bourne,
Gully Burns and Cartic Ramakrishnan
1. Research: Each item in the system has metadata
metadata (including provenance) and relations to other data items
metadata added to it.
2. Workflow: All data items created in the lab are added
metadata
to a (lab-owned) workflow system.
3. Authoring: A paper is written in an authoring tool which
can pull data with provenance from the workflow tool in the
appropriate representation into the document.
metadata 4. Editing and review: Once the co-authors agree, the
paper is ‘exposed’ to the editors, who in turn expose it to
metadata reviewers. Reports are stored in the authoring/editing
system, the paper gets updated, until it is validated.
Rats were subjected to two
grueling tests
(click on fig 2 to see underlying
data). These results suggest that
the neurological pain pro-
Review
Revise
Edit
14. The Vision Work done with Ed Hovy, Phil Bourne,
Gully Burns and Cartic Ramakrishnan
1. Research: Each item in the system has metadata
metadata (including provenance) and relations to other data items
metadata added to it.
2. Workflow: All data items created in the lab are added
metadata
to a (lab-owned) workflow system.
3. Authoring: A paper is written in an authoring tool which
can pull data with provenance from the workflow tool in the
appropriate representation into the document.
metadata 4. Editing and review: Once the co-authors agree, the
paper is ‘exposed’ to the editors, who in turn expose it to
metadata reviewers. Reports are stored in the authoring/editing
system, the paper gets updated, until it is validated.
5. Publishing and distribution: When a paper is
published, a collection of validated information is
exposed to the world. It remains connected to its related
Rats were subjected to two
data item, and its heritage can be traced.
grueling tests
(click on fig 2 to see underlying
data). These results suggest that
the neurological pain pro-
Review
Revise
Edit
15. The Vision Work done with Ed Hovy, Phil Bourne,
Gully Burns and Cartic Ramakrishnan
1. Research: Each item in the system has metadata
metadata (including provenance) and relations to other data items
metadata added to it.
2. Workflow: All data items created in the lab are added
metadata
to a (lab-owned) workflow system.
3. Authoring: A paper is written in an authoring tool which
can pull data with provenance from the workflow tool in the
appropriate representation into the document.
metadata 4. Editing and review: Once the co-authors agree, the
paper is ‘exposed’ to the editors, who in turn expose it to
metadata reviewers. Reports are stored in the authoring/editing
system, the paper gets updated, until it is validated.
5. Publishing and distribution: When a paper is
published, a collection of validated information is
exposed to the world. It remains connected to its related
Rats were subjected to two
data item, and its heritage can be traced.
grueling tests
(click on fig 2 to see underlying
6. User applications: distributed applications run on this
data). These results suggest that ‘exposed data’ universe.
the neurological pain pro-
Some other publisher
Review
Revise
Edit
17. What is needed to get there?
A. Workflow tools: Linked-data-based workflow tools for all
sciences: scalable, safe, and user-friendly
18. What is needed to get there?
A. Workflow tools: Linked-data-based workflow tools for all
sciences: scalable, safe, and user-friendly
B. Authoring and reviewing tools: that enable use of rich
and provenance-tracked elements
19. What is needed to get there?
A. Workflow tools: Linked-data-based workflow tools for all
sciences: scalable, safe, and user-friendly
B. Authoring and reviewing tools: that enable use of rich
and provenance-tracked elements
C. Metadata standards: Standards that allow exchange of
information on any knowledge item created in a lab,
including provenance/privacy/IPR rights
20. What is needed to get there?
A. Workflow tools: Linked-data-based workflow tools for all
sciences: scalable, safe, and user-friendly
B. Authoring and reviewing tools: that enable use of rich
and provenance-tracked elements
C. Metadata standards: Standards that allow exchange of
information on any knowledge item created in a lab,
including provenance/privacy/IPR rights
D. Social change: Scientists who store, track and annotate
their work
21. What is needed to get there?
A. Workflow tools: Linked-data-based workflow tools for all
sciences: scalable, safe, and user-friendly
B. Authoring and reviewing tools: that enable use of rich
and provenance-tracked elements
C. Metadata standards: Standards that allow exchange of
information on any knowledge item created in a lab,
including provenance/privacy/IPR rights
D. Social change: Scientists who store, track and annotate
their work
E. Semantic/Linked Data XML repositories.
22. What is needed to get there?
A. Workflow tools: Linked-data-based workflow tools for all
sciences: scalable, safe, and user-friendly
B. Authoring and reviewing tools: that enable use of rich
and provenance-tracked elements
C. Metadata standards: Standards that allow exchange of
information on any knowledge item created in a lab,
including provenance/privacy/IPR rights
D. Social change: Scientists who store, track and annotate
their work
E. Semantic/Linked Data XML repositories.
F. Publishing systems that run application servers.
23. What is needed to get there?
A. Workflow tools: Linked-data-based workflow tools for all
sciences: scalable, safe, and user-friendly tool builders
B. Authoring and reviewing tools: that enable use of rich
and provenance-tracked elements
C. Metadata standards: Standards that allow exchange of
information on any knowledge item created in a lab,
including provenance/privacy/IPR rights
D. Social change: Scientists who store, track and annotate
their work
E. Semantic/Linked Data XML repositories.
F. Publishing systems that run application servers.
24. What is needed to get there?
A. Workflow tools: Linked-data-based workflow tools for all
sciences: scalable, safe, and user-friendly tool builders
B. Authoring and reviewing tools: that enable use of rich
and provenance-tracked elements tool builders
C. Metadata standards: Standards that allow exchange of
information on any knowledge item created in a lab,
including provenance/privacy/IPR rights
D. Social change: Scientists who store, track and annotate
their work
E. Semantic/Linked Data XML repositories.
F. Publishing systems that run application servers.
25. What is needed to get there?
A. Workflow tools: Linked-data-based workflow tools for all
sciences: scalable, safe, and user-friendly tool builders
B. Authoring and reviewing tools: that enable use of rich
and provenance-tracked elements tool builders
C. Metadata standards: Standards that allow exchange of
information on any knowledge item created in a lab,
including provenance/privacy/IPR rights standards bodies
D. Social change: Scientists who store, track and annotate
their work
E. Semantic/Linked Data XML repositories.
F. Publishing systems that run application servers.
26. What is needed to get there?
A. Workflow tools: Linked-data-based workflow tools for all
sciences: scalable, safe, and user-friendly tool builders
B. Authoring and reviewing tools: that enable use of rich
and provenance-tracked elements tool builders
C. Metadata standards: Standards that allow exchange of
information on any knowledge item created in a lab,
including provenance/privacy/IPR rights standards bodies
D. Social change: Scientists who store, track and annotate
their work institutes, funding bodies, individuals
E. Semantic/Linked Data XML repositories.
F. Publishing systems that run application servers.
27. What is needed to get there?
A. Workflow tools: Linked-data-based workflow tools for all
sciences: scalable, safe, and user-friendly tool builders
B. Authoring and reviewing tools: that enable use of rich
and provenance-tracked elements tool builders
C. Metadata standards: Standards that allow exchange of
information on any knowledge item created in a lab,
including provenance/privacy/IPR rights standards bodies
D. Social change: Scientists who store, track and annotate
their work institutes, funding bodies, individuals
E. Semantic/Linked Data XML repositories. publishers
F. Publishing systems that run application servers.
28. What is needed to get there?
A. Workflow tools: Linked-data-based workflow tools for all
sciences: scalable, safe, and user-friendly tool builders
B. Authoring and reviewing tools: that enable use of rich
and provenance-tracked elements tool builders
C. Metadata standards: Standards that allow exchange of
information on any knowledge item created in a lab,
including provenance/privacy/IPR rights standards bodies
D. Social change: Scientists who store, track and annotate
their work institutes, funding bodies, individuals
E. Semantic/Linked Data XML repositories. publishers
F. Publishing systems that run application servers.
publishers
31. A. Workflow tools are emerging
http://VisTrails.org
http://MyExperiment.org
32. A. Workflow tools are emerging
http://VisTrails.org
http://MyExperiment.org
http://wings.isi.edu/
33. B. Authoring ‘ecosystems: SWAN
person
SWAN Semantic Relationships
comment
concept
Claim publication
hypothesis
gene
Claim publication
group
publication
Public Excel file
PDFs
Private
comment
publication person
Claim
publication
MSWORD file Slide by Tim Clark
34. B. Authoring ‘ecosystems: SWAN
person
SWAN Semantic Relationships
annotates
comment
authoredBy
makes hasEvidence
concept
annotates
Claim publication
shareWith hypothesis
makes hasEvidence
gene
Claim publication
hasEvidence discussedIn
group
publication
Public Excel file describes describes
PDFs
Private makes hasEvidence annotates
comment
publication person
Claim
hasEvidence authoredBy authorOf
publication
shareWith
describes
MSWORD file Slide by Tim Clark
35. C. Metadata: HCLS SiG Scientific Discourse
http://esw.w3.org/HCLSIG/SWANSIOC:
Project Description
Provide a Semantic Web platform for biomedical discourse which can be
evolved over time into a more general facility for many types of scientific
discourse, and which is linked to key biological categories specified by
ontologies.
Discourse categories should include research questions, scientific
assertions or claims, hypotheses, comments and discussion, experiments,
data, publications, citations, and evidence.
Our primary scientific use cases will be derived from problems in digital
scientific communications and web-based research collaboratories
supporting research in neurological disorders and therapies.
The scientific use cases will motivate a series of informatics use cases
which can later be generalized across wider areas of biology and
medicine.
36. C. Metadata: SWAN
The Knowledge Ecosystem:
Interlocking Cycles of Research
Draw conclusions Draw conclusions
Communicate
Collect data
Collect data
Perform Perform
experiment Gather info experiment
Synthesize
Create/modify Create/modify
hypothesis hypothesis
SWAN
Slide by Tim Clark
37. C. Metadata: Annotation Ontology
foaf:person rdf:Type
http://www.ht.org/
foaf.rdf#me
June 1, 2010
pav:createdBy
pav:createdOn ann:annotates http://anyurl.com/sf_pat01.html
hasTag
rdf:Type
hasTopic
Tag
Atomic
tag
FMA:skull ann:context
onDocument
Linear skull fracture
rdf:Type
Other annotations on the same document:
1. Atomic annotation on image (tag: “hematoma”)
2. General annotation (tag: “injury”) InitEndCornerSelector
init
Other annotations on similar documents: (304, 507)
1. General annotation (tag: “skull fracture”) rdfs:SubClassOf
end
(380, 618)
ImageSelector
Slide by Tim Clark
40. D. Linked Data: E.g. for Elsevier
this says
<ce:section id=#123> mice like cheese
41. D. Linked Data: E.g. for Elsevier
said @anita
on May 31 2010
this says
<ce:section id=#123> mice like cheese
42. D. Linked Data: E.g. for Elsevier
but we all know
she was jetlagged then
said @anita
on May 31 2010
this says
<ce:section id=#123> mice like cheese
43. D. Linked Data: E.g. for Elsevier
immutable, $$, proprietary
but we all know
she was jetlagged then
said @anita
on May 31 2010
this says
<ce:section id=#123> mice like cheese
44. D. Linked Data: E.g. for Elsevier
immutable, $$, proprietary dynamic, personal, task-driven, - open?
but we all know
she was jetlagged then
said @anita
on May 31 2010
this says
<ce:section id=#123> mice like cheese
47. D. What to link? Semantic annotation grid
Granularity
collection
document
claim
triple
entity
48. D. What to link? Semantic annotation grid
Granularity
collection
document
claim
triple
entity Moment
measure author/editor typesetter/production reader/data minin
49. D. What to link? Semantic annotation grid
Granularity
collection
document
claim
triple
entity Moment
measure author/editor typesetter/production reader/data minin
Meansmanual
semi-automated
automated
50. D. What to link? Semantic annotation grid
Granularity
collection
document
claim
Automated Copy Editing
triple
entity Moment
measure author/editor typesetter/production reader/data minin
Meansmanual
semi-automated
automated
51. D. What to link? Semantic annotation grid
Granularity
collection
document
claim
Automated Copy Editing
triple
entity Moment
measure author/editor typesetter/production reader/data minin
Reflect
Meansmanual
semi-automated
automated
52. D. A start: .XMP RDF in all our PDFs: DC + PRISM
55. Next Steps:
• Fall 2010:
‘Beyond the PDF’: Workshop organized by Phil Bourne
@UCSD:
–Take one paper from his group
–And all data that went into making that paper
–Including all correspondence, raw data, etc.
–Challenge: how better to represent that?
• 2010 - 2011: Try to gather resources, current efforts,
etc. on virtual platform
• August 2011:
FoRC: Future of Research Communications
–Dagstuhl Workshop
–Involve key people (include funding bodies, libraries,
institutions) to see where bottlenecks are
• Start using these tools and writing this way!