SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Downloaden Sie, um offline zu lesen
Introducing the Whole Tale Project:
Merging Science and Cyberinfrastructure Pathways
Bertram Ludäscher Victoria Stodden Matt Turk
Kyle Chard (U Chicago), Niall Gaffney (TACC), Matt Jones (UCSB),
Jarek Nabrzyski (Notre Dame), Kandace Turner (NCSA)
CIRSS Seminar
September 2, 2016
Problems Facing Data Researchers
Workflow for data research is fragmented
● Data comes from many sources and is “integrated
the old fashioned way” e.g. via chains of email
● Use a collection of cloud services copying data
from Dropbox and Box to local storage with a
distributed directory structures to organize (and
provide discovery) to data
● Actions taken on data are not recorded (custom
scripts, some version of a community developed
and supported codebase)
● Publication of final data as prescribed by a Data
Management Plan (hopefully with a DOI) with link in
publications gives no reproducibility
Whole Tale ~ Whole Story (research à publication)
~ Long Tail of Science (Lil’Data/MPC) + Big Data/HPC
Whole Tale (WT) Big Picture
• WT will leverage & contribute
to existing CI and tools to
support the whole science
story (= run-to-pub-cycle), and
providing access to big CI &
HPC for long tail researchers.
➡ Integrated tools to simplify
usage and promote best
practices.
• NSF CC*DNI DIBBS:
– 5 Institutions, 5 Years ($5M total)
– Cooperative Agreement
WT Project Org & Working Groups
Whole	Tale:	Merging	Science	&	CI	Pathways	
…	through	Working	Groups!
Working	Groups	Driving	Use	
Cases	and		Adoption
Working	Groups	to	Provide	Key	
Components
GSLIS Research Showcase
K.	Bocinsky,	T.	Kohler,	A	2000-year	reconstruction	of	the	rain-fed	
maize	agricultural	niche	in	the	US	Southwest.	Nature
Communications.	doi:10.1038/ncomms6618
Map	showing	the	"selected"	trees	for	reconstructing	
precipitation	at	four	sites	in	theCAR regression	
approach	(Correlation-Adjusted	corRelation).
Reconstructions	
for	AD	1247	
Science Pathways: Archaeology
Joining data from different environments to enable new research:
• Streamline gathering, integrating, and analyzing environmental data
needed to build up a fuller picture of the paleoclimate.
• Enables access and interrogation of data from DataONE, iPlant, and
the Long Term Ecological Research Network (LTER), leveraging
Globus On-line, Brown Dog’s data tilling services, RStudio, and
XSEDE resources.
• Enables access to the Digital Archeological Record (tDAR), both
through a native API and via a tDAR member node in DataONE.
The CI Side of this Science
Pathways (Archaeology)
Reproducible Science Example:
Paleoclimate Reconstruction
Science paper (OA) uses:
• open source code:
– R, PaleoCAR, …
• multiple tree-ring databases
• HPC resources
• Example WG Goal:
– Reproducibility study using
• YesWorkflow toolkit:
Workflow & provenance from
code
• Jupyter notebook
Science Pathways: Astronomy
• Researcher A uses university credentials to access large cosmological
simulation outputs from Blue Waters published into WT, does analysis
using Whole Tale services in a Jupyter Notebook, and creates a new
result. With the publication, user creates a DOI linking data and source
code used to generate data tied to original input data and references
this in his reviewed and published research paper.
• Another researcher finds the DOI, and is able to access data and
analysis to then compare model output with new observations from the
Hobby Eberly Telescope Dark Energy experiment on TACC systems.
Results are shared with the original author and a new DOI is created
for these results.
• Enabling direct analysis and
collaborative research on simulation
outputs stored in Whole Tale
enabled repositories via user-
supplied Python scripts.
• YT (yt-project.org), will provide
advanced, customizable analysis
and visualization, leveraging Jupyter
for provide the scripting support.
• Federation will allow jobs to move to
data or visa versa where appropriate
Science Pathways: Astronomy
The Whole Tale’s Approach
WT will integrate well established CI components creating a simple and
unified environment to use, share, and publish data and workflows
1. Unified Authentication via Globus Auth
2. Abstracted Storage Layer with a unified namespace
3. Integrated Python and R APIs integrated with Jupyter Notebook Environments
4. Ingest and publication service linking data, computations, and scholarly articles
5. OwnCloud desktop integration for “Dropbox like interface”
6. Event System to react to changes (e.g. new data published)
7. Data Dashboard to ease data management and service interactions
• Capture full workflow via Notebooks, scripts, and applications to be
published along with Data and Research publications
WT Dashboard
• Web-based interface to enable ‘live
articles’ and research repeatability by
enabling the execution of research
methods on data using NDS labs
Docker containers and notebooks
(Jupyter).
• Research methods: provided support
for running python scripts
• Research Data: interfaced with NDS
Labs Python API - connecting the
desktop to the NDS Labs data
storage mechanisms (iRODS,
Dropbox, Google Drive, SciDrive and
local file integration)
• Provide a Docker “diff” tarball for
downloading research run results
Demonstrated	@	SC	2014
http://ndspilot.com/nds/ndspilot1080p.mp4
Base Share Integrate Reproduce Operation
• Ingest	data	from	
HTTP,	Globus,	and	
DataONE
• Store	data	in	a	
private	cloud	based	
home	directory
• Move	and	manage	
data	in	iRODS
• Interact	with	data	
using	Jupyter
• Manage	data	
across	ownClound
&	iRODS
• Authenticate	
using	ORCID
• Interact	with	data	
through	a	suite	of	
frontends
• Automatically	
extract	key	
metadata
• Search	and	manage	
distributed	data	
from	within	
frontends
• Operate	on	remote	
data	as	if	it	were	
local	(including	
using	OAI-ORE)
• Utilize	a	single	
identity	across	
services	
• Discover	and	
share	frontends	
through	global	
repository
• Integrate	data	and	
workflows	with	
publications
• Issue,	resolve,	and	
track	identifiers	
for	distributed	
data
• Discover	data	using	
federated	and	
distributed	queries
• Track	provenance	
across	services
• Organize	data	
collections	via	user-
defined		
namespaces
High-level Milestones, Phases
Working with the NDS and Others
WT is an NDS Pilot Project
– Will develop and integrate system as
components in NDS Labs
– Work with communities to create
powerful environments tailored for the
communities needs
WT will work with NDS to provide status and
train other users
– NDS Meeting Hackathons
– Cooperation and collaboration with
other projects in NDS
WT will collaborate with other projects (DIBBs,
BDHub, RDA…)
è Get involved! (e.g., Working Groups,
Hackathons, Summer Internship) wholetale.org
Part 2
Victoria Stodden
WT “Philosophy”
• Why a platform?
• Computation is near-ubiquitious in
research, yet we have few best practices
or dissemination standards
• And it’s complex! Small changes in a
computational implementation or in the
data can have a dramatic impact on the
result.
Some Bold Assertions
• Software is used as a tool of discovery in nearly all
research today.
• When software is a key part of the discovery
process, it should be subject to the same
philosophy of transparency as any method.
• Software is an integral and inseparable
component of the computational infrastructure in
which most research takes place.
• Computational research is embedded in a social
structure which includes many stakeholders.
Example: Facebook Study
Dissemination is Incomplete
• Publications are missing details that are
necessary to understand and verify
published findings…
• => credibility crisis in computational
science
• How to reproduce the result? What’s
needed?
Computational Reproducibility
Traditionally two branches to the scientific method:
• Branch 1 (deductive): mathematics, formal logic,
• Branch 2 (empirical): statistical analysis of
controlled experiments.
Now, new branches due to technological changes?
• Branch 3,4? (computational): large scale
simulations / data driven computational science.
The Ubiquity of Error
The central motivation for the scientific method is to
root out error:
• Deductive branch: the well-defined concept of the
proof,
• Empirical branch: the machinery of hypothesis
testing, appropriate statistical methods, structured
communication of methods and protocols.
Claim: Computation presents only a potential
third/fourth branch of the scientific method, until
the development of comparable standards.
Whole Tale?
Proposed Solution:
• Capture computational steps / provide
compute environment
• Provide unique identifiers to
data/code/workflows associated with results
• Provide links to embed in the publication for
discoverability
• Preserve digital scholarly objects
So it looks pretty simple..
• What about big data?
• Complex codes?
• Reuse and bug fixes?
• Meta-analysis?
• Working with external groups, such as publishers?
• Incentives? What if they don’t come?
• Allocating resources? Sustainability models?
• What does citation mean and how are
contributions to be rewarded?
Incompleteness
• “I ran all the stuff and it’s still the wrong
answer!”
• “I got a different result, using your code
and data!”
• “Your code doesn’t work!”
• “Where’s all the documentation? I can’t
figure this thing out.”
Part 3 (Demo)
Matt Turk

Weitere ähnliche Inhalte

Was ist angesagt?

The SFX Framework for Context-Sensitive Reference Linking
The SFX Framework for  Context-Sensitive Reference LinkingThe SFX Framework for  Context-Sensitive Reference Linking
The SFX Framework for Context-Sensitive Reference LinkingHerbert Van de Sompel
 
Sciunits: Reusable Research Objects
Sciunits: Reusable Research Objects Sciunits: Reusable Research Objects
Sciunits: Reusable Research Objects Globus
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the partsCarole Goble
 
DuraMat Data Management and Analytics
DuraMat Data Management and AnalyticsDuraMat Data Management and Analytics
DuraMat Data Management and AnalyticsAnubhav Jain
 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the ContinuumIan Foster
 
Scott Edmunds ISMB talk on Big Data Publishing
Scott Edmunds ISMB talk on Big Data PublishingScott Edmunds ISMB talk on Big Data Publishing
Scott Edmunds ISMB talk on Big Data PublishingGigaScience, BGI Hong Kong
 
GlobusWorld 2015
GlobusWorld 2015GlobusWorld 2015
GlobusWorld 2015Tanu Malik
 
Using Neo4j for exploring the research graph connections made by RD-Switchboard
Using Neo4j for exploring the research graph connections made by RD-SwitchboardUsing Neo4j for exploring the research graph connections made by RD-Switchboard
Using Neo4j for exploring the research graph connections made by RD-Switchboardamiraryani
 
Open-source tools for generating and analyzing large materials data sets
Open-source tools for generating and analyzing large materials data setsOpen-source tools for generating and analyzing large materials data sets
Open-source tools for generating and analyzing large materials data setsAnubhav Jain
 
Data Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationData Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationIan Foster
 
Research Automation for Data-Driven Discovery
Research Automationfor Data-Driven DiscoveryResearch Automationfor Data-Driven Discovery
Research Automation for Data-Driven DiscoveryGlobus
 
More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?Paul Groth
 
Understanding WeboNaver
Understanding WeboNaverUnderstanding WeboNaver
Understanding WeboNaverHan Woo PARK
 
TMS workshop on machine learning in materials science: Intro to deep learning...
TMS workshop on machine learning in materials science: Intro to deep learning...TMS workshop on machine learning in materials science: Intro to deep learning...
TMS workshop on machine learning in materials science: Intro to deep learning...BrianDeCost
 
An Overview of Bionimbus (March 2010)
An Overview of Bionimbus (March 2010)An Overview of Bionimbus (March 2010)
An Overview of Bionimbus (March 2010)Robert Grossman
 
Next-Generation Search Engines for Information Retrieval
Next-Generation Search Engines for Information RetrievalNext-Generation Search Engines for Information Retrieval
Next-Generation Search Engines for Information RetrievalWaqas Tariq
 

Was ist angesagt? (20)

The SFX Framework for Context-Sensitive Reference Linking
The SFX Framework for  Context-Sensitive Reference LinkingThe SFX Framework for  Context-Sensitive Reference Linking
The SFX Framework for Context-Sensitive Reference Linking
 
FAIRy Stories
FAIRy StoriesFAIRy Stories
FAIRy Stories
 
Sciunits: Reusable Research Objects
Sciunits: Reusable Research Objects Sciunits: Reusable Research Objects
Sciunits: Reusable Research Objects
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the parts
 
DuraMat Data Management and Analytics
DuraMat Data Management and AnalyticsDuraMat Data Management and Analytics
DuraMat Data Management and Analytics
 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the Continuum
 
Scott Edmunds ISMB talk on Big Data Publishing
Scott Edmunds ISMB talk on Big Data PublishingScott Edmunds ISMB talk on Big Data Publishing
Scott Edmunds ISMB talk on Big Data Publishing
 
UPennONS
UPennONSUPennONS
UPennONS
 
Braintalk cuso nm
Braintalk cuso nmBraintalk cuso nm
Braintalk cuso nm
 
GlobusWorld 2015
GlobusWorld 2015GlobusWorld 2015
GlobusWorld 2015
 
Using Neo4j for exploring the research graph connections made by RD-Switchboard
Using Neo4j for exploring the research graph connections made by RD-SwitchboardUsing Neo4j for exploring the research graph connections made by RD-Switchboard
Using Neo4j for exploring the research graph connections made by RD-Switchboard
 
Open-source tools for generating and analyzing large materials data sets
Open-source tools for generating and analyzing large materials data setsOpen-source tools for generating and analyzing large materials data sets
Open-source tools for generating and analyzing large materials data sets
 
Data Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationData Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud Automation
 
Research Automation for Data-Driven Discovery
Research Automationfor Data-Driven DiscoveryResearch Automationfor Data-Driven Discovery
Research Automation for Data-Driven Discovery
 
More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?
 
Understanding WeboNaver
Understanding WeboNaverUnderstanding WeboNaver
Understanding WeboNaver
 
TMS workshop on machine learning in materials science: Intro to deep learning...
TMS workshop on machine learning in materials science: Intro to deep learning...TMS workshop on machine learning in materials science: Intro to deep learning...
TMS workshop on machine learning in materials science: Intro to deep learning...
 
MUDROD - Ranking
MUDROD - RankingMUDROD - Ranking
MUDROD - Ranking
 
An Overview of Bionimbus (March 2010)
An Overview of Bionimbus (March 2010)An Overview of Bionimbus (March 2010)
An Overview of Bionimbus (March 2010)
 
Next-Generation Search Engines for Information Retrieval
Next-Generation Search Engines for Information RetrievalNext-Generation Search Engines for Information Retrieval
Next-Generation Search Engines for Information Retrieval
 

Ähnlich wie Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure Pathways

Mtsr2015 goble-keynote
Mtsr2015 goble-keynoteMtsr2015 goble-keynote
Mtsr2015 goble-keynoteCarole Goble
 
Open Data (and Software, and other Research Artefacts) - A proper management
Open Data (and Software, and other Research Artefacts) -A proper managementOpen Data (and Software, and other Research Artefacts) -A proper management
Open Data (and Software, and other Research Artefacts) - A proper management Oscar Corcho
 
Why Data Science Matters - 2014 WDS Data Stewardship Award Lecture
Why Data Science Matters - 2014 WDS Data Stewardship Award LectureWhy Data Science Matters - 2014 WDS Data Stewardship Award Lecture
Why Data Science Matters - 2014 WDS Data Stewardship Award LectureXiaogang (Marshall) Ma
 
How to Execute A Research Paper
How to Execute A Research PaperHow to Execute A Research Paper
How to Execute A Research PaperAnita de Waard
 
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumElsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumAnita de Waard
 
Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.Paul Groth
 
Research Objects for improved sharing and reproducibility
Research Objects for improved sharing and reproducibilityResearch Objects for improved sharing and reproducibility
Research Objects for improved sharing and reproducibilityOscar Corcho
 
Big Data and the Future of Publishing
Big Data and the Future of PublishingBig Data and the Future of Publishing
Big Data and the Future of PublishingAnita de Waard
 
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...BigData_Europe
 
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...Jonathan Tedds
 
Zudilova-Seinstra-Elsevier-data and the article of the future-nfdp13
Zudilova-Seinstra-Elsevier-data and the article of the future-nfdp13Zudilova-Seinstra-Elsevier-data and the article of the future-nfdp13
Zudilova-Seinstra-Elsevier-data and the article of the future-nfdp13DataDryad
 
Networked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseNetworked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseAnita de Waard
 
Enabling better science: Results and vision of the OpenAIRE infrastructure an...
Enabling better science: Results and vision of the OpenAIRE infrastructure an...Enabling better science: Results and vision of the OpenAIRE infrastructure an...
Enabling better science: Results and vision of the OpenAIRE infrastructure an...OpenAIRE
 
Enabling better science - Results and vision of the OpenAIRE infrastructure a...
Enabling better science - Results and vision of the OpenAIRE infrastructure a...Enabling better science - Results and vision of the OpenAIRE infrastructure a...
Enabling better science - Results and vision of the OpenAIRE infrastructure a...Paolo Manghi
 
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...dkNET
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Anita de Waard
 
Elag workshop sessie 1 en 2 v10
Elag workshop sessie 1 en 2 v10Elag workshop sessie 1 en 2 v10
Elag workshop sessie 1 en 2 v10Jeroen Rombouts
 

Ähnlich wie Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure Pathways (20)

A Clean Slate?
A Clean Slate?A Clean Slate?
A Clean Slate?
 
Mtsr2015 goble-keynote
Mtsr2015 goble-keynoteMtsr2015 goble-keynote
Mtsr2015 goble-keynote
 
Open Data (and Software, and other Research Artefacts) - A proper management
Open Data (and Software, and other Research Artefacts) -A proper managementOpen Data (and Software, and other Research Artefacts) -A proper management
Open Data (and Software, and other Research Artefacts) - A proper management
 
Why Data Science Matters - 2014 WDS Data Stewardship Award Lecture
Why Data Science Matters - 2014 WDS Data Stewardship Award LectureWhy Data Science Matters - 2014 WDS Data Stewardship Award Lecture
Why Data Science Matters - 2014 WDS Data Stewardship Award Lecture
 
How to Execute A Research Paper
How to Execute A Research PaperHow to Execute A Research Paper
How to Execute A Research Paper
 
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumElsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
 
Data management
Data management Data management
Data management
 
Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.
 
Research Objects for improved sharing and reproducibility
Research Objects for improved sharing and reproducibilityResearch Objects for improved sharing and reproducibility
Research Objects for improved sharing and reproducibility
 
Big Data and the Future of Publishing
Big Data and the Future of PublishingBig Data and the Future of Publishing
Big Data and the Future of Publishing
 
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
 
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
 
Zudilova-Seinstra-Elsevier-data and the article of the future-nfdp13
Zudilova-Seinstra-Elsevier-data and the article of the future-nfdp13Zudilova-Seinstra-Elsevier-data and the article of the future-nfdp13
Zudilova-Seinstra-Elsevier-data and the article of the future-nfdp13
 
Shifting the Burden from the User to the Data Provider
Shifting the Burden from the User to the Data ProviderShifting the Burden from the User to the Data Provider
Shifting the Burden from the User to the Data Provider
 
Networked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseNetworked Science, And Integrating with Dataverse
Networked Science, And Integrating with Dataverse
 
Enabling better science: Results and vision of the OpenAIRE infrastructure an...
Enabling better science: Results and vision of the OpenAIRE infrastructure an...Enabling better science: Results and vision of the OpenAIRE infrastructure an...
Enabling better science: Results and vision of the OpenAIRE infrastructure an...
 
Enabling better science - Results and vision of the OpenAIRE infrastructure a...
Enabling better science - Results and vision of the OpenAIRE infrastructure a...Enabling better science - Results and vision of the OpenAIRE infrastructure a...
Enabling better science - Results and vision of the OpenAIRE infrastructure a...
 
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
 
Elag workshop sessie 1 en 2 v10
Elag workshop sessie 1 en 2 v10Elag workshop sessie 1 en 2 v10
Elag workshop sessie 1 en 2 v10
 

Mehr von Bertram Ludäscher

Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...Bertram Ludäscher
 
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion
Games, Queries, and Argumentation Frameworks: Time for a Family ReunionGames, Queries, and Argumentation Frameworks: Time for a Family Reunion
Games, Queries, and Argumentation Frameworks: Time for a Family ReunionBertram Ludäscher
 
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion!
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion!Games, Queries, and Argumentation Frameworks: Time for a Family Reunion!
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion!Bertram Ludäscher
 
[Flashback] Integration of Active and Deductive Database Rules
[Flashback] Integration of Active and Deductive Database Rules[Flashback] Integration of Active and Deductive Database Rules
[Flashback] Integration of Active and Deductive Database RulesBertram Ludäscher
 
[Flashback] Statelog: Integration of Active & Deductive Database Rules
[Flashback] Statelog: Integration of Active & Deductive Database Rules[Flashback] Statelog: Integration of Active & Deductive Database Rules
[Flashback] Statelog: Integration of Active & Deductive Database RulesBertram Ludäscher
 
Answering More Questions with Provenance and Query Patterns
Answering More Questions with Provenance and Query PatternsAnswering More Questions with Provenance and Query Patterns
Answering More Questions with Provenance and Query PatternsBertram Ludäscher
 
Computational Reproducibility vs. Transparency: Is It FAIR Enough?
Computational Reproducibility vs. Transparency: Is It FAIR Enough?Computational Reproducibility vs. Transparency: Is It FAIR Enough?
Computational Reproducibility vs. Transparency: Is It FAIR Enough?Bertram Ludäscher
 
Which Model Does Not Belong: A Dialogue
Which Model Does Not Belong: A DialogueWhich Model Does Not Belong: A Dialogue
Which Model Does Not Belong: A DialogueBertram Ludäscher
 
From Workflows to Transparent Research Objects and Reproducible Science Tales
From Workflows to Transparent Research Objects and Reproducible Science TalesFrom Workflows to Transparent Research Objects and Reproducible Science Tales
From Workflows to Transparent Research Objects and Reproducible Science TalesBertram Ludäscher
 
From Research Objects to Reproducible Science Tales
From Research Objects to Reproducible Science TalesFrom Research Objects to Reproducible Science Tales
From Research Objects to Reproducible Science TalesBertram Ludäscher
 
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of UsPossible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of UsBertram Ludäscher
 
Deduktive Datenbanken & Logische Programme: Eine kleine Zeitreise
Deduktive Datenbanken & Logische Programme: Eine kleine ZeitreiseDeduktive Datenbanken & Logische Programme: Eine kleine Zeitreise
Deduktive Datenbanken & Logische Programme: Eine kleine ZeitreiseBertram Ludäscher
 
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...Bertram Ludäscher
 
Dissecting Reproducibility: A case study with ecological niche models in th...
Dissecting Reproducibility:  A case study with ecological niche models  in th...Dissecting Reproducibility:  A case study with ecological niche models  in th...
Dissecting Reproducibility: A case study with ecological niche models in th...Bertram Ludäscher
 
Incremental Recomputation: Those who cannot remember the past are condemned ...
Incremental Recomputation:  Those who cannot remember the past are condemned ...Incremental Recomputation:  Those who cannot remember the past are condemned ...
Incremental Recomputation: Those who cannot remember the past are condemned ...Bertram Ludäscher
 
Validation and Inference of Schema-Level Workflow Data-Dependency Annotations
Validation and Inference of Schema-Level Workflow Data-Dependency AnnotationsValidation and Inference of Schema-Level Workflow Data-Dependency Annotations
Validation and Inference of Schema-Level Workflow Data-Dependency AnnotationsBertram Ludäscher
 
An ontology-driven framework for data transformation in scientific workflows
An ontology-driven framework for data transformation in scientific workflowsAn ontology-driven framework for data transformation in scientific workflows
An ontology-driven framework for data transformation in scientific workflowsBertram Ludäscher
 
Knowledge Representation & Reasoning and the Hierarchy-of-Hypotheses Approach
Knowledge Representation & Reasoning and the Hierarchy-of-Hypotheses ApproachKnowledge Representation & Reasoning and the Hierarchy-of-Hypotheses Approach
Knowledge Representation & Reasoning and the Hierarchy-of-Hypotheses ApproachBertram Ludäscher
 
Whole-Tale: The Experience of Research
Whole-Tale: The Experience of ResearchWhole-Tale: The Experience of Research
Whole-Tale: The Experience of ResearchBertram Ludäscher
 
ETC & Authors in the Driver's Seat
ETC & Authors in the Driver's SeatETC & Authors in the Driver's Seat
ETC & Authors in the Driver's SeatBertram Ludäscher
 

Mehr von Bertram Ludäscher (20)

Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion
Games, Queries, and Argumentation Frameworks: Time for a Family ReunionGames, Queries, and Argumentation Frameworks: Time for a Family Reunion
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion
 
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion!
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion!Games, Queries, and Argumentation Frameworks: Time for a Family Reunion!
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion!
 
[Flashback] Integration of Active and Deductive Database Rules
[Flashback] Integration of Active and Deductive Database Rules[Flashback] Integration of Active and Deductive Database Rules
[Flashback] Integration of Active and Deductive Database Rules
 
[Flashback] Statelog: Integration of Active & Deductive Database Rules
[Flashback] Statelog: Integration of Active & Deductive Database Rules[Flashback] Statelog: Integration of Active & Deductive Database Rules
[Flashback] Statelog: Integration of Active & Deductive Database Rules
 
Answering More Questions with Provenance and Query Patterns
Answering More Questions with Provenance and Query PatternsAnswering More Questions with Provenance and Query Patterns
Answering More Questions with Provenance and Query Patterns
 
Computational Reproducibility vs. Transparency: Is It FAIR Enough?
Computational Reproducibility vs. Transparency: Is It FAIR Enough?Computational Reproducibility vs. Transparency: Is It FAIR Enough?
Computational Reproducibility vs. Transparency: Is It FAIR Enough?
 
Which Model Does Not Belong: A Dialogue
Which Model Does Not Belong: A DialogueWhich Model Does Not Belong: A Dialogue
Which Model Does Not Belong: A Dialogue
 
From Workflows to Transparent Research Objects and Reproducible Science Tales
From Workflows to Transparent Research Objects and Reproducible Science TalesFrom Workflows to Transparent Research Objects and Reproducible Science Tales
From Workflows to Transparent Research Objects and Reproducible Science Tales
 
From Research Objects to Reproducible Science Tales
From Research Objects to Reproducible Science TalesFrom Research Objects to Reproducible Science Tales
From Research Objects to Reproducible Science Tales
 
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of UsPossible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
 
Deduktive Datenbanken & Logische Programme: Eine kleine Zeitreise
Deduktive Datenbanken & Logische Programme: Eine kleine ZeitreiseDeduktive Datenbanken & Logische Programme: Eine kleine Zeitreise
Deduktive Datenbanken & Logische Programme: Eine kleine Zeitreise
 
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...
 
Dissecting Reproducibility: A case study with ecological niche models in th...
Dissecting Reproducibility:  A case study with ecological niche models  in th...Dissecting Reproducibility:  A case study with ecological niche models  in th...
Dissecting Reproducibility: A case study with ecological niche models in th...
 
Incremental Recomputation: Those who cannot remember the past are condemned ...
Incremental Recomputation:  Those who cannot remember the past are condemned ...Incremental Recomputation:  Those who cannot remember the past are condemned ...
Incremental Recomputation: Those who cannot remember the past are condemned ...
 
Validation and Inference of Schema-Level Workflow Data-Dependency Annotations
Validation and Inference of Schema-Level Workflow Data-Dependency AnnotationsValidation and Inference of Schema-Level Workflow Data-Dependency Annotations
Validation and Inference of Schema-Level Workflow Data-Dependency Annotations
 
An ontology-driven framework for data transformation in scientific workflows
An ontology-driven framework for data transformation in scientific workflowsAn ontology-driven framework for data transformation in scientific workflows
An ontology-driven framework for data transformation in scientific workflows
 
Knowledge Representation & Reasoning and the Hierarchy-of-Hypotheses Approach
Knowledge Representation & Reasoning and the Hierarchy-of-Hypotheses ApproachKnowledge Representation & Reasoning and the Hierarchy-of-Hypotheses Approach
Knowledge Representation & Reasoning and the Hierarchy-of-Hypotheses Approach
 
Whole-Tale: The Experience of Research
Whole-Tale: The Experience of ResearchWhole-Tale: The Experience of Research
Whole-Tale: The Experience of Research
 
ETC & Authors in the Driver's Seat
ETC & Authors in the Driver's SeatETC & Authors in the Driver's Seat
ETC & Authors in the Driver's Seat
 

Kürzlich hochgeladen

Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...HyderabadDolls
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangeThinkInnovation
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfSayantanBiswas37
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numberssuginr1
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...gajnagarg
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...kumargunjan9515
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...kumargunjan9515
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabiaahmedjiabur940
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxronsairoathenadugay
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowgargpaaro
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...kumargunjan9515
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...nirzagarg
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...HyderabadDolls
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制vexqp
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...HyderabadDolls
 

Kürzlich hochgeladen (20)

Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbers
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
 

Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure Pathways

  • 1. Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure Pathways Bertram Ludäscher Victoria Stodden Matt Turk Kyle Chard (U Chicago), Niall Gaffney (TACC), Matt Jones (UCSB), Jarek Nabrzyski (Notre Dame), Kandace Turner (NCSA) CIRSS Seminar September 2, 2016
  • 2. Problems Facing Data Researchers Workflow for data research is fragmented ● Data comes from many sources and is “integrated the old fashioned way” e.g. via chains of email ● Use a collection of cloud services copying data from Dropbox and Box to local storage with a distributed directory structures to organize (and provide discovery) to data ● Actions taken on data are not recorded (custom scripts, some version of a community developed and supported codebase) ● Publication of final data as prescribed by a Data Management Plan (hopefully with a DOI) with link in publications gives no reproducibility
  • 3. Whole Tale ~ Whole Story (research à publication) ~ Long Tail of Science (Lil’Data/MPC) + Big Data/HPC
  • 4. Whole Tale (WT) Big Picture • WT will leverage & contribute to existing CI and tools to support the whole science story (= run-to-pub-cycle), and providing access to big CI & HPC for long tail researchers. ➡ Integrated tools to simplify usage and promote best practices. • NSF CC*DNI DIBBS: – 5 Institutions, 5 Years ($5M total) – Cooperative Agreement
  • 5. WT Project Org & Working Groups Whole Tale: Merging Science & CI Pathways … through Working Groups! Working Groups Driving Use Cases and Adoption Working Groups to Provide Key Components
  • 7. Joining data from different environments to enable new research: • Streamline gathering, integrating, and analyzing environmental data needed to build up a fuller picture of the paleoclimate. • Enables access and interrogation of data from DataONE, iPlant, and the Long Term Ecological Research Network (LTER), leveraging Globus On-line, Brown Dog’s data tilling services, RStudio, and XSEDE resources. • Enables access to the Digital Archeological Record (tDAR), both through a native API and via a tDAR member node in DataONE. The CI Side of this Science Pathways (Archaeology)
  • 8. Reproducible Science Example: Paleoclimate Reconstruction Science paper (OA) uses: • open source code: – R, PaleoCAR, … • multiple tree-ring databases • HPC resources • Example WG Goal: – Reproducibility study using • YesWorkflow toolkit: Workflow & provenance from code • Jupyter notebook
  • 9. Science Pathways: Astronomy • Researcher A uses university credentials to access large cosmological simulation outputs from Blue Waters published into WT, does analysis using Whole Tale services in a Jupyter Notebook, and creates a new result. With the publication, user creates a DOI linking data and source code used to generate data tied to original input data and references this in his reviewed and published research paper. • Another researcher finds the DOI, and is able to access data and analysis to then compare model output with new observations from the Hobby Eberly Telescope Dark Energy experiment on TACC systems. Results are shared with the original author and a new DOI is created for these results.
  • 10. • Enabling direct analysis and collaborative research on simulation outputs stored in Whole Tale enabled repositories via user- supplied Python scripts. • YT (yt-project.org), will provide advanced, customizable analysis and visualization, leveraging Jupyter for provide the scripting support. • Federation will allow jobs to move to data or visa versa where appropriate Science Pathways: Astronomy
  • 11. The Whole Tale’s Approach WT will integrate well established CI components creating a simple and unified environment to use, share, and publish data and workflows 1. Unified Authentication via Globus Auth 2. Abstracted Storage Layer with a unified namespace 3. Integrated Python and R APIs integrated with Jupyter Notebook Environments 4. Ingest and publication service linking data, computations, and scholarly articles 5. OwnCloud desktop integration for “Dropbox like interface” 6. Event System to react to changes (e.g. new data published) 7. Data Dashboard to ease data management and service interactions • Capture full workflow via Notebooks, scripts, and applications to be published along with Data and Research publications
  • 12. WT Dashboard • Web-based interface to enable ‘live articles’ and research repeatability by enabling the execution of research methods on data using NDS labs Docker containers and notebooks (Jupyter). • Research methods: provided support for running python scripts • Research Data: interfaced with NDS Labs Python API - connecting the desktop to the NDS Labs data storage mechanisms (iRODS, Dropbox, Google Drive, SciDrive and local file integration) • Provide a Docker “diff” tarball for downloading research run results Demonstrated @ SC 2014 http://ndspilot.com/nds/ndspilot1080p.mp4
  • 13. Base Share Integrate Reproduce Operation • Ingest data from HTTP, Globus, and DataONE • Store data in a private cloud based home directory • Move and manage data in iRODS • Interact with data using Jupyter • Manage data across ownClound & iRODS • Authenticate using ORCID • Interact with data through a suite of frontends • Automatically extract key metadata • Search and manage distributed data from within frontends • Operate on remote data as if it were local (including using OAI-ORE) • Utilize a single identity across services • Discover and share frontends through global repository • Integrate data and workflows with publications • Issue, resolve, and track identifiers for distributed data • Discover data using federated and distributed queries • Track provenance across services • Organize data collections via user- defined namespaces High-level Milestones, Phases
  • 14. Working with the NDS and Others WT is an NDS Pilot Project – Will develop and integrate system as components in NDS Labs – Work with communities to create powerful environments tailored for the communities needs WT will work with NDS to provide status and train other users – NDS Meeting Hackathons – Cooperation and collaboration with other projects in NDS WT will collaborate with other projects (DIBBs, BDHub, RDA…) è Get involved! (e.g., Working Groups, Hackathons, Summer Internship) wholetale.org
  • 16. WT “Philosophy” • Why a platform? • Computation is near-ubiquitious in research, yet we have few best practices or dissemination standards • And it’s complex! Small changes in a computational implementation or in the data can have a dramatic impact on the result.
  • 17. Some Bold Assertions • Software is used as a tool of discovery in nearly all research today. • When software is a key part of the discovery process, it should be subject to the same philosophy of transparency as any method. • Software is an integral and inseparable component of the computational infrastructure in which most research takes place. • Computational research is embedded in a social structure which includes many stakeholders.
  • 19. Dissemination is Incomplete • Publications are missing details that are necessary to understand and verify published findings… • => credibility crisis in computational science • How to reproduce the result? What’s needed?
  • 20. Computational Reproducibility Traditionally two branches to the scientific method: • Branch 1 (deductive): mathematics, formal logic, • Branch 2 (empirical): statistical analysis of controlled experiments. Now, new branches due to technological changes? • Branch 3,4? (computational): large scale simulations / data driven computational science.
  • 21. The Ubiquity of Error The central motivation for the scientific method is to root out error: • Deductive branch: the well-defined concept of the proof, • Empirical branch: the machinery of hypothesis testing, appropriate statistical methods, structured communication of methods and protocols. Claim: Computation presents only a potential third/fourth branch of the scientific method, until the development of comparable standards.
  • 22. Whole Tale? Proposed Solution: • Capture computational steps / provide compute environment • Provide unique identifiers to data/code/workflows associated with results • Provide links to embed in the publication for discoverability • Preserve digital scholarly objects
  • 23. So it looks pretty simple.. • What about big data? • Complex codes? • Reuse and bug fixes? • Meta-analysis? • Working with external groups, such as publishers? • Incentives? What if they don’t come? • Allocating resources? Sustainability models? • What does citation mean and how are contributions to be rewarded?
  • 24. Incompleteness • “I ran all the stuff and it’s still the wrong answer!” • “I got a different result, using your code and data!” • “Your code doesn’t work!” • “Where’s all the documentation? I can’t figure this thing out.”