Aryani, Amir; Schmidt, Heinz, Research Data and the Future of Software Engineering. Australian Software Engineering Conference (ASWEC2014),
http://dx.doi.org/10.6084/m9.figshare.956086
SUKDANAN DIAGNOSTIC TEST IN PHYSICAL SCIENCE ANSWER KEYY.pdf
Research Data and the Future of Software Engineering
1. Research Data and the Future of
Software Engineering
Amir Aryani
Australian National Data Service (ANDS)
Heinz Schmidt
RMIT University
2. ANDS is enabling transformation of
Australian research data
2
• Funded by the Australian Commonwealth Government through
the National Collaborative Research Infrastructure Strategy
(NCRIS) with the mission of transforming Australia’s research data.
• Since 2009 ANDS has spent $80 million building skills, knowledge,
services and community around data.
Amir Aryani (twitter: @amir_at_ands)
3. About eResearch@RMIT
• established to facilitate IT systems, software and services
support to researchers with high-performance computing
(HPC), high-bandwidth network and data-intensive
collaborative spaces needs such as
– Research Data Capture and Curation,
– Campus Cloud,
– Data Visualization, …
3 Amir Aryani (twitter: @amir_at_ands)
4. Research Data and the Future of
Software Engineering
Changing the
landscape of
science
New paradigm
of research
Funding agencies and
research data
NSF
Welcome Trust
ARC
Research Data Alliance
Riding the Wave
Research data management (DRM) RDM frameworks
Policy
Ethics
Metadata
Research life cycle
Domain specific
repositories
Software engineering
is a key enabler
Software curation
embeded in data curation
Software DataModel
Research data in software
engineering research
Model
Software
Process
Data
Gap: Majority of Australian universities has no domain
specific data curation solution for researchers in
computer science and computer engineering.
Future work
Formal approach to RDM in SE
National/international
infrastructure for RDM
in CS and SE
Creating a culture of data
reuse and data citation Sharing data and open access
Credibility and
transparency Collaboration
Better return on
investment
New research
oportunities
New citation
Adding data to
your resume
Data citation
Agenda
4 Amir Aryani (twitter: @amir_at_ands)
5. Fourth Paradigm
• Thousand years ago:
science was empirical
– describing natural phenomena
• Last few hundred years:
theoretical branch
– using models, generalizations
• Last few decades:
a computational branch
– simulating complex phenomena
• Today :
data exploration (eScience)
– unify theory, experiment, and
simulation
5
Ref: Tony Hey, Stewart Tansley, and Kristin Tolle, The fourth paradigm: data-intensive scientific discovery, Microsoft Research, 2009
Jim Gray on eScience:
A Transformed Scientific Method
Amir Aryani (twitter: @amir_at_ands)
6. 6
Presented to Neelie Kroes, European Commission Vice-President for the Digital
Agenda, the report "Riding the Wave: How Europe can gain from the rising tide of
scientific data" is the result of six months of intense brainstorming and
consultations by the High-Level Group on Scientific Data.
Riding the Wave
Ref: http://ec.europa.eu/information_society/newsroom/cf/itemlongdetail.cfm?item_id=6204
7. 7
Investigators are expected to share
with other researchers, at no more
than incremental cost and within a
reasonable time, the primary data,
samples, physical collections and
other supporting materials created
or gathered in the course of work
under NSF grants.
Ref: http://www.nsf.gov/bfa/dias/policy/dmp.jsp
Amir Aryani (twitter: @amir_at_ands)
8. 8
Making research data widely available to
the research community in a timely and
responsible manner ensures that these
data can be verified, built upon and
used to advance knowledge…
Ref:http://www.wellcome.ac.uk/About-us/Policy/Policy-and-position-statements/WTX035043.htm
Amir Aryani (twitter: @amir_at_ands)
9. 9
Ref: http://www.arc.gov.au/pdf/LIEF15/LE15%20Funding%20Rules.pdf
“Researchers and institutions have an
obligation to care for and maintain
research data in accordance with the
Australian Code for the Responsible
Conduct of Research (2007). The ARC
considers data management planning
an important part of the responsible
conduct of research and strongly
encourages the depositing of data
arising from a Project in an appropriate
publicly accessible subject and/or
institutional repository".
Amir Aryani (twitter: @amir_at_ands)
10. 477 academics and policymakers from around the globe gathered for the
Research Data Alliance’s third plenary meeting in Dublin (March 2014)
Research Data Alliance
Amir Aryani (twitter: @amir_at_ands)10
12. Research Data
“The data, records, files or other evidence,
irrespective of their content or form (e.g. in
print, digital, physical or other forms), that
comprise research observations, findings or
outcomes, including primary materials and
analysed data.”
Monash University Research Data Policy
ANDS Guideline:
ands.org.au/guides/what-is-research-data.html
12 Amir Aryani (twitter: @amir_at_ands)
13. Research Data
(a simple perspective)
13
Research
digital
data
prints and forms
analysed
data
Research
output
Research
input
Research Data Research Data
logs, models
and processes
Amir Aryani (twitter: @amir_at_ands)
14. Why share your data?
• Credibility
• Transparency
• Collaboration
– Better return on
investment
– New research
opportunities
• Data citation
– Adding data to your
resume
14 Amir Aryani (twitter: @amir_at_ands)
15. Data Management Framework*
• Institutional policies and procedures
• IT infrastructure
(hardware & software)
• Support services
(people & advice)
• Managing metadata
15
*www.ands.org.au/datamanagement/overview.html
Amir Aryani (twitter: @amir_at_ands)
19. Software is embedded in
the research data lifecycle
19
Research
digital
data
prints and forms
analysed
data
Research
output
Research
input
Research Data Research Data
logs, models
and processes
Software
Software
Software
Amir Aryani (twitter: @amir_at_ands)
22. What is missing?
22
Majority of Australian universities have no domain
specific data curation solution to support researchers
in computer science and computer engineering.
Amir Aryani (twitter: @amir_at_ands)
23. Roadmap for future work
Amir Aryani (twitter: @amir_at_ands)23
• Formal approach to data management in
Computer Science and Software Engineering
(CS & SE)
Research
• National/international data management
infrastructure for CS & SEInfrastructure
• Building the culture of data citation
Policy and
Practice
24. Research Problem
Formal approach (domain-
based method) to data
management in software
engineering
Amir Aryani (twitter: @amir_at_ands)24
Research Data
Hypothesis
Software
Model
Process
Experiment
Results
Opportunity:
collaborative research
25. Infrastructure Gap
• National/international infrastructure for data
management in computer science and
software engineering
Amir Aryani (twitter: @amir_at_ands)25
Requires:
cross-institution
collaboration
27. Last comment:
Open data = new research opportunities
Amir Aryani (twitter: @amir_at_ands)27
Find these slides at
Twitter: @amir_at_ands
slideshare.net/amiraryani