The Past, Present and Future of data

•Download as PPTX, PDF•

1 like•724 views

Opening keynote for NZ eResearch Symposium 2010. http://www.eresearch.org.nz/. Discusses the past, present and future of data.

Education Technology Sports

The Past, Present and Future
of Data
Dr Andrew Treloar
Director of Technology
Australian National Data Service

Inconvenient data
DOI: 10.1098/rsta.2005.1569

Imprisoned data
DOI 10.1098/rsta.2006.1793

Invisible data
DOI 10.1098/rsta.2006.1793

Why re-use data?
• Efficiency
• Validation
• Integrity
• Value for money
• Self-interest

Piwowar, et. al., “Sharing Detailed Research Data Is
Associated with Increased Citation Rate”
http://www.plosone.org/article/info:doi/10.1371/journal.pone.0000308

Climate Archæology
de la Mare, William K., 1997, "Abrupt mid-twentieth-century decline in Antarctic sea-
ice extent from whaling records", Nature, vol.389, pp 87-90, 4 Sept 97

Australian National Data Service (ANDS)
 An initiative of the Australian Government being
conducted as part of the National Collaborative
Research Infrastructure Strategy ($A24M) and the
Super Science Initiative ($A48M)
 A collaboration between Monash University, the
Australian National University and CSIRO
 Nearly 50 staff, funded to mid 2013
 More researchers re-using more data more often
 Data as a first-class object
ands.org.au 28

“The future is already here – it’s
just not very evenly distributed”
William Gibson

Create: Open Notebook Science
• http://usefulchem.wikispaces.com/malaria
• http://www.ourexperiment.org/racemic_pzq
• http://www.infotoday.com/it/sep10/Poynder.s
html

Describe, Store: PIC Cloud Demo
• http://www.polarcommons.org/
• http://piccloud.arcs.org.au/piccloud/

Discover, Access: RDA Demo
• http://www.google.com/
• http://services.ands.org.au/pages/

Identify: Journal Demo
• http://dx.doi.org/10.1016/j.yqres.2010.04.004
• “Elsevier and PANGAEA (Publishing Network for
Geoscientific & Environmental Data) announced their
next step in interconnecting the diverse elements of
scientific research. Elsevier articles at ScienceDirect are
now enriched with graphical information linking to
associated research data sets that are deposited at
PANGAEA. This enrichment functionality offers a
blueprint of how Elsevier would like to work with data
set repositories all over the world [emphasis added].”
http://newsbreaks.infotoday.com/Digest/Elsevier-Enriches-
Articles-With-Research-Data-Sets-69148.asp

2001
• http://www.youtube.com/watch?annotation_i
d=annotation_701469&v=TSW69UwxKbU&fea
ture=iv
– 5:04 through 6:00

Acknowledgements
• http://www.flickr.com/photos/shashwat/1215492062
• http://www.flickr.com/photos/carbonnyc/3160378286
• http://jpkc.fimmu.com/sfzx/new/Upload/20091024163545503.JPG
• NASA/courtesy of nasaimages.org
• http://www.pri.kyoto-
u.ac.jp/press/20090716/bossou_chimpanzee_stone-tool_use.jpg
• http://www.flickr.com/photos/13238706@N00/136830103/
• http://www.flickr.com/photos/mplemmon/215790292/
• http://www.utexas.edu/features/archive/2003/vase.html
• http://www.flickr.com/photos/steveharris/84026155/
• Clip of 2001 shown in accordance with section 47(2) of Copyright
Act 1994 No 143 (as at 07 July 2010)
– http://legislation.govt.nz/act/public/1994/0143/latest/DLM345972.ht
ml#DLM345972

Questions/Links
• ands.org.au
• services.ands.org.au
• andrew.treloar@ands.org.au
• andrew.treloar.net

Similar to The Past, Present and Future of data

share23webversion-1Carrie Bengston

Open data policy in the Earth SciencesTrisha Moriarty

Data management: international challenges, national infrastructure, and insti...Andrew Treloar

Summary of TERN monitoring plots in the Pilbara WA, Apr2015 - Jun2021TERN Australia

SCAR Data Management and PolicyAnton Van de Putte

Behavior ontology workshop princetonCyndy Parr

The Atlas of Living Australia - Infrastructure for Biodiversity ResearchDonald Hobern

Building on the Atlas (of Living Australia)Andrew Treloar

Official Opening - Julia Evans (DIISR)TERN Australia

Research data and the ANDS agenda in AustraliaAndrew Treloar

Research Data Management Services at UWA (July 2015)Katina Toufexis

Atlas of Living Australia Cyndy Parr

Summary of TERN plots on Kangaroo Island, SA, Oct 2018 - Oct 2021TERN Australia

Rewarding data publication: ipt.biodiversity.aqAnton Van de Putte

สัปดาห์ที่ 1Pongsak Noparat

20160623 alia sydneyARDC

Remsen celebration of discoveryDavid Remsen

White Paperfarquhar86

Data Facilities Workshop - Panel on Current Concepts in Data Sharing & Intero...EarthCube

FAIR and open biodiversity collection data managementDag Endresen

Similar to The Past, Present and Future of data (20)

share23webversion-1

Open data policy in the Earth Sciences

Data management: international challenges, national infrastructure, and insti...

Summary of TERN monitoring plots in the Pilbara WA, Apr2015 - Jun2021

SCAR Data Management and Policy

Behavior ontology workshop princeton

The Atlas of Living Australia - Infrastructure for Biodiversity Research

Building on the Atlas (of Living Australia)

Official Opening - Julia Evans (DIISR)

Research data and the ANDS agenda in Australia

Research Data Management Services at UWA (July 2015)

Atlas of Living Australia

Summary of TERN plots on Kangaroo Island, SA, Oct 2018 - Oct 2021

Rewarding data publication: ipt.biodiversity.aq

สัปดาห์ที่ 1

20160623 alia sydney

Remsen celebration of discovery

White Paper

Data Facilities Workshop - Panel on Current Concepts in Data Sharing & Intero...

FAIR and open biodiversity collection data management

Recently uploaded

4.16.24 Poverty and Precarity--Desmond.pptxmary850239

Food processing presentation for bsc agriculture honsManeerUddin

4.16.24 21st Century Movements for Black Lives.pptxmary850239

Global Lehigh Strategic Initiatives (without descriptions)cama23

INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña

YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxConquiztadors- the Quiz Society of Sri Venkateswara College

HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection

How to do quick user assign in kanban in Odoo 17 ERPCeline George

ICS2208 Lecture6 Notes for SL spaces.pdfVanessa Camilleri

Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy

FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxConquiztadors- the Quiz Society of Sri Venkateswara College

Karra SKD Conference Presentation Revised.pptxAshokKarra1

ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood

Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George

AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood

USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.

Difference Between Search & Browse Methods in Odoo 17Celine George

What is Model Inheritance in Odoo 17 ERPCeline George

GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2

THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña

Recently uploaded (20)

4.16.24 Poverty and Precarity--Desmond.pptx

Food processing presentation for bsc agriculture hons

4.16.24 21st Century Movements for Black Lives.pptx

Global Lehigh Strategic Initiatives (without descriptions)

INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx

YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx

HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...

How to do quick user assign in kanban in Odoo 17 ERP

ICS2208 Lecture6 Notes for SL spaces.pdf

Student Profile Sample - We help schools to connect the data they have, with ...

FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx

Karra SKD Conference Presentation Revised.pptx

ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx

Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17

AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx

USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...

Difference Between Search & Browse Methods in Odoo 17

What is Model Inheritance in Odoo 17 ERP

GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS

THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION

The Past, Present and Future of data

1. The Past, Present and Future of Data Dr Andrew Treloar Director of Technology Australian National Data Service

2. THE PAST

10.

11.

12.

13.

14. THE PRESENT

15. Inconvenient data DOI: 10.1098/rsta.2005.1569

16. Imprisoned data DOI 10.1098/rsta.2006.1793

17. Invisible data DOI 10.1098/rsta.2006.1793

18. Inaccessible data

19. Ignored negative data

20. Why re-use data? • Efficiency • Validation • Integrity • Value for money • Self-interest

21.

22.

23. Cancer Micro-array trials

24. Piwowar, et. al., “Sharing Detailed Research Data Is Associated with Increased Citation Rate” http://www.plosone.org/article/info:doi/10.1371/journal.pone.0000308

25. Climate Archæology de la Mare, William K., 1997, "Abrupt mid-twentieth-century decline in Antarctic sea- ice extent from whaling records", Nature, vol.389, pp 87-90, 4 Sept 97

26. ANDS

27.

28. Australian National Data Service (ANDS)  An initiative of the Australian Government being conducted as part of the National Collaborative Research Infrastructure Strategy ($A24M) and the Super Science Initiative ($A48M)  A collaboration between Monash University, the Australian National University and CSIRO  Nearly 50 staff, funded to mid 2013  More researchers re-using more data more often  Data as a first-class object ands.org.au 28

29. Unmanaged

30. Managed

31. Disconnected

32. Connected

33. Invisible

34. Findable

35. Single use

36. Reusable

37.

38.

39. THE FUTURE

40. “The future is already here – it’s just not very evenly distributed” William Gibson

41.

42.

43. Create: Open Notebook Science • http://usefulchem.wikispaces.com/malaria • http://www.ourexperiment.org/racemic_pzq • http://www.infotoday.com/it/sep10/Poynder.s html

44. Describe, Store: PIC Cloud Demo • http://www.polarcommons.org/ • http://piccloud.arcs.org.au/piccloud/

45. Discover, Access: RDA Demo • http://www.google.com/ • http://services.ands.org.au/pages/

46. Identify: Journal Demo • http://dx.doi.org/10.1016/j.yqres.2010.04.004 • “Elsevier and PANGAEA (Publishing Network for Geoscientific & Environmental Data) announced their next step in interconnecting the diverse elements of scientific research. Elsevier articles at ScienceDirect are now enriched with graphical information linking to associated research data sets that are deposited at PANGAEA. This enrichment functionality offers a blueprint of how Elsevier would like to work with data set repositories all over the world [emphasis added].” http://newsbreaks.infotoday.com/Digest/Elsevier-Enriches- Articles-With-Research-Data-Sets-69148.asp

47. CONCLUSION

48. 2001 • http://www.youtube.com/watch?annotation_i d=annotation_701469&v=TSW69UwxKbU&fea ture=iv – 5:04 through 6:00

49. Acknowledgements • http://www.flickr.com/photos/shashwat/1215492062 • http://www.flickr.com/photos/carbonnyc/3160378286 • http://jpkc.fimmu.com/sfzx/new/Upload/20091024163545503.JPG • NASA/courtesy of nasaimages.org • http://www.pri.kyoto- u.ac.jp/press/20090716/bossou_chimpanzee_stone-tool_use.jpg • http://www.flickr.com/photos/13238706@N00/136830103/ • http://www.flickr.com/photos/mplemmon/215790292/ • http://www.utexas.edu/features/archive/2003/vase.html • http://www.flickr.com/photos/steveharris/84026155/ • Clip of 2001 shown in accordance with section 47(2) of Copyright Act 1994 No 143 (as at 07 July 2010) – http://legislation.govt.nz/act/public/1994/0143/latest/DLM345972.ht ml#DLM345972

50. Questions/Links • ands.org.au • services.ands.org.au • andrew.treloar@ands.org.au • andrew.treloar.net

Editor's Notes

Start with a question. What is the different between these?
And these?
Thanks to machines like these, we now know that at genetic level
It’s only 1% of this. But that’s just genetics. What about culture?
We now know that a range of species (including crows!) are tool users, and they pass on particular techniques Think of this as a chimpanzee tutorial…
But this sort of transmission of culture doesn’t transcend either time or space. You need to be in the same time and place to learn.
For our species one of the big breakthroughs was the development of language. This now allowed for easier transmission than show and tell, but still didn’t address the time and space problem. So, where am I going with this? To data of course…
These are data from 7,000 BCE Each token is a particular value Initially they were used on their own (a bit like coins today)
Then around 4,000 BCE we see the emergence of these: bullae Explain: Seal (identify), signs for what is traded, contents as tokens. Essentially the first written contracts
To avoid having to literally break the contract to see what numbers it contained, the next step was to provide a representation on the outside. Then in 2900 BCE some genius made the crucial conceptual leap: if we have the numbers in symbolic form on the outside, do we need them in physical form on the inside? Answer: No, and so we get clay tablets. And those strange marks next to the numbers? The very beginnings of pictographic writing…
And then very quickly, the first libraries. I don’t have time to cover the entire history of writing, but just want to make the point that writing came from the need to capture and manage data. Or to put it another way, much of what we regard as civilisation started with accounting. Any accounting graduates in the audience?
So, let’s fast forward about 45 centuries to the present and look at the state of data in scholarly communication. Unfortunately, it’s inconvenient, imprisoned, invisible, inaccessible, and ignored
Need to retype
Near impossible to liberate. Talk about ChemXSeer example and DataThief Java application
Too transformed
Discipline scientist may know how to get these data but I don’t
Only journal like this I know. Anecdotal evidence that it is hard to get negative papers published All of the above problems are really about difficulties in getting to the data so it can be re-used. By why would you want to re-use data?
NOTE: Some of these arguments are at individual, national, global level Efficiency – don’t reinvent wheel Validation – repeatability of research Integrity – of scholarly record Value for Money – public money funded it, it should be available to public (ClimateGate!) Self-interest – sharing with a future self, greater visibility So, what are some good stories around data sharing?
Hubble Space Telescope (HST) operating since 1990 Observations are proposed, and if accepted, data is collected and made available to the proposers – who then write a research paper Each year around 1,000 proposals are reviewed and approximately 200 are selected, for a total of 20,000 individual observations Data is stored at the Space Telescope Science Institute and made available after embargo period
GO = General Observation program AR = Archival Reuse
From Wikipedia: “A DNA microarray is a multiplex technology used in molecular biology. It consists of an arrayed series of thousands of microscopic spots of DNA oligonucleotides, called features, each containing picomoles (10−12 moles) of a specific DNA sequence, known as probes (or reporters). These can be a short section of a gene or other DNA element that are used to hybridize a cDNA or cRNA sample (called target) under high-stringency conditions. Probe-target hybridization is usually detected and quantified by detection of fluorophore-, silver-, or chemiluminescence-labeled targets to determine relative abundance of nucleic acid sequences in the target. Since an array can contain tens of thousands of probes, a microarray experiment can accomplish many genetic tests in parallel. Therefore arrays have dramatically accelerated many types of investigation.”
Heather Piwowar looked at the citation history of cancer microarray clinical trial publications Found that publicly available data was associated with a 69% increase in citations, independently of journal impact factor, date of publication, and author country of origin
Climate researchers need to be able to run their models foreward (forecasting) and backwards (backcasting) to check they are correct. The southern limit of whaling is constrained by sea ice, and since 1931 whaling records have been collected for every whale caught. This paper took these records and used this. His analysis indicates that the Antarctic summer sea-ice edge has moved southwards by 2.8° of latitude between the mid 1950s and early 1970s This suggests a decline in the area covered by sea ice of some 25%
Number of initiatives around the world working to do a better job on data: NSF DataNet (Bill at end of conference), JISC Managing Research Data, NL SURF/DANS
I want to talk about one from New Zealand’s West Island…
28
So, how are we doing this? We’ve got a whole series of programs of activity, but one way to visualise the infrastructure that is needed is to distinguish…
The current picture for Australian (and other) research data From…
The components that ANDS is adding to produce the ARDC
So, if that is a partial view of the present (Bill will tell you more tomorrow, I’m sure), what about the future?
Talk about ANDS was a founding member of DataCite. TIB in Germany was another and is providing the data DOIs for this example
So, to conclude: The need to manage data is not just a modern problem – it drove crucial developments in Western civilisation nearly 9,000 years ago For most of the last two hundred years, data has largely been the neglected stepdaughter in scholarly communication, eclipsed by its more glamorous sister the journal article. And I’ve reviewed some of the attendant problems arising from this Two things are driving a change in this approach: the shift to more data-intensive research and growth in information systems that can better manage and make available the underlying data I showed you some of the bits of the future that are starting to appear – forerunners of the way the research world might look for many disciplines in the next 10-20 years Or to put it another way, data is what helped to make it possible to go from this <click.
Thanks to all those who made their images available under CC licensing for re-use [click]
And thank you for the opportunity to speak to you this morning.

The Past, Present and Future of data

Recommended

Recommended

More Related Content

Similar to The Past, Present and Future of data

Similar to The Past, Present and Future of data (20)

More from Andrew Treloar

More from Andrew Treloar (18)

Recently uploaded

Recently uploaded (20)

The Past, Present and Future of data

Editor's Notes