2. On Science Research Data
Jeffrey Lancaster, Ph.D.
Emerging Technologies Coordinator
Columbia University Libraries
jeffrey.lancaster@columbia.edu
@j_lancaster
3. How do you feel about being
ever-increasingly bombarded
by data?
15. Applica*ons?
Plausible?
Useful?
Novel?
New
Ques*ons
New
Knowledge
Patents
Background
Jus*fica*on
Conferences
Community
Conversa*on
Data
Analysis
Confirma*on
Reagents
Protocols
Learning
Up
to
Speed
Researchers
Students
Jus*fica*on
Grants
JOB
&
FUNDING
PUBLISH
PEOPLE
FUNDING
ANALYSIS
&
RESULTS
EXPERIMENTS
RESEARCH
PLAN
IDEA
Big
or
small
Discussion
Conferences
Talks
Ar*cles
Thesis
Talks
The Research Workflow
Adapted
from
Laura
Cro/
@
Nature
18. Case Study 1: I’m on a Boat!
MCS Acquisition
Syntrak 960-24
SSI Seisnet active tape
emulation
Hydrophone arrays
Sentry solid cable
12.5 meter groups
150m sections
up to four towed
separation 50 - 150 meters
Source Arrays
4 x 10 gun strings
9 active, one spare / string
15 meter string length
1650 cu. In. per string
Source Controller
DigiShot
MCS geometry sensors
Digicourse 5011 Compassbirds
Digicourse Digirange
Tailbuoy GPS
Source GPS (1 per string)
MCS Navigation
Concept Systems, Ltd Spectra,
Sprint, Reflex
MCS QC
Syntrak
SeisNet
ProMaxx
Focus
Communications
HighSeasNet
Inmarsat Sailor 500 FleetBroadband
Iridium Sailor Satellite Phone
Multibeam / Echsounder
Kongsberg EM122 1° x 1°
Knudsen 3260 Echosounder
Marine Mammals Observation/
Mitigation
Seiche Passive Acoustic Monitoring
Streamer
2 x Fujinon Big Eye Binoculars
General
Bell BGM-3 Gravimeter
Geometrics 882 Magnetometer
RDI 75KHz ADCP
Stbd Side A frame
Telescoping Stern Boom
SippicanMk21ExpendableProbeLauncher
Teflon-lined Uncontaminated Seawater
System
Seabird SBE21 Thermosalinograph
LDEO PCO2
RM Young Weather Station
19. Activity: Spreadsheets
What do you observe about the data?
Can you describe the experiment that was being done?
What did the researcher do well?
What can be improved in how the data is kept/shared?
21. Activity: Lab Notebooks
What do you observe about the data?
Can you describe the experiment that was being done?
What did the researcher do well?
What can be improved in how the data is kept/shared?
22. Case Study 3: Needle in a Haystack
http://core-genomics.blogspot.com/2012/05/resources-for-public-understanding-of.html
23. Big Data
+
Data Science
CERN:
approx. 1 PB/sec = 1000 TB/sec = 1000000 GB/sec
filtered to 1 GB/sec
http://arstechnica.com/science/2010/08/lhc-computing-grid-pushes-petabytes-of-data-beats-expectations/
24. Big Data
+
Data Science
Institute for Data Sciences & Engineering:
• Cybersecurity Center
• Financial and Business Analytics Center
• Foundations of Data Science Center
• Health Analytics Center
• New Media Center
• Smart Cities Center
25. Conversation: Code
What is special about code?
What do you need to know to help a patron code?
What are best practices for code use?
Could you find out the most used bits of code in 2014?
26. Some disciplines have repositories.
Some don’t.
Some institutions have repositories.
Some don’t.
33. Crowd-Funding Science
Funding
science
may
no
longer
rely
upon
government.
Interested
people,
engaged
by
social
media
presence,
are
key
to
raising
money
from
the
crowd.
34. Reproducibility Initiative
Address
the
reproducibility
of
your
research
in
a
blind,
fee-‐for-‐service
validaFon
Validated
studies
receive
a
Cer*ficate
of
Reproducibility
acknowledging
that
their
results
have
been
independently
reproduced.
36. ORCID, ResearcherID, etc.
Unique
idenFfiers
for
researchers
to
cross-‐reference
publicaFons,
acFviFes,
etc.
John
Smith
vs.
J.
Smith
vs.
John
D.
Smith
vs.
J.
D.
Smith
vs.
JD
Smith
vs.
…
Wang
Kim
vs.
W.
Kim
vs.
Kim
Wang
vs.
K.
Wang
…
ORCID:
0000-‐0003-‐0458-‐2127
ResearcherID:
J-‐6870-‐2012
40. Applica*ons?
Plausible?
Useful?
Novel?
New
Ques*ons
New
Knowledge
Patents
Background
Jus*fica*on
Conferences
Community
Conversa*on
Data
Analysis
Confirma*on
Reagents
Protocols
Learning
Up
to
Speed
Researchers
Students
Jus*fica*on
Grants
JOB
&
FUNDING
PUBLISH
PEOPLE
FUNDING
ANALYSIS
&
RESULTS
EXPERIMENTS
RESEARCH
PLAN
IDEA
Big
or
small
Discussion
Conferences
Talks
Ar*cles
Thesis
Talks
So. Many. Tools.