SlideShare ist ein Scribd-Unternehmen logo
1 von 68
Downloaden Sie, um offline zu lesen
From	
  Calisphere,	
  	
  Couretsy	
  of	
  	
  UC	
  Riverside,	
  California	
  Museum	
  of	
  Photography	
  
From	
  Calisphere,	
  	
  Courtesy	
  of	
  Thousand	
  Oaks	
  Library	
  	
  	
  

Data	
  Management	
  
for	
  Researchers	
  

Tips,	
  Tools,	
  &	
  	
  
Why	
  You	
  Should	
  Care	
  

	
  

Carly	
  Strasser,	
  PhD	
  
California	
  Digital	
  Library	
  
@carlystrasser	
  
carly.strasser@ucop.edu	
  

Cal	
  Poly	
  	
  
Oct	
  2013	
  
Roadmap	
  

4.  Toolbox	
  
	
  
3.  Best	
  practices	
  
2.  Why	
  you	
  should	
  care	
  
1.  Background	
  
	
  
What	
  role	
  can	
  
libraries	
  play	
  in	
  
data	
  education?	
  
NSF	
  funded	
  DataNet	
  Project	
  
Office	
  of	
  Cyberinfrastructure	
  
Why	
  don’t	
  people	
  
share	
  data?	
  
Do	
  attitudes	
  about	
  
sharing	
  differ	
  
among	
  disciplines?	
  

What	
  barriers	
  to	
  sharing	
  
can	
  we	
  eliminate?	
  

Is	
  data	
  management	
  
being	
  taught?	
  
How	
  can	
  we	
  promote	
  storing	
  
data	
  in	
  repositories?	
  
What	
  role	
  can	
  
libraries	
  play	
  in	
  
data	
  education?	
  

Why	
  don’t	
  people	
  
share	
  data?	
  
Do	
  attitudes	
  about	
  
sharing	
  differ	
  
among	
  disciplines?	
  

What	
  barriers	
  to	
  sharing	
  
can	
  we	
  eliminate?	
  

Is	
  data	
  management	
  
being	
  taught?	
  
How	
  can	
  we	
  promote	
  storing	
  
data	
  in	
  repositories?	
  
From	
  Calisphere	
  via	
  Santa	
  Clara	
  University,	
  	
  
ark:/13030/kt696nc7j2	
  

A	
  Brief	
  
History	
  of	
  
Data	
  
Collection	
  
Or…	
  how	
  scientists	
  came	
  to	
  be	
  so	
  
bad	
  at	
  data	
  management	
  
Back in the day…

Curie	
  

Newton	
  

Da	
  Vinci	
  

classicalschool.blogspot.com	
  

Darwin	
  
From	
  Flickr	
  by	
  US	
  Army	
  Environmental	
  Command	
  

From	
  Flickr	
  by	
  	
  deltaMike	
  

From	
  Flickr	
  by	
  	
  DW0825	
  
From	
  Flickr	
  by	
  Flickmor	
  

Courtesey	
  of	
  WHOI	
  

Digital	
  data	
  

C.	
  Strasser	
  
Digital	
  data	
  
+	
  	
  
Complex	
  
workflows	
  
Data	
  management	
  
Documentation	
  
Reproducibility	
  

From	
  Flickr	
  by	
  ~Minnea~	
  
•  Cost	
  
•  Confusion	
  about	
  
standards	
  
•  Lack	
  of	
  training	
  
•  Fear	
  of	
  lost	
  rights	
  or	
  
benefits	
  
•  No	
  incentives	
  
From	
  Flickr	
  by	
  iowa_spirit_walker	
  
From	
  sandierpastures.com	
  

the
Truth
You need
to know
about

Data	
  management	
  
Metadata	
  
Data	
  repositories	
  
Data	
  sharing	
  
Why	
  you	
  
should	
  care	
  

From	
  Flickr	
  by	
  johntrainor	
  
Because	
  they	
  care:	
  

From	
  Flickr	
  by	
  Redden-­‐McAllister	
  
Because	
  they	
  care:	
  
All	
  data	
  must	
  be	
  in	
  a	
  
public	
  archive.	
  

You	
  can’t	
  hoard	
  it.	
  If	
  it’s	
  not	
  
available	
  you	
  can’t	
  cite	
  it.	
  

Include	
  a	
  data	
  section	
  with	
  
how	
  to	
  find	
  datasets.	
  
Data	
  
Management:	
  
Who	
  Knew	
  	
  Could	
  
be	
  a	
  Hot	
  Topic?	
  

r!	
  
te
La
From	
  Flickr	
  by	
  Velo	
  Steve	
  

Carly	
  Strasser,	
  PhD	
  
California	
  Digital	
  Library	
  
@carlystrasser	
  
Cal	
  Poly	
  
Oct	
  2013	
  
What	
  should	
  
OT
N
Vbe	
  doing?	
  
you	
  

From	
  Flickr	
  by	
  whatthefeed	
  
2	
  tables	
  

Random	
  notes	
  

C:Documents and SettingshamptonMy DocumentsNCEAS Distributed Graduate Seminars[Wash Cres Lake Dec 15 Dont_Use.xls]Sheet1
Stable Isotope Data Sheet
Sampling Site / Identifier: Wash Cresc Lake
Sample Type: Algal
Date: Dec. 16
Tray ID and Sequence: Tray 004
Reference statistics: SD for delta

Position
A1
A2
A3
A4
A5
A6
A7
A8
A9
A10
B1
B2
B3
B4
B5
B6
B7
B8
B9
B10
C1
C2
C3

SampleID

Weight (mg)
0.98
0.98
0.98
1.01
3.05
3.06
2.91
2.91
3.04
2.95
3.01
3
2.99
2.92
2.9
1.01
ref
0.99
ref
3.04
3.09
3.05
2.98
3.04
0.99
ref
ref
ref
ref
ref

ALG01
Lk Outlet Alg
ALG03
ALG05
ALG07
ALG06
ALG04
ALG02
ALG01
ALG03
ALG07

Lk Outlet Alg
ALG06
ALG02
ALG04
ALG05

13

From	
  Stephanie	
  Hampton	
  

C = 0.07

%C
38.27
39.78
40.37
42.23
1.88
31.55
6.85
35.56
33.49
41.17
43.74
4.51
1.59
4.37
33.58
44.94
42.28
31.43
35.57
5.52
37.90
31.74
38.46
23.78

SD for delta

delta 13C
-25.05
-25.00
-24.99
-25.06
-24.34
-30.17
-21.11
-28.05
-29.56
-27.32
-27.50
-22.68
-24.58
-21.06
-29.44
-25.00
-24.87
-29.69
-27.26
-22.31
-27.42
-27.93
-25.09

delta 13C_ca
-24.59
-24.54
-24.53
-24.60
-23.88
-29.71
-20.65
-27.59
-29.10
-26.86
-27.04
-22.22
-24.12
-20.60
-28.98
-24.54
-24.41
-29.23
-26.80
-21.85
-26.96
-27.47
-24.63

%N
1.96
2.03
2.04
2.17
0.17
0.92
0.48
2.30
1.68
1.97
1.36
0.34
0.15
0.34
1.74
2.59
2.37
1.07
1.96
0.45
1.36
2.40
2.40
1.17

15

Peter's lab
Washed Rocks

Don't use - old data

Shore
-1.26
1.26

Avg Con
-27.22
0.32

N = 0.15

delta 15N
4.12
4.01
4.09
4.20
-1.65
0.87
-0.97
0.59
0.79
2.71
0.99
4.31
-1.69
-1.52
0.62
3.96
4.33
0.95
2.79
4.72
1.21
0.73
4.37

delta 15N_ca
3.47
3.36
3.44
3.55
-2.30
0.22
-1.62
-0.06
0.14
2.06
0.34
3.66
-2.34
-2.17
-0.03
3.31
3.68
0.30
2.14
4.07
0.56
0.08
3.72

Spec. No.
25354
25356
25358
25360
25362
25364
25366
25368
25370
25372
25374
25376
25378
25380
25382
25384
25386
25388
25390
25392
25394
25396
25398

c
c

c
c
c

c

From	
  Stephanie	
  Hampton	
  (2010)	
  
ESA	
  Workshop	
  on	
  Best	
  Practices	
  

	
  	
  
Wash	
  Cres	
  Lake	
  Dec	
  15	
  Dont_Use.xls	
  
C:Documents and SettingshamptonMy DocumentsNCEAS Distributed Graduate Seminars[Wash Cres Lake Dec 15 Dont_Use.xls]Sheet1
Stable Isotope Data Sheet
Sampling Site / Identifier: Wash Cresc Lake
Sample Type: Algal
Date: Dec. 16
Tray ID and Sequence: Tray 004
Reference statistics: SD for delta

Position
A1
A2
A3
A4
A5
A6
A7
A8
A9
A10
B1
B2
B3
B4
B5
B6
B7
B8
B9
B10
C1
C2
C3

SampleID

Weight (mg)
0.98
0.98
0.98
1.01
3.05
3.06
2.91
2.91
3.04
2.95
3.01
3
2.99
2.92
2.9
1.01
ref
0.99
ref
3.04
3.09
3.05
2.98
3.04
0.99
ref
ref
ref
ref
ref

ALG01
Lk Outlet Alg
ALG03
ALG05
ALG07
ALG06
ALG04
ALG02
ALG01
ALG03
ALG07

Lk Outlet Alg
ALG06
ALG02
ALG04
ALG05

13

From	
  Stephanie	
  Hampton	
  

C = 0.07

%C
38.27
39.78
40.37
42.23
1.88
31.55
6.85
35.56
33.49
41.17
43.74
4.51
1.59
4.37
33.58
44.94
42.28
31.43
35.57
5.52
37.90
31.74
38.46
23.78

SD for delta

delta 13C
-25.05
-25.00
-24.99
-25.06
-24.34
-30.17
-21.11
-28.05
-29.56
-27.32
-27.50
-22.68
-24.58
-21.06
-29.44
-25.00
-24.87
-29.69
-27.26
-22.31
-27.42
-27.93
-25.09

delta 13C_ca
-24.59
-24.54
-24.53
-24.60
-23.88
-29.71
-20.65
-27.59
-29.10
-26.86
-27.04
-22.22
-24.12
-20.60
-28.98
-24.54
-24.41
-29.23
-26.80
-21.85
-26.96
-27.47
-24.63

%N
1.96
2.03
2.04
2.17
0.17
0.92
0.48
2.30
1.68
1.97
1.36
0.34
0.15
0.34
1.74
2.59
2.37
1.07
1.96
0.45
1.36
2.40
2.40
1.17

15

Peter's lab
Washed Rocks

Don't use - old data

Shore
-1.26
1.26

Avg Con
-27.22
0.32

N = 0.15

delta 15N
4.12
4.01
4.09
4.20
-1.65
0.87
-0.97
0.59
0.79
2.71
0.99
4.31
-1.69
-1.52
0.62
3.96
4.33
0.95
2.79
4.72
1.21
0.73
4.37

delta 15N_ca
3.47
3.36
3.44
3.55
-2.30
0.22
-1.62
-0.06
0.14
2.06
0.34
3.66
-2.34
-2.17
-0.03
3.31
3.68
0.30
2.14
4.07
0.56
0.08
3.72

Spec. No.
25354
25356
25358
25360
25362
25364
25366
25368
25370
25372
25374
25376
25378
25380
25382
25384
25386
25388
25390
25392
25394
25396
25398

c
c

c
c
c

c

From	
  Stephanie	
  Hampton	
  (2010)	
  
ESA	
  Workshop	
  on	
  Best	
  Practices	
  

	
  	
  
Random	
  stats	
  output	
  

C:Documents and SettingshamptonMy DocumentsNCEAS Distributed Graduate Seminars[Wash Cres Lake Dec 15 Dont_Use.xls]Sheet1
Stable Isotope Data Sheet
Sampling Site / Identifier: Wash Cresc Lake
Sample Type: Algal
Date: Dec. 16
Tray ID and Sequence: Tray 004
13

SampleID

Weight (mg)
0.98
0.98
0.98
1.01
3.05
3.06
2.91
2.91
3.04
2.95
3.01
3
2.99
2.92
2.9
1.01
ref
0.99
ref
3.04
3.09
3.05
2.98
3.04
0.99
ref
ref
ref
ref
ref

ALG01
Lk Outlet Alg
ALG03
ALG05
ALG07
ALG06
ALG04
ALG02
ALG01
ALG03
ALG07

Lk Outlet Alg
ALG06
ALG02
ALG04
ALG05

%C
38.27
39.78
40.37
42.23
1.88
31.55
6.85
35.56
33.49
41.17
43.74
4.51
1.59
4.37
33.58
44.94
42.28
31.43
35.57
5.52
37.90
31.74
38.46
23.78

Don't use - old data

Shore
-1.26
1.26

Avg Con
-27.22
0.32

15

Reference statistics: SD for delta C = 0.07

Position
A1
A2
A3
A4
A5
A6
A7
A8
A9
A10
B1
B2
B3
B4
B5
B6
B7
B8
B9
B10
C1
C2
C3

Peter's lab
Washed Rocks

SD for delta N = 0.15

delta 13C
-25.05
-25.00
-24.99
-25.06
-24.34
-30.17
-21.11
-28.05
-29.56
-27.32
-27.50
-22.68
-24.58
-21.06
-29.44
-25.00
-24.87
-29.69
-27.26
-22.31
-27.42
-27.93
-25.09

delta 13C_ca
-24.59
-24.54
-24.53
-24.60
-23.88
-29.71
-20.65
-27.59
-29.10
-26.86
-27.04
-22.22
-24.12
-20.60
-28.98
-24.54
-24.41
-29.23
-26.80
-21.85
-26.96
-27.47
-24.63

%N
1.96
2.03
2.04
2.17
0.17
0.92
0.48
2.30
1.68
1.97
1.36
0.34
0.15
0.34
1.74
2.59
2.37
1.07
1.96
0.45
1.36
2.40
2.40
1.17

delta 15N
4.12
4.01
4.09
4.20
-1.65
0.87
-0.97
0.59
0.79
2.71
0.99
4.31
-1.69
-1.52
0.62
3.96
4.33
0.95
2.79
4.72
1.21
0.73
4.37

delta 15N_ca Spec. No.
3.47
25354
3.36
25356
3.44
25358
3.55
25360
-2.30
25362
0.22
25364
-1.62
25366
-0.06
25368
0.14
25370
2.06
25372
0.34
25374
3.66
25376
-2.34
25378
-2.17
25380
-0.03
25382
3.31
25384
3.68
25386
0.30
25388
2.14
25390
4.07
25392
0.56
25394
0.08
25396
3.72
25398

c
c

c

SUMMARY OUTPUT

c
c

Regression Statistics
Multiple R 0.283158
R Square 0.080178
Adjusted R Square
-0.022024
Standard Error
1.906378
Observations
11
ANOVA

c

df
Regression
Residual
Total

SS
MS
F Significance F
1 2.851116 2.851116 0.784507 0.398813
9 32.7085 3.634278
10 35.55962

Coefficients
Standard Error t Stat
P-value Lower 95%Upper 95%
Lower 95.0%
Upper 95.0%
Intercept -4.297428 4.671099 -0.920003 0.381568 -14.8642 6.269341 -14.8642 6.269341
X Variable 1
-0.158022 0.17841 -0.885724 0.398813 -0.561612 0.245569 -0.561612 0.245569

From	
  Stephanie	
  Hampton	
  
C:Documents and SettingshamptonMy DocumentsNCEAS Distributed Graduate Seminars[Wash Cres Lake Dec 15 Dont_Use.xls]Sheet1
Stable Isotope Data Sheet
Sampling Site / Identifier: Wash Cresc Lake
Sample Type: Algal
Date: Dec. 16
Tray ID and Sequence: Tray 004
Reference statistics: SD for delta

Position
A1
A2
A3
A4
A5
A6
A7
A8
A9
A10
B1
B2
B3
B4
B5
B6
B7
B8
B9
B10
C1
C2
C3

SampleID

Weight (mg)
0.98
0.98
0.98
1.01
3.05
3.06
2.91
2.91
3.04
2.95
3.01
3
2.99
2.92
2.9
1.01
ref
0.99
ref
3.04
3.09
3.05
2.98
3.04
0.99
ref
ref
ref
ref
ref

ALG01
Lk Outlet Alg
ALG03
ALG05
ALG07
ALG06
ALG04
ALG02
ALG01
ALG03
ALG07

Lk Outlet Alg
ALG06
ALG02
ALG04
ALG05

13

C = 0.07

%C
38.27
39.78
40.37
42.23
1.88
31.55
6.85
35.56
33.49
41.17
43.74
4.51
1.59
4.37
33.58
44.94
42.28
31.43
35.57
5.52
37.90
31.74
38.46
23.78

SD for delta

delta 13C delta 13C_ca
-25.05
-24.59
-25.00
-24.54
-24.99
-24.53
-25.06
-24.60
-24.34
-23.88
-30.17
-29.71
-21.11
-20.65
-28.05
-27.59
-29.56
-29.10
-27.32
-26.86
-27.50
-27.04
SampleID
-22.68
-22.22
-24.58
-24.12
-21.06
-20.60
Weight (mg)
-29.44
-28.98
-25.00
-24.54
-24.87
-24.41
-29.69 %C-29.23
-27.26
-26.80
delta 13C
-22.31
-21.85
delta 13C_ca
-27.42
-26.96
-27.93
-27.47
-25.09
-24.63

%N
delta 15N
delta 15N_ca

15

Don't use - old data

Shore
-1.26
1.26

Avg Con
-27.22
0.32

N = 0.15

%N
1.96
2.03
2.04
2.17
0.17
0.92
0.48
2.30
1.68
1.97
1.36
ALG03
0.34
0.15
0.34
2.91
1.74
2.59
2.37
6.85
1.07
1.96
-21.11
0.45
-20.65
1.36
2.40
2.40
0.48
1.17

-0.97
-1.62

Peter's lab
Washed Rocks

delta 15N delta 15N_ca
4.12
3.47
4.01
3.36
4.09
3.44
4.20
3.55
-1.65
-2.30
0.87
0.22
-0.97
-1.62
0.59
-0.06
0.79
0.14
2.71
2.06
0.99
0.34
ALG05
4.31
3.66
-1.69
-2.34
-1.52
-2.17
2.91
0.62
-0.03
3.96
3.31
4.33
3.68
35.56
0.95
0.30
2.79
2.14
-28.05
4.72
4.07
-27.59
1.21
0.56
0.73
0.08
4.37
3.72

2.30
0.59
-0.06

Spec. No.
25354
25356
25358
25360
25362 c
25364
25366 c
25368
25370
25372
25374 c
ALG07
25376
25378 c
25380 c
25382 3.04
25384
25386
25388 33.49
25390
-29.56
25392
25394 -29.10
c
25396
25398

1.68
0.79
0.14

SUMMARY OUTPUT

ALG06

ALG04

Regression Statistics
Multiple R 0.283158
2.95 Square 0.080178
3.01
R
Adjusted R Square
-0.022024
Standard Error
1.906378
41.17
43.74
Observations
11

-27.32
-27.50
ANOVA
-26.86
-27.04
df
Regression
Residual
1.97
Total

2.71
2.06
Intercept

ALG02

ALG01
3

4.51
-22.68
-22.22
MS

2.99
1.59
-24.58
-24.12
Significance F

SS
F
1 2.851116 2.851116 0.784507 0.398813
9 32.7085 3.634278
1.3610 35.55962 0.34
0.15

0.99

4.31

-1.69

ALG03

ALG07

2.92

2.9

4.37
-21.06
-20.60

33.58
-29.44
-28.98

0.34
-1.52

1.74
0.62
-0.03

Coefficients
Standard Error t Stat
P-value Lower 95%Upper 95%
Lower 95.0%
Upper 95.0%
0.34
-2.34
-2.17
-4.297428 4.671099 3.66
-0.920003 0.381568 -14.8642 6.269341 -14.8642 6.269341
X Variable 1
-0.158022 0.17841 -0.885724 0.398813 -0.561612 0.245569 -0.561612 0.245569

4.00

3.00

2.00

1.00
Series1
-35.00

-30.00

-25.00

-20.00

-15.00

-10.00

-5.00

0.00
0.00
-1.00

-2.00

-3.00

From	
  Stephanie	
  Hampton	
  
What	
  should	
  
you	
  be	
  doing?	
  

From	
  Flickr	
  by	
  whatthefeed	
  
From	
  Flickr	
  by	
  Big	
  Swede	
  Guy	
  

Best	
  Practices	
  

ent
data managem

1.  Planning	
  
2.  Data	
  collection	
  &	
  
organization	
  
3.  Quality	
  control	
  &	
  assurance	
  
4.  Metadata	
  
5.  Workflows	
  
6.  Data	
  stewardship	
  &	
  reuse	
  
From	
  Flickr	
  by	
  Big	
  Swede	
  Guy	
  

Best	
  Practices	
  

ent
data managem

1.  Planning	
  
2.  Data	
  collection	
  &	
  
organization	
  
3.  Quality	
  control	
  &	
  assurance	
  
4.  Metadata	
  
5.  Workflows	
  
6.  Data	
  stewardship	
  &	
  reuse	
  
From	
  Flickr	
  by	
  Big	
  Swede	
  Guy	
  

Best	
  Practices	
  

ent
data managem

1.  Planning	
  
2.  Data	
  collection	
  &	
  
organization	
  
3.  Quality	
  control	
  &	
  assurance	
  
4.  Metadata	
  
5.  Workflows	
  
6.  Data	
  stewardship	
  &	
  reuse	
  
2.	
  Data	
  collection	
  &	
  organization	
  
Create	
  unique	
  identifiers	
  

From	
  Flickr	
  by	
  zebbie	
  

•  Decide	
  on	
  naming	
  scheme	
  early	
  
•  Create	
  a	
  key	
  
•  Different	
  for	
  each	
  sample	
  

From	
  Flickr	
  by	
  sjbresnahan	
  
2.	
  Data	
  collection	
  &	
  organization	
  
Standardize	
  
•  Consistent	
  within	
  columns	
  
– only	
  numbers,	
  dates,	
  or	
  text	
  

•  Consistent	
  names,	
  codes,	
  formats	
  

Modified	
  from	
  K.	
  Vanderbilt	
  	
  

From	
  Pink	
  Floyd,	
  The	
  Wall	
  	
  	
  themurkyfringe.com	
  
2.	
  Data	
  collection	
  &	
  organization	
  
Standardize	
  
•  Reduce	
  possibility	
  
of	
  manual	
  error	
  by	
  
constraining	
  entry	
  
choices	
  
Excel	
  lists	
  
Google	
  Docs	
  
Data
	
  
Forms	
  
validataion	
  
Modified	
  from	
  K.	
  Vanderbilt	
  	
  
2.	
  Data	
  collection	
  &	
  organization	
  
	
  	
  

Create	
  parameter	
  table	
  
Create	
  a	
  site	
  table	
  

From	
  doi:10.3334/ORNLDAAC/777	
  
From	
  doi:10.3334/ORNLDAAC/777	
  

From	
  R	
  Cook,	
  ESA	
  Best	
  Practices	
  Workshop	
  2010	
  
2.	
  Data	
  collection	
  &	
  organization	
  
What	
  about	
  
databases?	
  

A	
  relational	
  database	
  is	
  	
  
	
  A	
  set	
  of	
  tables	
  
	
  Relationships	
  among	
  the	
  tables	
  
	
  A	
  language	
  to	
  specify	
  &	
  query	
  the	
  tables	
  
	
  
A	
  RDB	
  provides	
  
	
  Scalability:	
  millions+	
  records	
  
	
  Features	
  for	
  sub-­‐setting,	
  querying,	
  sorting	
  
	
  Reduced	
  redundancy	
  &	
  entry	
  errors	
  
	
  

From	
  Mark	
  Schildhauer	
  
2.	
  Data	
  collection	
  &	
  organization	
  
You	
  should	
  invest	
  time	
  in	
  learning	
  databases	
  if	
  	
  
	
  your	
  data	
  sets	
  are	
  large	
  or	
  complex	
  
	
  

Consider	
  investing	
  time	
  in	
  learning	
  databases	
  if	
  
	
  your	
  data	
  are	
  small	
  and	
  humble	
  
	
  you	
  ever	
  intend	
  to	
  share	
  your	
  data	
  
	
  you	
  are	
  <	
  30	
  years	
  old	
  

From	
  Mark	
  Schildhauer	
  
2.	
  Data	
  collection	
  &	
  organization	
  
	
  Use	
  descriptive	
  file	
  names	
  *	
  
•  Unique	
  
•  Reflect	
  contents	
  
Bad:	
  
	
  
	
  

	
  Mydata.xls	
  
	
  2001_data.csv	
  
	
  best	
  version.txt	
  

Better: 	
  Eaffinis_nanaimo_2010_counts.xls	
  
Study	
  
organism	
  

Site	
  
name	
  

Year	
  

What	
  was	
  
measured	
  	
  

*Not	
  for	
  everyone	
  
From	
  R	
  Cook,	
  ESA	
  Best	
  Practices	
  Workshop	
  2010	
  
2.	
  Data	
  collection	
  &	
  organization	
  
Organize	
  files	
  	
  logically	
  
Biodiversity	
  

Lake	
  
Experiments	
   Biodiv_H20_heatExp_2005to2008.csv	
  
Biodiv_H20_predatorExp_2001to2003.csv	
  
…	
  
Field	
  work	
   Biodiv_H20_PlanktonCount_2001toActive.csv	
  
Biodiv_H20_ChlAprofiles_2003.csv	
  
…	
  
	
  

Grassland	
  
From	
  S.	
  Hampton	
  
2.	
  Data	
  collection	
  &	
  organization	
  
	
  Preserve	
  information	
  
•  Keep	
  raw	
  data	
  raw	
  
•  Use	
  scripts	
  to	
  process	
  data	
  
	
  &	
  save	
  them	
  with	
  data	
  
Raw	
  data	
  as	
  .csv	
  

R	
  script	
  for	
  processing	
  &	
  
analysis	
  

	
  
From	
  Flickr	
  by	
  Big	
  Swede	
  Guy	
  

Best	
  Practices	
  

ent
data managem

1.  Planning	
  
2.  Data	
  collection	
  &	
  
organization	
  
3.  Quality	
  control	
  &	
  assurance	
  
4.  Metadata	
  
5.  Workflows	
  
6.  Data	
  stewardship	
  &	
  reuse	
  
3.	
  Quality	
  control	
  and	
  quality	
  assurance	
  
Before	
  data	
  collection	
  

From	
  Flickr	
  by	
  StacieBee	
  

•  Define	
  &	
  enforce	
  standards	
  
•  Assign	
  responsibility	
  for	
  data	
  quality	
  
3.	
  Quality	
  control	
  and	
  quality	
  assurance	
  
After	
  data	
  entry	
  
•  Check	
  for	
  missing,	
  impossible,	
  
anomalous	
  values	
  
•  Perform	
  statistical	
  summaries	
  	
  
•  Look	
  for	
  outliers	
  
	
  
60	
  
50	
  
40	
  
30	
  
20	
  
10	
  
0	
  
0	
  

10	
  

20	
  

30	
  

40	
  
From	
  Flickr	
  by	
  Big	
  Swede	
  Guy	
  

Best	
  Practices	
  

ent
data managem

1.  Planning	
  
2.  Data	
  collection	
  &	
  
organization	
  
3.  Quality	
  control	
  &	
  assurance	
  
4.  Metadata	
  
5.  Workflows	
  
6.  Data	
  stewardship	
  &	
  reuse	
  
4.	
  Metadata	
  basics	
  

Why	
  are	
  you	
  
What	
  is	
  
promoting	
  
metadata?	
  
Excel?	
  
4.	
  Metadata	
  basics	
  
From	
  Flickr	
  by	
  	
  //ichael	
  Patric|{	
  
	
  

	
  	
  Metadata	
  =	
  Data	
  reporting	
  
	
  

WHO	
  created	
  the	
  data?	
  
WHAT	
  is	
  the	
  content	
  	
  
	
  of	
  the	
  data	
  set?	
  
WHEN	
  was	
  it	
  created?	
  
WHERE	
  was	
  it	
  collected?	
  
HOW	
  was	
  it	
  developed?	
  
WHY	
  was	
  it	
  developed?	
  
4.	
  Metadata	
  basics	
  

• 

Scientific	
  context	
  

• 

• 

Environmental	
  conditions	
  during	
  collection	
  

• 

Where	
  collected	
  &	
  spatial	
  resolution	
  When	
  
collected	
  &	
  temporal	
  resolution	
  

• 

The	
  name(s)	
  of	
  the	
  data	
  file(s)	
  in	
  the	
  data	
  
set	
  

What	
  instruments	
  (including	
  model	
  &	
  
serial	
  number)	
  were	
  used	
  

Standards	
  or	
  calibrations	
  used	
  

Name	
  of	
  the	
  data	
  set	
  

• 

What	
  data	
  were	
  collected	
  

• 

Digital	
  context	
  

Scientific	
  reason	
  why	
  the	
  data	
  were	
  
collected	
  

• 
• 

• 

• 

Date	
  the	
  data	
  set	
  was	
  last	
  modified	
  

• 

Example	
  data	
  file	
  records	
  for	
  each	
  data	
  
type	
  file	
  

• 

Pertinent	
  companion	
  files	
  

• 

• 

Information	
  about	
  parameters	
  

List	
  of	
  related	
  or	
  ancillary	
  data	
  sets	
  

How	
  each	
  was	
  measured	
  or	
  produced	
  

• 

Software	
  (including	
  version	
  number)	
  
used	
  to	
  prepare/read	
  	
  the	
  data	
  set	
  

Units	
  of	
  measure	
  

• 

• 

Format	
  used	
  in	
  the	
  data	
  set	
  

• 
• 

• 

Data	
  processing	
  that	
  was	
  performed	
  

• 

Precision	
  &	
  accuracy	
  if	
  known	
  

Personnel	
  &	
  stakeholders	
  
• 

• 

Quality	
  assurance	
  &	
  control	
  measures	
  

• 

Funders	
  

Definitions	
  of	
  codes	
  used	
  

• 

Who	
  to	
  contact	
  with	
  questions	
  

• 

Information	
  about	
  data	
  
• 

Who	
  collected	
  	
  

• 

• 

Known	
  problems	
  that	
  limit	
  data	
  use	
  (e.g.	
  
uncertainty,	
  sampling	
  problems)	
  	
  

How	
  to	
  cite	
  the	
  data	
  set	
  
4.	
  Metadata	
  basics	
  

What	
  is	
  
metadata?	
  

Select	
  the	
  appropriate	
  standard	
  
•  Provides	
  structure	
  to	
  describe	
  data	
  
Common	
  terms	
  	
  |	
  	
  definitions	
  	
  |	
  	
  language	
  	
  |	
  	
  structure	
  

•  Lots	
  of	
  different	
  standards	
  
	
  EML	
  ,	
  FGDC,	
  ISO19115,	
  DarwinCore,…	
  

•  Tools	
  for	
  creating	
  metadata	
  files	
  
	
  Morpho	
  (EML),	
  Metavist	
  (FGDC),	
  NOAA	
  MERMaid	
  (CSGDM)	
  	
  
	
  
	
  
From	
  Flickr	
  by	
  Big	
  Swede	
  Guy	
  

Best	
  Practices	
  

ent
data managem

1.  Planning	
  
2.  Data	
  collection	
  &	
  
organization	
  
3.  Quality	
  control	
  &	
  assurance	
  
4.  Metadata	
  
5.  Workflows	
  
6.  Data	
  stewardship	
  &	
  reuse	
  
5.	
  Workflows	
  
Workflow:	
  how	
  you	
  get	
  from	
  the	
  raw	
  data	
  to	
  the	
  final	
  
products	
  of	
  your	
  research	
  
	
  

Simple	
  workflows:	
  flow	
  charts	
  
Temperature	
  
data	
  
Salinity	
  	
  	
  	
  	
  	
  	
  	
  
data	
  
“Clean”	
  T	
  
&	
  S	
  data	
  

Data	
  import	
  into	
  R	
  

Data	
  in	
  R	
  
format	
  

Quality	
  control	
  &	
  
data	
  cleaning	
  
Analysis:	
  mean,	
  SD	
  
Graph	
  production	
  

Summary	
  
statistics	
  
5.	
  Workflows	
  
Workflow:	
  how	
  you	
  get	
  from	
  the	
  raw	
  data	
  to	
  the	
  final	
  
products	
  of	
  your	
  research	
  
	
  

Simple	
  workflows:	
  commented	
  scripts	
  
•  R,	
  SAS,	
  MATLAB	
  
•  Well-­‐documented	
  code	
  is…	
  
Easier	
  to	
  review	
  
Easier	
  to	
  share	
  
Easier	
  to	
  repeat	
  analysis	
  

%	
  
#	
   $	
  
&	
  
5.	
  Workflows	
  
Fancy	
  Schmancy	
  workflows:	
  Kepler	
  
Resulting	
  output	
  

https://kepler-­‐project.org	
  
5.	
  Workflows	
  
Workflows	
  enable	
  
Reproducibility	
  
Transparency	
  	
  
Executability	
  	
  
	
  

From	
  Flickr	
  by	
  merlinprincesse	
  
5.	
  Workflows	
  
Minimally:	
  document	
  your	
  analysis	
  
	
   	
  commented	
  code;	
  simple	
  flow-­‐chart	
  
	
  

www.littlebytesoflife.com	
  

Emerging	
  workflow	
  applications	
  will…	
  
−  Link	
  software	
  for	
  executable	
  end-­‐to-­‐end	
  analysis	
  
−  Provide	
  detailed	
  info	
  about	
  data	
  &	
  analysis	
  
−  Facilitate	
  re-­‐use	
  &	
  refinement	
  of	
  complex,	
  mn:	
  
o ulti-­‐step	
  
o
analyses	
  
ng	
  S aring	
  
omi 	
  sh
C
−  Enable	
  efficient	
  swapping	
  of	
  alternative	
  models	
  s! 	
  
	
  
kflow ent &
r
wo
algorithms	
  
uirem
req
−  Help	
  automate	
  tedious	
  tasks	
  
From	
  Flickr	
  by	
  Big	
  Swede	
  Guy	
  

Best	
  Practices	
  

ent
data managem

1.  Planning	
  
2.  Data	
  collection	
  &	
  
organization	
  
3.  Quality	
  control	
  &	
  assurance	
  
4.  Metadata	
  
5.  Workflows	
  
6.  Data	
  stewardship	
  &	
  reuse	
  
6.	
  Data	
  stewardship	
  &	
  reuse	
  
From	
  Flickr	
  by	
  greensambaman	
  

The	
  20-­‐Year	
  Rule	
  
The	
  metadata	
  accompanying	
  a	
  
data	
  set	
  should	
  be	
  written	
  for	
  a	
  
user	
  20	
  years	
  into	
  the	
  future	
  
	
  
	
  

RULE	
  

(National	
  Research	
  Council	
  1991)	
  
6.	
  Data	
  stewardship	
  &	
  reuse	
  
Use	
  stable	
  formats	
  
	
  

	
  csv,	
  txt,	
  tiff	
  

Create	
  back-­‐up	
  copies	
  	
  
original,	
  near,	
  far	
  

Periodically	
  test	
  ability	
  to	
  restore	
  information	
  

Modified from R. Cook	
  
6.	
  Data	
  stewardship	
  &	
  reuse	
  
Store	
  your	
  data	
  in	
  a	
  repository	
  
Institutional	
  archive	
  

Ask	
  a	
  librarian	
  

Discipline/specialty	
  archive	
  

	
  
	
  

	
  

Repos	
  of	
  repos:	
  

databib.org	
  
re3data.org	
  
From	
  Flickr	
  by	
  torkildr	
  
6.	
  Data	
  stewardship	
  &	
  reuse	
  
Practice	
  Data	
  Citation	
  
Example:	
  
Sidlauskas,	
  B.	
  2007.	
  Data	
  from:	
  Testing	
  for	
  unequal	
  rates	
  of	
  
morphological	
  diversification	
  in	
  the	
  absence	
  of	
  a	
  detailed	
  
phylogeny:	
  a	
  case	
  study	
  from	
  characiform	
  fishes.	
  Dryad	
  Digital	
  
Repository.	
  doi:10.5061/dryad.20	
  
Persistent	
  Unique	
  

Identifier	
  

Allows	
  readers	
  to	
  find	
  data	
  products	
  
Get	
  credit	
  for	
  data	
  and	
  publications	
  
Promotes	
  reproducibility	
  
Better	
  measure	
  of	
  research	
  impact	
  
From	
  Flickr	
  by	
  Big	
  Swede	
  Guy	
  

Best	
  Practices	
  

ent
data managem

1.  Planning	
  
2.  Data	
  collection	
  &	
  
organization	
  
3.  Quality	
  control	
  &	
  assurance	
  
4.  Metadata	
  
5.  Workflows	
  
6.  Data	
  stewardship	
  &	
  reuse	
  
What	
  is	
  a	
  data	
  
management	
  plan?	
  
A	
  document	
  that	
  
describes	
  what	
  you	
  will	
  
do	
  with	
  your	
  data	
  
throughout	
  	
  
the	
  research	
  project	
  

From Flickr by Barbies Land
From	
  Flickr	
  by	
  401(K)	
  2013	
  

DMP	
  for	
  funders:	
  
A	
  short	
  plan	
  submitted	
  
alongside	
  grant	
  applications	
  
	
  An	
  outline	
  of	
  	
  
–  what	
  will	
  be	
  collected	
  
–  methods	
  
–  Standards	
  
But they all have
–  Metadata	
  
different requirements
–  sharing/access	
  
and express them in
–  long-­‐term	
  storage	
  
different ways

	
  Includes	
  how	
  and	
  why	
  
NSF	
  DMP	
  Requirements	
  
From	
  Grant	
  Proposal	
  Guidelines:	
  
	
  DMP	
  supplement	
  may	
  include:	
  
1.  the	
  types	
  of	
  data,	
  samples,	
  physical	
  collections,	
  software,	
  curriculum	
  
materials,	
  and	
  other	
  materials	
  to	
  be	
  produced	
  in	
  the	
  course	
  of	
  the	
  project	
  
2.  	
  the	
  standards	
  to	
  be	
  used	
  for	
  data	
  and	
  metadata	
  format	
  and	
  content	
  (where	
  
existing	
  standards	
  are	
  absent	
  or	
  deemed	
  inadequate,	
  this	
  should	
  be	
  
documented	
  along	
  with	
  any	
  proposed	
  solutions	
  or	
  remedies)	
  
3.  	
  policies	
  for	
  access	
  and	
  sharing	
  including	
  provisions	
  for	
  appropriate	
  
protection	
  of	
  privacy,	
  confidentiality,	
  security,	
  intellectual	
  property,	
  or	
  other	
  
rights	
  or	
  requirements	
  
4.  	
  policies	
  and	
  provisions	
  for	
  re-­‐use,	
  re-­‐distribution,	
  and	
  the	
  production	
  of	
  
derivatives	
  
5.  	
  plans	
  for	
  archiving	
  data,	
  samples,	
  and	
  other	
  research	
  products,	
  and	
  for	
  
preservation	
  of	
  access	
  to	
  them	
  
From	
  Flickr	
  by	
  OZinOH	
  

The	
  Data	
  Management	
  
Planning	
  Tool	
  

DMPTool	
  	
  

	
  

Carly	
  Strasser	
  	
  	
  |	
  @carlystrasser	
  
California	
  Digital	
  Library	
  
	
  

5	
  August	
  2013	
  
ESA	
  2013	
  SS	
  2	
  
From	
  Flickr	
  by	
  dipster1	
  

Toolbox	
  
Write	
  a	
  DMP	
  

dmptool.org	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  

Step-­‐by-­‐step	
  wizard	
  for	
  generating	
  DMP	
  
create	
  |	
  edit	
  |	
  re-­‐use	
  |	
  share	
  
Free	
  &	
  open	
  to	
  community	
  	
  
Find	
  a	
  repository	
  

Where	
  
should	
  I	
  put	
  
my	
  data?	
  

databib.org	
  
From Flickr by thewmatt

Get	
  help	
  
From	
  Flickr	
  by	
  North	
  Carolina	
  Digital	
  
Heritage	
  Center	
  

Get	
  help	
  from	
  your	
  library	
  

From	
  Flickr	
  by	
  Madison	
  Guy	
  
Get	
  help	
  

Toolbox:	
  
	
  DCXL	
  blog:	
  dcxl.cdlib.org	
  
From	
  Flickr	
  by	
  dotpolka	
  

Doing	
  science	
  is	
  a	
  
privilege	
  –	
  not	
  a	
  right	
  
From	
  Flickr	
  by	
  Michael	
  Tinkler	
  
From	
  Flickr	
  by	
  mikerosebery	
  

	
  There	
  is	
  a	
  social	
  contract	
  of	
  science:	
  we	
  

have	
  an	
  obligation	
  to	
  ensure	
  dissemination,	
  
validation,	
  &	
  advancement.	
  
To	
  not	
  do	
  so	
  is	
  science	
  malpractice.	
  	
  
	
  
–	
  Brian	
  Hole,	
  Ubiquity	
  Press	
  at	
  UCL	
  
My	
  website	
  
Email	
  me	
  
Tweet	
  me	
  
My	
  slides	
  

carlystrasser.net	
  
carlystrasser@gmail.com	
  
@carlystrasser	
  	
  
slideshare.net/carlystrasser	
  

Weitere ähnliche Inhalte

Was ist angesagt?

NISO Webinar on data curation services at the CDL
NISO Webinar on data curation services at the CDLNISO Webinar on data curation services at the CDL
NISO Webinar on data curation services at the CDLCarly Strasser
 
ESA Ignite talk on UC3 Dash platform for data sharing
ESA Ignite talk on UC3 Dash platform for data sharingESA Ignite talk on UC3 Dash platform for data sharing
ESA Ignite talk on UC3 Dash platform for data sharingCarly Strasser
 
Lightning Talk on open data for #oaw14sky
Lightning Talk on open data for #oaw14skyLightning Talk on open data for #oaw14sky
Lightning Talk on open data for #oaw14skyCarly Strasser
 
Data management overview and UC3 tools for IASSIST 2014
Data management overview and UC3 tools for IASSIST 2014Data management overview and UC3 tools for IASSIST 2014
Data management overview and UC3 tools for IASSIST 2014Carly Strasser
 
Research Life Cycle for GeoData 2014
Research Life Cycle for GeoData 2014Research Life Cycle for GeoData 2014
Research Life Cycle for GeoData 2014Carly Strasser
 
Rda nitrd 2015 berman - final
Rda nitrd 2015 berman  - finalRda nitrd 2015 berman  - final
Rda nitrd 2015 berman - finalKathy Fontaine
 
"Undergrad ecologists aren't learning data management" - ESA 2013
"Undergrad ecologists aren't learning data management" -  ESA 2013"Undergrad ecologists aren't learning data management" -  ESA 2013
"Undergrad ecologists aren't learning data management" - ESA 2013Carly Strasser
 
DMPTool: Data Management Made Easier at CNI 2012
DMPTool: Data Management Made Easier at CNI 2012DMPTool: Data Management Made Easier at CNI 2012
DMPTool: Data Management Made Easier at CNI 2012Carly Strasser
 
Making Data Dynamic: Views from UC3, CDL
Making Data Dynamic: Views from UC3, CDLMaking Data Dynamic: Views from UC3, CDL
Making Data Dynamic: Views from UC3, CDLCarly Strasser
 
Data Management: The Current Landscape
Data Management: The Current LandscapeData Management: The Current Landscape
Data Management: The Current LandscapeCarly Strasser
 
DMPTool Webinar 6: Health Sciences and the DMPTool (presented by Lisa Federer)
DMPTool Webinar 6: Health Sciences and the DMPTool (presented by Lisa Federer)DMPTool Webinar 6: Health Sciences and the DMPTool (presented by Lisa Federer)
DMPTool Webinar 6: Health Sciences and the DMPTool (presented by Lisa Federer)University of California Curation Center
 
DMPTool Webinar 8: Data Curation Profiles and the DMPTool (presented by Jake ...
DMPTool Webinar 8: Data Curation Profiles and the DMPTool (presented by Jake ...DMPTool Webinar 8: Data Curation Profiles and the DMPTool (presented by Jake ...
DMPTool Webinar 8: Data Curation Profiles and the DMPTool (presented by Jake ...University of California Curation Center
 
Data Publication at CDL for IDCC14
Data Publication at CDL for IDCC14Data Publication at CDL for IDCC14
Data Publication at CDL for IDCC14Carly Strasser
 
Data Science, Data Curation, and Human-Data Interaction
Data Science, Data Curation, and Human-Data InteractionData Science, Data Curation, and Human-Data Interaction
Data Science, Data Curation, and Human-Data InteractionUniversity of Washington
 
DataUp: An overview for the DataONE Users Group
DataUp: An overview for the DataONE Users GroupDataUp: An overview for the DataONE Users Group
DataUp: An overview for the DataONE Users GroupCarly Strasser
 
Tragedy of the (Data) Commons
Tragedy of the (Data) CommonsTragedy of the (Data) Commons
Tragedy of the (Data) CommonsJames Hendler
 

Was ist angesagt? (20)

Dash for IASSIST 2014
Dash for IASSIST 2014Dash for IASSIST 2014
Dash for IASSIST 2014
 
NISO Webinar on data curation services at the CDL
NISO Webinar on data curation services at the CDLNISO Webinar on data curation services at the CDL
NISO Webinar on data curation services at the CDL
 
ESA Ignite talk on UC3 Dash platform for data sharing
ESA Ignite talk on UC3 Dash platform for data sharingESA Ignite talk on UC3 Dash platform for data sharing
ESA Ignite talk on UC3 Dash platform for data sharing
 
DataUp at ACRL 2013
DataUp at ACRL 2013DataUp at ACRL 2013
DataUp at ACRL 2013
 
Lightning Talk on open data for #oaw14sky
Lightning Talk on open data for #oaw14skyLightning Talk on open data for #oaw14sky
Lightning Talk on open data for #oaw14sky
 
Data management overview and UC3 tools for IASSIST 2014
Data management overview and UC3 tools for IASSIST 2014Data management overview and UC3 tools for IASSIST 2014
Data management overview and UC3 tools for IASSIST 2014
 
Research Life Cycle for GeoData 2014
Research Life Cycle for GeoData 2014Research Life Cycle for GeoData 2014
Research Life Cycle for GeoData 2014
 
Rda nitrd 2015 berman - final
Rda nitrd 2015 berman  - finalRda nitrd 2015 berman  - final
Rda nitrd 2015 berman - final
 
"Undergrad ecologists aren't learning data management" - ESA 2013
"Undergrad ecologists aren't learning data management" -  ESA 2013"Undergrad ecologists aren't learning data management" -  ESA 2013
"Undergrad ecologists aren't learning data management" - ESA 2013
 
DMPTool: Data Management Made Easier at CNI 2012
DMPTool: Data Management Made Easier at CNI 2012DMPTool: Data Management Made Easier at CNI 2012
DMPTool: Data Management Made Easier at CNI 2012
 
Making Data Dynamic: Views from UC3, CDL
Making Data Dynamic: Views from UC3, CDLMaking Data Dynamic: Views from UC3, CDL
Making Data Dynamic: Views from UC3, CDL
 
Christine borgman keynote
Christine borgman keynoteChristine borgman keynote
Christine borgman keynote
 
Data Management: The Current Landscape
Data Management: The Current LandscapeData Management: The Current Landscape
Data Management: The Current Landscape
 
DMPTool Webinar 6: Health Sciences and the DMPTool (presented by Lisa Federer)
DMPTool Webinar 6: Health Sciences and the DMPTool (presented by Lisa Federer)DMPTool Webinar 6: Health Sciences and the DMPTool (presented by Lisa Federer)
DMPTool Webinar 6: Health Sciences and the DMPTool (presented by Lisa Federer)
 
DMPTool Webinar 8: Data Curation Profiles and the DMPTool (presented by Jake ...
DMPTool Webinar 8: Data Curation Profiles and the DMPTool (presented by Jake ...DMPTool Webinar 8: Data Curation Profiles and the DMPTool (presented by Jake ...
DMPTool Webinar 8: Data Curation Profiles and the DMPTool (presented by Jake ...
 
Data Publication at CDL for IDCC14
Data Publication at CDL for IDCC14Data Publication at CDL for IDCC14
Data Publication at CDL for IDCC14
 
Data Science, Data Curation, and Human-Data Interaction
Data Science, Data Curation, and Human-Data InteractionData Science, Data Curation, and Human-Data Interaction
Data Science, Data Curation, and Human-Data Interaction
 
DataUp: An overview for the DataONE Users Group
DataUp: An overview for the DataONE Users GroupDataUp: An overview for the DataONE Users Group
DataUp: An overview for the DataONE Users Group
 
Tragedy of the (Data) Commons
Tragedy of the (Data) CommonsTragedy of the (Data) Commons
Tragedy of the (Data) Commons
 
Uc3 ucacc-2015-11-16
Uc3 ucacc-2015-11-16Uc3 ucacc-2015-11-16
Uc3 ucacc-2015-11-16
 

Andere mochten auch

SBML (the Systems Biology Markup Language)
SBML (the Systems Biology Markup Language)SBML (the Systems Biology Markup Language)
SBML (the Systems Biology Markup Language)Mike Hucka
 
Ita2005
Ita2005Ita2005
Ita2005cavip
 
Sxsw ala-presentation
Sxsw ala-presentationSxsw ala-presentation
Sxsw ala-presentationCarson Block
 
Procrastination presentation
Procrastination presentationProcrastination presentation
Procrastination presentationssmomml
 
Sample 100-index-patterns-english-spelling-volumes-1-10
Sample 100-index-patterns-english-spelling-volumes-1-10Sample 100-index-patterns-english-spelling-volumes-1-10
Sample 100-index-patterns-english-spelling-volumes-1-10brian_avko_org
 
Bookmarking sheet(11 may-2012)(2)
Bookmarking sheet(11 may-2012)(2)Bookmarking sheet(11 may-2012)(2)
Bookmarking sheet(11 may-2012)(2)nareshcssoft
 

Andere mochten auch (6)

SBML (the Systems Biology Markup Language)
SBML (the Systems Biology Markup Language)SBML (the Systems Biology Markup Language)
SBML (the Systems Biology Markup Language)
 
Ita2005
Ita2005Ita2005
Ita2005
 
Sxsw ala-presentation
Sxsw ala-presentationSxsw ala-presentation
Sxsw ala-presentation
 
Procrastination presentation
Procrastination presentationProcrastination presentation
Procrastination presentation
 
Sample 100-index-patterns-english-spelling-volumes-1-10
Sample 100-index-patterns-english-spelling-volumes-1-10Sample 100-index-patterns-english-spelling-volumes-1-10
Sample 100-index-patterns-english-spelling-volumes-1-10
 
Bookmarking sheet(11 may-2012)(2)
Bookmarking sheet(11 may-2012)(2)Bookmarking sheet(11 may-2012)(2)
Bookmarking sheet(11 may-2012)(2)
 

Ähnlich wie Stable Isotope Data from Wash Cres Lake Algal Samples

Data Management: Scientist Perspective - UC3 Data Curation Workshop
Data Management: Scientist Perspective - UC3 Data Curation WorkshopData Management: Scientist Perspective - UC3 Data Curation Workshop
Data Management: Scientist Perspective - UC3 Data Curation WorkshopCarly Strasser
 
UCLA: Data Management for Scientists
UCLA: Data Management for ScientistsUCLA: Data Management for Scientists
UCLA: Data Management for ScientistsCarly Strasser
 
Data Herding for Scientists - UC Davis OA Week
Data Herding for Scientists - UC Davis OA WeekData Herding for Scientists - UC Davis OA Week
Data Herding for Scientists - UC Davis OA WeekCarly Strasser
 
CDL Tools for DataCite 2014
CDL Tools for DataCite 2014CDL Tools for DataCite 2014
CDL Tools for DataCite 2014Carly Strasser
 
UC Riverside: Data Management for Scientists
UC Riverside: Data Management for ScientistsUC Riverside: Data Management for Scientists
UC Riverside: Data Management for ScientistsCarly Strasser
 
Research Data Curation _ Grad Humanities Class
Research Data Curation _ Grad Humanities ClassResearch Data Curation _ Grad Humanities Class
Research Data Curation _ Grad Humanities ClassAaron Collie
 
Data Management: Scientist Perspective - DLF 2012
Data Management: Scientist Perspective - DLF 2012Data Management: Scientist Perspective - DLF 2012
Data Management: Scientist Perspective - DLF 2012Carly Strasser
 
Keeping Up to Date on Data Management - UC3 Data Curation Workshop
Keeping Up to Date on Data Management - UC3 Data Curation WorkshopKeeping Up to Date on Data Management - UC3 Data Curation Workshop
Keeping Up to Date on Data Management - UC3 Data Curation WorkshopCarly Strasser
 
Introduction to data management
Introduction to data managementIntroduction to data management
Introduction to data managementcunera
 
DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?DataONE
 
DataStarR: A Data Sharing and Publication Infrastructure to Support Research
DataStarR: A Data Sharing and Publication Infrastructure to Support ResearchDataStarR: A Data Sharing and Publication Infrastructure to Support Research
DataStarR: A Data Sharing and Publication Infrastructure to Support ResearchIAALD Community
 
Keeping Up with Data
Keeping Up with Data Keeping Up with Data
Keeping Up with Data AbigailGoben
 
Fsci 2018 thursday2_august_am6
Fsci 2018 thursday2_august_am6Fsci 2018 thursday2_august_am6
Fsci 2018 thursday2_august_am6ARDC
 
Data Matters for AGU Early Career Conference
Data Matters for AGU Early Career ConferenceData Matters for AGU Early Career Conference
Data Matters for AGU Early Career ConferenceCarly Strasser
 
Managing and sharing data
Managing and sharing dataManaging and sharing data
Managing and sharing dataSarah Jones
 
Data Management Planning and the DMPTool
Data Management Planning and the DMPToolData Management Planning and the DMPTool
Data Management Planning and the DMPToolCarly Strasser
 
Research Data Management Fundamentals for MSU Engineering Students
Research Data Management Fundamentals for MSU Engineering StudentsResearch Data Management Fundamentals for MSU Engineering Students
Research Data Management Fundamentals for MSU Engineering StudentsAaron Collie
 

Ähnlich wie Stable Isotope Data from Wash Cres Lake Algal Samples (20)

Data Management: Scientist Perspective - UC3 Data Curation Workshop
Data Management: Scientist Perspective - UC3 Data Curation WorkshopData Management: Scientist Perspective - UC3 Data Curation Workshop
Data Management: Scientist Perspective - UC3 Data Curation Workshop
 
UCLA: Data Management for Scientists
UCLA: Data Management for ScientistsUCLA: Data Management for Scientists
UCLA: Data Management for Scientists
 
Data Herding for Scientists - UC Davis OA Week
Data Herding for Scientists - UC Davis OA WeekData Herding for Scientists - UC Davis OA Week
Data Herding for Scientists - UC Davis OA Week
 
CDL Tools for DataCite 2014
CDL Tools for DataCite 2014CDL Tools for DataCite 2014
CDL Tools for DataCite 2014
 
UC Riverside: Data Management for Scientists
UC Riverside: Data Management for ScientistsUC Riverside: Data Management for Scientists
UC Riverside: Data Management for Scientists
 
Research Data Curation _ Grad Humanities Class
Research Data Curation _ Grad Humanities ClassResearch Data Curation _ Grad Humanities Class
Research Data Curation _ Grad Humanities Class
 
Data Management: Scientist Perspective - DLF 2012
Data Management: Scientist Perspective - DLF 2012Data Management: Scientist Perspective - DLF 2012
Data Management: Scientist Perspective - DLF 2012
 
Keeping Up to Date on Data Management - UC3 Data Curation Workshop
Keeping Up to Date on Data Management - UC3 Data Curation WorkshopKeeping Up to Date on Data Management - UC3 Data Curation Workshop
Keeping Up to Date on Data Management - UC3 Data Curation Workshop
 
Introduction to data management
Introduction to data managementIntroduction to data management
Introduction to data management
 
DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?
 
DataStarR: A Data Sharing and Publication Infrastructure to Support Research
DataStarR: A Data Sharing and Publication Infrastructure to Support ResearchDataStarR: A Data Sharing and Publication Infrastructure to Support Research
DataStarR: A Data Sharing and Publication Infrastructure to Support Research
 
Keeping Up with Data
Keeping Up with Data Keeping Up with Data
Keeping Up with Data
 
Creating a Green Environment @ your library
Creating a Green Environment @ your library Creating a Green Environment @ your library
Creating a Green Environment @ your library
 
Fsci 2018 thursday2_august_am6
Fsci 2018 thursday2_august_am6Fsci 2018 thursday2_august_am6
Fsci 2018 thursday2_august_am6
 
Data Matters for AGU Early Career Conference
Data Matters for AGU Early Career ConferenceData Matters for AGU Early Career Conference
Data Matters for AGU Early Career Conference
 
Managing and sharing data
Managing and sharing dataManaging and sharing data
Managing and sharing data
 
ARLIS-NY Presentation
ARLIS-NY PresentationARLIS-NY Presentation
ARLIS-NY Presentation
 
Data Management Planning and the DMPTool
Data Management Planning and the DMPToolData Management Planning and the DMPTool
Data Management Planning and the DMPTool
 
Research Data Management Fundamentals for MSU Engineering Students
Research Data Management Fundamentals for MSU Engineering StudentsResearch Data Management Fundamentals for MSU Engineering Students
Research Data Management Fundamentals for MSU Engineering Students
 
Sept 18 NISO Webinar: Research Data Curation, Part 2: Libraries and Big Data ...
Sept 18 NISO Webinar: Research Data Curation, Part 2: Libraries and Big Data ...Sept 18 NISO Webinar: Research Data Curation, Part 2: Libraries and Big Data ...
Sept 18 NISO Webinar: Research Data Curation, Part 2: Libraries and Big Data ...
 

Mehr von Carly Strasser

Funders and Publishers: Agents of Change
Funders and Publishers: Agents of ChangeFunders and Publishers: Agents of Change
Funders and Publishers: Agents of ChangeCarly Strasser
 
AIBS Bioinformatics Workforce Needs Workshop, Dec 2015
AIBS Bioinformatics Workforce Needs Workshop, Dec 2015AIBS Bioinformatics Workforce Needs Workshop, Dec 2015
AIBS Bioinformatics Workforce Needs Workshop, Dec 2015Carly Strasser
 
ESA Ignite talk on quality control for data
ESA Ignite talk on quality control for dataESA Ignite talk on quality control for data
ESA Ignite talk on quality control for dataCarly Strasser
 
Data publication and Citation for CLIR postdoc seminar
Data publication and Citation for CLIR postdoc seminarData publication and Citation for CLIR postdoc seminar
Data publication and Citation for CLIR postdoc seminarCarly Strasser
 
Libraries & Research Data Management for CO Alliance of Resrch Libraries
Libraries & Research Data Management for CO Alliance of Resrch LibrariesLibraries & Research Data Management for CO Alliance of Resrch Libraries
Libraries & Research Data Management for CO Alliance of Resrch LibrariesCarly Strasser
 
Open Science for Australian Institute of Marine Science Workshop
Open Science for Australian Institute of Marine Science WorkshopOpen Science for Australian Institute of Marine Science Workshop
Open Science for Australian Institute of Marine Science WorkshopCarly Strasser
 
Data Stewardship for SPATIAL/IsoCamp 2014
Data Stewardship for SPATIAL/IsoCamp 2014Data Stewardship for SPATIAL/IsoCamp 2014
Data Stewardship for SPATIAL/IsoCamp 2014Carly Strasser
 
Coping with Data for WHOI JP Students
Coping with Data for WHOI JP StudentsCoping with Data for WHOI JP Students
Coping with Data for WHOI JP StudentsCarly Strasser
 
DMPTool for UMass eScience Symposium
DMPTool for UMass eScience SymposiumDMPTool for UMass eScience Symposium
DMPTool for UMass eScience SymposiumCarly Strasser
 
DMPTool 2.0 for #IDCC14
DMPTool 2.0 for #IDCC14DMPTool 2.0 for #IDCC14
DMPTool 2.0 for #IDCC14Carly Strasser
 
Data Publication for UC Davis Publish or Perish
Data Publication for UC Davis Publish or PerishData Publication for UC Davis Publish or Perish
Data Publication for UC Davis Publish or PerishCarly Strasser
 
DMPTool for IMLS #WebWise14
DMPTool for IMLS #WebWise14DMPTool for IMLS #WebWise14
DMPTool for IMLS #WebWise14Carly Strasser
 
Bren - UCSB - Spooky spreadsheets
Bren - UCSB - Spooky spreadsheetsBren - UCSB - Spooky spreadsheets
Bren - UCSB - Spooky spreadsheetsCarly Strasser
 
Cal Poly - An Overview of Open Science
Cal Poly - An Overview of Open ScienceCal Poly - An Overview of Open Science
Cal Poly - An Overview of Open ScienceCarly Strasser
 
PLOS ALM Talk on UC3 Services and Altmetrics
PLOS ALM Talk on UC3 Services and AltmetricsPLOS ALM Talk on UC3 Services and Altmetrics
PLOS ALM Talk on UC3 Services and AltmetricsCarly Strasser
 

Mehr von Carly Strasser (15)

Funders and Publishers: Agents of Change
Funders and Publishers: Agents of ChangeFunders and Publishers: Agents of Change
Funders and Publishers: Agents of Change
 
AIBS Bioinformatics Workforce Needs Workshop, Dec 2015
AIBS Bioinformatics Workforce Needs Workshop, Dec 2015AIBS Bioinformatics Workforce Needs Workshop, Dec 2015
AIBS Bioinformatics Workforce Needs Workshop, Dec 2015
 
ESA Ignite talk on quality control for data
ESA Ignite talk on quality control for dataESA Ignite talk on quality control for data
ESA Ignite talk on quality control for data
 
Data publication and Citation for CLIR postdoc seminar
Data publication and Citation for CLIR postdoc seminarData publication and Citation for CLIR postdoc seminar
Data publication and Citation for CLIR postdoc seminar
 
Libraries & Research Data Management for CO Alliance of Resrch Libraries
Libraries & Research Data Management for CO Alliance of Resrch LibrariesLibraries & Research Data Management for CO Alliance of Resrch Libraries
Libraries & Research Data Management for CO Alliance of Resrch Libraries
 
Open Science for Australian Institute of Marine Science Workshop
Open Science for Australian Institute of Marine Science WorkshopOpen Science for Australian Institute of Marine Science Workshop
Open Science for Australian Institute of Marine Science Workshop
 
Data Stewardship for SPATIAL/IsoCamp 2014
Data Stewardship for SPATIAL/IsoCamp 2014Data Stewardship for SPATIAL/IsoCamp 2014
Data Stewardship for SPATIAL/IsoCamp 2014
 
Coping with Data for WHOI JP Students
Coping with Data for WHOI JP StudentsCoping with Data for WHOI JP Students
Coping with Data for WHOI JP Students
 
DMPTool for UMass eScience Symposium
DMPTool for UMass eScience SymposiumDMPTool for UMass eScience Symposium
DMPTool for UMass eScience Symposium
 
DMPTool 2.0 for #IDCC14
DMPTool 2.0 for #IDCC14DMPTool 2.0 for #IDCC14
DMPTool 2.0 for #IDCC14
 
Data Publication for UC Davis Publish or Perish
Data Publication for UC Davis Publish or PerishData Publication for UC Davis Publish or Perish
Data Publication for UC Davis Publish or Perish
 
DMPTool for IMLS #WebWise14
DMPTool for IMLS #WebWise14DMPTool for IMLS #WebWise14
DMPTool for IMLS #WebWise14
 
Bren - UCSB - Spooky spreadsheets
Bren - UCSB - Spooky spreadsheetsBren - UCSB - Spooky spreadsheets
Bren - UCSB - Spooky spreadsheets
 
Cal Poly - An Overview of Open Science
Cal Poly - An Overview of Open ScienceCal Poly - An Overview of Open Science
Cal Poly - An Overview of Open Science
 
PLOS ALM Talk on UC3 Services and Altmetrics
PLOS ALM Talk on UC3 Services and AltmetricsPLOS ALM Talk on UC3 Services and Altmetrics
PLOS ALM Talk on UC3 Services and Altmetrics
 

Kürzlich hochgeladen

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 

Kürzlich hochgeladen (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 

Stable Isotope Data from Wash Cres Lake Algal Samples

  • 1. From  Calisphere,    Couretsy  of    UC  Riverside,  California  Museum  of  Photography   From  Calisphere,    Courtesy  of  Thousand  Oaks  Library       Data  Management   for  Researchers   Tips,  Tools,  &     Why  You  Should  Care     Carly  Strasser,  PhD   California  Digital  Library   @carlystrasser   carly.strasser@ucop.edu   Cal  Poly     Oct  2013  
  • 2. Roadmap   4.  Toolbox     3.  Best  practices   2.  Why  you  should  care   1.  Background    
  • 3. What  role  can   libraries  play  in   data  education?   NSF  funded  DataNet  Project   Office  of  Cyberinfrastructure   Why  don’t  people   share  data?   Do  attitudes  about   sharing  differ   among  disciplines?   What  barriers  to  sharing   can  we  eliminate?   Is  data  management   being  taught?   How  can  we  promote  storing   data  in  repositories?  
  • 4. What  role  can   libraries  play  in   data  education?   Why  don’t  people   share  data?   Do  attitudes  about   sharing  differ   among  disciplines?   What  barriers  to  sharing   can  we  eliminate?   Is  data  management   being  taught?   How  can  we  promote  storing   data  in  repositories?  
  • 5.
  • 6. From  Calisphere  via  Santa  Clara  University,     ark:/13030/kt696nc7j2   A  Brief   History  of   Data   Collection   Or…  how  scientists  came  to  be  so   bad  at  data  management  
  • 7. Back in the day… Curie   Newton   Da  Vinci   classicalschool.blogspot.com   Darwin  
  • 8. From  Flickr  by  US  Army  Environmental  Command   From  Flickr  by    deltaMike   From  Flickr  by    DW0825   From  Flickr  by  Flickmor   Courtesey  of  WHOI   Digital  data   C.  Strasser  
  • 9. Digital  data   +     Complex   workflows  
  • 10. Data  management   Documentation   Reproducibility   From  Flickr  by  ~Minnea~  
  • 11. •  Cost   •  Confusion  about   standards   •  Lack  of  training   •  Fear  of  lost  rights  or   benefits   •  No  incentives   From  Flickr  by  iowa_spirit_walker  
  • 12. From  sandierpastures.com   the Truth You need to know about Data  management   Metadata   Data  repositories   Data  sharing  
  • 13. Why  you   should  care   From  Flickr  by  johntrainor  
  • 14. Because  they  care:   From  Flickr  by  Redden-­‐McAllister  
  • 15. Because  they  care:   All  data  must  be  in  a   public  archive.   You  can’t  hoard  it.  If  it’s  not   available  you  can’t  cite  it.   Include  a  data  section  with   how  to  find  datasets.  
  • 16. Data   Management:   Who  Knew    Could   be  a  Hot  Topic?   r!   te La From  Flickr  by  Velo  Steve   Carly  Strasser,  PhD   California  Digital  Library   @carlystrasser   Cal  Poly   Oct  2013  
  • 17. What  should   OT N Vbe  doing?   you   From  Flickr  by  whatthefeed  
  • 18. 2  tables   Random  notes   C:Documents and SettingshamptonMy DocumentsNCEAS Distributed Graduate Seminars[Wash Cres Lake Dec 15 Dont_Use.xls]Sheet1 Stable Isotope Data Sheet Sampling Site / Identifier: Wash Cresc Lake Sample Type: Algal Date: Dec. 16 Tray ID and Sequence: Tray 004 Reference statistics: SD for delta Position A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 B1 B2 B3 B4 B5 B6 B7 B8 B9 B10 C1 C2 C3 SampleID Weight (mg) 0.98 0.98 0.98 1.01 3.05 3.06 2.91 2.91 3.04 2.95 3.01 3 2.99 2.92 2.9 1.01 ref 0.99 ref 3.04 3.09 3.05 2.98 3.04 0.99 ref ref ref ref ref ALG01 Lk Outlet Alg ALG03 ALG05 ALG07 ALG06 ALG04 ALG02 ALG01 ALG03 ALG07 Lk Outlet Alg ALG06 ALG02 ALG04 ALG05 13 From  Stephanie  Hampton   C = 0.07 %C 38.27 39.78 40.37 42.23 1.88 31.55 6.85 35.56 33.49 41.17 43.74 4.51 1.59 4.37 33.58 44.94 42.28 31.43 35.57 5.52 37.90 31.74 38.46 23.78 SD for delta delta 13C -25.05 -25.00 -24.99 -25.06 -24.34 -30.17 -21.11 -28.05 -29.56 -27.32 -27.50 -22.68 -24.58 -21.06 -29.44 -25.00 -24.87 -29.69 -27.26 -22.31 -27.42 -27.93 -25.09 delta 13C_ca -24.59 -24.54 -24.53 -24.60 -23.88 -29.71 -20.65 -27.59 -29.10 -26.86 -27.04 -22.22 -24.12 -20.60 -28.98 -24.54 -24.41 -29.23 -26.80 -21.85 -26.96 -27.47 -24.63 %N 1.96 2.03 2.04 2.17 0.17 0.92 0.48 2.30 1.68 1.97 1.36 0.34 0.15 0.34 1.74 2.59 2.37 1.07 1.96 0.45 1.36 2.40 2.40 1.17 15 Peter's lab Washed Rocks Don't use - old data Shore -1.26 1.26 Avg Con -27.22 0.32 N = 0.15 delta 15N 4.12 4.01 4.09 4.20 -1.65 0.87 -0.97 0.59 0.79 2.71 0.99 4.31 -1.69 -1.52 0.62 3.96 4.33 0.95 2.79 4.72 1.21 0.73 4.37 delta 15N_ca 3.47 3.36 3.44 3.55 -2.30 0.22 -1.62 -0.06 0.14 2.06 0.34 3.66 -2.34 -2.17 -0.03 3.31 3.68 0.30 2.14 4.07 0.56 0.08 3.72 Spec. No. 25354 25356 25358 25360 25362 25364 25366 25368 25370 25372 25374 25376 25378 25380 25382 25384 25386 25388 25390 25392 25394 25396 25398 c c c c c c From  Stephanie  Hampton  (2010)   ESA  Workshop  on  Best  Practices      
  • 19. Wash  Cres  Lake  Dec  15  Dont_Use.xls   C:Documents and SettingshamptonMy DocumentsNCEAS Distributed Graduate Seminars[Wash Cres Lake Dec 15 Dont_Use.xls]Sheet1 Stable Isotope Data Sheet Sampling Site / Identifier: Wash Cresc Lake Sample Type: Algal Date: Dec. 16 Tray ID and Sequence: Tray 004 Reference statistics: SD for delta Position A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 B1 B2 B3 B4 B5 B6 B7 B8 B9 B10 C1 C2 C3 SampleID Weight (mg) 0.98 0.98 0.98 1.01 3.05 3.06 2.91 2.91 3.04 2.95 3.01 3 2.99 2.92 2.9 1.01 ref 0.99 ref 3.04 3.09 3.05 2.98 3.04 0.99 ref ref ref ref ref ALG01 Lk Outlet Alg ALG03 ALG05 ALG07 ALG06 ALG04 ALG02 ALG01 ALG03 ALG07 Lk Outlet Alg ALG06 ALG02 ALG04 ALG05 13 From  Stephanie  Hampton   C = 0.07 %C 38.27 39.78 40.37 42.23 1.88 31.55 6.85 35.56 33.49 41.17 43.74 4.51 1.59 4.37 33.58 44.94 42.28 31.43 35.57 5.52 37.90 31.74 38.46 23.78 SD for delta delta 13C -25.05 -25.00 -24.99 -25.06 -24.34 -30.17 -21.11 -28.05 -29.56 -27.32 -27.50 -22.68 -24.58 -21.06 -29.44 -25.00 -24.87 -29.69 -27.26 -22.31 -27.42 -27.93 -25.09 delta 13C_ca -24.59 -24.54 -24.53 -24.60 -23.88 -29.71 -20.65 -27.59 -29.10 -26.86 -27.04 -22.22 -24.12 -20.60 -28.98 -24.54 -24.41 -29.23 -26.80 -21.85 -26.96 -27.47 -24.63 %N 1.96 2.03 2.04 2.17 0.17 0.92 0.48 2.30 1.68 1.97 1.36 0.34 0.15 0.34 1.74 2.59 2.37 1.07 1.96 0.45 1.36 2.40 2.40 1.17 15 Peter's lab Washed Rocks Don't use - old data Shore -1.26 1.26 Avg Con -27.22 0.32 N = 0.15 delta 15N 4.12 4.01 4.09 4.20 -1.65 0.87 -0.97 0.59 0.79 2.71 0.99 4.31 -1.69 -1.52 0.62 3.96 4.33 0.95 2.79 4.72 1.21 0.73 4.37 delta 15N_ca 3.47 3.36 3.44 3.55 -2.30 0.22 -1.62 -0.06 0.14 2.06 0.34 3.66 -2.34 -2.17 -0.03 3.31 3.68 0.30 2.14 4.07 0.56 0.08 3.72 Spec. No. 25354 25356 25358 25360 25362 25364 25366 25368 25370 25372 25374 25376 25378 25380 25382 25384 25386 25388 25390 25392 25394 25396 25398 c c c c c c From  Stephanie  Hampton  (2010)   ESA  Workshop  on  Best  Practices      
  • 20. Random  stats  output   C:Documents and SettingshamptonMy DocumentsNCEAS Distributed Graduate Seminars[Wash Cres Lake Dec 15 Dont_Use.xls]Sheet1 Stable Isotope Data Sheet Sampling Site / Identifier: Wash Cresc Lake Sample Type: Algal Date: Dec. 16 Tray ID and Sequence: Tray 004 13 SampleID Weight (mg) 0.98 0.98 0.98 1.01 3.05 3.06 2.91 2.91 3.04 2.95 3.01 3 2.99 2.92 2.9 1.01 ref 0.99 ref 3.04 3.09 3.05 2.98 3.04 0.99 ref ref ref ref ref ALG01 Lk Outlet Alg ALG03 ALG05 ALG07 ALG06 ALG04 ALG02 ALG01 ALG03 ALG07 Lk Outlet Alg ALG06 ALG02 ALG04 ALG05 %C 38.27 39.78 40.37 42.23 1.88 31.55 6.85 35.56 33.49 41.17 43.74 4.51 1.59 4.37 33.58 44.94 42.28 31.43 35.57 5.52 37.90 31.74 38.46 23.78 Don't use - old data Shore -1.26 1.26 Avg Con -27.22 0.32 15 Reference statistics: SD for delta C = 0.07 Position A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 B1 B2 B3 B4 B5 B6 B7 B8 B9 B10 C1 C2 C3 Peter's lab Washed Rocks SD for delta N = 0.15 delta 13C -25.05 -25.00 -24.99 -25.06 -24.34 -30.17 -21.11 -28.05 -29.56 -27.32 -27.50 -22.68 -24.58 -21.06 -29.44 -25.00 -24.87 -29.69 -27.26 -22.31 -27.42 -27.93 -25.09 delta 13C_ca -24.59 -24.54 -24.53 -24.60 -23.88 -29.71 -20.65 -27.59 -29.10 -26.86 -27.04 -22.22 -24.12 -20.60 -28.98 -24.54 -24.41 -29.23 -26.80 -21.85 -26.96 -27.47 -24.63 %N 1.96 2.03 2.04 2.17 0.17 0.92 0.48 2.30 1.68 1.97 1.36 0.34 0.15 0.34 1.74 2.59 2.37 1.07 1.96 0.45 1.36 2.40 2.40 1.17 delta 15N 4.12 4.01 4.09 4.20 -1.65 0.87 -0.97 0.59 0.79 2.71 0.99 4.31 -1.69 -1.52 0.62 3.96 4.33 0.95 2.79 4.72 1.21 0.73 4.37 delta 15N_ca Spec. No. 3.47 25354 3.36 25356 3.44 25358 3.55 25360 -2.30 25362 0.22 25364 -1.62 25366 -0.06 25368 0.14 25370 2.06 25372 0.34 25374 3.66 25376 -2.34 25378 -2.17 25380 -0.03 25382 3.31 25384 3.68 25386 0.30 25388 2.14 25390 4.07 25392 0.56 25394 0.08 25396 3.72 25398 c c c SUMMARY OUTPUT c c Regression Statistics Multiple R 0.283158 R Square 0.080178 Adjusted R Square -0.022024 Standard Error 1.906378 Observations 11 ANOVA c df Regression Residual Total SS MS F Significance F 1 2.851116 2.851116 0.784507 0.398813 9 32.7085 3.634278 10 35.55962 Coefficients Standard Error t Stat P-value Lower 95%Upper 95% Lower 95.0% Upper 95.0% Intercept -4.297428 4.671099 -0.920003 0.381568 -14.8642 6.269341 -14.8642 6.269341 X Variable 1 -0.158022 0.17841 -0.885724 0.398813 -0.561612 0.245569 -0.561612 0.245569 From  Stephanie  Hampton  
  • 21. C:Documents and SettingshamptonMy DocumentsNCEAS Distributed Graduate Seminars[Wash Cres Lake Dec 15 Dont_Use.xls]Sheet1 Stable Isotope Data Sheet Sampling Site / Identifier: Wash Cresc Lake Sample Type: Algal Date: Dec. 16 Tray ID and Sequence: Tray 004 Reference statistics: SD for delta Position A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 B1 B2 B3 B4 B5 B6 B7 B8 B9 B10 C1 C2 C3 SampleID Weight (mg) 0.98 0.98 0.98 1.01 3.05 3.06 2.91 2.91 3.04 2.95 3.01 3 2.99 2.92 2.9 1.01 ref 0.99 ref 3.04 3.09 3.05 2.98 3.04 0.99 ref ref ref ref ref ALG01 Lk Outlet Alg ALG03 ALG05 ALG07 ALG06 ALG04 ALG02 ALG01 ALG03 ALG07 Lk Outlet Alg ALG06 ALG02 ALG04 ALG05 13 C = 0.07 %C 38.27 39.78 40.37 42.23 1.88 31.55 6.85 35.56 33.49 41.17 43.74 4.51 1.59 4.37 33.58 44.94 42.28 31.43 35.57 5.52 37.90 31.74 38.46 23.78 SD for delta delta 13C delta 13C_ca -25.05 -24.59 -25.00 -24.54 -24.99 -24.53 -25.06 -24.60 -24.34 -23.88 -30.17 -29.71 -21.11 -20.65 -28.05 -27.59 -29.56 -29.10 -27.32 -26.86 -27.50 -27.04 SampleID -22.68 -22.22 -24.58 -24.12 -21.06 -20.60 Weight (mg) -29.44 -28.98 -25.00 -24.54 -24.87 -24.41 -29.69 %C-29.23 -27.26 -26.80 delta 13C -22.31 -21.85 delta 13C_ca -27.42 -26.96 -27.93 -27.47 -25.09 -24.63 %N delta 15N delta 15N_ca 15 Don't use - old data Shore -1.26 1.26 Avg Con -27.22 0.32 N = 0.15 %N 1.96 2.03 2.04 2.17 0.17 0.92 0.48 2.30 1.68 1.97 1.36 ALG03 0.34 0.15 0.34 2.91 1.74 2.59 2.37 6.85 1.07 1.96 -21.11 0.45 -20.65 1.36 2.40 2.40 0.48 1.17 -0.97 -1.62 Peter's lab Washed Rocks delta 15N delta 15N_ca 4.12 3.47 4.01 3.36 4.09 3.44 4.20 3.55 -1.65 -2.30 0.87 0.22 -0.97 -1.62 0.59 -0.06 0.79 0.14 2.71 2.06 0.99 0.34 ALG05 4.31 3.66 -1.69 -2.34 -1.52 -2.17 2.91 0.62 -0.03 3.96 3.31 4.33 3.68 35.56 0.95 0.30 2.79 2.14 -28.05 4.72 4.07 -27.59 1.21 0.56 0.73 0.08 4.37 3.72 2.30 0.59 -0.06 Spec. No. 25354 25356 25358 25360 25362 c 25364 25366 c 25368 25370 25372 25374 c ALG07 25376 25378 c 25380 c 25382 3.04 25384 25386 25388 33.49 25390 -29.56 25392 25394 -29.10 c 25396 25398 1.68 0.79 0.14 SUMMARY OUTPUT ALG06 ALG04 Regression Statistics Multiple R 0.283158 2.95 Square 0.080178 3.01 R Adjusted R Square -0.022024 Standard Error 1.906378 41.17 43.74 Observations 11 -27.32 -27.50 ANOVA -26.86 -27.04 df Regression Residual 1.97 Total 2.71 2.06 Intercept ALG02 ALG01 3 4.51 -22.68 -22.22 MS 2.99 1.59 -24.58 -24.12 Significance F SS F 1 2.851116 2.851116 0.784507 0.398813 9 32.7085 3.634278 1.3610 35.55962 0.34 0.15 0.99 4.31 -1.69 ALG03 ALG07 2.92 2.9 4.37 -21.06 -20.60 33.58 -29.44 -28.98 0.34 -1.52 1.74 0.62 -0.03 Coefficients Standard Error t Stat P-value Lower 95%Upper 95% Lower 95.0% Upper 95.0% 0.34 -2.34 -2.17 -4.297428 4.671099 3.66 -0.920003 0.381568 -14.8642 6.269341 -14.8642 6.269341 X Variable 1 -0.158022 0.17841 -0.885724 0.398813 -0.561612 0.245569 -0.561612 0.245569 4.00 3.00 2.00 1.00 Series1 -35.00 -30.00 -25.00 -20.00 -15.00 -10.00 -5.00 0.00 0.00 -1.00 -2.00 -3.00 From  Stephanie  Hampton  
  • 22. What  should   you  be  doing?   From  Flickr  by  whatthefeed  
  • 23. From  Flickr  by  Big  Swede  Guy   Best  Practices   ent data managem 1.  Planning   2.  Data  collection  &   organization   3.  Quality  control  &  assurance   4.  Metadata   5.  Workflows   6.  Data  stewardship  &  reuse  
  • 24. From  Flickr  by  Big  Swede  Guy   Best  Practices   ent data managem 1.  Planning   2.  Data  collection  &   organization   3.  Quality  control  &  assurance   4.  Metadata   5.  Workflows   6.  Data  stewardship  &  reuse  
  • 25. From  Flickr  by  Big  Swede  Guy   Best  Practices   ent data managem 1.  Planning   2.  Data  collection  &   organization   3.  Quality  control  &  assurance   4.  Metadata   5.  Workflows   6.  Data  stewardship  &  reuse  
  • 26. 2.  Data  collection  &  organization   Create  unique  identifiers   From  Flickr  by  zebbie   •  Decide  on  naming  scheme  early   •  Create  a  key   •  Different  for  each  sample   From  Flickr  by  sjbresnahan  
  • 27. 2.  Data  collection  &  organization   Standardize   •  Consistent  within  columns   – only  numbers,  dates,  or  text   •  Consistent  names,  codes,  formats   Modified  from  K.  Vanderbilt     From  Pink  Floyd,  The  Wall      themurkyfringe.com  
  • 28. 2.  Data  collection  &  organization   Standardize   •  Reduce  possibility   of  manual  error  by   constraining  entry   choices   Excel  lists   Google  Docs   Data   Forms   validataion   Modified  from  K.  Vanderbilt    
  • 29. 2.  Data  collection  &  organization       Create  parameter  table   Create  a  site  table   From  doi:10.3334/ORNLDAAC/777   From  doi:10.3334/ORNLDAAC/777   From  R  Cook,  ESA  Best  Practices  Workshop  2010  
  • 30. 2.  Data  collection  &  organization   What  about   databases?   A  relational  database  is      A  set  of  tables    Relationships  among  the  tables    A  language  to  specify  &  query  the  tables     A  RDB  provides    Scalability:  millions+  records    Features  for  sub-­‐setting,  querying,  sorting    Reduced  redundancy  &  entry  errors     From  Mark  Schildhauer  
  • 31. 2.  Data  collection  &  organization   You  should  invest  time  in  learning  databases  if      your  data  sets  are  large  or  complex     Consider  investing  time  in  learning  databases  if    your  data  are  small  and  humble    you  ever  intend  to  share  your  data    you  are  <  30  years  old   From  Mark  Schildhauer  
  • 32. 2.  Data  collection  &  organization    Use  descriptive  file  names  *   •  Unique   •  Reflect  contents   Bad:        Mydata.xls    2001_data.csv    best  version.txt   Better:  Eaffinis_nanaimo_2010_counts.xls   Study   organism   Site   name   Year   What  was   measured     *Not  for  everyone   From  R  Cook,  ESA  Best  Practices  Workshop  2010  
  • 33. 2.  Data  collection  &  organization   Organize  files    logically   Biodiversity   Lake   Experiments   Biodiv_H20_heatExp_2005to2008.csv   Biodiv_H20_predatorExp_2001to2003.csv   …   Field  work   Biodiv_H20_PlanktonCount_2001toActive.csv   Biodiv_H20_ChlAprofiles_2003.csv   …     Grassland   From  S.  Hampton  
  • 34. 2.  Data  collection  &  organization    Preserve  information   •  Keep  raw  data  raw   •  Use  scripts  to  process  data    &  save  them  with  data   Raw  data  as  .csv   R  script  for  processing  &   analysis    
  • 35. From  Flickr  by  Big  Swede  Guy   Best  Practices   ent data managem 1.  Planning   2.  Data  collection  &   organization   3.  Quality  control  &  assurance   4.  Metadata   5.  Workflows   6.  Data  stewardship  &  reuse  
  • 36. 3.  Quality  control  and  quality  assurance   Before  data  collection   From  Flickr  by  StacieBee   •  Define  &  enforce  standards   •  Assign  responsibility  for  data  quality  
  • 37. 3.  Quality  control  and  quality  assurance   After  data  entry   •  Check  for  missing,  impossible,   anomalous  values   •  Perform  statistical  summaries     •  Look  for  outliers     60   50   40   30   20   10   0   0   10   20   30   40  
  • 38. From  Flickr  by  Big  Swede  Guy   Best  Practices   ent data managem 1.  Planning   2.  Data  collection  &   organization   3.  Quality  control  &  assurance   4.  Metadata   5.  Workflows   6.  Data  stewardship  &  reuse  
  • 39. 4.  Metadata  basics   Why  are  you   What  is   promoting   metadata?   Excel?  
  • 40. 4.  Metadata  basics   From  Flickr  by    //ichael  Patric|{        Metadata  =  Data  reporting     WHO  created  the  data?   WHAT  is  the  content      of  the  data  set?   WHEN  was  it  created?   WHERE  was  it  collected?   HOW  was  it  developed?   WHY  was  it  developed?  
  • 41. 4.  Metadata  basics   •  Scientific  context   •  •  Environmental  conditions  during  collection   •  Where  collected  &  spatial  resolution  When   collected  &  temporal  resolution   •  The  name(s)  of  the  data  file(s)  in  the  data   set   What  instruments  (including  model  &   serial  number)  were  used   Standards  or  calibrations  used   Name  of  the  data  set   •  What  data  were  collected   •  Digital  context   Scientific  reason  why  the  data  were   collected   •  •  •  •  Date  the  data  set  was  last  modified   •  Example  data  file  records  for  each  data   type  file   •  Pertinent  companion  files   •  •  Information  about  parameters   List  of  related  or  ancillary  data  sets   How  each  was  measured  or  produced   •  Software  (including  version  number)   used  to  prepare/read    the  data  set   Units  of  measure   •  •  Format  used  in  the  data  set   •  •  •  Data  processing  that  was  performed   •  Precision  &  accuracy  if  known   Personnel  &  stakeholders   •  •  Quality  assurance  &  control  measures   •  Funders   Definitions  of  codes  used   •  Who  to  contact  with  questions   •  Information  about  data   •  Who  collected     •  •  Known  problems  that  limit  data  use  (e.g.   uncertainty,  sampling  problems)     How  to  cite  the  data  set  
  • 42. 4.  Metadata  basics   What  is   metadata?   Select  the  appropriate  standard   •  Provides  structure  to  describe  data   Common  terms    |    definitions    |    language    |    structure   •  Lots  of  different  standards    EML  ,  FGDC,  ISO19115,  DarwinCore,…   •  Tools  for  creating  metadata  files    Morpho  (EML),  Metavist  (FGDC),  NOAA  MERMaid  (CSGDM)        
  • 43. From  Flickr  by  Big  Swede  Guy   Best  Practices   ent data managem 1.  Planning   2.  Data  collection  &   organization   3.  Quality  control  &  assurance   4.  Metadata   5.  Workflows   6.  Data  stewardship  &  reuse  
  • 44. 5.  Workflows   Workflow:  how  you  get  from  the  raw  data  to  the  final   products  of  your  research     Simple  workflows:  flow  charts   Temperature   data   Salinity                 data   “Clean”  T   &  S  data   Data  import  into  R   Data  in  R   format   Quality  control  &   data  cleaning   Analysis:  mean,  SD   Graph  production   Summary   statistics  
  • 45. 5.  Workflows   Workflow:  how  you  get  from  the  raw  data  to  the  final   products  of  your  research     Simple  workflows:  commented  scripts   •  R,  SAS,  MATLAB   •  Well-­‐documented  code  is…   Easier  to  review   Easier  to  share   Easier  to  repeat  analysis   %   #   $   &  
  • 46. 5.  Workflows   Fancy  Schmancy  workflows:  Kepler   Resulting  output   https://kepler-­‐project.org  
  • 47. 5.  Workflows   Workflows  enable   Reproducibility   Transparency     Executability       From  Flickr  by  merlinprincesse  
  • 48. 5.  Workflows   Minimally:  document  your  analysis      commented  code;  simple  flow-­‐chart     www.littlebytesoflife.com   Emerging  workflow  applications  will…   −  Link  software  for  executable  end-­‐to-­‐end  analysis   −  Provide  detailed  info  about  data  &  analysis   −  Facilitate  re-­‐use  &  refinement  of  complex,  mn:   o ulti-­‐step   o analyses   ng  S aring   omi  sh C −  Enable  efficient  swapping  of  alternative  models  s!     kflow ent & r wo algorithms   uirem req −  Help  automate  tedious  tasks  
  • 49. From  Flickr  by  Big  Swede  Guy   Best  Practices   ent data managem 1.  Planning   2.  Data  collection  &   organization   3.  Quality  control  &  assurance   4.  Metadata   5.  Workflows   6.  Data  stewardship  &  reuse  
  • 50. 6.  Data  stewardship  &  reuse   From  Flickr  by  greensambaman   The  20-­‐Year  Rule   The  metadata  accompanying  a   data  set  should  be  written  for  a   user  20  years  into  the  future       RULE   (National  Research  Council  1991)  
  • 51. 6.  Data  stewardship  &  reuse   Use  stable  formats      csv,  txt,  tiff   Create  back-­‐up  copies     original,  near,  far   Periodically  test  ability  to  restore  information   Modified from R. Cook  
  • 52. 6.  Data  stewardship  &  reuse   Store  your  data  in  a  repository   Institutional  archive   Ask  a  librarian   Discipline/specialty  archive         Repos  of  repos:   databib.org   re3data.org   From  Flickr  by  torkildr  
  • 53. 6.  Data  stewardship  &  reuse   Practice  Data  Citation   Example:   Sidlauskas,  B.  2007.  Data  from:  Testing  for  unequal  rates  of   morphological  diversification  in  the  absence  of  a  detailed   phylogeny:  a  case  study  from  characiform  fishes.  Dryad  Digital   Repository.  doi:10.5061/dryad.20   Persistent  Unique   Identifier   Allows  readers  to  find  data  products   Get  credit  for  data  and  publications   Promotes  reproducibility   Better  measure  of  research  impact  
  • 54. From  Flickr  by  Big  Swede  Guy   Best  Practices   ent data managem 1.  Planning   2.  Data  collection  &   organization   3.  Quality  control  &  assurance   4.  Metadata   5.  Workflows   6.  Data  stewardship  &  reuse  
  • 55. What  is  a  data   management  plan?   A  document  that   describes  what  you  will   do  with  your  data   throughout     the  research  project   From Flickr by Barbies Land
  • 56. From  Flickr  by  401(K)  2013   DMP  for  funders:   A  short  plan  submitted   alongside  grant  applications    An  outline  of     –  what  will  be  collected   –  methods   –  Standards   But they all have –  Metadata   different requirements –  sharing/access   and express them in –  long-­‐term  storage   different ways  Includes  how  and  why  
  • 57. NSF  DMP  Requirements   From  Grant  Proposal  Guidelines:    DMP  supplement  may  include:   1.  the  types  of  data,  samples,  physical  collections,  software,  curriculum   materials,  and  other  materials  to  be  produced  in  the  course  of  the  project   2.   the  standards  to  be  used  for  data  and  metadata  format  and  content  (where   existing  standards  are  absent  or  deemed  inadequate,  this  should  be   documented  along  with  any  proposed  solutions  or  remedies)   3.   policies  for  access  and  sharing  including  provisions  for  appropriate   protection  of  privacy,  confidentiality,  security,  intellectual  property,  or  other   rights  or  requirements   4.   policies  and  provisions  for  re-­‐use,  re-­‐distribution,  and  the  production  of   derivatives   5.   plans  for  archiving  data,  samples,  and  other  research  products,  and  for   preservation  of  access  to  them  
  • 58. From  Flickr  by  OZinOH   The  Data  Management   Planning  Tool   DMPTool       Carly  Strasser      |  @carlystrasser   California  Digital  Library     5  August  2013   ESA  2013  SS  2  
  • 59. From  Flickr  by  dipster1   Toolbox  
  • 60. Write  a  DMP   dmptool.org                     Step-­‐by-­‐step  wizard  for  generating  DMP   create  |  edit  |  re-­‐use  |  share   Free  &  open  to  community    
  • 61. Find  a  repository   Where   should  I  put   my  data?   databib.org  
  • 62. From Flickr by thewmatt Get  help  
  • 63. From  Flickr  by  North  Carolina  Digital   Heritage  Center   Get  help  from  your  library   From  Flickr  by  Madison  Guy  
  • 64. Get  help   Toolbox:    DCXL  blog:  dcxl.cdlib.org  
  • 65. From  Flickr  by  dotpolka   Doing  science  is  a   privilege  –  not  a  right  
  • 66. From  Flickr  by  Michael  Tinkler  
  • 67. From  Flickr  by  mikerosebery    There  is  a  social  contract  of  science:  we   have  an  obligation  to  ensure  dissemination,   validation,  &  advancement.   To  not  do  so  is  science  malpractice.       –  Brian  Hole,  Ubiquity  Press  at  UCL  
  • 68. My  website   Email  me   Tweet  me   My  slides   carlystrasser.net   carlystrasser@gmail.com   @carlystrasser     slideshare.net/carlystrasser