8. Prevent Errors Before Collection
• Define & enforce standards
• Formats
• Codes
• Measurement units
• Metadata
FromFlickrbyStacieBee
9. Prevent Errors Before Collection
• Define & enforce standards
• Formats
• Codes
• Measurement units
• Metadata
• Assign responsibility
for data quality
FromFlickrbyStacieBee
10. Comments & notes fields
Allows handling of unexpected situations
Prevent Errors Before Collection
Allow “other” values
From Flickr by Olga Nohra
11. Minimize Errors During Collection
• Eliminate manual data entry
• Design data storage well
• Minimize repeat entry
• Use consistent terminology
• Atomize data
From Flickr by Butal Lee
12. You should invest time in learning databases if
your data sets are large or complex
Consider investing time in learning databases if
your data are small and humble
you ever intend to share your data
you are < 30 years old
From Mark Schildhauer
Minimize Errors: Use databases
14. Databases
• FileMaker Pro (Mac)
• Access (PC)
• LibreOffice
Spreadsheets
• Google forms
• LibreOffice
• Lists & data validation in
Excel
Minimize Errors: Tools
15. Detect Errors After Collection
Look for outliers
Goal is not to eliminate outliers but to
identify potential data contamination
0
10
20
30
40
50
60
0 10 20 30 40
16. Detect Errors After Collection
Look for outliers
Goal is not to eliminate outliers but to
identify potential data contamination
Strategies
• Normal probability plots
• Regression
• Scatter plots
• Maps
0
10
20
30
40
50
60
0 10 20 30 40
18. Handle Errors
• Case-by-case decision
• Flag them?
• Remove them?
• Fix them?
• Document all changes
readme.txt, scripts
• Keep original data
separate
• Use scripts
Raw data as .csv
R script for QAQC