How AI, OpenAI, and ChatGPT impact business and software.
The Natural History Open Data Challenge @ OTA16
1.
2. Diverse collections spanning
space and time
Challenge of scale:
>80 million specimens!
Challenge of speed
(digitising within a lifetime)
Ambitious digitisation
programme (DCP)
Institutional policy
“open by default”
3.
4. Higher Classification
Scientific name: Thymelicus lineola (Ochsenheimer, 1808)
Family: Hesperiidae
Location
Locality: Tilbury Docks
State/province: England
Country: United Kingdom
Continent: Europe
Decimal latitude: 51.4605
Decimal longitude: 0.3449
Collection Event
Recorded by: T G. Howarth; Howarth
Collection date: 31 / 07 / 1938
Most iCollections specimens will have ~30 fields containing data
(over 100 different fields across all collections)
There are some issues…
(where is H. M. Edelsten!?)
7. Potential Challenges
How did collecting effort change over time?
Who was the collector who collected from the most distinct localities? – can we make a ranking
table and mash up data with Wikipedia or other sources?
What can we learn about the collectors – who travelled the furthest or most regularly?
Were most specimens collected in rural areas? Is there collection bias in particular counties?
How can we make the data more attractive to difference audiences?
How could we display the data in more engaging or informative ways?
Thanks for joining us today – to my knowledge this is the Natrual History Museum’s first hackathon based on its specimen data!
80M awesome objects - aim to get them done in under a lifetime[5:59] Default policy is openess - data and images going on the portal[5:59] Hopefully 3D at some point!