The document discusses transcribing and making accessible the diaries of George Leslie Adkin through crowdsourcing. It describes the process used to digitize the diaries and photographs, create conceptual entities for each diary entry, and link them with related people, places and topics. While crowdsourcing could help with transcription, integrating the data and developing an effective display platform presents challenges given the large scope and complexity of the project. More work is needed to formalize the project and determine the most appropriate roles for crowdsourcing versus in-house work.
Choir attempted beautiful anthem “Oh, Radiant Morn” – made hash
1. “Choir attempted that beautiful
anthem “Oh, Radiant Morn” –
made a hash of it”
Making a hash of the Adkin Diary transcriptions
Adrian Kingston
Collections Information Manager, Digital Assets and Development
Museum of New Zealand Te Papa Tongarewa
@adriankingston
Crowdsourcing for the Digital Humanities and Cultural Heritage Sector
Victoria University of Wellington, 23 April 2013
2.
3. Wed. Apr. 23.
Worked at Swamp–Cow p[addock] fence. Bulliman took
48 heavy fat ewes at 15/-. In evening Father + I drove
down to Levin No L[icense] Democratic Vote Campaign
committee meeting. Father voted to chair + self
appointed secretary. Discussed campaign.
4. Background
George Leslie Adkin; Farmer, photographer, geologist, explorer,
archaeologist, ethnologist.
1 man, 41 diaries, 59 years, Over 21000 days
Thousands of negatives and prints, some albums
Initial deadline, launch of @life100yearsago ,a project of
WW100
Did everything ourselves. We resourced most of this project
with a curator (Kirstie Ross) and a monkey with a keyboard
Figure out process (imaging, cropping, loading, transcription
guidelines),
Figure out content (data structure, quirks of Adkin, glossaries
etc.)
Project? What project?
Very early days.
5. Process
Assess album condition
Photograph album pages
Crop pages to days
Create narrative for day
Load “day” images to EMu “day” narrative
Transcribe
Add associated subjects, people, places (from authority files
and controlled vocabularies)
Add context to narrative entries for month
Some parts semi-automated, some completely manual; some
need no special skills, others do
6. Received a letter + referee’s report from Dr
Chilton, Editor “Trans[actions of the] NZ
Inst[itute], on my paper on Tararuas = “my
theories based on too slender evidence and
debatable evidence + also in part erroneous (?
GLA). I decided to withdraw the paper as it is
evidently unsuitable for publication in
“Transactions”
http://collections.tepapa.govt.nz/theme.aspx?irn=4294
7. Framework
Using existing framework; EMu, Collections Online
CIDOC CRM for building and expressing relationships
Days are conceptual entities, not physical. Framework allows
for this
Links to physical entities, diaries, photographs, albums
Links to people, places, topics
However, scale of content of really starting to highlight issues of
display in Collections Online.
8.
9.
10. What we’ve learnt
So much content, so much data
More than just one man’s story, a huge data source on NZ life
So much potential for a number of fields of research
Our existing data structure works really well
Transcription only one part
To get most out of the content, need the links, need the rich
conceptual model
Context needed, or at least useful, for the reader
Existing display not so hot
Enlivens the collection, a step beyond just digitisation and
transcription
11.
12. Issues
Size of the project is daunting, but the transcription seems
manageable to do through crowdsourcing
There are a number of existing platforms that look great, but
how to deal with matching to our structure, vocabularies,
authorities?
Could use automated in text authority mining, but would need
to then match back to authorities and structure
Beyond scope of crowdsourcing? But does that diminish the
value of the “data”?
Could come later though, are we getting too hung up on
quality?
13.
14. Our potential crowd
By starting it ourselves, we have some content available to
promote the crowdsourcing.
Already had unsolicited volunteers
The content is interesting: NZ history, early 20th Century
courtship, farming, geology, religion, war, politics, weather…
Horowhenua locals interested in local history, and one of their
famous sons
History students and educators
Bring students closer to primary material, work with cursive
handwriting, highlight the importance of accuracy in relation to
data, personal biography
Learning history through a first hand account
Plan B is do war years with interns
15. We decided to go into town to lunch so I piloted
the party to Kirkcaldie + Stains where we had a
good dinner… Will wanted to know if one could
have all the courses for 2/-. I told him it was not
customary to indulge in more than six but that if
he wanted to tackle the lot we would have to
leave him at it. Olive ordered dishes she did not
want + Alice also got a bit mixed up.
http://collections.tepapa.govt.nz/theme.aspx?irn=4095
16. Where to
Can’t do with existing (human) resource
Transcription only one part of the project
Need to figure what parts need to be crowdsourced, what can’t
Transcription will enable the adding the contextual and semantic
relationships and links to other sources
Options for automating the above
Or, with a focussed crowd and a finite project, maybe we don’t need
a new platform, could provide training and use existing tools
Can’t crowdsource the display platform. Or can we? Crowdfund it?
Make data available for analysis, visualisation, research, fun
Need to formalise the project
Lots to figure out
17. In evening rode down to see Maud – showed
her some books but there seemed to be a lack
of sympathy between us + the evening was a
failure.
http://collections.tepapa.govt.nz/theme.aspx?irn=4080
18. See
Adkin diaries of Collections Online
@adkin_diary on Twitter
@life100yearsago on Twitter
Questions?
Kirstie Ross, Curator Modern New Zealand
Adrian Kingston, Collections Information Manager
Philip Edgar, Manager Digital Collections and Access