In this very short presentation I am to give a very brief history of the progress of resource discovery in the archives domain; to talk about The National Archives experience with using linked data; and to talk about some possible future directions.
As this is a mixed audience of librarians, archivists and curators, it may be helpful just to start with a few points about the nature of archival catalogues and how they operate in an online world.
In ting was the beginning was the National Register of Archives founded in 1945 as a paper-based national union catalogue of manuscripts, held in London, indexed by the ‘creators’ of manuscripts; now with info on over 300,000 collections indexes computerised from the 1970s and given additional place-name and subject access points; mounted online in the 1990s; linking to online catalogues began in 2005 adapted as an ISAAR-compliant name authority file in 2007 In 1998 the National Council on Archives published its seminal report, Archives Online – the creation of a National Archives Network articulated (what remains a valid objective) the concept of a single online point of access from which it would be possible to search and browse all the available catalogue descriptions of UK archives, linked to a name authority file identified the major tasks as retroconversion of existing paper-based catalogues and creating the technical infrastructure The report recognised that funding silos might mean a range of different projects took this vision forwards, but that adherence to some simple rules on interoperability would enable them to be linked or joined up in future
As a result, over the last decade or so, many flowers have bloomed. Some, like A2A and SCAN have not been taking on new content for some years, while others continue in active development and are indeed represented at this conference. There must be a concern, however, in the current funding environment, about how sustainable services are which are dependent on project or renewable funding. Or about whether their funders will reward their continuing fulfilment of a core information function as well as supporting continuing development.
About five years ago there was a prevalent view that the future lay in each repository hosting and managing its own data. Considerable investment has been made by local authorities and universities in making this possible, using two main commercial products, CALM and ADLIB, and a variety of other commercial and bespoke approaches. My view – and I recognise that this contentious - is that the results have been rather disappointing. Some of the reasons for this are on the screen. And again, there must be a question about whether in the current funding environment, the range of repositories providing their own catalogues will continue to grow, and even whether those which have them will continue to afford the relevant licences and investment to maintain them.
I want to turn now to The National Archives’ own work with linked data. As an organisation we have a commitment to the principles of open data. We are the arm of the UK government responsible for implementing the PSI directive, and the UK is widely viewed across Europe as the most enthusiastic advocate of open data. We have seen a range of initiatives by the current government to take this agenda further: data.gov.uk and now proposals for a Public Data Corporation. We are fortunate in having, in the person of John Sheridan, one of the pioneers and greatest advocates of linked data, certainly in the UK and possibly internationally. He has built up a team which has delivered the legislation.gov.uk site, of which TNA is extremely proud. It has made possible a revolution in the accessibility of primary and secondary legislation, both in its original enacted form and as amended. It is possible to see what was the state of the legislation at any particular date in the past. Linked data has been critical in making this possible. John’s work has prompted us to explore how L D can work in other contexts. We are producing a L D version of PRONOM which enables matching of the file format definitions it contains with those in similar registeries across the world. And we are exploring the application to resource discovery
The context for this is the beta-launch of an improved resource discovery system, initially focusing on Catalogue and some digitised resources, as our “Discovery” system. We are planning to explore the use of L D in the form of the “Open Annotations” model, for linking catalogue records and related user-generated content. Our ambition is to enable UGC to enrich the catalogue but at the same time to make it easy to see at a glance what is authoritative TNA-sourced catalogue data and what is possibly less authoritative UGC. In 2011 we will be seeking views on the future development of the NRA with a renewed appetite for sector leadership. We aim to move to a new infrastructure platform in 2012-14 which will radically enhance the technical and collaborative possibilities. We are willing to consider rebuilding the data structure to facilitate new ways of working, and we will explore the potential of linked data, web crawling and other approaches to extending the effective data content. We are considering offering a hosting service for repository catalogues that allows you to edit them remotely, thus potentially removing the need for smaller services to build an online catalogue at all. And we are interested at enabling crowdsourcing and possibly linking to online resources like Wikipedia as solutions to rapidly building name authority content SO – TELL US WHAT YOU WANT THE NRA TO DO FOR YOU!
I hope there is enough there to stir up debate! Thank you.