This document summarizes a presentation about developing and deploying open source tools at the University of Virginia Library. It discusses the benefits of using open source software over proprietary software, including lower costs and greater flexibility. The presentation describes how UVa Library developed open source tools like Fedora, Blacklight, and Hydra to meet its needs. It emphasizes collaborating with other institutions on open source projects to share costs and benefits. The summary highlights how UVa built its institutional repository Libra using Hydra to provide open access to scholarly works while gathering user requirements and customizing the tools.
Streamlining Python Development: A Guide to a Modern Project Setup
Developing and Deploying Open Source in the Library: Hydra, Blacklight, and Beyond
1. Developing and Deploying Open Source Tools in the Library: Hydra, Blacklight, and Beyond Julie Meloni, University of Virginia Library NYPL Brown Bag Lunch Talk // 26 August 2011 jcmeloni@virginia.edu // @jcmeloni
2. The Million-Dollar Question Do you spend hundreds of thousands of dollars on proprietary software (licensing, maintenance contracts, support contracts, etc.) that performs one set of tasks, or do you hire a few good software developers and library professionals who can lead the design of applications and platforms specific to your needs?
3. The Answer … The people cost more. The people can also do more, especially when committed to open source wherever possible. In turn, other institutions benefit as well. This approach will not work for every institution. This approach does work for University of Virginia Library.
4. Problems with Proprietary Software Expensive in terms of licensing hardware Maintenance Vendor lock-in dependencies make switching costs too great
5. Problems with Open Source Software Expensive in terms of Human resources (learning, collaboration, and commitment to a community takes a lot of time!) No vendor support Reliance on internal resources and a community that may have different goals than your own.
6. Where Does that Leave Us? OSS is no panacea Know what you're getting into Philosophies are difficult to implement wholesale Implementations must serve the greater goals of the library The process of testing, implementing, and testing again, and working with a community to achieve goals, takes time but is worth the effort for stability and scalability.
7. OSS at UVa Library Fedora (Flexible Extensible Digital Object Repository Architecture): a solid, modular architecture on which to build repositories, archives, and related systems 2001 Mellon grant to Cornell & UVa enabled development Blacklight: creating, implementing, and maintaining an open source OPAC (& related collaborations) Developed originally within the Scholars’ Lab and UVa Library as a skunkworks project Embracing the Hydra philosophy that no single application can meet the full range of needs no single institution can handle development and maintenance requires a common repository infrastructure; flexible, atomic data models; modular services and configurable components
8. Up Next… The Hydra Project: what we do, what we get out of it, and what we contribute back to the community How using an open source discovery interface has allowed us to quickly address the needs of our institutionand its patrons How working with open source has allowed more Library staff outside of the development team have a say in the design, development, and deployment of our products
9. The Hydra Project Collaborative effort between University of Virginia, Stanford University, University of Hull, Fedora Commons/DuraSpace, and MediaShelf. Working group created in 2008 to fill a need to develop an end-to-end, flexible, extensible, workflow-driven, Fedora application kit. Technical Framework Community Framework No direct funding of the Hydra Project itself.
10. Hydra Project Assumption #1 No single application can meet the full range of digital asset management needs, but there are shared primitive functions: Depositsimple or multipart objects, singly or in bulk Manage object’s content, metadata, and permissions Search both full text and fielded search in support of user discovery and administration Browseobjects sequentially by collection, attribute, or ad-hoc filtering Delivery of objects for viewing, downloading, and dissemination through user and machine interfaces
11. Hydra Project Response One body, many heads. Hydra is designed to support tailored applications and workflows for different content types, contexts, and interactions by building from: a common repository infrastructure, flexible, atomic data models, and modular services and configurable components
12. Hydra Technical Framework Fedoraas repository layer for persisting and managing digital objects. An abstraction layer sits between Fedora and the Hydra heads, insulating applications from changes in the repository structure ActiveFedorais a Ruby gem for creating and managing objects in Fedora Solr indexes provide fast access to information Blacklight for faceted searching, browsing and tailored views on objects The Hydra-Head plugin itself: a Ruby on Rails library that works with ActiveFedora to provide create, update and delete actions against objects in the repository
13. Hydra Project Assumption #2 No single institution or provider can resource the development or maintenance of a full set of solutions for the same needs. Problems with proprietary software include expense in terms of licensing, hardware, maintenance, potential vendor lock-in Problems with open source software include expense of human resources, and lack of vendor support causes a reliance on internal resources and community that may have different goals than your own.
14. Hydra Project Response “If you want to go fast, go alone. If you want to go far, go together.” Hydra Steering Group Collaborative roadmapping, resource allocation and coordination, governance of the technology core Hydra Managers Shape and fund work, commission “heads”, create functional requirements and specifications, UI/UX design, documentation, training, evangelism Hydra Developers Define technical architecture, commit code, integration and release, testing, testing, testing.
15. Hydra Community Framework Conceived and executed as a collaborative, open source effort from the start An open architecture, with many contributors to the core Collaboratively built “solution bundles” that can be adapted and modified to suit local needs Hydra heads as reference implementations Ultimate objective of the Hydra Project is to effectively intertwine its technical and community threads of development, producing a community-sourced, sustainable application framework. http://projecthydra.org/
16. Great, But… WHAT DID YOU BUILD??? We built Libra: an unmediated, self-deposit, institutional repository for scholarly material. http://libra.virginia.edu/
17. In February 2010, the University of Virginia Faculty Senate passed an Open Access resolution: All faculty encouraged to “reserve a nonexclusive, irrevocable, non-commercial, global license to exercise any and all rights under copyright relating to each of her or his scholarly articles in any medium, and to authorize others to do the same.” NSF requirements for preservation and access of data used in or resulting from researchers’ grant-funded projects. Discovery, access, and preservation of our students’ electronic theses and dissertations. Why Did We Need Libra?
18. Given institutional commitment to these University-wide problems, resources were allocated from both the University Library and Information Technology & Communication. UVa was already committed to the Hydra Project, and to assist in the development of an end-to-end, flexible, extensible, workflow-driven Fedora application kit. The solution to our problems clearly required such an application toolkit…good thing the Hydra Project had one in development. Hydra offerings ARE NOT a turnkey institutional repository solutions, but frameworks for depositing, managing, searching, browsing, and delivering digital content. We built on that. How Did We Get Libra?
19. Our solution should: Be unmediated Provide sustainable access to and discovery of scholarly materials Enable collection of depository-designated metadata Manage depositor-designated access permissions Work with internal stakeholders to gather requirements and user stories, as this is their repository. Work with Hydra partners to move the common code base forward while still developing our own application in our own branch. Libra Development Principles
20. The Result: A Highly Customized Application http://libra.virginia.edu
25. Open Source in Practice Blacklight is an open source discovery interface that can be used as a front end for a digital repository, or as a single-search interface to aggregate digital content that would otherwise be siloed. customizable and removable for ultimate flexibility many core developers part of the Hydra Project (Bess Sadler, now at Stanford, Bob Haschert at UVa, etc) Continued development by a core group of committers governed by developer norms. http://projectblacklight.org/
29. Good, Broad, Requirements Gathering Functional requirements define the functionality of the system, in terms of inputs, behaviors, outputs. What is the system supposed to accomplish? Functional requirements come from stakeholders (users), not (necessarily) developers. stakeholder request -> feature -> use case -> business rule Developers can/should/will help stakeholders work through functional requirements. Functional requirements should be written in a non-technical way.
30. An epic is a long story that can be broken into smaller stories. It is a narrative; it describes interactions between people and a system WHO the actors are WHAT the actors are trying to accomplish The OUTPUT at the end Narrative should: Be chronological Be complete (the who, what, AND the why) NOT reference specific software or other tools NOT describe a user interface Non-Technical Folk Write Epics and Stories
31. Stories are the pieces of an epic that begin to get to the heart of the matter. Still written in non-technical language, but move toward a technical structure. Given/When/Then scenarios GIVEN the system is in a known state WHEN an action is performed THEN these outcomes should exist EXAMPLE: GIVEN one thing AND an other thing AND yet an other thing WHEN I open my eyes THEN I see something But I don't see something else Non-Technical Folk Write Epics and Stories
32. Scenario: User attempting to add an object GIVEN I am logged in AND I have selected the “add” form AND I am attempting to upload a file WHEN I invoke the file upload button THEN validate file type on client side AND return alert message if not valid AND continue if is valid THEN validate file type on server side AND return alert message if not valid AND finish process if is valid Actual Story Example
33. Developers involved at the story level Writing stories Validating stories Throwing rocks at stories Getting at the real nitty-gritty of the task request Moving from story to actual code Stories written in step definitions become Ruby code Tests are part of this code Code is tested from the time it is written Writing Code From Stories
34. Watch out for the butterfly effect… When one change in a complex system has large effects elsewhere, through a sensitive dependence on initial conditions. Epics and stories do not have to be golden, but changes should be carefully considered Developers illuminate the potential effects of changes The cycle of epic, story, coding begins again This includes any story that touches the changed story Never Stop Communicating
35. Each release has with a list of known issues and potential areas of improvement We go through the cycle of epic, story, coding/testing, user testing, story editing, coding/testing, (etc) again and again. Products are organic and grow upward and outward …but if you want to lop off part of that tree, expect there will be systematic changes developers are there to ensure the tree doesn’t fall on your house We Never Think We’re Finished
36. We Never Ignore the User Work closely with the UX team to ensure that wireframes and prototypes are put in front of users before we take action. Patrons vet the stakeholder requests just like developers do, but from a user’s perspective rather than a technical one. In some notable instances, patron desires have differed tremendously from what stakeholders believe they want. The story of integrating a discovery service: how and why we didn’t blend results. User testing produced clear requests, different from librarian assumptions. Open source flexibility allowed us to go from requirements gathering to user testing to requirements changing to development and deployment in four months.
37. We Will… NEVER return to using proprietary software and solutions (when we can help it). ALWAYS try to find an open source solution, or build one if it doesn’t exist. SHARE everything we possibly and legally can, with anyone who wants to use it. HOPE that any of you considering the use of open source versus proprietary software will consider it and ask questions…