Note: This presentation includes notes with some of the slides. Click on the "Speaker Notes" tab below the slides to see them or download the presentation to view them in PowerPoint.
Tightening materials budgets and physical space constraints have intensified the focus on usage data for decision making purposes. For print books, circulation is the most trusted indicator of use. Circulation data can drive decisions about everything from acquisitions and weeding to staffing levels and hours of operation. Modern integrated library systems record and maintain a rich and detailed array of circulation data, yet decisions are often based upon only the most rudimentary measures, such as total circulation, or average circulation per volume.
Taking advantage of the time and location data within circulation transactions, and combining it with demographics, acquisitions, and holdings data can provide a basis for more sophisticated analysis of book circulation that is better suited to strategic planning needs. This presentation examines some of the issues with the gathering and analysis of circulation data and also looks at different ways to measure circulation. It includes examples of time series and ratio analysis that can be applied to circulation data, and how the data and analysis can be used for decision making.
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Strategic Use of Circulation Data: Moving Beyond the Basics
1. Strategic Use of Circulation Data
Moving Beyond the Basics
Richard Entlich
Collection Analyst
Cornell University Library
Charleston Conference, November 3, 2011
2. Value of Circulation Data in Collection
Management
• Recognized many years ago
• First proposed as a means to
help cope with overcrowded
stacks
• Move little used materials to
remote storage facility
Photo credit: zsrlibrary http://www.flickr.com/photos/zsrlibrary/5352092180
3. Quotes from an Early Proponent of the
Strategic Use of Circulation Data
“Completeness can no longer be the ideal of any library.”
“All signs indicate that the flood of printed material has by no
means reached its height.”
“It is not a good use of the educational resources of an institution
to enlarge its library building to make new space for books in
use, when books that are very seldom used can be stored in
inexpensive buildings on cheap land.”
4. Source
Charles William Eliot (President of Harvard University)
“The Division of a Library into Books in Use, and Books
Not in Use, with Different Storage Methods for the Two
Classes of Books.”
Library Journal 27, no. 7:51-56, 1902
5. More Wisdom from Charles Eliot
“The means of just discrimination between books in
use and books not in use are not easy to discern or to
apply; but I maintain, nevertheless, that the search for
these means should be diligently prosecuted, and that
every reasonable suggestion of means of
discrimination deserves careful attention.”
6. What is use and how do we measure it?
• We recognize two common classes of use for print
materials
• In-building use (“browsing”) • Out of building use (“circulation”)
Photo credit: Brendan Murphy
Photo credit: zenobia_joy http://www.flickr.com/photos/sekihan/6255392036/ http://www.flickr.com/photos/29501884@N04/4552647815/
7. Browsing
• Evidence of browsing may be hard to detect
• As an activity, it is difficult to measure accurately and
consistently
• Measurement of browsing, if done at all, can vary
widely, even between libraries within an institution or
in the same library over time
8. Circulation
• The “gold standard” for measuring use of books
• It too suffers from a lack of standardization
• What transactions are included? (e.g., ILL)
• What are loan periods for different user groups and
materials?
• What materials don‟t circulate at all?
• Is there a limit on renewals?
See “A Look at Circulation Statistics” by Jeff Luzius
Journal of Access Services, Vol. 2(4) 2004, pp.15-22
9. Measuring Use of Books: The Bottom Line
• Circulation is the best measure we have, though it has
limitations
• There is some evidence that circulation and browsing
are well-correlated
• Comparing circulation between institutions may be of
little value, except in the broadest terms
• Comparing circulation within an institution should be
meaningful, as long as policies and procedures are
fairly consistent across units and over time
10. A Challenge from Charleston Past
• QUESTIONS FOR LIBRARIANS WHO DO
COLLECTION DEVELOPMENT
• Do you get or use any circulation data in making
decisions about what books to buy?
• If you do get circulation data, does it specifically
pinpoint what's happening in your subject area?
• Do you have circulation data time series for your
subject(s)?
• Do you see any need for this information?
(From Charles Hamaker, Charleston Conf., 1994)
11. More Wisdom from Charles Hamaker
“We somehow assume the circulation librarian is
„responsible‟ for circulation data. Few of us have figured
out how to use these [automated library systems] to see
if they can help us make better decisions about what to
buy, or even to see if they tell us what kind of a job we
have done with what we did buy.”
12. Other Potential Strategic Applications of
Circulation Data
• For what titles should we buy multiple copies?
• What materials should be owned vs borrowed vs
rented?
• What loan policies and periods should apply to what
materials?
• In what library building should certain materials be
housed?
13. Current Challenges Facing Libraries Suggest
We Must Adopt a More Quantitative Approach
• Increased scrutiny of library budgets and demands for
accountability
• Widening gap between the universe of published
output and collection budgets
• Physical space constraints
• The shift from speculative (“just-in-case”) to demand-
driven (“just-in-time”) acquisitions
14. Early Generation Library Automation Systems
• Browse and circulation totals at the title level
• No individual transaction data
• Transactions were not date and time stamped
• Little or no user data
• Inability to distinguish internal library use or ILL from
community use
• Limited management reporting capability, often
requiring advanced programming skills
15. Recent Generation Library Automation
Systems
• Detailed circulation transaction records retained
• Time/location of all charges and discharges
• Patron groups (distinguish „pseudo-patrons‟)
• Renewal counts
• More flexible (though not necessarily easy to
use) reporting functions
16. Basic Data Requirements for Circulation
Analysis
1. A set of bibliographic data pertaining to print
monographs owned by the library, usually subject to
certain selection criteria
2. The set of circulation transactions that correspond to
the items in set one
17. Caveats: Bibliographic Records
• Limited to books?
• Limited to print?
• When first available to circulate?
• Part of a circulating collection?
• Lost or withdrawn?
• Do records support analysis?
18. Caveats: Circulation Records
• Completed vs in process transactions
• Variable loan periods, esp. short-term reserves
• What to do about renewals?
• Identification of „pseudo-patron‟ transactions
• Handling of ILL transactions
19. “Beyond Basic” Circulation Analysis
• Trend or Time Series Analysis
• Ratio Analysis
• Integration of non-ILS data
20. Time Series Study: Time to First Circulation
• Basic Recipe
• Gather data on a set of books acquired in a similar time
frame
• Gather circulation records for the above set
• Isolate the set of first circulations for each
• Compute time (in months) from acquisition to first circulation
• Analyze as desired
Photo credit: Patrick Gage Kelley http://www.flickr.com/photos/sekihan/6255392036/
21. Avg Months Total
LC Top Class - Description to First Circ Items
A - General Works 43.4 57
V - Naval Science 43.2 60
Z - Library Science 33.8 300
C - Auxiliary Sciences of History 33.2 299
F - History: United States Local and Latin America 32.8 814
D - History: General and Outside the Americas 32.5 4459
K - Law 32.3 1524
B - Philosophy, Psychology, Religion 30.4 3859
U - Military Science 28.6 280
P - Language and Literature 28.6 8135
N - Fine Arts 27.5 2585
J - Political Science 25.5 1214
M - Music 23.7 750
G - Geography 23.2 1134
H - Social Sciences 22.3 7138
L - Education 21.9 718
E - History: United States 21.0 925
R - Medicine 18.8 1059
S - Agriculture 18.3 1012
T - Technology 18.2 1967
Q - Science 15.7 3169
22. Time Series Study: Cumulative Volume
Circulation
• Recipe
• Gather data on a set of books acquired in a similar time
frame
• Gather circulation records for the above set
• Isolate the set of first circulations for each
• Count the number of first circulations for each year
• Calculate cumulative totals for each year
• Analyze as desired
23.
24. Source: Use of Library Materials: The University of Pittsburgh Study by Kent, et al. 1979
25. Time Series Study: Circulation Consistency
• Recipe
• Gather data on a set of books acquired in a similar time
frame
• Gather circulation records for the above set
• Group records by year of circulation
• Cross tabulate desired parameters
27. Average of Total
LC Top Class - Description Circ Years Items
Q - Science 3.58 3169
S - Agriculture 3.48 1012
E - History: United States 3.25 925
G - Geography 3.18 1134
M - Music 3.09 750
R - Medicine 3.08 1059
T - Technology 3.06 1967
N - Fine Arts 3.04 2585
J - Political Science 2.95 1214
H - Social Sciences 2.86 7138
L - Education 2.80 718
P - Language and Literature 2.75 8135
B - Philosophy, Psychology, Religion 2.57 3859
F - History: United States Local and Latin America 2.46 814
C - Auxiliary Sciences of History 2.37 299
D - History: General and Outside the Americas 2.28 4459
Z - Library Science 2.23 300
U - Military Science 2.21 280
A - General Works 2.19 57
K - Law 2.09 1524
V - Naval Science 1.77 60
28. Ratio Analysis: Circulation by Language
• From a study of historical circulation of books in a
particular LC subclass, by language
• At first glance, the data doesn‟t seem very dramatic
29. % of Historical Circulation for non-English
Volumes with the Largest Holdings
2.5%
2.0%
1.5%
1.0%
0.5%
0.0%
French German Russian Spanish Italian Portuguese Greek
Whether plotted as historical
or volume circulation, several
languages show similar levels % of Volume Circulation for non-English
of use, relative to the whole. Volumes with the Largest Holdings
6.0%
5.0%
4.0%
3.0%
2.0%
1.0%
0.0%
French German Russian Spanish Italian Portuguese Greek
30. But the ratios tell a different story:
An “Enthusiasm Gap”
%Holdings / %Holdings /
Language %HistoricalCirc %VolumeCirc
French 4.50 1.65
German 7.80 2.24
Russian 35.05 7.94
Spanish 9.58 2.89
Italian 9.15 2.46
Portuguese 7.64 2.26
Greek 4.39 1.28
31. % Holdings to % Total Circulation Ratio for non-English
Volumes with the Largest Holdings
40
35
30
25
20
15
10
5
0
French German Russian Spanish Italian Portuguese Greek
% Holdings to % Volume Circulation Ratio for non-English
Volumes with the Largest Holdings
9
8
7
6
5
4
3
2
1
0
French German Russian Spanish Italian Portuguese Greek
32. External Data:
“Circulation Snapshot”
• A frozen moment in a
continuous stream of data
• Combines ILS data with human resources data for a much
richer demographic analysis of users
• Profiles the users of print
• Identifies relationships between users and materials
• impact of characteristics like status, department, field of study, and
college affiliation on borrowing habits
• breakdown of subjects, languages, dates of publication by user
groups
Photo credit: jeff_golden http://www.flickr.com/photos/jeffanddayna/5067383625/
33. Some Strategic Applications of Snapshot Data
• For Unit Library Review process
• From which depts/fields do borrowers of libraries come?
• Which libraries do members of affiliated depts/fields use?
• For Print Collection Usage Task Force review process
• LC class user analysis by department and graduate field
• Department/grad field usage breakdown by LC class
• Circulation time and renewals by patron status
• Other potential uses
• User breakdown by publication date (for off-site transfer
decision-making)
• Inform individual subject selectors about usage in their
domain
34. For a detailed description of the circulation snapshot
process and its use, see
Richard Entlich, “Focus on Circulation Snapshots: A Powerful
Tool for Print Collection Assessment” in
Proceedings of the 2010 Library Assessment Conference,
October 24–27, 2010, Baltimore, Maryland, pp. 703-13.
http://libraryassessment.org/bm~doc/proceedings-lac-2010.pdf
Ifa book circulates a lot, studies suggest it will be browsed a lot, and vice-versa, which gives more credence to using circulation as a stand-in for all usage, although there is undoubtedly subject-specific variation.
How do we make good use of the circulation data that’s available?
We have data, but there is considerable variation in quality and completeness. Library automation systems have evolved significantly.
Even if your institution uses a “modern” ILS that retains detailed circulation data, getting data suitable for collection management decisions isn’t necessarily just a matter of “pushing a few button”The fact that we can now quickly gather a large volume of data and manipulate it in various ways does not assure its accuracy or trustworthiness. In some respects, modern circulation analysis has more potential pitfalls than analysis done in the pre-automation era.
1) It can be surprisingly difficult to limit a set of bibliographic records to print books. There is no single cataloging designation for what we commonly think of as print books.Helpful filtering mechanisms: Bibliographic Format (positions 6 and 7 in the MARC Leader) Form of Item (008 - Fixed-Length Data Elements position 23)GMD (General Material Designator, field 245 h) Call number prefix (852 k)2) Formeaningful circulation analysis,we have to know whether an item can circulate, and if so, for how long it’s been available to do soNeeded for reliable analysis: An indicator of whether or not an item circulates A rough indicator of when the item became available to circulate (not the publication date!) A mechanism to filter out records for lost, missing, or withdrawn items that may still be in the system3) There are certain kinds of parameters we may like to analyze for which standard MARC records may not be suitable. For example, we might want to analyze circulation by publisher, but the publisher field (260 b) is not authority controlled, leading to inconsistent spelling and abbreviation, which makes accurate analysis difficult.
1) Dramatic differences in loan period among materials can skew analysis. For example. short-term reserves (items on reserve that circulate for just a few hours at a time and are required reading for classes with hundreds ofstudents) cannot be directly compared to items that are loaned for six months or a year, and which may take 2-3 weeks to transfer from one user to another, even if recalled.2) Should renewals be counted as separate circulations? This may depend on local policy. Is there a limit on the number of times a borrower may renew?3) Circulation that is not the result of user demand (placement on a new book shelf, digitization, conservation treatment) should be filtered out.4) The ILS may reflect only one side of interlibrary loan transactions, usually the less interesting side from a collection building perspective. It will typically include loan of the library’s holdings to users at other institutions, but not our users’ borrowing from other institutions, which indicates demand for content that is unavailable in our own collection. Often this latter data resides in separate ILL systems, and may or not may be compatible with ILS data.
Once you’ve gone to the effort to remove spurious and irrelevant data, and to avoid making apples to oranges comparisons, what can you do?We can do a lot more with this data than compile raw total circulation statistics, or compute average circulation per volume or per collection for inclusion in the library’s annual report, which is often all that’s done with available circulation data.Following are some examples of “beyond basic” circulation analysis in three categories. The intent is to show some of what’s possible to do with the circulation data from a modern ILS, but not to explain exactly how to conduct each type of analysis.Most of the work described here was done using the Microsoft Access relational database.
This analysis is based on nearly 93,000 print monographs acquired during calendar year 2001. Volumes published prior to 1996 were removed in order to limit the analysis to material both acquired and published in a similar time frame. Circulation for internal library purposes and very short-term loans were filtered out.STM materials werethe quickest to reach first circulation. Some subject areas take 2-3 times as long, on average, to get noticed and circulate. This data might be taken into consideration in devising schedules for the migration of books from the book stacks to remote storage, and for other purposes.
Notice that many of the initial steps for this analysis are identical to the previous one.
There is extreme variability by subject in the percentage of volumes that circulated over a ten year period following acquisition. Some of the variability may result from differences in how materials in different subjects are used. For example, certain law materials may be more likely to be consulted in-house for a quick citation, rather than to circulate, compared to materials in other subjects.The subjects that were fastest to reach first circulation (LC top classes Q, R, S, and T) were not always those with the greatest percentage uptake at the volume level over the decade. Also noteworthy is that while all of the curves have the same characteristic shape (rapid initial rise followed by gradual flattening), some subjects show more of an upward slope leading into 2011 than others. For example, US History (E) and Fine Arts (N) have significantly more upward trajectory in 2011 than Science (Q) or Technology (T).This data has many potential decision-making applications, depending on the goals of the collection program. These could include anything from reallocating resources for purchase of materials, to reallocating resources for promotion and discovery of materials.
The general shape of the curve for cumulative use of library materials acquired in the same time period, has been noted in previous large-scale circulation studies. This same pattern of use was observed in the Kent study at the University of Pittsburgh for materials acquired in 1969 and then tracked for the next six years.Using predictive statistical analysis tools, the Kent study estimated that the likelihood that a volume that has not yet circulated will ever circulate drops to near zero at around 12-13 years after acquisition.
For this analysis, records are first grouped by year of circulation (this insures that there is only one record for each year that an item circulated, regardless of how many times it circulated in that year.)They are then cross-tabulate against desired parameters. In this example, it’s LC Top Class against year of circulation, populated with the count of circulation years. Since there is at most one record for any year of circulation, all of the values will either be a one or a zero.
The resulting table reveals many different patterns of circulation over time, some of which are highlighted here.Analysis of the common characteristics of titles exhibiting each pattern could provide valuable insights into the circulation behavior of different types of books.
By aggregating according to a particular parameter and taking the average of Total Circ Years in each class, we can profile the consistency of circulation by a variety of measures. In this example, LC top class is used again.
Ratio analysis is another “beyond basic” technique.At first glance, this data doesn’t seem to tell us much, other than that use of English language materials in this subject is responsible for the bulk of circulation and of the volumes that circulated, and that many of the non-English languages had similar levels of circulation, when compared to the whole.
Holdings relative to either circulation measure tell a different story. There is more enthusiasm for acquisition relative to circulation for one language in particular.
Regardless of which type of circulation one looks at, Russian is an outlier in this group.It would be up to the selector to decide whether this merited an adjustment in the selection profile.
Finally, data in the ILS can be enhanced by combining it with data from other sources.