2. What is Visual Analytics (VA)?
• Analytic reasoning enabled through
interactive data visualizations
– Explore data through the visual
– Synthesize large amounts of data, often from
various sources
– Discover relationships and relevancies amongst
data entities in a non-linear manner
3. “Detecting the expected,
and discovering the unexpected”
– Jim Thomas, Founding NVAC Director
Background photo: http://www.flickr.com/photos/luc/5418037955/
4. Storytelling with data isn’t just about the end
product…
- Involves exploring your data
- Uncovering new insights by interacting with
your data: building, questioning and changing
your visualizations
This slide was added to the presentation after the Summit to
cover points from my speaking notes.
5. VA Application for Library Data
• Analyzed circulation and in-house use data by
branch at UBC Library during 8 month period
• Datasets used were not open, but could be…
– All anonymized and aggregated
• Project started in Dr. Lemieux’s VA class
(LIBR514F) @ SLAIS
• Afterwards, I incorporated VA into a UBC
Library stakeholder report for the In-House
Use pilot
6. VA Application for Library Data
• 10 UBC branches participated in the In-House
Use pilot – one branch logged upwards of
7000 instances of in-house use in just one
month
• Before this pilot, UBC Library was not
gathering in-house use in a manner that leant
itself to analysis
• Lots of potential for new discovery, and to
better understand how the collection is used
This slide was added to the presentation after the Summit to
cover points from my speaking notes.
7. Application of VA
Analytic Tasks
1. Generate an overview of circulation and in-
house use
2. Examine in-house use by Library of Congress
class and subclass
3. Examine class-level in-house use based on
total circulation
8. VA Tool
Tableau
• Data visualization software that works well
with tabular data
• Interactive dashboard lets you drag and drop
to change your analysis of the data
• Easy to use and not limited
• Tableau = amazing.
*Tableau authorized the use of screenshots from their desktop program for this presentation.
9. Task 1: In-House Use and Circulation Overview .
Circulation = Blue
In-House Use = Orange
10. Task 1: Area Chart
- This view displays peaks and valleys in the
data (higher and lower periods of in-house
use)
- Easy to compare rise and fall of use between
the two data types
- Reveals that by tracking in-house use, the
Library was able to capture an additional 25%
more use
- Difficult to compare totals in this visual…
This slide was added to the presentation after the Summit to
cover points from my speaking notes.
12. Task 1: Continuous Line Graph
- In this visual, it is easier to compare in-house
to circulation use by week
- The cases where in-house is greater or closer
to circulation totals is more evident
- This screenshot also shows you the dashboard
in Tableau, and how you can adjust the
measures and dimensions of your visual
This slide was added to the presentation after the Summit to
cover points from my speaking notes.
16. Task 3: Continuous Line Graph
- My last analytic task was to visual in-house
use based on whether an individual item has
also circulated (since 2004, when the current
ILS was implemented)
- From this visual, we can see that most
instances of in-house use were of items that
have little or zero circulation (reference and
journal use was removed from this visual)
This slide was added to the presentation after the Summit to
cover points from my speaking notes.
18. Insight from VA Application applied…
http://api.imapbuilder.net/1.0/map/20873/?s=a
7723831f962cd3c27aedb04f7b2c27c
- The heat map visual shows in-house use
mapped over a library floor plan
- The darker the shade of red, the greater the
number of instances of in-house use
- Hover over the heated areas for a display of the
percentage of in-house use that involves items
that have not circulated
- This is a sample of the areas with the greatest in-
house use, and does not reflect all in-house use
20. Visual Analysis of In-House Use Data
- When in-house use is analyzed alongside
circulation data, and data collected through
qualitative studies such as surveys and focus
groups, the Library can gain a better
understanding of how patrons use both library
materials and the physical library space.
- Library space can be redesigned to help
facilitate how patrons use the resources there.
This slide was added to the presentation after the Summit to
cover points from my speaking notes.
Last January, Dr. Victoria Lemieux, an archival professor at UBC’s School of Library, Archival and Information Studies (SLAIS), developed and taught a course on information visualization and visual analytics – the first course of it’s kind taught at SLAIS. Today I’m going to talk a bit about the project I started in Dr. Lemieux’s class and ended up incorporating into my work as a student librarian at UBC. To start, I’ll also talk a bit about the application of visual analytics in general.
Visual Analytics is a method of exploratory analytic reasoning enabled through interactive data visualizations – the creation of the visualization and analysis of the data go hand-in-hand. The goal is to explore the data through the visual, and build a series of visualizations that represent or resemble the data and any existing structures, patterns, relationships and irregularities. The goal of visual analytics is to synthesize information, which is done through starting with a visual that provides you with an overview of your data, and then interact with your visualization and zoom into specific data clusters to derive insight.
Through VA, we are detecting the expected, and discovering the unexpected – confirming assumptions we may have, but also discovering new insight into the data.So I see storytelling with data as not only the story you can tell through a final visualization – it is also the exploration of your data, what you learn through creating clear and insightful visualizations; it’s also the unexpected perspectives you unearth in an early visualization that causes you to change your approach, and those visualizations that actually misrepresent your data, and push you to alter how you have visualized it.
The datasets I used in my analysis were not open - I used circulation and in-house use data for UBC Library. However, both were sufficiently anonymized and aggregated to protect against ever reconstructing individual patron use – so these datasets represent library data that could be made open. While a student in Dr. Lemieux’s class, I was also working as a student librarian in UBC Technical Services for the Collections Management & Planning Librarian, Doug Brigham. At the time, I was managing the implementation of a new automated in-house use tracker – and I thought the use of visual analysis would be a great way to explore this new (at least, new to UBC) data. Although I started the analysis in class by looking at in-house use for a single branch, I ended up incorporating visual analysis of three branches into the stakeholder report on in-house use that I wrote for the UBC Library community once the pilot was complete.During the pilot, ten branches in the UBC system participated – one branch logged upwards of 7000 instances of in-house use in just one month – so, although the data collected over the 7 months does not constitute big data, it was still a lot of data with several dimensions.
The datasets I used in my analysis were not open - I used circulation and in-house use data for UBC Library. However, both were sufficiently anonymized and aggregated to protect against ever reconstructing individual patron use – so these datasets represent library data that could be made open. While a student in Dr. Lemieux’s class, I was also working as a student librarian in UBC Technical Services for the Collections Management & Planning Librarian, Doug Brigham. At the time, I was managing the implementation of a new automated in-house use tracker – and I thought the use of visual analysis would be a great way to explore this new (at least, new to UBC) data. Although I started the analysis in class by looking at in-house use for a single branch, I ended up incorporating visual analysis of three branches into the stakeholder report on in-house use that I wrote for the UBC Library community once the pilot was complete.During the pilot, ten branches in the UBC system participated – one branch logged upwards of 7000 instances of in-house use in just one month – so, although the data collected over the 7 months does not constitute big data, it was still a lot of data with several dimensions.
I didn’t approach the data with a hypothesis, but instead set out with 3 analytic tasks: I wanted to generate an overview of circulation and in-house use. To examine in-house use by Library of Congress class and subclass. And to analyze in-house use based on total circulation. These three tasks were applied to circulation and in-house use data from the 3 branches which had been participating inthe in-house use pilot the longest. In my examples today, I will show the data collected for one branch, between September 2011 and March 2012 – this timeline represents the length of the pilot.
I chose to use Tableau for my visual analysis because it works really well with tabular data. Chad has already talked about Tableau – it really is a fantastic tool. My visualizations are basic. My focus was on exploring the data and creating simple overview that exposed outliers and provided insight.
I started with an area chart to display the peaks and valleys in the data – higher and lower periods of in-house use. Through this view, it is easy to compare the rise and fall of use between the two types.With the addition of in-house use data, UBC Library was able to track an additional 25% more collection use for this particular branch, and upwards to 30% more use in other branches.I should note that the area chart stacks the circulation data (which is in blue) on top of the in-house data (which is in orange). While this makes similarities in use trends more obvious, it reduces my ability to compare total in-house by the total circulation use for each week.
Here is my view of the VA tool as the analyst (you can see how I can filter the data, and set and change the dimensions and measures of my visualization) – I have moved to a continuous line graph, with the data collected in 2011 to the left, and data from the first few months of 2012 to the right.In this view, it is easier to compare in-house use to circulation by week, and to take note of the cases where in-house use is greater or closer to the circulation totals (as seen at the start and end of the first term, and over reading week).
For my second task, I looked at in-house use by Library of Congress class to see which are more heavily used in the library. Although classes of equal or similar use are obscured at the bottom of the visual, because it is the outliers that are of interest, I chose not to change the visualization. Class N, Q and T are used the most in-house in this particular branch.
By removing the LC Class dimension and replacing it with the LC Subclass dimension, I am able to zoom in, or drill into a specific subclass – in this case, I have visualization subclass N.Similarly to the previous visualization, I am only interested in the outliers, so the fact that the visual for lower use subclasses is obscured does not matter. However, if this information was of interest, I could move the subclass to the horizontal axis.
My last task was to look at in-house use by circulation – I wanted to see what was being used in the library, based on whether it had been checked out as well. What I discovered is that most items being used in-house have not circulated, at least since the introduction of the current ILS in 2004. In this visual, the blue line represents instances of in-house use where the resources used have not been checked out (reference and journals materials were discounted). The lighter blue line represents instances of in-house use where the item was checked out once, and the orange line represents instances that have circulated 3 times.
This is the data presented before, but it displays the data by LC class and how it has been used in the library, creating a heat-map effect for the areas used the most.Prior to the pilot, the Library did not collect in-house use in a way that leant itself to analysis. Although some branches counted how many books they picked up and re-shelved each day, UBC wasn’t able to say what was being used in the library, where and when. In it’s mission, UBC Library states that it will value stewardship of collections and meet the learning, research and teaching needs of it’s community. But, the Library does not have limitless space. In-house use data, in combination with circulation data, could provide important insight into what should be kept in the library, what can be moved to onsite storage, and what can be a candidate for offsite storage.
The visualization with the heat map effect made me think about how I could apply the insight I gained from the visualizations in another context. So, I decided to create a heat map of in-house use over a library floor plan.Together with circulation data, and data collected through qualitative studies such as surveys and focus groups, the Library will be able to gain a better understanding of how patrons use both library materials and the physical library space – and library space can be redesigned to help facilitate how patrons use the resources there.