The document discusses developing a scalable strategy for gathering and reporting analytics. It recommends framing important questions, auditing all potential data collection points, and evaluating which data should be analyzed. An example question is tracking undergraduate research downloads to show the impact of new initiatives. While not possible to track individual student downloads, demographics can provide insight. Developing an effective strategy includes prioritizing top questions, mapping related data points, determining relevant data, and creating visualizations to tell compelling stories with the data.
1. Separating the Wheat from the
Chaff
Developing a Scalable Strategy for Gathering and Reporting Analytics
Suzanna Conrad
suzanna.conrad@csus.edu
Sacramento State
University Library
@tbytelibrarian
5. Scope creep of data – an example
• Goal was to measure every
connection with the library
where we captured an identifier
that linked to a student; then
assess student success
• Correlation issues: group study
room reservations
• Collection issues: not
comprehensive enough
• Analysis issues: too much data
6. Three-steps to narrowing down what you
want
1. Framing the questions that are important to
answer
2. Auditing all potential points where data is
collected
3. Evaluating which data should ultimately be
considered for analysis and visualization
7. Answering specific questions – an example
• Question: Tracking downloads of
undergraduate research to show
impact of new undergraduate
research initiatives
• Question: Are students
downloading work in the
repository?
8. Answering specific questions – an example
Downloads of undergraduate research
• Making sure tracking is
happening for the granularity of
the type of content
• In this case, including custom
dimensions in Google Analytics
using Google Tag Manager to
inject more metadata
Downloads by Author in GA
9. Answering specific questions – an example
Tracking downloads of work by
students
• Not possible to track individual
downloads by students – can’t
correlate a Google Analytics
session to a user without having
some sort of login somewhere
• BUT, demographics can be
enabled
Demographics in GA
10. Hardest part: Get people to tell you what
questions they want answered
• Who, what, when, where, why, in what way, by what means? (or
whichever are relevant)
• What is the end game?
• If the analytics are in your “favor,” what do you want them to say?
11. Three-steps to narrowing down what you
want
1. Framing the questions that are important to
answer
2. Auditing all potential points where data is
collected
3. Evaluating which data should ultimately be
considered for analysis and visualization
12. Auditing all potential points where data is
collected
Example Question: When is the library
busiest?
• Gate counts
• Service desk transactions
• Computer station logins
• Group study room reservation
counts
• WiFi usage/saturation
13. Auditing all potential points where data is
collected
Example Question: Do users like the
new website?
• Top pages, sessions, etc.
• Comparisons of new vs. old page
view data and pathways
• Heatmap analysis
• Surveys
• User feedback from testing
(usability, focus groups, A/B,
card sorting)
14. Identifying the caveats of your data
• Correlation: Is there a real or
suggested correlation between
data points?
• Is your data actually measuring
what you think it’s measuring?
• How many qualifiers do you
need to make about the
interpretation of your data?
• How much work is it to get the
data?
15. Three-steps to narrowing down what you
want
1. Framing the questions that are important to
answer
2. Auditing all potential points where data is
collected
3. Evaluating which data should ultimately be
considered for analysis and visualization
16. Evaluating which data should ultimately be
considered for analysis and visualization
• Impact of the data to your
operation
• Compelling, but realistic data
• Unbiased analysis is best
received
• Plotting data effectively for a
more comprehensive picture
17. Impact of the data
• So much data requires
prioritization
• Who in your organization would
want to know about this data?
• Does the data affect your funding
already, or could it?
• Does the data allow you to stay
independent?
18. Pulling it together – developing a strategy
• Rank your top 1-5 questions
• Map out all the data points related to your question(s)
• Determine if any of the data points are not relevant, are too time
intensive to get/manage, or don’t add value
• Draw out what an effective visualization would be; draft for others if
the request comes from someone else
• Determine the tools you’ll use for visualization
• Make time for developing your strategy
Talking about three steps and areas to develop an analytics strategy
So many points of access to track – physical usage, electronic usage of collections, usage of technology
If you don’t have the time to look at your data, how helpful is it? Don’t track just to track – think of a use case where the tracking might be helpful for you in the future.
One of my first projects at a previous institution. Someone had an idea about library impact and showing it through this kind of data. I was the person called upon to implement the technical collection of the data. It was overwhelming.
Talk through what the project was and the connection to storage, vs. IRAR (additional data) vs. analysis and reports
Talk through each of the types of reports we were pulling.
Talk about primary keys
Variables – is it helpful that they’re all different.
Correlation issues: group study rooms, workstation usage, workshops – are they using them to study? For meetings with fraternity or sorority? For social gatherings? Are they on Facebook or typing personal emails on the workstation – maybe they’re just killing time? Is a graduate workshop on how to submit to the IR really enhancing their GPAs and retention?
Collection issues: e-resources not capturing off-campus usage, group study not capturing all the people in the room, service desk captures are only good if everyone is swiping cards when they’re asking questions, ILL reports only good if they have the same primary key (they didn’t) – many created their own accounts and we did not know who they were, collecting all attendees at workshops and in-class instruction, are the books they’re checking out and renewing for personal reasons or academic?
In this diagram, the only data that did not have issues with correlation or collection was the tutorial usage and the LIB 150 participants
Analysis: Too much data, too many variables, too inconsistent collection and correlation not always apparent. Trying to pull all of these data points together is daunting and not helpful.
Introduce three questions
Follow-up on one of the questions
Example of making sure you capture the data points that you need to answer the questions. This was one where we had to have this in place and start tracking before we could answer the question. But knowing the question existed allowed us to make sure we developed it.
Demographics – not a perfect correlation, but closer based on age and location. Wouldn’t account for distance access/learners. Another example where you know the question is coming, so you make sure you’re capturing.
Like a reference interview – find out what the person needing the data really needs.
End game: Is there a budgetary or staff implication? Does this justify work completed or mean you need to refocus goals?
Favor: be careful about bias though – it does help to know what you want it to say, but don’t try to set it up to PROVE that point, rather just to collect what you need should you want to prove that point.
This isn’t just “when is the library busiest,” it’s also, when is it busiest for staff. Some of these may impact your staffing, some might be coincidental. This is more an “inventory” of what data exists than it is the moment to decide what’s relevant.
Do these data points actually say the users LIKE the website?
Quantitative vs. qualitative can help answer questions like these.
Checking in with before and after assessments can really determine if a new website has been successful.
Correlation: example of library usage project and lack of correlation. Do hits prove anything in website stats?
Measuring: Again, do website stats measure anything, really? Are more or less better? i.e. more efficient.
Qualifiers: If you have to explain five reasons the data may not be exact, maybe it’s best to leave it alone. I.e. if there are collection, correlation, analysis issues, it might not be worth using.
Work: Hours and hours of citation analysis for two pages of text in a 200 page book. Library usage project – was it worth the time when there were so many issues with the data.
This is when you’re separating the wheat from the chaff. Figuring out exactly what is helpful and compelling and making something out of it that is visual and easy to comprehend.
Who: Does your Provost want to know? Your library dean? Your department head? The team working on it?
Funding: Will this increase funding or prevent a budget cut?
Independence: Example of ITC tracking help desk ticket requests to prove relevance IF discussions ever open where this is required.
1-5 depends on staffing. If you only have staffing to concentrate on one, then concentrate on one.
Effective visualization: Important to try to draw what you want or could foresee, otherwise it’s very hard to put something together or think about it.
Visualization tools: Excel or Tableau? How complicated?
Time: Carve out X hours a week to think about your analytics strategy. How will you pull these pieces together? What have you tracked that you haven’t analyzed? What aren’t you tracking that you need to track?