Presentation at Southeastern Library Assessment Conference 2017 in Atlanta, GA.
This session will outline how we planned and executed five simultaneous usability tests and what we learned from using this method. We'll also discuss how we approached analyzing the large amount of qualitative data that was gathered during testing via affinity diagrams and lots of post-it notes. The focus of this session is on our methodologies, though we'll briefly look at the results of each test.
4. Harvard at edUi, 2016
bit.ly/testfestivus
Amy Deschenes &
Shannon Rice
Test fest:
Running multiple tests at the same
time to decrease your overhead
5. Who are we?
What do we do?
We try to “...create a seamless
connection between the library’s
services, collections, physical spaces
and virtual presence.”
libux.web.unc.edu
P.S. We’re hiring:
library.unc.edu/personnel/employment/
6. Scheduling
• Issues in the past
• Recruiting
Image credit: https://pixabay.com/en/january-calendar-month-year-day-2290045/
10. What is a test fest?
A series of simultaneous usability tests with a set number of tests
equal to the number of participants
Tests...
• use a mix of methodologies
• can be moderated or unmoderated
• need to take approximately the same amount of time
11. A photo takes up the entire slide here plus a caption in the top left
What We Learned from Harvard
12. Logistics
Planning and scripting tests based on backlog
• Warm up and follow up questions
• Volunteers to help staff tests
Recruiting participants
• Incentives
Space to run tests
14. A photo takes up the entire slide here plus a caption in the top left
Temporary Usability Lab
15. Roles
1 Timekeeper & Greeter
4 Test Moderators
2 Notetakers
1 Escort
Volunteers for pilot test
16. Schedule
9:30 – Introduction & Consent/Financial form signatures
9:35 – First test begins
9:55 – Second test begins
10:15 – Third test begins
10:35 – Break
10:50 – Fourth test begins
11:10 – Fifth test begins
11:30 – De-brief of participants, incentive handout
18. Tests & Methods
1. Accessing databases from the catalog
2. Basic research skill videos
3. General catalog usability
4. Library homepage sketching
5. Summon vs. Ebsco Discovery Service
19. Testing Logistics
Two rounds of tests with 5 participants each
• 2 participants didn’t show up to the first round
Follow-up round with 2 participants for
• Accessing databases from the catalog
• Summon vs. EDS
21. Intermediary pages are confusing
Users don’t read notes/special instructions even when highlighted
• Current user notes aren’t noticeable
• Move above description if it’s important
Undergrad participants don’t access databases via the catalog
Test 1: Results
23. No participant viewed our research skill videos before
Students found the videos useful, but wouldn’t watch them again
Comments included:
• “This seems useful for a first year” (multiple)
• “Could have saved lots of time” (multiple)
• Video is too long (specific to “Evaluating Resources”)
Test 2: Results
25. Confusion over differences between catalog and other search tools
Didn’t see “show more” option in facets
Majority used advanced search features, but confusion over
different fields such as:
• Boolean search box
• Author, subject heading, etc. fields
Test 3: Results
28. Most important feature is centralized search bar
• Followed closely by hours information visibility
Design should be simple overall
Help and chat features should be more prominent
News and exhibits take up a lot of space
Test 4: Results
31. Overall users preferred Summon over EDS, but only slightly
They want:
• Dynamic features
• Contextual help
• More information embedded within results
Test 5: Results
32. Lessons Learned
Run a pilot test beforehand
UNC basketball + UNC undergrads
• Only 3 participants for the first round of tests
• Rescheduled a smaller round for 2 tests to make up difference
Not all tests work for this format
34. Analyzing Results + Affinity Diagramming
Total: 44 tests to analyze
• 8 participants, 5 tests = 40 results
• 2 participants, 2 tests = 4 results
Process: Affinity diagramming
• Organizes a large amount of qualitative data into related categories
based on relationships of the data
49. Problems with our approach
Too much data to analyze
Difficulty with cross-comparing elements from different tests
• More than one test looked at search results pages (catalog vs.
Summon/EDS)
• Interface features like advanced search options
• Visual elements
51. Final (or Developing) Outcomes
Generated reports for each individual test
Used some of the data, but not all so far
• User note review with subject librarians
• New catalog platform in development
• Used sketch method for our special collections homepage
• Dropped EDS trial after statewide deal for Summon
Amy Do-shane (Deschenes)
Worked well for them, we very slightly modified it
At its core a Test fest is about running multiple tests at the same time to cut down on the overhead of recruiting etc
A big part of this mission involves running usability tests. Catalog, website, guides, etc.
Started running regular monthly tests in late 2016 and early 2017
Regular monthly usability tests, but scheduling and staffing were difficult…
Issue #1: Scheduling
Differing time commitments:
Guerilla testing vs one on one exploratory methods
Online testing vs in-person
Purchasing policies
Image credit: https://pixabay.com/en/january-calendar-month-year-day-2290045/
Issue #2: Staffing Limitations
Related to scheduling of staff
Other projects taking priority over regular testing
Result: Backlog
Difficulty scheduling tests and lack of staff led to a backlog
We had more things to test than we did time to test them
Image from 2012 Veterans Affairs office in
Going into planning our test fest, there were some things we already knew to be aware of from of Amy and Shannon’s presentation at edUi:
Participants and moderators/notetakers get fatigued
Have short breaks between each test
Provide water and snacks
A longer break so participants can relax/go to the bathroom if needed
Consent form process should be simple
Centralizing the consent process + other paperwork with all participants at one time is easier than doing it test by test
Noise from each test session could be problematic
Having a location for each test that could be separate from the others is important
Provide clear signage for each test and schedule for the testing day
Setting expectations and informing participants of the entire testing process helps things run smoothly
Have a centralized timekeeper for all tests
Individual moderators/notetakers won’t keep track of their time as well as a central timekeeper
These were the solutions to our problems > backlog, staff vacancies, etc.
Davis construction allowed us to use this space for our testing
Snacks = granola bars, animal crackers, coffee, orange juice, water
Timekeeper + greeter = main contact for all tests and participants
Moderators and notetakers might vary depending on types of tests being run
An escort is necessary to be first contact participants have + to show participants where to go
By 9 AM, escort is downstairs and begins bringing participants up
Timekeeper gave warnings at 5 and 2 minutes
Done by approximately 11:35 AM
3 task-based usability tests = more traditional test
1 unmoderated test
1 sketch test
Methods for each test:
Task-based usability test
Unmoderated using Qualtrics to record responses
Task-based usability test
Sketch test
Task-based usability test
Link to the library’s databases from catalog
In some cases we drop users on an intermediary page with special instructions/information
Special instructions/information called user notes
Summer 2016 we migrated from a home built database management system to LibGuides A-Z Database List
Changed the way we’re able to display the intermediary pages, and needed to find a new solution
Task Analysis: https://www.usability.gov/how-to-and-tools/methods/task-analysis.html
Results
Intermediary pages are confusing no matter what they look like
Unsurprisingly users don’t read notes
Generally, as our analytics backed up, users don’t access databases via the catalog
Wanted to assess the value and benefit of a video tutorial series that we’d created
Videos focus on basic research skills and are meant to be used by our First Year Writing courses as discussion starters
Used a variation of the Technology Acceptance Model (TAM)
Looks at perceived usefulness and perceived ease of use
Users watched 4 out of 5 videos + agreed/disagreed with 10 statements about the usefulness of the videos after each viewing
Statements included:
Watching this video could have improved my completed research project.
I am confident that I have learned something from this video.
I would recommend this video to my friends.
Data from Qualtrics is very useful on a video by video basis for future development
Moving to a new catalog in the next year or so
Find out how students search in the library catalog using the main catalog search box, advanced search, and refine your search features
Tasks included:
Simple search from the main catalog search bar
Use of refine your search options/facets
The advanced search option and find out how they perceived it
Task Analysis: https://www.usability.gov/how-to-and-tools/methods/task-analysis.html
Learned how users search and refine their results
Any potential usability issues that we could address in the new catalog such as confusion over what the catalog does compared to other searches on the library site
Idea for this test was adapted from methodologies conducted as part of a Space Use Assessment at the University of Rochester by Foster-Gibbons.
As a part of their Undergraduate Research project, a series of Design Workshops were held at which students were asked to design their ideal library space, and how they would lay out the furniture at the library
We applied those same principles to our web space
Process
Participants sketch their ideal library homepage
Then annotate a printout of our actual homepage
Once they’d annotated the library homepage, they were asked to sketch their new ideal library homepage
Unsurprisingly the search bar and library hours were stated to be the most important
News and exhibits seem more prominent, but were mentioned more as negative features than positive ones -- take up a lot of space on homepage
Library was exploring moving from Summon to EDS for our discovery service
Used Western Carolina University’s EDS
UNC’s Summon
Wanted to compare basic usability and satisfaction of both services
Split participants up randomly between the 2 services -- 5 looked at Summon, 5 looked at EDS
Tasks included common search tasks with scenarios involving research for class assignments
Examples: Finding an ebook, finding the full text of a specific article
Users would rank their satisfaction with the results
Task Analysis: https://www.usability.gov/how-to-and-tools/methods/task-analysis.html
In total, we had results from 44 tests to analyze
We wrote individual reports for each test, but also wanted to see if there were trends across 4 of the 5 tests
Excluded the unmoderated test because user comments were different from the other 4
To process the results, we used a method called affinity diagramming
Affinity diagramming: http://www.balancedscorecard.org/portals/0/pdf/affinity.pdf
Write out what participants said verbatim or general idea/expressed feeling that they share
We included each participant’s number + test day (i.e. 1B)
Examples
“footer is really repetitive”
“lacks confidence in search results due to look of bolded key words”
“I want to… is nice, but doesn’t fit with the others”
Green = Summon vs. Ebsco Discovery Service
Blue = General library catalog
Pink = Accessing databases from the catalog
Non-swooshing = local term for intermediary pages
Yellow = Library homepage sketching
Our sticky notes!!
From here, we individually began grouping comments/notes together that were related
After that we individually then collaboratively worked to come up with categories that matched each grouping
Examples of the categories we came up with
Had 34 categories total between the 4 tests
Lots of feedback on specific platforms/tools like our catalog, discovery service, hours system, chat, etc.
Similar comments were combined or stacked on top of one another to indicate they were highlighting the same idea/topic
Comments included:
Glad there are lots of resources (across all 4 tests: accessing database via catalog, catalog, sketch test, and Summon vs. EDS)
Research Tools and “I Want To…” are helpful (sketch test)
Overall Articles+ experience was satisfactory (Summon vs. EDS)
Comments included:
Lots of clicking to do (accessing database via catalog)
Looking for resources, not notes (accessing database via catalog)
EDS not intuitive enough/frustrated with experience (Summon vs. EDS)
Comments included:
“What does the librarian help with?” in regards to librarian’s photo and name on a subject page (accessing database via catalog)
Subject bar at the top has too many options/expected it to be a search bar (accessing database via catalog)
“Does web resources include ebooks or is it the same thing?” (Summon vs. EDS)
Comments included:
Visit the library’s website at least a dozen times a semester (sketch test)
Use library catalog daily (catalog)
Use databases in class from the E-Research by Discipline page (accessing database via catalog)
Comments included:
Typically ignore a lot of stuff on the search results page (Summon vs. EDS)
Don’t use the advanced search options (catalog)
Never used the research assistance stuff like user notes or contact (accessing database via catalog, sketch test)
Comments included:
Like the footer on library homepage (sketch test)
Like the 3 boxes on library homepage (sketch test)
Like color scheme of our guides/A-Z Database pages (accessing database via catalog)
Comments included:
Layout feels cluttered (sketch test: homepage, Summon vs. EDS: search results page)
Old Well picture for librarian photo makes no sense (accessing database via catalog)
Footer is repetitive (sketch test)
All notes come from the general catalog usability test
Comments included:
Expects to find a little bit of everything in the catalog vs. Not sure what resources are in the catalog
Would like if search/advance search were more like Academic Search Premier
Layout is most important for catalog
Confusion/uncertainty over Advanced Search fields like subject headings, OCLC #
Category made sense when looking at all 4 tests together, but ends up having conflicting/mix of comments that could be broken down further within it
All notes come from the Summon vs. EDS test
Comments included:
Articles+ feels pretty good (re: reliability of results) vs. Unsure if “scholarly & peer reviewed” option gives relevant results
Confusing because people aren’t used to these tools
Didn’t know Articles+ had ebook content
Combination of responses related to library catalog or Summon and EDS
Difficult to compare because talking about 2 different interfaces
Comments included:
Confused by options on advanced search page (Summons vs. EDS)
Nice to pre-set filters on one page (Articles+ specifically)
Wants more options and search modes on advanced search page (catalog)
Surprised by number of language options (catalog)
Found that affinity diagramming would work better on a per test basis