SlideShare ist ein Scribd-Unternehmen logo
1 von 17
Analytics @ Lancaster University Library 
IGeLU 2014 
John Krug, Systems and Analytics Manager, Lancaster University Library 
http://www.slideshare.net/jhkrug/igelu-analytics-2014
Lancaster University, the Library 
and Alma 
• We are in Lancaster in the UK North West. 
• ~ 12,000 FTE students, ~ 2300 FTE Staff 
• Library has 55 FTE staff, building refurbishment in progress 
• University aims to be 10, 100 – Research, Teaching, Engagement 
• Global outlook with partnerships in Malaysia, India, Pakistan and 
a new Ghana campus 
• Alma implemented January 2013 as an early adopter. 
• I am Systems and Analytics Manager, at LUL since 2002 to 
implement Aleph – systems background, not library 
• How can library analytics help?
Alma Analytics reporting and 
dashboards 
• Following implementation of Alma, analytics dashboards 
rapidly developed for common reporting tasks 
• Ongoing work in this area, refining existing and developing 
new reports
Results
Fun with BLISS 
B Floor 9AZ (B) 
347 lines of this!
Projects & Challenges 
• LDIV – Library Data, Information & Visualisation 
• ETL experiments done using PostgresQL and Python 
• Data from Aleph, Alma, Ezproxy, etc. 
• Smaller projects: 
• e.g. Re-shelving performance – required to use Alma Analytics 
returns data along with the number of trolleys re-shelved daily. 
• Challenges – Infrastructure, Skills, time 
• Lots of new skills/knowledge needed for Analytics. For us : 
Alma analytics (OBIEE), python, Django, postgres, Tableau, nginx, 
openresty, lua, json, xml, xsl, statistics, data preparation, ETL, etc, etc, 
etc
Alma analytics data extraction 
• Requires using a SOAP API (thankfully a RESTful API is now 
available for Analytics) 
• SOAP support for python not very good, much better with 
REST. Currently using the suds python library with a few bug 
fixes for compression, ‘&’ encoding, etc. 
• A script get_analytics invokes the required report, 
manages collection of multiple ‘gets’ if the data is large and 
produces a single XML file result. 
• Needs porting from SOAP to REST. 
• Data extraction from Alma Analytics is straight forward, 
especially with REST
Data from other places 
• Ezproxy logs 
• Enquiry/exit desk query statistics 
• Re-shelving performance data 
• Shibboleth logs, hopefully soon. We are dependent on central IT 
services 
• Library building usage counts 
• Library PC usage statistics 
• JUSP & USTAT aggregate usage data 
• University faculty and department data 
• Social networking 
• New Alma Analytics subject areas, especially uResolver data
Gaps in the electronic resource 
picture 
• Currently we have aggregate data from JUSP, USTAT 
• Partial off campus picture from ezproxy, but web orientated 
rather than resource 
• Really want the data from Shibboleth and uResolver 
• Why the demand for such low level data about individuals?
The library and learner analytics 
• Learner analytics a growth field 
• Driven by a mass of data from VLEs and MOOCs …. and 
libraries 
• Student satisfaction & retention 
• Intervention(?) 
• if 
low(library borrowing) & low(eresource access) & 
high(rate of near late or late submissions) & 
low_to_middling(grades) 
then 
do_something() 
• The library can’t do all that, but the university could/can 
• Library can provide data
The library as data provider 
• LAMP – Library Analytics & Metrics 
Project from JISC 
• http://jisclamp.mimas.ac.uk 
• We will be exporting loan and anonymised 
student data for use by LAMP. 
• They are experimenting with dashboards 
and applications 
• Prototype application later this year. 
• Overlap with our own project LDIV 
• The Library API 
• For use by analytics projects within the university 
• Planning office, Student Services and others
The Library API 
• Built using openresty, nginx, lua 
• Restful like API interface 
• e.g. Retrieve physical loans for a patron 
• GET http://lib-ldiv.lancs.ac.uk:8080/ploans/0010215?start=45&number=1&format=xml (or json) 
<?xml version="1.0" encoding="UTF-8"?> 
<response> 
<record> 
<call_no>AZKF.S75 (H)</call_no> 
<loan_date>2014-07-10 15:44:00</loan_date> 
<num_renewals>0</num_renewals> 
<bor_status>03</bor_status> 
<rowid>3212</rowid> 
<returned_date>2014-08-15 10:16:00</returned_date> 
<collection>MAIN</collection> 
<rownum>1</rownum> 
<material>BOOK</material> 
<patron>b3ea5253dd4877c94fa9fac9</patron> 
<item_status>01</item_status> 
<call_no_2>B Floor Red Zone</call_no_2> 
<bor_type>34</bor_type> 
<key>000473908000010-200208151016173</key> 
<due_date>2015-06-19 19:00:00</due_date> 
</record> 
</response> 
[{ 
"rownum": 1, 
"key": "000473908000010-200208151016173", 
"patron": "b3ea5253dd4877c94fa9fac9", 
"loan_date": "2014-07-10 15:44:00", 
"due_date": "2015-06-19 19:00:00", 
"returned_date": "2014-08-15 10:16:00", 
"item_status": "01", 
"num_renewals": 0, 
"material": "BOOK", 
"bor_status": "03", 
"bor_type": "34", 
"call_no": "AZKF.S75 (H)", 
"call_no_2": "B Floor Red Zone", 
"collection": "MAIN", 
"rowid": 3212 
}]
How does it work? 
• GET http://lib-ldiv.lancs.ac.uk:8080/ploans/0010215?start=45&number=1&format=xml 
• Nginx configuration maps REST url to database query 
location ~ /ploans/(?<patron>w+) { 
## collect and/or set default parameters 
rewrite ^ /ploans_paged/$patron:$start:$nrows.$fmt; 
} 
location ~ /ploans_paged/(?<patron>w+):(?<start>d+):(?<nrows>d+).json { 
postgres_pass database; 
rds_json on; 
postgres_query HEAD GET " 
select * from ploans where patron = $patron 
and row >= $start and row < $start + $nrows"; 
}
Proxy for making Alma Analytics 
API requests 
• e.g. Analytics report which produces 
• nginx configuration 
location /aa/patron_count { 
set $b "api-na.hosted.exlibri … lytics/reports"; 
set $p "path=%2Fshared%2FLancas … tron_count"; 
set $k "apikey=l7xx6c0b1f6188514e388cb361dea3795e73"; 
proxy_pass https://$b?$p&$k; 
} 
• So users of our API can get data 
directly from Alma Analytics and 
we manage the interface they use 
and shield them from any API 
changes at Ex Libris.
Re-thinking approaches 
• Requirements workshops 
• Application development 
• Data provider via API interfaces 
• RDF/SPARQL capability 
• LDIV – Library Data, Information and Visualisation 
• Still experimenting 
• Imported data from ezproxy logs, GeoIP databases, student 
data, primo logs, a small amount of Alma data 
• Really need Shibboleth and uResolver data 
• Tableau as the dashboard to these data sets
Preliminary results 
More at http://public.tableausoftware.com/profile/john.krug#!/
• First UK Analytics SIG meeting Oct 14 following EPUG-UKI AGM 
• Questions?

Weitere ähnliche Inhalte

Was ist angesagt?

Intro to-technologies-Green-City-Hackathon-Athens
Intro to-technologies-Green-City-Hackathon-AthensIntro to-technologies-Green-City-Hackathon-Athens
Intro to-technologies-Green-City-Hackathon-Athens
Stoitsis Giannis
 
Scalable Automatic Machine Learning in H2O
Scalable Automatic Machine Learning in H2OScalable Automatic Machine Learning in H2O
Scalable Automatic Machine Learning in H2O
Sri Ambati
 
S. Bartoli & F. Pompermaier – A Semantic Big Data Companion
S. Bartoli & F. Pompermaier – A Semantic Big Data CompanionS. Bartoli & F. Pompermaier – A Semantic Big Data Companion
S. Bartoli & F. Pompermaier – A Semantic Big Data Companion
Flink Forward
 

Was ist angesagt? (20)

Spark + H20 = Machine Learning at scale
Spark + H20 = Machine Learning at scaleSpark + H20 = Machine Learning at scale
Spark + H20 = Machine Learning at scale
 
Spark MLlib - Training Material
Spark MLlib - Training Material Spark MLlib - Training Material
Spark MLlib - Training Material
 
Strata San Jose 2016: Scalable Ensemble Learning with H2O
Strata San Jose 2016: Scalable Ensemble Learning with H2OStrata San Jose 2016: Scalable Ensemble Learning with H2O
Strata San Jose 2016: Scalable Ensemble Learning with H2O
 
Machine Learning with Spark MLlib
Machine Learning with Spark MLlibMachine Learning with Spark MLlib
Machine Learning with Spark MLlib
 
Automatic Scaling Iterative Computations
Automatic Scaling Iterative ComputationsAutomatic Scaling Iterative Computations
Automatic Scaling Iterative Computations
 
ApacheCon: Apache Flink - Fast and Reliable Large-Scale Data Processing
ApacheCon: Apache Flink - Fast and Reliable Large-Scale Data ProcessingApacheCon: Apache Flink - Fast and Reliable Large-Scale Data Processing
ApacheCon: Apache Flink - Fast and Reliable Large-Scale Data Processing
 
Graph Databases for SQL Server Professionals
Graph Databases for SQL Server ProfessionalsGraph Databases for SQL Server Professionals
Graph Databases for SQL Server Professionals
 
Journey of Implementing Solr at Target: Presented by Raja Ramachandran, Target
Journey of Implementing Solr at Target: Presented by Raja Ramachandran, TargetJourney of Implementing Solr at Target: Presented by Raja Ramachandran, Target
Journey of Implementing Solr at Target: Presented by Raja Ramachandran, Target
 
Intro to-technologies-Green-City-Hackathon-Athens
Intro to-technologies-Green-City-Hackathon-AthensIntro to-technologies-Green-City-Hackathon-Athens
Intro to-technologies-Green-City-Hackathon-Athens
 
Tuning ML Models: Scaling, Workflows, and Architecture
Tuning ML Models: Scaling, Workflows, and ArchitectureTuning ML Models: Scaling, Workflows, and Architecture
Tuning ML Models: Scaling, Workflows, and Architecture
 
Exceptions are the Norm: Dealing with Bad Actors in ETL
Exceptions are the Norm: Dealing with Bad Actors in ETLExceptions are the Norm: Dealing with Bad Actors in ETL
Exceptions are the Norm: Dealing with Bad Actors in ETL
 
Scalable Automatic Machine Learning in H2O
Scalable Automatic Machine Learning in H2OScalable Automatic Machine Learning in H2O
Scalable Automatic Machine Learning in H2O
 
Machine Learning for (JVM) Developers
Machine Learning for (JVM) DevelopersMachine Learning for (JVM) Developers
Machine Learning for (JVM) Developers
 
Data provenance in Hopsworks
Data provenance in HopsworksData provenance in Hopsworks
Data provenance in Hopsworks
 
Serverless data pipelines gcp
Serverless data pipelines gcpServerless data pipelines gcp
Serverless data pipelines gcp
 
Etl with apache impala by athemaster
Etl with apache impala by athemasterEtl with apache impala by athemaster
Etl with apache impala by athemaster
 
Intro to H2O Machine Learning in R at Santa Clara University
Intro to H2O Machine Learning in R at Santa Clara UniversityIntro to H2O Machine Learning in R at Santa Clara University
Intro to H2O Machine Learning in R at Santa Clara University
 
Porting R Models into Scala Spark
Porting R Models into Scala SparkPorting R Models into Scala Spark
Porting R Models into Scala Spark
 
S. Bartoli & F. Pompermaier – A Semantic Big Data Companion
S. Bartoli & F. Pompermaier – A Semantic Big Data CompanionS. Bartoli & F. Pompermaier – A Semantic Big Data Companion
S. Bartoli & F. Pompermaier – A Semantic Big Data Companion
 
Echoes Project
Echoes ProjectEchoes Project
Echoes Project
 

Andere mochten auch

Andere mochten auch (6)

Mashcat 2017
Mashcat 2017Mashcat 2017
Mashcat 2017
 
LIBISnet Gebruikersdag2016 Alma Analytics
LIBISnet Gebruikersdag2016 Alma AnalyticsLIBISnet Gebruikersdag2016 Alma Analytics
LIBISnet Gebruikersdag2016 Alma Analytics
 
EPUG UKI - Lancaster Analytics
EPUG UKI - Lancaster AnalyticsEPUG UKI - Lancaster Analytics
EPUG UKI - Lancaster Analytics
 
Not available, or not found? Lessons from user queries in the Oria catalog at...
Not available, or not found? Lessons from user queries in the Oria catalog at...Not available, or not found? Lessons from user queries in the Oria catalog at...
Not available, or not found? Lessons from user queries in the Oria catalog at...
 
Catalog Management in the Cloud: Two Years In
Catalog Management in the Cloud: Two Years InCatalog Management in the Cloud: Two Years In
Catalog Management in the Cloud: Two Years In
 
ALMA ANALYTICS internal training document IRAM - University of Western Australia
ALMA ANALYTICS internal training document IRAM - University of Western AustraliaALMA ANALYTICS internal training document IRAM - University of Western Australia
ALMA ANALYTICS internal training document IRAM - University of Western Australia
 

Ähnlich wie IGeLU 2014

Presto as a Service - Tips for operation and monitoring
Presto as a Service - Tips for operation and monitoringPresto as a Service - Tips for operation and monitoring
Presto as a Service - Tips for operation and monitoring
Taro L. Saito
 

Ähnlich wie IGeLU 2014 (20)

Data Pipelines with Python - NWA TechFest 2017
Data Pipelines with Python - NWA TechFest 2017Data Pipelines with Python - NWA TechFest 2017
Data Pipelines with Python - NWA TechFest 2017
 
Making the Big Move: Moving to Cloud-Based OCLC’s WorldShare Management Servi...
Making the Big Move: Moving to Cloud-Based OCLC’s WorldShare Management Servi...Making the Big Move: Moving to Cloud-Based OCLC’s WorldShare Management Servi...
Making the Big Move: Moving to Cloud-Based OCLC’s WorldShare Management Servi...
 
Big Data Introduction - Solix empower
Big Data Introduction - Solix empowerBig Data Introduction - Solix empower
Big Data Introduction - Solix empower
 
Managing Your Hyperion Environment – Performance Tuning, Problem Solving and ...
Managing Your Hyperion Environment – Performance Tuning, Problem Solving and ...Managing Your Hyperion Environment – Performance Tuning, Problem Solving and ...
Managing Your Hyperion Environment – Performance Tuning, Problem Solving and ...
 
Kylin and Druid Presentation
Kylin and Druid PresentationKylin and Druid Presentation
Kylin and Druid Presentation
 
Lecture 3 - Exploratory Data Analytics (EDA), a lecture in subject module Sta...
Lecture 3 - Exploratory Data Analytics (EDA), a lecture in subject module Sta...Lecture 3 - Exploratory Data Analytics (EDA), a lecture in subject module Sta...
Lecture 3 - Exploratory Data Analytics (EDA), a lecture in subject module Sta...
 
Jethro for tableau webinar (11 15)
Jethro for tableau webinar (11 15)Jethro for tableau webinar (11 15)
Jethro for tableau webinar (11 15)
 
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
 
Presto as a Service - Tips for operation and monitoring
Presto as a Service - Tips for operation and monitoringPresto as a Service - Tips for operation and monitoring
Presto as a Service - Tips for operation and monitoring
 
The XML Submission Tool: A System for Managing Text Collections at Indiana Un...
The XML Submission Tool: A System for Managing Text Collections at Indiana Un...The XML Submission Tool: A System for Managing Text Collections at Indiana Un...
The XML Submission Tool: A System for Managing Text Collections at Indiana Un...
 
Demystifying data engineering
Demystifying data engineeringDemystifying data engineering
Demystifying data engineering
 
Efficient & effective data management for research projects : ILRI's Data Ma...
Efficient & effective  data management for research projects : ILRI's Data Ma...Efficient & effective  data management for research projects : ILRI's Data Ma...
Efficient & effective data management for research projects : ILRI's Data Ma...
 
Adf and ala design c sharp corner toronto chapter feb 2019 meetup nik shahriar
Adf and ala design c sharp corner toronto chapter feb 2019 meetup nik shahriarAdf and ala design c sharp corner toronto chapter feb 2019 meetup nik shahriar
Adf and ala design c sharp corner toronto chapter feb 2019 meetup nik shahriar
 
Jethro data meetup index base sql on hadoop - oct-2014
Jethro data meetup    index base sql on hadoop - oct-2014Jethro data meetup    index base sql on hadoop - oct-2014
Jethro data meetup index base sql on hadoop - oct-2014
 
MongoDB & The McGraw-Hill Education Learning Analytics Platform
MongoDB & The McGraw-Hill Education Learning Analytics PlatformMongoDB & The McGraw-Hill Education Learning Analytics Platform
MongoDB & The McGraw-Hill Education Learning Analytics Platform
 
Apache Spark sql
Apache Spark sqlApache Spark sql
Apache Spark sql
 
SQLSaturday 664 - Troubleshoot SQL Server performance problems like a Microso...
SQLSaturday 664 - Troubleshoot SQL Server performance problems like a Microso...SQLSaturday 664 - Troubleshoot SQL Server performance problems like a Microso...
SQLSaturday 664 - Troubleshoot SQL Server performance problems like a Microso...
 
The Evolution of the Oracle Database - Then, Now and Later (Fontys Hogeschool...
The Evolution of the Oracle Database - Then, Now and Later (Fontys Hogeschool...The Evolution of the Oracle Database - Then, Now and Later (Fontys Hogeschool...
The Evolution of the Oracle Database - Then, Now and Later (Fontys Hogeschool...
 
2015 Data Science Summit @ dato Review
2015 Data Science Summit @ dato Review2015 Data Science Summit @ dato Review
2015 Data Science Summit @ dato Review
 
EPrints Update, Les Carr, University of Southampton
EPrints  Update, Les Carr, University of SouthamptonEPrints  Update, Les Carr, University of Southampton
EPrints Update, Les Carr, University of Southampton
 

Kürzlich hochgeladen

An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
SanaAli374401
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
MateoGardella
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 

Kürzlich hochgeladen (20)

Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 

IGeLU 2014

  • 1. Analytics @ Lancaster University Library IGeLU 2014 John Krug, Systems and Analytics Manager, Lancaster University Library http://www.slideshare.net/jhkrug/igelu-analytics-2014
  • 2. Lancaster University, the Library and Alma • We are in Lancaster in the UK North West. • ~ 12,000 FTE students, ~ 2300 FTE Staff • Library has 55 FTE staff, building refurbishment in progress • University aims to be 10, 100 – Research, Teaching, Engagement • Global outlook with partnerships in Malaysia, India, Pakistan and a new Ghana campus • Alma implemented January 2013 as an early adopter. • I am Systems and Analytics Manager, at LUL since 2002 to implement Aleph – systems background, not library • How can library analytics help?
  • 3. Alma Analytics reporting and dashboards • Following implementation of Alma, analytics dashboards rapidly developed for common reporting tasks • Ongoing work in this area, refining existing and developing new reports
  • 5. Fun with BLISS B Floor 9AZ (B) 347 lines of this!
  • 6. Projects & Challenges • LDIV – Library Data, Information & Visualisation • ETL experiments done using PostgresQL and Python • Data from Aleph, Alma, Ezproxy, etc. • Smaller projects: • e.g. Re-shelving performance – required to use Alma Analytics returns data along with the number of trolleys re-shelved daily. • Challenges – Infrastructure, Skills, time • Lots of new skills/knowledge needed for Analytics. For us : Alma analytics (OBIEE), python, Django, postgres, Tableau, nginx, openresty, lua, json, xml, xsl, statistics, data preparation, ETL, etc, etc, etc
  • 7. Alma analytics data extraction • Requires using a SOAP API (thankfully a RESTful API is now available for Analytics) • SOAP support for python not very good, much better with REST. Currently using the suds python library with a few bug fixes for compression, ‘&’ encoding, etc. • A script get_analytics invokes the required report, manages collection of multiple ‘gets’ if the data is large and produces a single XML file result. • Needs porting from SOAP to REST. • Data extraction from Alma Analytics is straight forward, especially with REST
  • 8. Data from other places • Ezproxy logs • Enquiry/exit desk query statistics • Re-shelving performance data • Shibboleth logs, hopefully soon. We are dependent on central IT services • Library building usage counts • Library PC usage statistics • JUSP & USTAT aggregate usage data • University faculty and department data • Social networking • New Alma Analytics subject areas, especially uResolver data
  • 9. Gaps in the electronic resource picture • Currently we have aggregate data from JUSP, USTAT • Partial off campus picture from ezproxy, but web orientated rather than resource • Really want the data from Shibboleth and uResolver • Why the demand for such low level data about individuals?
  • 10. The library and learner analytics • Learner analytics a growth field • Driven by a mass of data from VLEs and MOOCs …. and libraries • Student satisfaction & retention • Intervention(?) • if low(library borrowing) & low(eresource access) & high(rate of near late or late submissions) & low_to_middling(grades) then do_something() • The library can’t do all that, but the university could/can • Library can provide data
  • 11. The library as data provider • LAMP – Library Analytics & Metrics Project from JISC • http://jisclamp.mimas.ac.uk • We will be exporting loan and anonymised student data for use by LAMP. • They are experimenting with dashboards and applications • Prototype application later this year. • Overlap with our own project LDIV • The Library API • For use by analytics projects within the university • Planning office, Student Services and others
  • 12. The Library API • Built using openresty, nginx, lua • Restful like API interface • e.g. Retrieve physical loans for a patron • GET http://lib-ldiv.lancs.ac.uk:8080/ploans/0010215?start=45&number=1&format=xml (or json) <?xml version="1.0" encoding="UTF-8"?> <response> <record> <call_no>AZKF.S75 (H)</call_no> <loan_date>2014-07-10 15:44:00</loan_date> <num_renewals>0</num_renewals> <bor_status>03</bor_status> <rowid>3212</rowid> <returned_date>2014-08-15 10:16:00</returned_date> <collection>MAIN</collection> <rownum>1</rownum> <material>BOOK</material> <patron>b3ea5253dd4877c94fa9fac9</patron> <item_status>01</item_status> <call_no_2>B Floor Red Zone</call_no_2> <bor_type>34</bor_type> <key>000473908000010-200208151016173</key> <due_date>2015-06-19 19:00:00</due_date> </record> </response> [{ "rownum": 1, "key": "000473908000010-200208151016173", "patron": "b3ea5253dd4877c94fa9fac9", "loan_date": "2014-07-10 15:44:00", "due_date": "2015-06-19 19:00:00", "returned_date": "2014-08-15 10:16:00", "item_status": "01", "num_renewals": 0, "material": "BOOK", "bor_status": "03", "bor_type": "34", "call_no": "AZKF.S75 (H)", "call_no_2": "B Floor Red Zone", "collection": "MAIN", "rowid": 3212 }]
  • 13. How does it work? • GET http://lib-ldiv.lancs.ac.uk:8080/ploans/0010215?start=45&number=1&format=xml • Nginx configuration maps REST url to database query location ~ /ploans/(?<patron>w+) { ## collect and/or set default parameters rewrite ^ /ploans_paged/$patron:$start:$nrows.$fmt; } location ~ /ploans_paged/(?<patron>w+):(?<start>d+):(?<nrows>d+).json { postgres_pass database; rds_json on; postgres_query HEAD GET " select * from ploans where patron = $patron and row >= $start and row < $start + $nrows"; }
  • 14. Proxy for making Alma Analytics API requests • e.g. Analytics report which produces • nginx configuration location /aa/patron_count { set $b "api-na.hosted.exlibri … lytics/reports"; set $p "path=%2Fshared%2FLancas … tron_count"; set $k "apikey=l7xx6c0b1f6188514e388cb361dea3795e73"; proxy_pass https://$b?$p&$k; } • So users of our API can get data directly from Alma Analytics and we manage the interface they use and shield them from any API changes at Ex Libris.
  • 15. Re-thinking approaches • Requirements workshops • Application development • Data provider via API interfaces • RDF/SPARQL capability • LDIV – Library Data, Information and Visualisation • Still experimenting • Imported data from ezproxy logs, GeoIP databases, student data, primo logs, a small amount of Alma data • Really need Shibboleth and uResolver data • Tableau as the dashboard to these data sets
  • 16. Preliminary results More at http://public.tableausoftware.com/profile/john.krug#!/
  • 17. • First UK Analytics SIG meeting Oct 14 following EPUG-UKI AGM • Questions?