SlideShare ist ein Scribd-Unternehmen logo
1 von 53
Crushing, Blending, and
       Stretching Data
Data Warehousing and Mining Data
  from Voyager and Other Library
     and University Systems for
 Assessment of Library Operations
      ELUNA Conference 2008, Long Beach, CA,
              Friday, August 1, 2008

                    Ray Schwartz,
             Systems Specialist Librarian
      Cheng Library, William Paterson University,
              Wayne, New Jersey, USA
               schwartzr2 @ wpunj.edu
Outline
• Why Assessment and Why Now?
• What is Data Mining and Data
  Warehousing and Why Do We Do It?
• Our Context
• Groups and Services
• Steps
• Reporting

                            2
Outline

• What is Data Mining and Data
  Warehousing?
• Our Context
• Groups and Services
• Steps
• Reporting



                                 3
Have We Always Assessed?

• Anecdotally—Yes.
• Systematically—Not usually.
  – Large scale assessment of manual systems
    (such as serials check-in, and card catalogs,
    circulation files) are not practical.
  – Smaller scale and directed assessment is
    possible.



                                     4
What changed since the days
    of manual systems?
• For many institutions in the West, the
  Integrated Library System has been in use
  for over 20 years.
• Larger scale assessment is now possible
  with the electronic systems.




                                5
6
7
What is different now?
• New services have come into existence.
  – Inside libraries
     • Full-Text Databases
     • Link Resolvers
  – Outside of libraries
     • Google
     • Amazon




                                8
9
What is Data Mining and Data
        Warehousing
• Extracting data from legacy systems and other
  resources;
• cleaning, scrubbing and preparing data for decision
  support;
• maintaining data in appropriate data stores;
• accessing and analysing data using a variety of end
  user tools;
• and mining data for significant relationships.


 •   Chaffey, D., Mayer, R., Johnston, K., & Ellis-Chadwick, F. (2002). Internet Marketing:
     Strategy, Implementation and Practice (2nd ed.). Financial Times/ Prentice Hall.

                                                                          10
• The primary purpose of these efforts
  is to provide easy access to specifically
  prepared data that can be used with
  decision support applications such as
  management reports, queries,
  decision support systems ,
  executive information systems and
  data mining.



•   Chaffey, D., Mayer, R., Johnston, K., & Ellis-Chadwick, F. (2002). Internet Marketing:
    Strategy, Implementation and Practice (2nd ed.). Financial Times/ Prentice Hall.

                                                                         11
Of course there are many
    ways to measure
            –
    Scott Nicholson’s
  Measurement Model


                   12
Measurement Matrix with
               methodologies
                                                                Topic
Perspective Library System                                                            Use
                                  Procedures and Standards               Recorded interactions with
Internal (Library                 •Staff survey and interviews           interface & materials
System)                           •Audits of collections, systems,       •Bibliomining
                                  or staff                               •Transaction/Web Log Analysis
                                                                         •Observation of User Behavior




                                  Aboutness and Usability                Knowledge states and User
External                          •Surveys and interviews                citations to materials
                                  •Talk-alouds and inprocess             •Surveys and interviews
(User)                            feedback mechanisms                    •Focus groups
                                  •Focus groups                          •User Citation tracking


                                                                                   13
    Nicholson, Scott (2004). A Conceptual framework for the holistic measurement and cumulative evaluation
    of library services. Journal of Documentation 60(2) p.164-181
Our Context




              14
Our University
•   9000 undergraduates
•   1000 graduates (mostly education majors)
•   400 faculty
•   800 adjuncts
•   1000 staff




                                    15
Our Library
•   19 librarians and 26 library staff
•   350,000 volumes
•   18,000 audiovisual items
•   22,000 print and electronic periodicals
•   100 general and subject specific databases




                                      16
Our Systems circa 2005
•   Voyager ILS – Cheng Server
•   Online Periodical Database (OPD)
•   Clio ILL Software
•   EZProxy Server - Zeus
•   Banner – University ERP
•   University Networked Drive K:
•   University Email Server
•   University Web Server
                                 17
Vendor Services
•   Serials Solutions
•   OCLC
•   Blackwell
•   Ebsco
•   Marcive
•   Database Vendors


                             18
The Question


Which categories of patrons are
  accessing which services?




                           19
First Step – Patron Statistical
          Categories




                          20
• Voyager Patron Database allows a maximum
  of 10 statistical categories per patron record.

• Decide which statistical categories are needed
  for each patron group defined.

• Work with your University Information Systems
  Department to extract the relevant data from
  the relevant sources.




                                       21
Groups and Services
• Major                              •   Circulation
• Status                                   – Books
                                           – Media
     – Undergrad or Grad
                                           – Reserve
     – Faculty, Adjunct Faculty or
                                           – By Fund Code
       Staff
                                           – Location
•   Department                       •   ILL / Document Delivery
•   College                          •   Databases
•   Degree                           •   Library Web Pages
•                                         – Subject Area Resource Guides
    No. of Credits
                                          – Reference Requests
•   Year of Study                    •   Catalog
•   Campus Location                  •   Other Vendor Services
                                          – Serials Solutions




                                                         22
History Department - 12 months -                                                                             Feb. 2008
                                                                                                              %
                                                                                                           BORROW          CIRC/       CIRC/
  PATRON STATUS           BOOK CIRC MEDIA CIRC EQUIP CIRC          TOTAL CIRC    MEMBERS         BORROWERS   ING          MEMBER     BORROWER

UNDERGRADUATE
STUDENTS                       2,715           250          698          3,663             238        186           78%      15.39        19.69

GRADUATE
STUDENTS                         419            13           76           508               14          13          93%      36.29        39.08

ADJUNCT FACULTY                  100            65           20           185               32          20          63%       5.78         9.25

FULL-TIME FACULTY                159           115          194           468               24          23          96%      19.50        20.35

HISTORY TOTALS                 3,393           443          988          4,824             308        242           79%      15.66        19.93

LIBRARY TOTALS                23,370         8,713       20,703        52,756         7,418          4,981          67%       7.11        10.59



DEFINITIONS:
BOOK CIRCULATION = books, book disks, maps, oversize, Curriculum materials, reserve books, NJ History, Leisure Lounge
MEDIA CIRCULATION = audio & video materials, including media reserves

EQUIPMENT CIRCULATION = camcorders, overhead & data projectors, laptops, easels, DVD players, etc.
MEMBER = declared major or department member
BORROWER = any member who borrowed materials
Library Total = declared undergrad & grad majors, adjuncts & full time faculty borrowers



                                                                                                             23
Problems with Configuration of
          Services
• Little to no linkage of data
• Need to search multiple services to
  get complete picture of serial holdings
• Multiple user IDs for authentication




                                24
Systems Chart – ca. 2005
     Cheng Server                                                                   www.wpunj.edu
                                             Online Periodicals                                                 Serials
                                                                                                                Form
                 Perl                            Database                                     ColdFusion
                                                                              ILL Form
                                                                                             Web Server       ER
                                                                                Micro                       Pag
         Web Server                                     Oracle                  Form                        e

            Voyager                                    Materials
                                                                                               Zeus
   Circulation             Media
                         Scheduling
                                                                                    Off Campus Dbase Hits
                                                        Patrons
  Patrons                Searches                                                   & ILL Form
                                                                                         ( EZProxy Log )

       Banner
    SIS     HRS                                                                 University Networked
                                                                                Drive K:
( University ERP System )                   University Email Server
                                                                                        Patrons     Materials

                                                                                         ILL ( Cliodata )
   Serials Solutions                                  OCLC
     A to Z
                                                     WorldCat

                                                       ILL
                                                                                Other Vendors‘
                                                                                Database Services
 Current Relationships
                                          Internal      Externally              & Usage Reports
                                            only        accessible    Non
                                          WPUNJ          WPUNJ       WPUNJ

                                                                                 25
                                                                     Server
                                           Server         Server
Retirement the the OPD
• Serials holdings data was extracted
  from the OPD and added to
  Voyager catalog
• From Voyager catalog, serials
  holdings data is extracted and added
  to Serials Solutions A to Z list




                               26
Retirement of the OPD cont.
• Authentication of ILL form is routed
  through the EZProxy server
• A web bug is placed in the microform
  request page to record submission in the
  Voyager server's web logfile.




                                27
Systems Chart – ca. 2005 – Retiring the OPD
     Cheng Server                                                               www.wpunj.edu
                                         Online Periodicals                                                 Serials
                                                                                                            Form
                 Perl                        Database                                      ColdFusion
                                                                          ILL Form
                                                                                           Web Server     ER
                                                                            Micro                       Pag
         Web Server                                  Oracle                 Form                        e

            Voyager                                 Materials
                                                                                            Zeus
   Circulation             Media
                         Scheduling
                                                                                Off Campus Dbase Hits
                                                     Patrons
  Patrons                Searches                                               & ILL Form
                                                                                     ( EZProxy Log )

       Banner
    SIS     HRS                                                             University Networked
                                                                            Drive K:
( University ERP System )               University Email Server
                                                                                    Patrons     Materials

                                                                                     ILL
   Serials Solutions                              OCLC
     A to Z
                                                 WorldCat

                                                   ILL
                                                                            Other Vendors‘
                                                                            Database Services
 Current Relationships
                                      Internal      Externally              & Usage Reports
                                        only        accessible    Non
                                      WPUNJ          WPUNJ       WPUNJ

                                                                             28
                                                                 Server
                                       Server         Server
New Services Added
• Serials Solutions MARC Record Service
• Serials Solutions Link Resolver
• OCLC Worldcat Collection Analysis




                              29
Systems Chart – ca. 2005 – New Services Added
     Cheng Server                                                                  www.wpunj.edu               Serials
                                                                                                               Form
                 Perl                                                                        ColdFusion
                                                                             ILL Form
                                                                                            Web Server       ER
                                                                               Micro                       Pag
         Web Server                                                            Form                        e

             Voyager                                                                          Zeus
   Circulation             Media
                         Scheduling
                                                                                   Off Campus Dbase Hits
  Patrons                Searches                                                  & ILL Form
                                                                                        ( EZProxy Log )

       Banner
    SIS     HRS                                                                University Networked
                                                                               Drive K:
( University ERP System )               University Email Server
                                                                                       Patrons     Materials

                                                                                        ILL ( Cliodata )
    Serials Solutions                             OCLC
    A to Z
                                        W             WorldCat
    MARC Records                        C
    Link Resolver                       A                     ILL
                                                                               Other Vendors‘
                                                                               Database Services
 Current Relationships
                                      Internal   Externally                    & Usage Reports
                                        only     accessible          Non
                                      WPUNJ       WPUNJ             WPUNJ

                                                                                30
                                                                    Server
                                       Server      Server
Our Systems in 2008
•   Voyager ILS – Cheng Server
•   Shared Application Server
•   Clio ILL Software
•   EZProxy Server - Zeus
•   Banner – University ERP
•   University Networked Drive K:
•   University Email Server
•   University Web Server
                                    31
Systems Chart - 2008
     Cheng Server                                  Application Server                       www.wpunj.edu               Serials
                                                                                                                        Form
                 Perl                                                                                 ColdFusion
                                                                                      ILL Form
                                                      ColdFusion                                     Web Server       ER
                                                                                        Micro                       Pag
         Web Server                                                                     Form                        e

             Voyager                                  Web Server                                       Zeus
   Circulation             Media
                         Scheduling
                                                          DBMS                              Off Campus Dbase Hits
  Patrons                Searches                                                           & ILL Form
                                             OffCampus        ILL          ILL
                                               Dbase        Patrons/      Patrons/               ( EZProxy Log )
                                             Usage by      Materials
                                                                          Materials
                                               Patron      Requested
                                              Groups                      Received
       Banner
    SIS     HRS                                                                         University Networked
( University ERP System )                 University Email Server                       Drive K:
                                                                                                Patrons     Materials

    Serials Solutions                                  OCLC                                      ILL ( Cliodata )
    A to Z
                                          W                WorldCat
    MARC Records                          C
    Link Resolver                         A                        ILL
                                                                                        Other Vendors‘
                                                                                        Database Services
                                                                                        & Usage Reports
 Current Relationships
                                        Internal      Externally
                                          only        accessible          Non
                                        WPUNJ          WPUNJ             WPUNJ

                                                                                         32
                                                                         Server
                                         Server         Server
Second Step – Setup an Application
             Server




                          33
What is an Application Server?
• A machine or its software that works in
  conjunction with a web server to deliver
  application services such as the dynamic
  creation of a webpage from content stored in a
  database. From http://www.webtools.ca.gov/help/Glossary.asp

• Web Server Software (Apache or IIS)
• Database Management System – DBMS (MySQL,
  Oracle, MS SQL Server)
• Scripting Language (Perl, PHP, ColdFusion, ASP)

                                               34
Why an Application Server?
• Relevant data in logfiles need to be in
  a database to be analyze.

• Need your own DBMS to create new
  tables and queries.




                                  35
• Decide how you will use the
  Application Server.

• Decide on the best and most plausible
  configuration.




                                36
The Projects
• Mining EZProxy logfiles and linking to
  patron statistical categories from the
  Voyager Patron Database

  – What majors and departments are accessing
    which database services?

  – What majors and departments are accessing
    the ILL services?



                                   37
Systems Chart - 2008
Integrated Library System                                Application Server                     www.wpunj.edu               Serials
                                                                                                                            Form
                                                                                                      Scripting Language
  Scripting Language                                 Scripting Language
                                                                                          ILL Form
                                                                                                         Web Server       ER
                                                                                            Micro                       Pag
         Web Server                                                                         Form                        e

            Voyager                                         Web Server                               Proxy Server
   Circulation             Media
                         Scheduling
                                                                DBMS                            Off Campus Dbase Hits
  Patrons                Searches                                                               & ILL Form
                                                   OffCampus        ILL        ILL
                                                     Dbase        Patrons/    Patrons/               ( EZProxy Log )
                                                   Usage by      Materials
                                                                              Materials
                                                     Patron      Requested
                                                    Groups                    Received
       Banner
    SIS     HRS                                                                             University Networked
( University ERP System )                       University Email Server                     Drive K:
                                                                                                    Patrons     Materials

   Serials Solutions                                         OCLC                                    ILL ( Cliodata )
   A to Z
                                                W                WorldCat
   MARC Records                                 C
   Link Resolver                                A                    ILL
                                                                                            Other Vendors‘
                                                                                            Database Services
                                                                                            & Usage Reports
 Current Relationships
                                              Internal      Externally
 ILL Collection and Patron Group Analyses       only        accessible        Non
                                              WPUNJ          WPUNJ           WPUNJ

                                                                                             38
 Off Campus Database Hits by Patron Group                                    Server
                                               Server         Server
ILL request form authentications by major –
                Academic year 07/08
Article                              Book
Count Major                          Count Major
      62 M- Psychology                   90 M- History
      60 M- Sociology                    28 M- Non-Degree
      42 M- Applied Clinical Psych       25 M- Pub Pol & Intl Affairs
      35 M- Education                    20 M- Spanish
      31 M- History                      18 M- English
      30 M- Spanish                      16 M- Undecided
      29 M- Nursing                      14 M- Art
          M- Communication               14 M- Education
      19 Disorders                       11 M- Sociology
      19 M- Communication                10 M- Biology
      14 M- Biotechnology                 9 M- Music
      14 M- Counseling                    9 M- Special Programs
      14 M- English                       8 M- Psychology
      12 M- Non-Degree                    7 M- Biotechnology
      10 M- Community/Sch Health          7 M- Political Science
        7 M- Biology                      6 M- Anthropology
        7 M- Political Science            6 M- Music - Jazz Studies
        6 M- Undecided                    4 M- Business
        5 M- Comm Media Studies           4 M- Communication
        5 M- Reading                      4 M- Nursing
        4 M- Business                                    39
Which Databases are
     accessed by Majors and
         Departments?




07/29/08
By Major and Host
  Major                       Count Host
  M- Nursing                    3377 ebscohost.com
  M- Non-Degree                 3010 ebscohost.com
  M- Psychology                 2303 ebscohost.com
  M- Counseling                 1487 ebscohost.com
  M- Communication              1359 ebscohost.com
  M- Education                  1267 ebscohost.com
  M- Business                   1246 proquest.umi.com
  M- Sociology                  1152 ebscohost.com
  M- Business                   1145 lexis-nexis.com
  M- Undecided                  1100 ebscohost.com
  M- Applied Clinical Psych     1075 ebscohost.com
  M- English                    1034 ebscohost.com
  M- Sociology                   916 csa.com
  M- Business                    794 ebscohost.com
  M- Accounting                  738 lexis-nexis.com
  M- Reading                     683 ebscohost.com
  M- Physical Education          653 ebscohost.com
  M- Special Programs            600 ebscohost.com
  M- Non-Degree                  463 ereserve.wpunj.edu

07/29/08
By Dept and Host
Department               Count Host
S- Information Systems     933 webscript.exe?fs.scr
S- Psychology Dept.        742 ebscohost.com
S- Accounting and Law      559 lexis-nexis.com
S- Political Sci Dept.     308 lexis-nexis.com
S- Nursing Dept.           204 ebscohost.com
S- Market & Mgt. Dept.     175 proquest.umi.com
S- Library                 167 ebscohost.com
S- Sociology Dept.         151 ebscohost.com
S- Sociology Dept.         134 csa.com
S- History Dept.           121 serials.abc-clio.com
S- Exercise & Mov Sci      110 ebscohost.com
S- Political Sci Dept.     104 ebscohost.com
S- Library                 103 ILL_article.cfm
S- Library                 100 webscript.exe?fs.scr
S- History Dept.             94 webscript.exe?fs.scr

07/29/08
By Dept and Service

Department                Count Service
S- Information Systems       933 http://www.wpunj.edu/scripts/webscript.exe?fs.scr
S- Accounting and Law        549 http://www.lexis-nexis.com/universe
S- Psychology Dept.          364 http://search.ebscohost.com/login.aspx?authtype=ip,uid&profile=psych
S- Nursing Dept.             114 http://search.ebscohost.com/login.aspx?authtype=ip,uid&profile=c8h
S- Sociology Dept.            96 http://www.csa.com/htbin/dbrng.cgi?&db=socioabs-set-c&adv=1
S- Sociology Dept.            75 http://search.ebscohost.com/login.asp?profile=asp
                                 http://webspirs4.silverplatter.com:8900/c119646?
S- Philosophy Dept.           74 sp.form.first.p=srchmain.htm&sp.dbid.p=S(PHIL
S- Library                    65 http://search.ebscohost.com/login.aspx?authtype=ip,uid&profile=asp
S- Anthropology Dept.         62 http://www.sciencedirect.com/
S- History Dept.              61 http://serials.abc-clio.com/active/start?_appname=serials&initialdb=AHL
S- Psychology Dept.           61 http://search.ebscohost.com/login.asp?profile=psyart
S- History Dept.              58 http://serials.abc-clio.com/active/start?_appname=serials&initialdb=HA
S- Psychology Dept.           54 http://search.ebscohost.com/login.asp?profile=psych
S- Psychology Dept.           42 http://search.ebscohost.com/login.aspx?authtype=ip,uid&profile=psyart
S- English Dept.              42 http://search.ebscohost.com/login.aspx?authtype=ip,uid&profile=mzh

       07/29/08
Some concerns

Patron Privacy and Standards




07/29/08
Using Voyager as the model
      for Patron Privacy




07/29/08
• Active Circ transactions are stored in a
  table with patron ID and statistical
  categories.

• Completed Circ transactions are stored
  in a table without the patron ID, but still
  with the patron statistical categories.
• The Patron Table contains the total
  counts of transactions for each patron,
  but no link to which transactions they
  are.

07/29/08
• EZProxy transactions would be stored in
  one table with patron statistical
  categories, but without the user ID.

• User ID s would be stored in another
  table with counts for each service divided
  by academic year.

• Logs are collected monthly and loaded
  and deleted monthly.


07/29/08
Example of EZProxy log entry
•   Ip address     nj.dhcp.embarqhsd.net
•   (Not used)     -
•   user id        theuser
•   date/time      1/1/2008 4:25:15 AM
•   Method         GET
•   page           http://ezproxy.wpunj.edu:2048/connect?
                       session=sGHMbeSss121YxZa&url=http://www.wpunj.edu/scripts/
    retrieved          webscript.exe?fs.scr
                   HTTP/1.1
•   Version
                   302
•   response
    code
•   no. of bytes   537
•   Referring      http://ezproxy.wpunj.edu:2048/login?
                       url=http://www.wpunj.edu/scripts/webscript.exe?fs.scr
    URL
                   Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR
•   User agent         1.1.4322)



                                                                48
Perl Script for loading ezproxy
       log into MySQL
use strict;
my
%month=(Jan=>'01',Feb=>'02',Mar=>'03',Apr=>'04',May=>'05',Jun=>'06',Jul=>'07',
Aug=>'08',Sep=>'09',Oct=>'10',Nov=>'11',Dec=>'12');
while (<>){
     my $pattern =
            '^(S*) (S*) (S*) (S*) '.
            '[(..)/(...)/(....):(..):(..):(..) .....]'.
            ' "(S*) (S*) (S*)" '.
            '(d*) (-|d*) "([^"]*)" "([^"]*)"';
     if (m/$pattern/){
            my ($tgt,$ref,$agt) = (esc($12),esc($16),esc($17));
            my $byt = $15 eq '_'?'NULL':$15;
            print "INSERT INTO ezproxylogs VALUES ('$1','$2','$3',".
                    " TIMESTAMP '$7/$month{$6}/$5 $8:$9:$10','$11','$tgt',".
                    "'$13',$14,$byt,'$ref','$agt');r.";
     }else{
            print "--Skipped line $.n";
     }
}

sub esc{
     my ($p) = @_;
     $p =~ s/'/''/g;
     return $p;
}                                                             49
Created table to assist the
            linking
SELECT PATRON_ADDRESS.ADDRESS_TYPE,
Left([ADDRESS_LINE1],InStr([ADDRESS_LIN
E1],"@")-1) AS usr ,
PATRON_ADDRESS.PATRON_ID,
PATRON_ADDRESS.ADDRESS_STATUS,
PATRON_ADDRESS.EFFECT_DATE,
PATRON_ADDRESS.EXPIRE_DATE,
PATRON_ADDRESS.MODIFY_DATE,
PATRON_ADDRESS.MODIFY_OPERATOR_ID INTO
emailprefix
FROM PATRON_ADDRESS
WHERE
(((PATRON_ADDRESS.ADDRESS_TYPE)="3"));
                               50
The question of standards


Need standards to share data for
    comparative research




                            51
Types of Reporting
Email Reports
Periodic - e.g., Daily Dossiers
Event Triggered
On Demand
Email, web or print
Use by Dept/Major
Use by Fund Code Purchases




                                  52
Questions?


             Ray Schwartz,
      Systems Specialist Librarian
Cheng Library, William Paterson University,
       Wayne, New Jersey, USA
        schwartzr2 @ wpunj.edu




                                        53

Weitere ähnliche Inhalte

Andere mochten auch

Dabbling with Data Visualisation
Dabbling with Data VisualisationDabbling with Data Visualisation
Dabbling with Data Visualisation
Martin Hawksey
 
Andy Kirk talk at Big Data World Europe, September 2012
Andy Kirk talk at Big Data World Europe, September 2012Andy Kirk talk at Big Data World Europe, September 2012
Andy Kirk talk at Big Data World Europe, September 2012
Andy Kirk
 
Week 4 - A Visual Contrasts
Week 4 -  A Visual ContrastsWeek 4 -  A Visual Contrasts
Week 4 - A Visual Contrasts
Graeme Smith
 

Andere mochten auch (12)

Doing data visualizations with tableau
Doing data visualizations with tableauDoing data visualizations with tableau
Doing data visualizations with tableau
 
Dabbling with Data Visualisation
Dabbling with Data VisualisationDabbling with Data Visualisation
Dabbling with Data Visualisation
 
Towards Information-Theoretic Visualization Evaluation Measure: A Practical e...
Towards Information-Theoretic Visualization Evaluation Measure: A Practical e...Towards Information-Theoretic Visualization Evaluation Measure: A Practical e...
Towards Information-Theoretic Visualization Evaluation Measure: A Practical e...
 
Class 5
Class 5Class 5
Class 5
 
Introduction to Visual Management
Introduction to Visual ManagementIntroduction to Visual Management
Introduction to Visual Management
 
Andy Kirk talk at Big Data World Europe, September 2012
Andy Kirk talk at Big Data World Europe, September 2012Andy Kirk talk at Big Data World Europe, September 2012
Andy Kirk talk at Big Data World Europe, September 2012
 
Data visualization - see things differently. Natalie Yadrentseva
Data visualization - see things differently. Natalie YadrentsevaData visualization - see things differently. Natalie Yadrentseva
Data visualization - see things differently. Natalie Yadrentseva
 
Week 4 - A Visual Contrasts
Week 4 -  A Visual ContrastsWeek 4 -  A Visual Contrasts
Week 4 - A Visual Contrasts
 
Notes on visual representation
Notes on visual representationNotes on visual representation
Notes on visual representation
 
Tableau free tutorial
Tableau free tutorialTableau free tutorial
Tableau free tutorial
 
Tableau Drive, A new methodology for scaling your analytic culture
Tableau Drive, A new methodology for scaling your analytic cultureTableau Drive, A new methodology for scaling your analytic culture
Tableau Drive, A new methodology for scaling your analytic culture
 
Tableau Software - Business Analytics and Data Visualization
Tableau Software - Business Analytics and Data VisualizationTableau Software - Business Analytics and Data Visualization
Tableau Software - Business Analytics and Data Visualization
 

Ähnlich wie Crushing, Blending, and Stretching Data

It proforum template final
It proforum template finalIt proforum template final
It proforum template final
AbigailGoben
 
Andrew Cox Research data management
Andrew Cox Research data managementAndrew Cox Research data management
Andrew Cox Research data management
Incisive_Events
 

Ähnlich wie Crushing, Blending, and Stretching Data (20)

Crushing, Blending, and Stretching Transactional Data
Crushing, Blending, and Stretching Transactional DataCrushing, Blending, and Stretching Transactional Data
Crushing, Blending, and Stretching Transactional Data
 
Crushing, Blending, and Stretching Data
Crushing, Blending, and Stretching DataCrushing, Blending, and Stretching Data
Crushing, Blending, and Stretching Data
 
Web-Scale Discovery: Post Implementation
Web-Scale Discovery: Post ImplementationWeb-Scale Discovery: Post Implementation
Web-Scale Discovery: Post Implementation
 
EDINA Serials UKLA SafeNet
EDINA Serials UKLA SafeNetEDINA Serials UKLA SafeNet
EDINA Serials UKLA SafeNet
 
November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...
November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...
November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...
 
Clement, A measured approach to supporting research productivity
Clement, A measured approach to supporting research productivityClement, A measured approach to supporting research productivity
Clement, A measured approach to supporting research productivity
 
HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9 HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9
 
Managing discovery and linking services
Managing discovery and linking servicesManaging discovery and linking services
Managing discovery and linking services
 
Holdings Verification, Monitoring and Collecting Statistics – How Can Librari...
Holdings Verification, Monitoring and Collecting Statistics – How Can Librari...Holdings Verification, Monitoring and Collecting Statistics – How Can Librari...
Holdings Verification, Monitoring and Collecting Statistics – How Can Librari...
 
ChemSpider – disseminating data and enabling an abundance of chemistry platforms
ChemSpider – disseminating data and enabling an abundance of chemistry platformsChemSpider – disseminating data and enabling an abundance of chemistry platforms
ChemSpider – disseminating data and enabling an abundance of chemistry platforms
 
Workshop 4: Open Science & Open Data for Librarians/Ina Smith
Workshop 4: Open Science & Open Data for Librarians/Ina SmithWorkshop 4: Open Science & Open Data for Librarians/Ina Smith
Workshop 4: Open Science & Open Data for Librarians/Ina Smith
 
RDAP14: An analysis and characterization of DMPs in NSF proposals from the Un...
RDAP14: An analysis and characterization of DMPs in NSF proposals from the Un...RDAP14: An analysis and characterization of DMPs in NSF proposals from the Un...
RDAP14: An analysis and characterization of DMPs in NSF proposals from the Un...
 
Piloting an E-Journals Preservation Registry Service (PEPRS)
Piloting an E-Journals Preservation Registry Service (PEPRS)Piloting an E-Journals Preservation Registry Service (PEPRS)
Piloting an E-Journals Preservation Registry Service (PEPRS)
 
SALT - Surfacing the Academic Long Tail
SALT - Surfacing the Academic Long TailSALT - Surfacing the Academic Long Tail
SALT - Surfacing the Academic Long Tail
 
It proforum template final
It proforum template finalIt proforum template final
It proforum template final
 
38 cc 4_a_r-rosy
38 cc 4_a_r-rosy38 cc 4_a_r-rosy
38 cc 4_a_r-rosy
 
Enterprise Content Management and the Librarian
Enterprise Content Management and the LibrarianEnterprise Content Management and the Librarian
Enterprise Content Management and the Librarian
 
Andrew Cox Research data management
Andrew Cox Research data managementAndrew Cox Research data management
Andrew Cox Research data management
 
Dorothy Byatt JIBS-RLUK event July 2012
Dorothy Byatt JIBS-RLUK event July 2012Dorothy Byatt JIBS-RLUK event July 2012
Dorothy Byatt JIBS-RLUK event July 2012
 
Supporting the development of a national Research Data Discovery Service - A ...
Supporting the development of a national Research Data Discovery Service - A ...Supporting the development of a national Research Data Discovery Service - A ...
Supporting the development of a national Research Data Discovery Service - A ...
 

Mehr von Ray Schwartz

Logging Data on Voyager Transactions that Voyager does NOT Log
Logging Data on Voyager Transactions that Voyager does NOT LogLogging Data on Voyager Transactions that Voyager does NOT Log
Logging Data on Voyager Transactions that Voyager does NOT Log
Ray Schwartz
 
Application of EZProxy logs, Voyager’s Patron Database, MySQL, and ColdFusion...
Application of EZProxy logs, Voyager’s Patron Database, MySQL, and ColdFusion...Application of EZProxy logs, Voyager’s Patron Database, MySQL, and ColdFusion...
Application of EZProxy logs, Voyager’s Patron Database, MySQL, and ColdFusion...
Ray Schwartz
 

Mehr von Ray Schwartz (12)

Discovery layer decisions, configurations and strategies
Discovery layer decisions, configurations and strategiesDiscovery layer decisions, configurations and strategies
Discovery layer decisions, configurations and strategies
 
Deploying vu find as the discovery layer for voyager, eds, libguides, and oth...
Deploying vu find as the discovery layer for voyager, eds, libguides, and oth...Deploying vu find as the discovery layer for voyager, eds, libguides, and oth...
Deploying vu find as the discovery layer for voyager, eds, libguides, and oth...
 
Hacking vufind combined search and making bento searching
Hacking vufind combined search and making bento searchingHacking vufind combined search and making bento searching
Hacking vufind combined search and making bento searching
 
Browses
BrowsesBrowses
Browses
 
The path to flexible loading of patron records
The path to flexible loading of patron recordsThe path to flexible loading of patron records
The path to flexible loading of patron records
 
Using drill down within alma analytics reports
Using drill down within alma analytics reportsUsing drill down within alma analytics reports
Using drill down within alma analytics reports
 
Vale2017 b13-presentation
Vale2017 b13-presentationVale2017 b13-presentation
Vale2017 b13-presentation
 
Doing data visualizations with tableau
Doing data visualizations with tableauDoing data visualizations with tableau
Doing data visualizations with tableau
 
Besides Circulation, How else is the print collection being used? Reporting o...
Besides Circulation, How else is the print collection being used? Reporting o...Besides Circulation, How else is the print collection being used? Reporting o...
Besides Circulation, How else is the print collection being used? Reporting o...
 
Fetch It! A Custom Voyager service for Holds/Retrieval without using reporter
Fetch It! A Custom Voyager service for Holds/Retrieval without using reporterFetch It! A Custom Voyager service for Holds/Retrieval without using reporter
Fetch It! A Custom Voyager service for Holds/Retrieval without using reporter
 
Logging Data on Voyager Transactions that Voyager does NOT Log
Logging Data on Voyager Transactions that Voyager does NOT LogLogging Data on Voyager Transactions that Voyager does NOT Log
Logging Data on Voyager Transactions that Voyager does NOT Log
 
Application of EZProxy logs, Voyager’s Patron Database, MySQL, and ColdFusion...
Application of EZProxy logs, Voyager’s Patron Database, MySQL, and ColdFusion...Application of EZProxy logs, Voyager’s Patron Database, MySQL, and ColdFusion...
Application of EZProxy logs, Voyager’s Patron Database, MySQL, and ColdFusion...
 

Kürzlich hochgeladen

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Kürzlich hochgeladen (20)

Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 

Crushing, Blending, and Stretching Data

  • 1. Crushing, Blending, and Stretching Data Data Warehousing and Mining Data from Voyager and Other Library and University Systems for Assessment of Library Operations ELUNA Conference 2008, Long Beach, CA, Friday, August 1, 2008 Ray Schwartz, Systems Specialist Librarian Cheng Library, William Paterson University, Wayne, New Jersey, USA schwartzr2 @ wpunj.edu
  • 2. Outline • Why Assessment and Why Now? • What is Data Mining and Data Warehousing and Why Do We Do It? • Our Context • Groups and Services • Steps • Reporting 2
  • 3. Outline • What is Data Mining and Data Warehousing? • Our Context • Groups and Services • Steps • Reporting 3
  • 4. Have We Always Assessed? • Anecdotally—Yes. • Systematically—Not usually. – Large scale assessment of manual systems (such as serials check-in, and card catalogs, circulation files) are not practical. – Smaller scale and directed assessment is possible. 4
  • 5. What changed since the days of manual systems? • For many institutions in the West, the Integrated Library System has been in use for over 20 years. • Larger scale assessment is now possible with the electronic systems. 5
  • 6. 6
  • 7. 7
  • 8. What is different now? • New services have come into existence. – Inside libraries • Full-Text Databases • Link Resolvers – Outside of libraries • Google • Amazon 8
  • 9. 9
  • 10. What is Data Mining and Data Warehousing • Extracting data from legacy systems and other resources; • cleaning, scrubbing and preparing data for decision support; • maintaining data in appropriate data stores; • accessing and analysing data using a variety of end user tools; • and mining data for significant relationships. • Chaffey, D., Mayer, R., Johnston, K., & Ellis-Chadwick, F. (2002). Internet Marketing: Strategy, Implementation and Practice (2nd ed.). Financial Times/ Prentice Hall. 10
  • 11. • The primary purpose of these efforts is to provide easy access to specifically prepared data that can be used with decision support applications such as management reports, queries, decision support systems , executive information systems and data mining. • Chaffey, D., Mayer, R., Johnston, K., & Ellis-Chadwick, F. (2002). Internet Marketing: Strategy, Implementation and Practice (2nd ed.). Financial Times/ Prentice Hall. 11
  • 12. Of course there are many ways to measure – Scott Nicholson’s Measurement Model 12
  • 13. Measurement Matrix with methodologies Topic Perspective Library System Use Procedures and Standards Recorded interactions with Internal (Library •Staff survey and interviews interface & materials System) •Audits of collections, systems, •Bibliomining or staff •Transaction/Web Log Analysis •Observation of User Behavior Aboutness and Usability Knowledge states and User External •Surveys and interviews citations to materials •Talk-alouds and inprocess •Surveys and interviews (User) feedback mechanisms •Focus groups •Focus groups •User Citation tracking 13 Nicholson, Scott (2004). A Conceptual framework for the holistic measurement and cumulative evaluation of library services. Journal of Documentation 60(2) p.164-181
  • 15. Our University • 9000 undergraduates • 1000 graduates (mostly education majors) • 400 faculty • 800 adjuncts • 1000 staff 15
  • 16. Our Library • 19 librarians and 26 library staff • 350,000 volumes • 18,000 audiovisual items • 22,000 print and electronic periodicals • 100 general and subject specific databases 16
  • 17. Our Systems circa 2005 • Voyager ILS – Cheng Server • Online Periodical Database (OPD) • Clio ILL Software • EZProxy Server - Zeus • Banner – University ERP • University Networked Drive K: • University Email Server • University Web Server 17
  • 18. Vendor Services • Serials Solutions • OCLC • Blackwell • Ebsco • Marcive • Database Vendors 18
  • 19. The Question Which categories of patrons are accessing which services? 19
  • 20. First Step – Patron Statistical Categories 20
  • 21. • Voyager Patron Database allows a maximum of 10 statistical categories per patron record. • Decide which statistical categories are needed for each patron group defined. • Work with your University Information Systems Department to extract the relevant data from the relevant sources. 21
  • 22. Groups and Services • Major • Circulation • Status – Books – Media – Undergrad or Grad – Reserve – Faculty, Adjunct Faculty or – By Fund Code Staff – Location • Department • ILL / Document Delivery • College • Databases • Degree • Library Web Pages • – Subject Area Resource Guides No. of Credits – Reference Requests • Year of Study • Catalog • Campus Location • Other Vendor Services – Serials Solutions 22
  • 23. History Department - 12 months - Feb. 2008 % BORROW CIRC/ CIRC/ PATRON STATUS BOOK CIRC MEDIA CIRC EQUIP CIRC TOTAL CIRC MEMBERS BORROWERS ING MEMBER BORROWER UNDERGRADUATE STUDENTS 2,715 250 698 3,663 238 186 78% 15.39 19.69 GRADUATE STUDENTS 419 13 76 508 14 13 93% 36.29 39.08 ADJUNCT FACULTY 100 65 20 185 32 20 63% 5.78 9.25 FULL-TIME FACULTY 159 115 194 468 24 23 96% 19.50 20.35 HISTORY TOTALS 3,393 443 988 4,824 308 242 79% 15.66 19.93 LIBRARY TOTALS 23,370 8,713 20,703 52,756 7,418 4,981 67% 7.11 10.59 DEFINITIONS: BOOK CIRCULATION = books, book disks, maps, oversize, Curriculum materials, reserve books, NJ History, Leisure Lounge MEDIA CIRCULATION = audio & video materials, including media reserves EQUIPMENT CIRCULATION = camcorders, overhead & data projectors, laptops, easels, DVD players, etc. MEMBER = declared major or department member BORROWER = any member who borrowed materials Library Total = declared undergrad & grad majors, adjuncts & full time faculty borrowers 23
  • 24. Problems with Configuration of Services • Little to no linkage of data • Need to search multiple services to get complete picture of serial holdings • Multiple user IDs for authentication 24
  • 25. Systems Chart – ca. 2005 Cheng Server www.wpunj.edu Online Periodicals Serials Form Perl Database ColdFusion ILL Form Web Server ER Micro Pag Web Server Oracle Form e Voyager Materials Zeus Circulation Media Scheduling Off Campus Dbase Hits Patrons Patrons Searches & ILL Form ( EZProxy Log ) Banner SIS HRS University Networked Drive K: ( University ERP System ) University Email Server Patrons Materials ILL ( Cliodata ) Serials Solutions OCLC A to Z WorldCat ILL Other Vendors‘ Database Services Current Relationships Internal Externally & Usage Reports only accessible Non WPUNJ WPUNJ WPUNJ 25 Server Server Server
  • 26. Retirement the the OPD • Serials holdings data was extracted from the OPD and added to Voyager catalog • From Voyager catalog, serials holdings data is extracted and added to Serials Solutions A to Z list 26
  • 27. Retirement of the OPD cont. • Authentication of ILL form is routed through the EZProxy server • A web bug is placed in the microform request page to record submission in the Voyager server's web logfile. 27
  • 28. Systems Chart – ca. 2005 – Retiring the OPD Cheng Server www.wpunj.edu Online Periodicals Serials Form Perl Database ColdFusion ILL Form Web Server ER Micro Pag Web Server Oracle Form e Voyager Materials Zeus Circulation Media Scheduling Off Campus Dbase Hits Patrons Patrons Searches & ILL Form ( EZProxy Log ) Banner SIS HRS University Networked Drive K: ( University ERP System ) University Email Server Patrons Materials ILL Serials Solutions OCLC A to Z WorldCat ILL Other Vendors‘ Database Services Current Relationships Internal Externally & Usage Reports only accessible Non WPUNJ WPUNJ WPUNJ 28 Server Server Server
  • 29. New Services Added • Serials Solutions MARC Record Service • Serials Solutions Link Resolver • OCLC Worldcat Collection Analysis 29
  • 30. Systems Chart – ca. 2005 – New Services Added Cheng Server www.wpunj.edu Serials Form Perl ColdFusion ILL Form Web Server ER Micro Pag Web Server Form e Voyager Zeus Circulation Media Scheduling Off Campus Dbase Hits Patrons Searches & ILL Form ( EZProxy Log ) Banner SIS HRS University Networked Drive K: ( University ERP System ) University Email Server Patrons Materials ILL ( Cliodata ) Serials Solutions OCLC A to Z W WorldCat MARC Records C Link Resolver A ILL Other Vendors‘ Database Services Current Relationships Internal Externally & Usage Reports only accessible Non WPUNJ WPUNJ WPUNJ 30 Server Server Server
  • 31. Our Systems in 2008 • Voyager ILS – Cheng Server • Shared Application Server • Clio ILL Software • EZProxy Server - Zeus • Banner – University ERP • University Networked Drive K: • University Email Server • University Web Server 31
  • 32. Systems Chart - 2008 Cheng Server Application Server www.wpunj.edu Serials Form Perl ColdFusion ILL Form ColdFusion Web Server ER Micro Pag Web Server Form e Voyager Web Server Zeus Circulation Media Scheduling DBMS Off Campus Dbase Hits Patrons Searches & ILL Form OffCampus ILL ILL Dbase Patrons/ Patrons/ ( EZProxy Log ) Usage by Materials Materials Patron Requested Groups Received Banner SIS HRS University Networked ( University ERP System ) University Email Server Drive K: Patrons Materials Serials Solutions OCLC ILL ( Cliodata ) A to Z W WorldCat MARC Records C Link Resolver A ILL Other Vendors‘ Database Services & Usage Reports Current Relationships Internal Externally only accessible Non WPUNJ WPUNJ WPUNJ 32 Server Server Server
  • 33. Second Step – Setup an Application Server 33
  • 34. What is an Application Server? • A machine or its software that works in conjunction with a web server to deliver application services such as the dynamic creation of a webpage from content stored in a database. From http://www.webtools.ca.gov/help/Glossary.asp • Web Server Software (Apache or IIS) • Database Management System – DBMS (MySQL, Oracle, MS SQL Server) • Scripting Language (Perl, PHP, ColdFusion, ASP) 34
  • 35. Why an Application Server? • Relevant data in logfiles need to be in a database to be analyze. • Need your own DBMS to create new tables and queries. 35
  • 36. • Decide how you will use the Application Server. • Decide on the best and most plausible configuration. 36
  • 37. The Projects • Mining EZProxy logfiles and linking to patron statistical categories from the Voyager Patron Database – What majors and departments are accessing which database services? – What majors and departments are accessing the ILL services? 37
  • 38. Systems Chart - 2008 Integrated Library System Application Server www.wpunj.edu Serials Form Scripting Language Scripting Language Scripting Language ILL Form Web Server ER Micro Pag Web Server Form e Voyager Web Server Proxy Server Circulation Media Scheduling DBMS Off Campus Dbase Hits Patrons Searches & ILL Form OffCampus ILL ILL Dbase Patrons/ Patrons/ ( EZProxy Log ) Usage by Materials Materials Patron Requested Groups Received Banner SIS HRS University Networked ( University ERP System ) University Email Server Drive K: Patrons Materials Serials Solutions OCLC ILL ( Cliodata ) A to Z W WorldCat MARC Records C Link Resolver A ILL Other Vendors‘ Database Services & Usage Reports Current Relationships Internal Externally ILL Collection and Patron Group Analyses only accessible Non WPUNJ WPUNJ WPUNJ 38 Off Campus Database Hits by Patron Group Server Server Server
  • 39. ILL request form authentications by major – Academic year 07/08 Article Book Count Major Count Major 62 M- Psychology 90 M- History 60 M- Sociology 28 M- Non-Degree 42 M- Applied Clinical Psych 25 M- Pub Pol & Intl Affairs 35 M- Education 20 M- Spanish 31 M- History 18 M- English 30 M- Spanish 16 M- Undecided 29 M- Nursing 14 M- Art M- Communication 14 M- Education 19 Disorders 11 M- Sociology 19 M- Communication 10 M- Biology 14 M- Biotechnology 9 M- Music 14 M- Counseling 9 M- Special Programs 14 M- English 8 M- Psychology 12 M- Non-Degree 7 M- Biotechnology 10 M- Community/Sch Health 7 M- Political Science 7 M- Biology 6 M- Anthropology 7 M- Political Science 6 M- Music - Jazz Studies 6 M- Undecided 4 M- Business 5 M- Comm Media Studies 4 M- Communication 5 M- Reading 4 M- Nursing 4 M- Business 39
  • 40. Which Databases are accessed by Majors and Departments? 07/29/08
  • 41. By Major and Host Major Count Host M- Nursing 3377 ebscohost.com M- Non-Degree 3010 ebscohost.com M- Psychology 2303 ebscohost.com M- Counseling 1487 ebscohost.com M- Communication 1359 ebscohost.com M- Education 1267 ebscohost.com M- Business 1246 proquest.umi.com M- Sociology 1152 ebscohost.com M- Business 1145 lexis-nexis.com M- Undecided 1100 ebscohost.com M- Applied Clinical Psych 1075 ebscohost.com M- English 1034 ebscohost.com M- Sociology 916 csa.com M- Business 794 ebscohost.com M- Accounting 738 lexis-nexis.com M- Reading 683 ebscohost.com M- Physical Education 653 ebscohost.com M- Special Programs 600 ebscohost.com M- Non-Degree 463 ereserve.wpunj.edu 07/29/08
  • 42. By Dept and Host Department Count Host S- Information Systems 933 webscript.exe?fs.scr S- Psychology Dept. 742 ebscohost.com S- Accounting and Law 559 lexis-nexis.com S- Political Sci Dept. 308 lexis-nexis.com S- Nursing Dept. 204 ebscohost.com S- Market & Mgt. Dept. 175 proquest.umi.com S- Library 167 ebscohost.com S- Sociology Dept. 151 ebscohost.com S- Sociology Dept. 134 csa.com S- History Dept. 121 serials.abc-clio.com S- Exercise & Mov Sci 110 ebscohost.com S- Political Sci Dept. 104 ebscohost.com S- Library 103 ILL_article.cfm S- Library 100 webscript.exe?fs.scr S- History Dept. 94 webscript.exe?fs.scr 07/29/08
  • 43. By Dept and Service Department Count Service S- Information Systems 933 http://www.wpunj.edu/scripts/webscript.exe?fs.scr S- Accounting and Law 549 http://www.lexis-nexis.com/universe S- Psychology Dept. 364 http://search.ebscohost.com/login.aspx?authtype=ip,uid&profile=psych S- Nursing Dept. 114 http://search.ebscohost.com/login.aspx?authtype=ip,uid&profile=c8h S- Sociology Dept. 96 http://www.csa.com/htbin/dbrng.cgi?&db=socioabs-set-c&adv=1 S- Sociology Dept. 75 http://search.ebscohost.com/login.asp?profile=asp http://webspirs4.silverplatter.com:8900/c119646? S- Philosophy Dept. 74 sp.form.first.p=srchmain.htm&sp.dbid.p=S(PHIL S- Library 65 http://search.ebscohost.com/login.aspx?authtype=ip,uid&profile=asp S- Anthropology Dept. 62 http://www.sciencedirect.com/ S- History Dept. 61 http://serials.abc-clio.com/active/start?_appname=serials&initialdb=AHL S- Psychology Dept. 61 http://search.ebscohost.com/login.asp?profile=psyart S- History Dept. 58 http://serials.abc-clio.com/active/start?_appname=serials&initialdb=HA S- Psychology Dept. 54 http://search.ebscohost.com/login.asp?profile=psych S- Psychology Dept. 42 http://search.ebscohost.com/login.aspx?authtype=ip,uid&profile=psyart S- English Dept. 42 http://search.ebscohost.com/login.aspx?authtype=ip,uid&profile=mzh 07/29/08
  • 44. Some concerns Patron Privacy and Standards 07/29/08
  • 45. Using Voyager as the model for Patron Privacy 07/29/08
  • 46. • Active Circ transactions are stored in a table with patron ID and statistical categories. • Completed Circ transactions are stored in a table without the patron ID, but still with the patron statistical categories. • The Patron Table contains the total counts of transactions for each patron, but no link to which transactions they are. 07/29/08
  • 47. • EZProxy transactions would be stored in one table with patron statistical categories, but without the user ID. • User ID s would be stored in another table with counts for each service divided by academic year. • Logs are collected monthly and loaded and deleted monthly. 07/29/08
  • 48. Example of EZProxy log entry • Ip address nj.dhcp.embarqhsd.net • (Not used) - • user id theuser • date/time 1/1/2008 4:25:15 AM • Method GET • page http://ezproxy.wpunj.edu:2048/connect? session=sGHMbeSss121YxZa&url=http://www.wpunj.edu/scripts/ retrieved webscript.exe?fs.scr HTTP/1.1 • Version 302 • response code • no. of bytes 537 • Referring http://ezproxy.wpunj.edu:2048/login? url=http://www.wpunj.edu/scripts/webscript.exe?fs.scr URL Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR • User agent 1.1.4322) 48
  • 49. Perl Script for loading ezproxy log into MySQL use strict; my %month=(Jan=>'01',Feb=>'02',Mar=>'03',Apr=>'04',May=>'05',Jun=>'06',Jul=>'07', Aug=>'08',Sep=>'09',Oct=>'10',Nov=>'11',Dec=>'12'); while (<>){ my $pattern = '^(S*) (S*) (S*) (S*) '. '[(..)/(...)/(....):(..):(..):(..) .....]'. ' "(S*) (S*) (S*)" '. '(d*) (-|d*) "([^"]*)" "([^"]*)"'; if (m/$pattern/){ my ($tgt,$ref,$agt) = (esc($12),esc($16),esc($17)); my $byt = $15 eq '_'?'NULL':$15; print "INSERT INTO ezproxylogs VALUES ('$1','$2','$3',". " TIMESTAMP '$7/$month{$6}/$5 $8:$9:$10','$11','$tgt',". "'$13',$14,$byt,'$ref','$agt');r."; }else{ print "--Skipped line $.n"; } } sub esc{ my ($p) = @_; $p =~ s/'/''/g; return $p; } 49
  • 50. Created table to assist the linking SELECT PATRON_ADDRESS.ADDRESS_TYPE, Left([ADDRESS_LINE1],InStr([ADDRESS_LIN E1],"@")-1) AS usr , PATRON_ADDRESS.PATRON_ID, PATRON_ADDRESS.ADDRESS_STATUS, PATRON_ADDRESS.EFFECT_DATE, PATRON_ADDRESS.EXPIRE_DATE, PATRON_ADDRESS.MODIFY_DATE, PATRON_ADDRESS.MODIFY_OPERATOR_ID INTO emailprefix FROM PATRON_ADDRESS WHERE (((PATRON_ADDRESS.ADDRESS_TYPE)="3")); 50
  • 51. The question of standards Need standards to share data for comparative research 51
  • 52. Types of Reporting Email Reports Periodic - e.g., Daily Dossiers Event Triggered On Demand Email, web or print Use by Dept/Major Use by Fund Code Purchases 52
  • 53. Questions? Ray Schwartz, Systems Specialist Librarian Cheng Library, William Paterson University, Wayne, New Jersey, USA schwartzr2 @ wpunj.edu 53