SlideShare ist ein Scribd-Unternehmen logo
1 von 37
Downloaden Sie, um offline zu lesen
Invisible	
  Ins*tu*onal	
  Repositories:	
  
Addressing	
  the	
  Low	
  Indexing	
  Ra*o	
  of	
  IRs	
  in	
  
Google	
  Scholar	
  by	
  Transforming	
  Metadata	
  
Schema	
  rlitsch	
  &	
  Patrick	
  OBrien	
  
Kenning	
  A
October	
  31,	
  2011	
  
2011	
  Fall	
  DLF,	
  Baltimore,	
  MD	
  
Today’s	
  Objec*ves	
  

u    Discuss	
  Marriott	
  Library	
  SEO	
  program	
  
      v  Program	
  Priorities	
  &	
  Results	
  	
  

      v  Issues	
  &	
  Opportunity	
  

      v  Google	
  Scholar	
  
MarrioE	
  Library	
  SEO	
  program	
  priori*es	
  

u    Digital	
  repositories	
  vs.	
  general	
  websites	
  
      v  Millions	
  of	
  objects	
  in	
  databases	
  

      v  Include	
  IR	
  

u    Priority	
  1	
  –	
  Increase	
  Reach	
  
      v  Get	
  objects	
  indexed	
  in	
  search	
  engines	
  

u    Priority	
  2	
  –	
  Increase	
  Visibility	
  
      v  Provide	
  robust	
  descriptive	
  content	
  
Collec*on	
  Google	
  Index	
  Ra*os	
  have	
  
          increased	
  across	
  the	
  board…	
  
                                            Google Index Ratio - All Collections*


                        12%	
  
Average	
                                                         51%	
  
                                                                                          74%	
  


                                                   37%	
  
 High**	
                                                                                              87%	
  
                                                                                                                     100%	
  

          0%	
                    25%	
                      50%	
                     75%	
                     100%	
  

                                     07/05/10	
          04/04/11	
          10/16/11	
  

 * Google Index Ratio = URLs submitted / URLs Indexed by Google for about 150 collections containing ~170,00 URLs
 **Highest index ratio achieved for Collections with over 500 URLs submitted to Google
…increasing	
  Google	
  referrals	
  by	
  200%	
  and	
  
total	
  visitors	
  by	
  79%.	
  
                     12 week year-over-year
However,	
  Google	
  Scholar	
  Index	
  Ra*os	
  ??	
  


         Google Scholar Index Ratio




                  0%
You can find Marriott IR papers in Google now, but can
       not find them in Google Scholar. Why?
Today’s	
  Objec*ves	
  

u    Discuss	
  Marriott	
  Library	
  SEO	
  program	
  
      v  Program	
  Priorities	
  &	
  Results	
  	
  

      v  Issues	
  &	
  Opportunity	
  

      v  Google	
  Scholar	
  
College	
  Students	
  Begin	
  Research	
  -­‐	
  2005	
  
College	
  Students	
  Begin	
  Research	
  -­‐	
  2010	
  




                 DeRosa,	
  Cathy,	
  et	
  al.	
  “Perceptions	
  of	
  Libraries,	
  2010:	
  Context	
  and	
  Community:	
  A	
  Report	
  
                 to	
  the	
  OCLC	
  Membership”,	
  OCLC,	
  2010.	
  
Start	
  with	
  the	
  800	
  pound	
  gorilla	
  –	
  Google.	
  
MarrioE	
  Library	
  Management	
  
Experiences	
  
u     Large	
  digital	
  collections	
  built	
  over	
  a	
  decade	
  
       v  1.3+	
  million	
  items	
  

u     Why	
  weren’t	
  we	
  getting	
  indexed?	
  
       v  Harvesting/indexing	
  rates	
  as	
  low	
  as	
  8%	
  

       v  Non-­‐existent	
  IR	
  showing	
  in	
  Google	
  Scholar	
  

u     Sitemaps	
  generated	
  for	
  Google	
  
	
  
MWDL	
  Repositories	
  Survey	
  

                                                           %	
  w/	
  Indirect	
  URL	
  
Utah	
  Digital	
  Newspapers	
  Repository	
  
            University	
  of	
  Nevada,	
  Reno	
  
                        University	
  of	
  Utah	
  	
  
             Southern	
  Utah	
  University	
  	
  
            Brigham	
  Young	
  University	
  	
  
                     Utah	
  State	
  University	
  	
  
                      Utah	
  State	
  Archives	
  	
  
                     Utah	
  State	
  University	
  	
  
                    Utah	
  Valley	
  University	
  
                   Weber	
  State	
  University	
  	
  
   Health	
  Education	
  Assets	
  Library	
  	
  
   University	
  of	
  Nevada,	
  Las	
  Vegas	
  	
  
                        Utah	
  State	
  Library	
  

                                                       0%	
             25%	
         50%	
     75%	
             100%	
  

                                                                                                          October 2010
MWDL	
  Repositories	
  Survey	
  

                                                           %	
  w/	
  Direct	
  URL	
  
           University	
  of	
  Nevada,	
  Reno	
  	
  
                     Utah	
  State	
  University	
  	
  
                        University	
  of	
  Utah	
  	
  
                     Utah	
  State	
  University	
  	
  
   University	
  of	
  Nevada,	
  Las	
  Vegas	
  	
  
                    Utah	
  Valley	
  University	
  	
  
            Brigham	
  Young	
  University	
  	
  
                   Weber	
  State	
  University	
  	
  
   Health	
  Education	
  Assets	
  Library	
  	
  
             Southern	
  Utah	
  University	
  	
  
                        Utah	
  State	
  Library	
  
                      Utah	
  State	
  Archives	
  	
  
Utah	
  Digital	
  Newspapers	
  Repository	
  

                                                       0%	
            25%	
          50%	
     75%	
             100%	
  

                                                                                                          October 2010
Literature	
  Lessons	
  

u  Most	
  are	
  dated	
  
u  Most	
  deal	
  with	
  general	
  websites	
  

u  Few	
  deal	
  with	
  digital	
  collections	
  in	
  db’s	
  

u  Some	
  suggest	
  duplicating	
  the	
  content	
  outside	
  

    the	
  database	
  
Today’s	
  Objec*ves	
  

u    Discuss	
  Marriott	
  Library	
  SEO	
  program	
  
      v  Program	
  Priorities	
  &	
  Results	
  	
  

      v  Issues	
  &	
  Opportunity	
  

      v  Google	
  Scholar	
  
Why	
  does	
  Google	
  Scholar	
  MaEer	
  ??	
  

u  “researchers	
  find	
  Google	
  and	
  Google	
  Scholar	
  to	
  be	
  
    amazingly	
  effective”	
  and	
  accept	
  the	
  results	
  as	
  “good	
  
    enough	
  in	
  many	
  cases”	
  (Kroll	
  &	
  Forsman	
  2010)	
  	
  
u  “broader	
  awareness	
  of	
  specialized	
  Google	
  tools	
  such	
  
    as	
  Google	
  Scholar	
  and	
  Google	
  Book	
  among	
  faculty	
  
    members	
  and	
  graduate	
  students”	
  (Rieger	
  2009)	
  	
  
u  “the	
  amount	
  of	
  qualified	
  scholarly	
  content	
  has	
  
    increased	
  considerably	
  in	
  Google	
  Scholar	
  since	
  it	
  
    was	
  launched	
  in	
  2004	
  (Mikki	
  2009)	
  
u  4%	
  -­‐	
  27%	
  use	
  increase	
  in	
  four-­‐year	
  U	
  Miss	
  study	
  
    (Herrera	
  2010)	
  
USpace	
  IR	
  Google	
  Index	
  Ra*os	
  baseline	
  

                                                      Google Index Ratio

                                                 12%	
  
   07/05/10	
       ETD	
  1	
  
   11/19/10	
  
   10/16/11	
                      0%	
  
                    ETD	
  2	
  

                                                           23%	
  
      UScholar	
  Works	
  

                                        4%	
  
    Board	
  of	
  Regents	
  

                               0%	
                    25%	
         50%	
     75%	
     100%	
  


*Weighted Average Google Index Ratio = 18.33% (1,188/6,482)
USpace	
  IR	
  Google	
  Index	
  Ra*os	
  baseline	
  

                                                 Google Index Ratio

   07/05/10	
  
                           Google Scholar Index Ratio
                    ETD	
  1	
  
                                 12%	
  
   11/19/10	
  




                                                 0%
   10/16/11	
                      0%	
  
                    ETD	
  2	
  

                                                   23%	
  
      UScholar	
  Works	
  

                                        4%	
  
    Board	
  of	
  Regents	
  

                               0%	
              25%	
          50%	
     75%	
     100%	
  


*Weighted Average Google Index Ratio = 18.33% (1,188/6,482)
Low	
  GS	
  indexing	
  ra*os	
  cut	
  across	
  
   ins*tu*ons	
  
                    Google	
  Scholar	
  Indexing	
  Ratio	
  for	
  Selected	
  Institutional	
  
                         and	
  Disciplinary	
  Repositories	
  October	
  2011	
  
                  Baylor	
  U	
  -­‐	
  BearDocs	
                                                                                                                                   89%	
  
  Digital	
  Commons@UNLincoln	
                                                                                                                   60%	
  
Virginia	
  Tech	
  -­‐	
  CS	
  Tech	
  Reports	
                                                                                                 60%	
  
                     Aquatic	
  Commons	
                                                                                                       56%	
  
                           Cornell	
  -­‐	
  arXiv	
                                                                                 47%	
  
Cornell	
  -­‐	
  Digital	
  Commons@ILR	
                                                                                 40%	
  
                            IUPUI	
  Scholar	
                                                                        38%	
  
                BYU	
  Scholars	
  Archive	
                                                                     34%	
  
                Michigan	
  -­‐	
  Deep	
  Blue	
                                                                34%	
  
Univ	
  of	
  Oregon	
  -­‐	
  Scholars	
  Bank	
                                                        29%	
  
                Harvard	
  Univ	
  -­‐	
  DASH	
                                                       28%	
  
                 eCommons@Cornell	
                                                        18%	
  
        UW	
  Madison	
  -­‐	
  Minds@UW	
                                               17%	
  
             Texas	
  A&M	
  Repository	
                                            16%	
  
                        IU	
  Scholarworks	
                                      13%	
  
        Columbia	
  Univ	
  -­‐	
  Academic	
                                    13%	
  
                   D-­‐Scholarship@Pitt	
                                       12%	
  
                          CaltechAuthors	
                                     10%	
  
      Univ	
  of	
  Rochester	
  Research	
                           6%	
  
   UW	
  -­‐	
  ResearchWorks	
  Archive	
                        3%	
  

                                                         0%	
         10%	
          20%	
           30%	
         40%	
         50%	
         60%	
         70%	
     80%	
     90%	
     100%	
  
Survey	
  Methodology	
  Key	
  Points	
  

u    Selected	
  from	
  OpenDOAR	
  
      v  Only	
  IRs	
  from	
  the	
  U.S.	
  
          n  “Pure”	
  institutional	
  or	
  disciplinary	
  repositories	
  
      v  Different	
  software	
  types	
  
          n  DSpace,	
  Digital	
  Commons,	
  EPrints,	
  IR+,	
  CONTENTdm,	
  
            DigiTool,	
  arXiv	
  
u  Calculated	
  total	
  items	
  in	
  each	
  repository	
  
u  Site	
  operator	
  search	
  
      v  Site:repositoryURL	
  
      v  Shows	
  Approximation	
  
GS	
  “site”	
  operator	
  provides	
  a	
  close	
  
approxima*on	
  for	
  indexing	
  ra*o	
  
Repository	
  so_ware	
  does	
  not	
  appear	
  to	
  be	
  the	
  
                   deciding	
  factor	
  
Repository	
  Name	
                                   Repository	
  So_ware	
     Repository	
  URL	
                     Repository	
  items	
   Items	
  in	
  Google	
  Scholar	
   Indexing	
  Ra*o	
  
Boston	
  College	
  -­‐	
  eScholarship@BC	
          DigiTool	
                  dcollec7ons.bc.edu	
                                      1,635	
                                  1	
                      0%	
  
UW	
  -­‐	
  ResearchWorks	
  Archive	
                Dspace	
                    digital.lib.washington.edu/dspace	
                     11,285	
                               304	
                        3%	
  
Univ	
  of	
  Rochester	
  Research	
                  IR+	
                       urresearch.rochester.edu	
                              16,184	
                               983	
                        6%	
  
CaltechAuthors	
                                       Eprints	
                   authors.library.caltech.edu	
                           22,000	
                             2,290	
                   10%	
  
D-­‐Scholarship@PiT	
                                  Eprints	
                   d-­‐scholarship.piT.edu	
                                 5,888	
                              686	
                   12%	
  
Columbia	
  Univ	
  -­‐	
  Academic	
  Commons	
       Digital	
  Commons	
        academiccommons.columbia.edu	
                            4,631	
                              586	
                   13%	
  
IU	
  Scholarworks	
                                   Dspace	
                    scholarworks.iu.edu/dspace	
                              7,782	
                            1,030	
                   13%	
  
Texas	
  A&M	
  Repository	
                           Dspace	
                    repository.tamu.edu	
                                   46,324	
                             7,250	
                   16%	
  
UW	
  Madison	
  -­‐	
  Minds@UW	
                     Dspace	
                    minds.wisconsin.edu	
                                   15,078	
                             2,520	
                   17%	
  
eCommons@Cornell	
                                     Dspace	
                    ecommons.library.cornell.edu	
                          18,544	
                             3,410	
                   18%	
  
Harvard	
  Univ	
  -­‐	
  DASH	
                       Dspace	
                    dash.harvard.edu	
                                        6,193	
                            1,710	
                   28%	
  
Univ	
  of	
  Oregon	
  -­‐	
  Scholars	
  Bank	
      Dspace	
                    scholarsbank.uoregon.edu/xmlui	
                          9,740	
                            2,840	
                   29%	
  
Michigan	
  -­‐	
  Deep	
  Blue	
                      Dspace	
                    deepblue.lib.umich.edu	
                                66,038	
                           22,200	
                    34%	
  
BYU	
  Scholars	
  Archive	
                           CONTENTdm	
                 scholarsarchive.lib.byu.edu	
                             7,421	
                            2,520	
                   34%	
  
IUPUI	
  Scholar	
                                     Dspace	
                    scholarworks.iupui.edu	
                                  2,109	
                              800	
                   38%	
  
Cornell	
  -­‐	
  Digital	
  Commons@ILR	
             Digital	
  Commons	
        digitalcommons.ilr.cornell.edu	
                        14,669	
                             5,880	
                   40%	
  
Cornell	
  -­‐	
  arXiv	
                              Other	
  (arXiv)	
          arxiv.org	
                                            706,906	
                          330,000	
                    47%	
  
Aqua7c	
  Commons	
                                    Eprints	
                   aqua7ccommons.org	
                                       5,722	
                            3,230	
                   56%	
  
Virginia	
  Tech	
  -­‐	
  CS	
  Tech	
  Reports	
     Eprints	
                   eprints.cs.vt.edu	
                                         983	
                              586	
                   60%	
  
Digital	
  Commons@UNLincoln	
                         Digital	
  Commons	
        digitalcommons.unl.edu	
                                50,657	
                           30,200	
                    60%	
  
Baylor	
  U	
  -­‐	
  BearDocs	
                       Dspace	
                    beardocs.baylor.edu	
                                       928	
                              829	
                   89%	
  
Google	
  Scholar	
  wants	
  the	
  right	
  metadata	
  
tags	
  used	
  consistently	
  and	
  accurately.	
  
"Use	
  Dublin	
  Core	
  tags	
  (e.g.,	
  DC.title)	
  as	
  a	
  last	
  resort	
  -­‐they	
  work	
  poorly	
  for	
  
  journal	
  papers...”	
  
                                                  -­‐       Google	
  Scholar	
  Inclusion	
  Guidelines	
  for	
  Webmasters
                                                                                                                            	
  
…	
  there's	
  a	
  good	
  chance	
  that	
  many	
  of	
  your	
  papers	
  aren't	
  included	
  at	
  all,	
  
      because	
  documents	
  with	
  the	
  same	
  title	
  are	
  often	
  considered	
  
      duplicates.	
  
                                                     -­‐    Google	
  Scholar	
  Inclusion	
  Guidelines	
  for	
  Webmasters
                                                                                                                            	
  
“…	
  incorrect	
  identification	
  of	
  references	
  could	
  lead	
  to	
  exclusion	
  of	
  your	
  
    papers	
  from	
  Google	
  Scholar	
  or	
  to	
  low	
  ranking	
  of	
  your	
  papers	
  in	
  the	
  
    search	
  results.”	
  
                                                     -­‐    Google	
  Scholar	
  Inclusion	
  Guidelines	
  for	
  Webmasters
                                                                                                                            	
  
“…the	
  most	
  common	
  cause	
  of	
  indexing	
  problems	
  is	
  incorrect	
  extraction	
  of	
  
  bibliographic	
  data	
  by	
  the	
  automated	
  parser	
  software.	
  	
  
                                                     -­‐                                                                    	
  
                                                            Google	
  Scholar	
  Inclusion	
  Guidelines	
  for	
  Webmasters
Challenge	
  is	
  presen*ng	
  bibliographic	
  
            cita*ons	
  GS	
  can	
  iden*fy,	
  parse	
  and	
  digest	
  
10/31/11                          Thanks for nothing: changes in income and labor force participation for never-married mothers since 1982

     Title                                     Thanks for nothing: changes in income and labor force participation for never-married mothers since 1982
     University of Utah creator                Wolfinger, Nicholas H.
     Other Creator                             McKeever, Matthew
     Subject.Keyword                           Motherhood; Single Mothers; Income; Population surveys;
     Subject.LCSH                              Single mothers
                                               Income
     Description                               This study examines whether the changing social and economic characteristics of
                                               women who give birth out of wedlock have led to higher family incomes. Using Current
                                               Population Survey data collected between 1982 and 2002, we find that never-married
                                               mothers remain poor. They have made modest economic gains, but these have disproportionately
                                               occurred at the top of the income distribution. Yet there is no evidence of
                                               a burgeoning class of "Murphy Browns" middle-class professional women who give
                                               birth out of wedlock. Surprisingly, never-married mothers' incomes have stagnated in
                                               spite of impressive gains in education and other personal and vocational characteristics
                                               that should have resulted in greater economic progress than has been the case.
                                               These gains cast doubt on various stereotypes about women who give birth out of
                                               wedlock.
     Publisher                                 University of Utah
     Date.Original                             2006-07-26
     Type                                      Text
     Format.Extent                             370,155 Bytes
     Format.Medium                             application/pdf
     Resource Identifier                       ir-main,824
     Language                                  eng
     Series                                    Institute of Public and International Affairs Working Papers
     Relation                                  McKeever, M. & Wolfinger, N.H. (2006). Thanks for Nothing: Changes in Income and Labor Force Participation
                                               Never-Married Mothers since 1982. Institute of Public & International Affairs (IPIA), 4, 1-43.
     Rights Management                         (c) Matthew McKeever and Nicholas H. Wolfinger
     Research Institute                        Institute of Public and International Affairs (IPIA)
     Department                                Family & Consumer Studies
                                               Sociology
     School / College                          College of Social & Behavioral Science
     Contributing Institution                  University of Utah
     Publication Type                          working paper
                  UNIVERSITY OF UTAH | ECCLES HEALTH SCIENCES LIBRARY | MARRIOTT LIBRARY | QUINNEY LAW LIBRARY | DISCLAIMER | COPYRIGHT | CONTACT
               IN ACCORDANCE WITH THE AMERICANS WITH DISABILITIES ACT, THE INFORMATION IN THIS SITE IS AVAILABLE IN ALTERNATE FORMATS UPON REQUEST.
First	
  step	
  was	
  to	
  begin	
  aligning	
  Highwire	
  
Press	
  with	
  exis*ng	
  Dublin	
  Core	
  fields	
  
Google	
  Scholar	
  HTML	
  speak	
  
Google	
  Scholar	
  Pilot	
  1	
  tested	
  importance	
  
of	
  Metadata	
  model	
  
u  6,482	
  URLs	
  	
  in	
  Sitemaps	
  submitted	
  via	
  Google	
  
    Webmaster	
  Tools.	
  
u  Errors	
  generated	
  during	
  Google	
  crawls	
  were	
  
    analyzed	
  and	
  addressed.	
  	
  	
  
u  Updated	
  &	
  corrected	
  metadata	
  for	
  20	
  pilot	
  articles	
  

      v  Ensured	
  full-­‐text	
  PDF	
  met	
  GS	
  inclusion	
  guideline	
  
          requirements.	
  
      v  Provided	
  a	
  “landing	
  page”	
  per	
  GS	
  inclusion	
  guidelines,	
  
          containing	
  links	
  to	
  the	
  20	
  IR	
  pilot	
  papers	
  that	
  was	
  
          within	
  a	
  few	
  clicks	
  of	
  the	
  home	
  page.	
  	
  
USpace	
  IR	
  Google	
  Index	
  Ra*os	
  increased	
  

                                                       Google Index Ratio

                                                  12%	
  
   07/05/10	
        ETD	
  1	
                                                       69%	
  
   11/19/10	
                                                                                      97%	
  
   10/16/11	
                       0%	
  
                     ETD	
  2	
                                                       68%	
  
                                                                                                   98%	
  
                                                            23%	
  
      UScholar	
  Works	
                                                   51%	
  
                                                                                                   98%	
  
                                         4%	
  
     Board	
  of	
  Regents	
                                            47%	
  
                                                                                                   97%	
  

                                0%	
                    25%	
          50%	
           75%	
     100%	
  


*October 16, 2011 Weighted Average Google Index Ratio = 97.82% (10,306/10,536).
USpace	
  IR	
  Google	
  Index	
  Ra*os	
  increased	
  

                                                  Google Index Ratio

   07/05/10	
  
                            Google Scholar Index Ratio
                     ETD	
  1	
  
                                  12%	
  
                                                  69%	
  
   11/19/10	
                                                                                      97%	
  




                                                  0%
   10/16/11	
                       0%	
  
                     ETD	
  2	
                                                       68%	
  
                                                                                                   98%	
  
                                                    23%	
  
      UScholar	
  Works	
                                                   51%	
  
                                                                                                   98%	
  
                                         4%	
  
     Board	
  of	
  Regents	
                                            47%	
  
                                                                                                   97%	
  

                                0%	
              25%	
                50%	
           75%	
     100%	
  


*October 16, 2011 Weighted Average Google Index Ratio = 97.82% (10,306/10,536).
GS	
  Pilot	
  2	
  U*lized	
  OCLC’s	
  rela*onship	
  
with	
  Google	
  Scholar	
  
u    19	
  Papers	
  in	
  GS	
  Pilot	
  2	
  
                  Google Scholar Index Ratio
      v  6	
  of	
  7	
  GS	
  paper	
  types	
  represented	
  

      v  19	
  Full	
  Text	
  PDFs	
  




                       62%
u    Augmented	
  CONTENTdm	
  v.6	
  
      v  Highwire	
  Press	
  Meta	
  tags	
  

      v  Browse	
  By	
  Year	
  

      v  Recently	
  Added	
  

      v  College	
  &	
  Department	
  
A	
  Pre-­‐Print	
  Author	
  Manuscript	
  is	
  not	
  the	
  
  Journal	
  Ar*cle.	
  
                      Meta	
  Tag	
                                      	
  Pre-­‐Print	
                                        Journal	
  Article	
  
1	
  -­‐	
  citation_author	
                              Maloney,	
  Krisellen;	
  Antelman,	
  Kristin;	
        Maloney,	
  Krisellen;	
  Antelman,	
  Kristin;	
  Arlitsch,	
  
                                                              Arlitsch,	
  Kenning;	
  Butler,	
  John	
                           Kenning;	
  Butler,	
  John	
  
2	
  -­‐	
  citation_date	
                                                      2009	
                                                        2010	
  
3	
  -­‐	
  citation_title	
                              Future	
  leaders'	
  views	
  on	
  organizational	
     Future	
  leaders'	
  views	
  on	
  organizational	
  culture	
  
                                                                                culture	
  
4	
  -­‐	
  citation_publisher	
                                                  N/A	
                               Association	
  of	
  College	
  &	
  Research	
  Libraries	
  
5	
  -­‐	
  citation_journal_title	
                                              N/A	
                                     College	
  and	
  Research	
  Libraries	
  
6	
  -­‐	
  citation_volume	
                                                                                                                    71	
  
7	
  -­‐	
  citation_issue	
                                                                                                                      4	
  
8	
  -­‐	
  citation_firstpage	
                                                    1	
                                                         322	
  
9	
  -­‐	
  citation_lastpage	
                                                    56	
                                                         347	
  
10	
  -­‐	
  citation_doi	
  	
  
11	
  -­‐	
  citation_issn	
  
12	
  -­‐	
  citation_isbn	
  
13	
  -­‐	
  citation_keywords	
                                   Organizational	
  culture	
                   Organizational	
  culture	
  
16	
  -­‐	
  citation_technical_report_institution	
            Uspace	
  Ins7tu7onal	
  Repository,	
  	
                N/A	
  
                                                                      University	
  of	
  Utah	
                            	
  
17	
  -­‐	
  citation_technical_report_number	
                                                                           N/A	
  
18	
  -­‐	
  citation_language	
                                             en	
                                          en	
  
21	
  -­‐	
  citation_pdf_url	
  	
                        hTp://cdm6gs.lib.utah.edu/u7ls/geeile/   hTp://cdm6gs.lib.utah.edu/u7ls/geeile/collec7on/
                                                           collec7on/uspace/id/10/filename/3.pdf	
            uspace/id/16/filename/17.pdf	
  
22	
  -­‐	
  citation_abstract_html_url	
                hTp://cdm6gs.lib.utah.edu/cdm/singleitem/    hTp://cdm6gs.lib.utah.edu/cdm/singleitem/
      Not Relevant                                              collec7on/uspace/id/10/rec/1	
               collec7on/uspace/id/16/rec/2	
  
      14 - citation_dissertation_institution
      15 - citation_dissertation_name
      19 - citation_conference_title
      20 - citation_inbook_title
A	
  minor	
  nuance	
  is	
  the	
  difference	
  between	
  
  Books	
  and	
  Book	
  Chapters	
  
           Meta	
  Tag	
                                   	
  Book	
  Chapter	
                                                          Book	
  
1	
  -­‐	
  citation_author	
                                  Riloff,	
  Ellen	
  M.	
                                               Ram,	
  Ashwin	
  
2	
  -­‐	
  citation_date	
                                          1999	
                                                                1999	
  
3	
  -­‐	
  citation_title	
              Information	
  extraction	
  as	
  a	
  stepping	
  stone	
  toward	
       Understanding	
  Language:	
  Understanding	
  
                                                           story	
  understanding	
                                      Computational	
  Models	
  of	
  Reading	
  
4	
  -­‐	
  citation_publisher	
                                  MIT	
  Press	
                                                         MIT	
  Press	
  
8	
  -­‐	
  citation_firstpage	
                                      435	
                                                                    1	
  
9	
  -­‐	
  citation_lastpage	
                                       460	
                                                                 519	
  
12	
  -­‐	
  citation_isbn	
                                  0-­‐262-­‐18192-­‐4	
                                                  0-­‐262-­‐18192-­‐4	
  
13	
  -­‐	
  citation_keywords	
             Information	
  extraction;	
  Story	
  understanding;	
                Information	
  extraction;	
  Story	
  understanding;	
  
18	
  -­‐	
  citation_language	
                                       en	
                                                                   en	
  
20	
  -­‐	
  citation_inbook_title	
           Understanding	
  Language:	
  Understanding	
                                                N/A	
  
                                                  Computational	
  Models	
  of	
  Reading	
  
21	
  -­‐	
  citation_pdf_url	
  	
      hTp://cdm6gs.lib.utah.edu/u7ls/geeile/collec7on/
                                         uspace/id/9/filename/5.pdf	
  
22	
  -­‐	
                              hTp://cdm6gs.lib.utah.edu/cdm/singleitem/collec7on/
citation_abstract_html_url	
             uspace/id/9/rec/1	
  

                                           Not Relevant
                                           5 - citation_journal_title
                                           6 - citation_volume
                                           7 - citation_issue
                                           10 - citation_doi
                                           11 - citation_issn
                                           14 - citation_dissertation_institution
                                           15 - citation_dissertation_name
                                           16 - citation_technical_report_institution
                                           17 - citation_technical_report_number
                                           19 - citation_conference_title
ETDs	
  use	
  very	
  different	
  metadata	
  tags	
  

                Meta	
  Tag	
                                                  	
  PhD	
                                                      Masters	
  
1	
  -­‐	
  citation_author	
                                         Rague,	
  Brian	
  William	
                                           Wu,	
  Shangduan	
  
2	
  -­‐	
  citation_date	
                                                     2010/08	
                                                          2010/07	
  
3	
  -­‐	
  citation_title	
                        A	
  CS1	
  pedagogical	
  approach	
  to	
  parallel	
  thinking	
   Electronic	
  structure	
  and	
  transport	
  property	
  of	
  
                                                                                                                                          disordered	
  graphene	
  
8	
  -­‐	
  citation_firstpage	
                                                     1	
                                                               1	
  
9	
  -­‐	
  citation_lastpage	
                                                  234	
                                                                84	
  
13	
  -­‐	
  citation_keywords	
                   Computer;	
  CS1;	
  Educa7on;	
  Parallel;	
  Programming;	
  	
   Disorder;	
  Electronic	
  structure;	
  Graphene;	
  Transport	
  
                                                                                                                                     property;	
  Electronic	
  structure;	
  	
  
14	
  -­‐	
  citation_dissertation_institution	
       University	
  of	
  Utah,	
  College	
  of	
  Engineering	
            University	
  of	
  Utah,	
  College	
  of	
  Science	
  
15	
  -­‐	
  citation_dissertation_name	
                                       PhD	
                                                                 MS	
  
18	
  -­‐	
  citation_language	
                                                    en	
                                                              en	
  
21	
  -­‐	
  citation_pdf_url	
  	
              hTp://cdm6gs.lib.utah.edu/u7ls/geeile/collec7on/ hTp://cdm6gs.lib.utah.edu/u7ls/geeile/collec7on/
                                                 uspace/id/5/filename/19.pdf	
                                         uspace/id/0/filename/4.pdf	
  
22	
  -­‐	
  citation_abstract_html_url	
        hTp://cdm6gs.lib.utah.edu/cdm/singleitem/                            hTp://cdm6gs.lib.utah.edu/cdm/singleitem/collec7on/
                                                 collec7on/uspace/id/5/rec/1	
                                        uspace/id/0/rec/1	
  


                                                   Not Relevant
                                                   4 - citation_publisher
                                                   5 - citation_journal_title
                                                   6 - citation_volume
                                                   7 - citation_issue
                                                   10 - citation_doi
                                                   11 - citation_issn
                                                   12 - citation_isbn
                                                   16 - citation_technical_report_institution
                                                   17 - citation_technical_report_number
                                                   19 - citation_conference_title
                                                   20 - citation_inbook_title
Working	
  papers	
  have	
  a	
  unique	
  
  combina*on	
  of	
  metadata	
  tags.	
  
                         Meta	
  Tag	
                                                                 Working	
  Paper	
  
1	
  -­‐	
  citation_author	
                             Wolfinger,	
  Nicholas	
  H.;	
  McKeever,	
  Matthew	
  
2	
  -­‐	
  citation_date	
                               2006-­‐07-­‐26	
  
3	
  -­‐	
  citation_title	
                              Thanks	
  for	
  nothing:	
  changes	
  in	
  income	
  and	
  labor	
  force	
  participation	
  for	
  never-­‐married	
  
                                                          mothers	
  since	
  1982	
  
6	
  -­‐	
  citation_volume	
  
7	
  -­‐	
  citation_issue	
  
8	
  -­‐	
  citation_firstpage	
                          1	
  
9	
  -­‐	
  citation_lastpage	
                           43	
  
10	
  -­‐	
  citation_doi	
  	
  
13	
  -­‐	
  citation_keywords	
                          Motherhood;	
  Single	
  Mothers;	
  Income;	
  Population	
  surveys;	
  
16	
  -­‐	
  citation_technical_report_institution	
      Institute	
  of	
  Public	
  &	
  International	
  Affairs	
  (IPIA),	
  University	
  of	
  Utah	
  
17	
  -­‐	
  citation_technical_report_number	
           2006-­‐07-­‐04	
  
18	
  -­‐	
  citation_language	
                          en	
  
19	
  -­‐	
  citation_conference_title	
                  101st	
  American	
  Sociological	
  Associa7on	
  (ASA)	
  Annual	
  Mee7ng;	
  2006	
  Aug	
  11-­‐14;	
  Montreal,	
  
                                                          Canada	
  
21	
  -­‐	
  citation_pdf_url	
  	
                       hTp://cdm6gs.lib.utah.edu/u7ls/geeile/collec7on/uspace/id/7/filename/21.pdf	
  
22	
  -­‐	
  citation_abstract_html_url	
                 hTp://cdm6gs.lib.utah.edu/cdm/singleitem/collec7on/uspace/id/7/rec/1	
  



                                                         Not Relevant
                                                         4 - citation_publisher
                                                         5 - citation_journal_title
                                                         11 - citation_issn
                                                         12 - citation_isbn
                                                         14 - citation_dissertation_institution
                                                         15 - citation_dissertation_name
                                                         20 - citation_inbook_title
Conferece	
  Ar*cles	
  may	
  or	
  may	
  not	
  have	
  
  published	
  proceedings	
  
              Meta	
  Tag	
                                                                  	
  Conference	
  Article	
  
1	
  -­‐	
  citation_author	
                Balasubramonian,	
  Rajeev;	
  Awasthi,	
  Manu;	
  Sudan,	
  Kshitij;	
  Carter,	
  John	
  
2	
  -­‐	
  citation_date	
                  2009/02/14	
  
3	
  -­‐	
  citation_title	
                 Dynamic	
  hardware-­‐assisted	
  software-­‐controlled	
  page	
  placement	
  to	
  manage	
  capacity	
  allocation	
  and	
  
                                             sharing	
  within	
  large	
  caches	
  
4	
  -­‐	
  citation_publisher	
             Institute	
  of	
  Electrical	
  and	
  Electronics	
  Engineers	
  (IEEE)	
  
5	
  -­‐	
  citation_journal_title	
         High	
  Performance	
  Computer	
  Architecture,	
  2009.	
  HPCA	
  2009.	
  IEEE	
  15th	
  International	
  Symposium	
  on	
  
6	
  -­‐	
  citation_volume	
  
7	
  -­‐	
  citation_issue	
  
8	
  -­‐	
  citation_firstpage	
             250	
  
9	
  -­‐	
  citation_lastpage	
              261	
  
10	
  -­‐	
  citation_doi	
  	
              10.1109/HPCA.2009.4798260	
  
11	
  -­‐	
  citation_issn	
                 1530-­‐0897	
  
12	
  -­‐	
  citation_isbn	
                 978-­‐1-­‐4244-­‐2932-­‐5	
  
13	
  -­‐	
  citation_keywords	
             Page	
  coloring;	
  Shadow-­‐memory	
  addresses;	
  Cache	
  capacity	
  allocation;	
  Data/page	
  migration	
  
18	
  -­‐	
  citation_language	
             en	
  
19	
  -­‐	
  citation_conference_title	
     15th	
  Interna7onal	
  Symposium	
  on	
  High	
  Performance	
  Computer	
  Architecture	
  (HPCA-­‐15	
  2009)	
  [14-­‐18	
  Feb.	
  
                                             2009,	
  Raleigh,	
  NC,	
  USA]	
  
21	
  -­‐	
  citation_pdf_url	
  	
          hTp://cdm6gs.lib.utah.edu/u7ls/geeile/collec7on/uspace/id/1/filename/11.pdf	
  

             citation_abstract_html_url	
   hTp://cdm6gs.lib.utah.edu/cdm/ref/collec7on/uspace/id/1	
  
22	
  -­‐	
  Not Relevant
         14 - citation_dissertation_institution
         15 - citation_dissertation_name
         16 - citation_technical_report_institution
         17 - citation_technical_report_number
         20 - citation_inbook_title
Ques*ons?	
  

Kenning	
  Arlitsch	
  
kenning.arlitsch@utah.edu	
  
	
  
Patrick	
  OBrien	
  
www.RevXcorp.com	
  
Patrick.OBrien@utah.edu	
  
805.509.2586	
  
Ques*ons?	
  

Kenning	
  Arlitsch	
  
kenning.arlitsch@utah.edu	
  
	
  
Patrick	
  OBrien	
  
www.RevXcorp.com	
  
Patrick.OBrien@utah.edu	
  

Weitere ähnliche Inhalte

Ähnlich wie Improving Institutional Repository Search Engine Visibility in Google and Google Scholar

Smith & Edwards - Embedding information literacy skills as employability attr...
Smith & Edwards - Embedding information literacy skills as employability attr...Smith & Edwards - Embedding information literacy skills as employability attr...
Smith & Edwards - Embedding information literacy skills as employability attr...
IL Group (CILIP Information Literacy Group)
 
Opening up -staff attitudes to open learning
Opening up -staff attitudes to open learningOpening up -staff attitudes to open learning
Opening up -staff attitudes to open learning
AndyBeggan
 
Nli2012 spiresopening
Nli2012 spiresopeningNli2012 spiresopening
Nli2012 spiresopening
Erin Lyjak
 
Enabling Collaborative Research Data Management with SQLShare
Enabling Collaborative Research Data Management with SQLShareEnabling Collaborative Research Data Management with SQLShare
Enabling Collaborative Research Data Management with SQLShare
University of Washington
 

Ähnlich wie Improving Institutional Repository Search Engine Visibility in Google and Google Scholar (20)

Affordable Learning $olutions Fair, San Jose State University
Affordable Learning $olutions Fair, San Jose State UniversityAffordable Learning $olutions Fair, San Jose State University
Affordable Learning $olutions Fair, San Jose State University
 
The Rise of E-Reading
The Rise of E-ReadingThe Rise of E-Reading
The Rise of E-Reading
 
Momentum of Open Research Data: now in 5-d!
Momentum of Open Research Data: now in 5-d!Momentum of Open Research Data: now in 5-d!
Momentum of Open Research Data: now in 5-d!
 
State Board of Education presentation 2012
State Board of Education presentation 2012State Board of Education presentation 2012
State Board of Education presentation 2012
 
The New Environment for Foundations
The New Environment for FoundationsThe New Environment for Foundations
The New Environment for Foundations
 
Library Process Redesign: Renewing Services, Changing Workflows
Library Process Redesign: Renewing Services, Changing Workflows Library Process Redesign: Renewing Services, Changing Workflows
Library Process Redesign: Renewing Services, Changing Workflows
 
Webometrics Unimas
Webometrics UnimasWebometrics Unimas
Webometrics Unimas
 
Crushing, Blending, and Stretching Data
Crushing, Blending, and Stretching DataCrushing, Blending, and Stretching Data
Crushing, Blending, and Stretching Data
 
Smith & Edwards - Embedding information literacy skills as employability attr...
Smith & Edwards - Embedding information literacy skills as employability attr...Smith & Edwards - Embedding information literacy skills as employability attr...
Smith & Edwards - Embedding information literacy skills as employability attr...
 
A Snapshot of the U.S. Web Archiving Landscape through the 2013 NDSA Survey R...
A Snapshot of the U.S. Web Archiving Landscape through the 2013 NDSA Survey R...A Snapshot of the U.S. Web Archiving Landscape through the 2013 NDSA Survey R...
A Snapshot of the U.S. Web Archiving Landscape through the 2013 NDSA Survey R...
 
Opening up -staff attitudes to open learning
Opening up -staff attitudes to open learningOpening up -staff attitudes to open learning
Opening up -staff attitudes to open learning
 
Information Behavior Of HBCU Students: A Case Study
Information Behavior Of HBCU Students: A Case StudyInformation Behavior Of HBCU Students: A Case Study
Information Behavior Of HBCU Students: A Case Study
 
Discovery and analysis of the world's research collections: JSTOR and Summon ...
Discovery and analysis of the world's research collections: JSTOR and Summon ...Discovery and analysis of the world's research collections: JSTOR and Summon ...
Discovery and analysis of the world's research collections: JSTOR and Summon ...
 
Michael Stoner - swissnexSF presentation
Michael Stoner - swissnexSF presentationMichael Stoner - swissnexSF presentation
Michael Stoner - swissnexSF presentation
 
Nli2012 spiresopening
Nli2012 spiresopeningNli2012 spiresopening
Nli2012 spiresopening
 
The Pearls and Perils of For-Profit Education (Deborah Riemer)
The Pearls and Perils of For-Profit Education (Deborah Riemer)The Pearls and Perils of For-Profit Education (Deborah Riemer)
The Pearls and Perils of For-Profit Education (Deborah Riemer)
 
Enabling Collaborative Research Data Management with SQLShare
Enabling Collaborative Research Data Management with SQLShareEnabling Collaborative Research Data Management with SQLShare
Enabling Collaborative Research Data Management with SQLShare
 
Learning Analytics - CET Seminar 2012
Learning Analytics - CET Seminar 2012Learning Analytics - CET Seminar 2012
Learning Analytics - CET Seminar 2012
 
Web 2.0 Measurement: Open Government Innovations Conference
Web 2.0 Measurement: Open Government Innovations ConferenceWeb 2.0 Measurement: Open Government Innovations Conference
Web 2.0 Measurement: Open Government Innovations Conference
 
Health oer uct
Health oer uctHealth oer uct
Health oer uct
 

Kürzlich hochgeladen

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Kürzlich hochgeladen (20)

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 

Improving Institutional Repository Search Engine Visibility in Google and Google Scholar

  • 1. Invisible  Ins*tu*onal  Repositories:   Addressing  the  Low  Indexing  Ra*o  of  IRs  in   Google  Scholar  by  Transforming  Metadata   Schema  rlitsch  &  Patrick  OBrien   Kenning  A October  31,  2011   2011  Fall  DLF,  Baltimore,  MD  
  • 2. Today’s  Objec*ves   u  Discuss  Marriott  Library  SEO  program   v  Program  Priorities  &  Results     v  Issues  &  Opportunity   v  Google  Scholar  
  • 3. MarrioE  Library  SEO  program  priori*es   u  Digital  repositories  vs.  general  websites   v  Millions  of  objects  in  databases   v  Include  IR   u  Priority  1  –  Increase  Reach   v  Get  objects  indexed  in  search  engines   u  Priority  2  –  Increase  Visibility   v  Provide  robust  descriptive  content  
  • 4. Collec*on  Google  Index  Ra*os  have   increased  across  the  board…   Google Index Ratio - All Collections* 12%   Average   51%   74%   37%   High**   87%   100%   0%   25%   50%   75%   100%   07/05/10   04/04/11   10/16/11   * Google Index Ratio = URLs submitted / URLs Indexed by Google for about 150 collections containing ~170,00 URLs **Highest index ratio achieved for Collections with over 500 URLs submitted to Google
  • 5. …increasing  Google  referrals  by  200%  and   total  visitors  by  79%.   12 week year-over-year
  • 6. However,  Google  Scholar  Index  Ra*os  ??   Google Scholar Index Ratio 0% You can find Marriott IR papers in Google now, but can not find them in Google Scholar. Why?
  • 7. Today’s  Objec*ves   u  Discuss  Marriott  Library  SEO  program   v  Program  Priorities  &  Results     v  Issues  &  Opportunity   v  Google  Scholar  
  • 8. College  Students  Begin  Research  -­‐  2005  
  • 9. College  Students  Begin  Research  -­‐  2010   DeRosa,  Cathy,  et  al.  “Perceptions  of  Libraries,  2010:  Context  and  Community:  A  Report   to  the  OCLC  Membership”,  OCLC,  2010.  
  • 10. Start  with  the  800  pound  gorilla  –  Google.  
  • 11. MarrioE  Library  Management   Experiences   u  Large  digital  collections  built  over  a  decade   v  1.3+  million  items   u  Why  weren’t  we  getting  indexed?   v  Harvesting/indexing  rates  as  low  as  8%   v  Non-­‐existent  IR  showing  in  Google  Scholar   u  Sitemaps  generated  for  Google    
  • 12. MWDL  Repositories  Survey   %  w/  Indirect  URL   Utah  Digital  Newspapers  Repository   University  of  Nevada,  Reno   University  of  Utah     Southern  Utah  University     Brigham  Young  University     Utah  State  University     Utah  State  Archives     Utah  State  University     Utah  Valley  University   Weber  State  University     Health  Education  Assets  Library     University  of  Nevada,  Las  Vegas     Utah  State  Library   0%   25%   50%   75%   100%   October 2010
  • 13. MWDL  Repositories  Survey   %  w/  Direct  URL   University  of  Nevada,  Reno     Utah  State  University     University  of  Utah     Utah  State  University     University  of  Nevada,  Las  Vegas     Utah  Valley  University     Brigham  Young  University     Weber  State  University     Health  Education  Assets  Library     Southern  Utah  University     Utah  State  Library   Utah  State  Archives     Utah  Digital  Newspapers  Repository   0%   25%   50%   75%   100%   October 2010
  • 14. Literature  Lessons   u  Most  are  dated   u  Most  deal  with  general  websites   u  Few  deal  with  digital  collections  in  db’s   u  Some  suggest  duplicating  the  content  outside   the  database  
  • 15. Today’s  Objec*ves   u  Discuss  Marriott  Library  SEO  program   v  Program  Priorities  &  Results     v  Issues  &  Opportunity   v  Google  Scholar  
  • 16. Why  does  Google  Scholar  MaEer  ??   u  “researchers  find  Google  and  Google  Scholar  to  be   amazingly  effective”  and  accept  the  results  as  “good   enough  in  many  cases”  (Kroll  &  Forsman  2010)     u  “broader  awareness  of  specialized  Google  tools  such   as  Google  Scholar  and  Google  Book  among  faculty   members  and  graduate  students”  (Rieger  2009)     u  “the  amount  of  qualified  scholarly  content  has   increased  considerably  in  Google  Scholar  since  it   was  launched  in  2004  (Mikki  2009)   u  4%  -­‐  27%  use  increase  in  four-­‐year  U  Miss  study   (Herrera  2010)  
  • 17. USpace  IR  Google  Index  Ra*os  baseline   Google Index Ratio 12%   07/05/10   ETD  1   11/19/10   10/16/11   0%   ETD  2   23%   UScholar  Works   4%   Board  of  Regents   0%   25%   50%   75%   100%   *Weighted Average Google Index Ratio = 18.33% (1,188/6,482)
  • 18. USpace  IR  Google  Index  Ra*os  baseline   Google Index Ratio 07/05/10   Google Scholar Index Ratio ETD  1   12%   11/19/10   0% 10/16/11   0%   ETD  2   23%   UScholar  Works   4%   Board  of  Regents   0%   25%   50%   75%   100%   *Weighted Average Google Index Ratio = 18.33% (1,188/6,482)
  • 19. Low  GS  indexing  ra*os  cut  across   ins*tu*ons   Google  Scholar  Indexing  Ratio  for  Selected  Institutional   and  Disciplinary  Repositories  October  2011   Baylor  U  -­‐  BearDocs   89%   Digital  Commons@UNLincoln   60%   Virginia  Tech  -­‐  CS  Tech  Reports   60%   Aquatic  Commons   56%   Cornell  -­‐  arXiv   47%   Cornell  -­‐  Digital  Commons@ILR   40%   IUPUI  Scholar   38%   BYU  Scholars  Archive   34%   Michigan  -­‐  Deep  Blue   34%   Univ  of  Oregon  -­‐  Scholars  Bank   29%   Harvard  Univ  -­‐  DASH   28%   eCommons@Cornell   18%   UW  Madison  -­‐  Minds@UW   17%   Texas  A&M  Repository   16%   IU  Scholarworks   13%   Columbia  Univ  -­‐  Academic   13%   D-­‐Scholarship@Pitt   12%   CaltechAuthors   10%   Univ  of  Rochester  Research   6%   UW  -­‐  ResearchWorks  Archive   3%   0%   10%   20%   30%   40%   50%   60%   70%   80%   90%   100%  
  • 20. Survey  Methodology  Key  Points   u  Selected  from  OpenDOAR   v  Only  IRs  from  the  U.S.   n  “Pure”  institutional  or  disciplinary  repositories   v  Different  software  types   n  DSpace,  Digital  Commons,  EPrints,  IR+,  CONTENTdm,   DigiTool,  arXiv   u  Calculated  total  items  in  each  repository   u  Site  operator  search   v  Site:repositoryURL   v  Shows  Approximation  
  • 21. GS  “site”  operator  provides  a  close   approxima*on  for  indexing  ra*o  
  • 22. Repository  so_ware  does  not  appear  to  be  the   deciding  factor   Repository  Name   Repository  So_ware   Repository  URL   Repository  items   Items  in  Google  Scholar   Indexing  Ra*o   Boston  College  -­‐  eScholarship@BC   DigiTool   dcollec7ons.bc.edu   1,635   1   0%   UW  -­‐  ResearchWorks  Archive   Dspace   digital.lib.washington.edu/dspace   11,285   304   3%   Univ  of  Rochester  Research   IR+   urresearch.rochester.edu   16,184   983   6%   CaltechAuthors   Eprints   authors.library.caltech.edu   22,000   2,290   10%   D-­‐Scholarship@PiT   Eprints   d-­‐scholarship.piT.edu   5,888   686   12%   Columbia  Univ  -­‐  Academic  Commons   Digital  Commons   academiccommons.columbia.edu   4,631   586   13%   IU  Scholarworks   Dspace   scholarworks.iu.edu/dspace   7,782   1,030   13%   Texas  A&M  Repository   Dspace   repository.tamu.edu   46,324   7,250   16%   UW  Madison  -­‐  Minds@UW   Dspace   minds.wisconsin.edu   15,078   2,520   17%   eCommons@Cornell   Dspace   ecommons.library.cornell.edu   18,544   3,410   18%   Harvard  Univ  -­‐  DASH   Dspace   dash.harvard.edu   6,193   1,710   28%   Univ  of  Oregon  -­‐  Scholars  Bank   Dspace   scholarsbank.uoregon.edu/xmlui   9,740   2,840   29%   Michigan  -­‐  Deep  Blue   Dspace   deepblue.lib.umich.edu   66,038   22,200   34%   BYU  Scholars  Archive   CONTENTdm   scholarsarchive.lib.byu.edu   7,421   2,520   34%   IUPUI  Scholar   Dspace   scholarworks.iupui.edu   2,109   800   38%   Cornell  -­‐  Digital  Commons@ILR   Digital  Commons   digitalcommons.ilr.cornell.edu   14,669   5,880   40%   Cornell  -­‐  arXiv   Other  (arXiv)   arxiv.org   706,906   330,000   47%   Aqua7c  Commons   Eprints   aqua7ccommons.org   5,722   3,230   56%   Virginia  Tech  -­‐  CS  Tech  Reports   Eprints   eprints.cs.vt.edu   983   586   60%   Digital  Commons@UNLincoln   Digital  Commons   digitalcommons.unl.edu   50,657   30,200   60%   Baylor  U  -­‐  BearDocs   Dspace   beardocs.baylor.edu   928   829   89%  
  • 23. Google  Scholar  wants  the  right  metadata   tags  used  consistently  and  accurately.   "Use  Dublin  Core  tags  (e.g.,  DC.title)  as  a  last  resort  -­‐they  work  poorly  for   journal  papers...”   -­‐  Google  Scholar  Inclusion  Guidelines  for  Webmasters   …  there's  a  good  chance  that  many  of  your  papers  aren't  included  at  all,   because  documents  with  the  same  title  are  often  considered   duplicates.   -­‐  Google  Scholar  Inclusion  Guidelines  for  Webmasters   “…  incorrect  identification  of  references  could  lead  to  exclusion  of  your   papers  from  Google  Scholar  or  to  low  ranking  of  your  papers  in  the   search  results.”   -­‐  Google  Scholar  Inclusion  Guidelines  for  Webmasters   “…the  most  common  cause  of  indexing  problems  is  incorrect  extraction  of   bibliographic  data  by  the  automated  parser  software.     -­‐    Google  Scholar  Inclusion  Guidelines  for  Webmasters
  • 24. Challenge  is  presen*ng  bibliographic   cita*ons  GS  can  iden*fy,  parse  and  digest   10/31/11 Thanks for nothing: changes in income and labor force participation for never-married mothers since 1982 Title Thanks for nothing: changes in income and labor force participation for never-married mothers since 1982 University of Utah creator Wolfinger, Nicholas H. Other Creator McKeever, Matthew Subject.Keyword Motherhood; Single Mothers; Income; Population surveys; Subject.LCSH Single mothers Income Description This study examines whether the changing social and economic characteristics of women who give birth out of wedlock have led to higher family incomes. Using Current Population Survey data collected between 1982 and 2002, we find that never-married mothers remain poor. They have made modest economic gains, but these have disproportionately occurred at the top of the income distribution. Yet there is no evidence of a burgeoning class of "Murphy Browns" middle-class professional women who give birth out of wedlock. Surprisingly, never-married mothers' incomes have stagnated in spite of impressive gains in education and other personal and vocational characteristics that should have resulted in greater economic progress than has been the case. These gains cast doubt on various stereotypes about women who give birth out of wedlock. Publisher University of Utah Date.Original 2006-07-26 Type Text Format.Extent 370,155 Bytes Format.Medium application/pdf Resource Identifier ir-main,824 Language eng Series Institute of Public and International Affairs Working Papers Relation McKeever, M. & Wolfinger, N.H. (2006). Thanks for Nothing: Changes in Income and Labor Force Participation Never-Married Mothers since 1982. Institute of Public & International Affairs (IPIA), 4, 1-43. Rights Management (c) Matthew McKeever and Nicholas H. Wolfinger Research Institute Institute of Public and International Affairs (IPIA) Department Family & Consumer Studies Sociology School / College College of Social & Behavioral Science Contributing Institution University of Utah Publication Type working paper UNIVERSITY OF UTAH | ECCLES HEALTH SCIENCES LIBRARY | MARRIOTT LIBRARY | QUINNEY LAW LIBRARY | DISCLAIMER | COPYRIGHT | CONTACT IN ACCORDANCE WITH THE AMERICANS WITH DISABILITIES ACT, THE INFORMATION IN THIS SITE IS AVAILABLE IN ALTERNATE FORMATS UPON REQUEST.
  • 25. First  step  was  to  begin  aligning  Highwire   Press  with  exis*ng  Dublin  Core  fields  
  • 27. Google  Scholar  Pilot  1  tested  importance   of  Metadata  model   u  6,482  URLs    in  Sitemaps  submitted  via  Google   Webmaster  Tools.   u  Errors  generated  during  Google  crawls  were   analyzed  and  addressed.       u  Updated  &  corrected  metadata  for  20  pilot  articles   v  Ensured  full-­‐text  PDF  met  GS  inclusion  guideline   requirements.   v  Provided  a  “landing  page”  per  GS  inclusion  guidelines,   containing  links  to  the  20  IR  pilot  papers  that  was   within  a  few  clicks  of  the  home  page.    
  • 28. USpace  IR  Google  Index  Ra*os  increased   Google Index Ratio 12%   07/05/10   ETD  1   69%   11/19/10   97%   10/16/11   0%   ETD  2   68%   98%   23%   UScholar  Works   51%   98%   4%   Board  of  Regents   47%   97%   0%   25%   50%   75%   100%   *October 16, 2011 Weighted Average Google Index Ratio = 97.82% (10,306/10,536).
  • 29. USpace  IR  Google  Index  Ra*os  increased   Google Index Ratio 07/05/10   Google Scholar Index Ratio ETD  1   12%   69%   11/19/10   97%   0% 10/16/11   0%   ETD  2   68%   98%   23%   UScholar  Works   51%   98%   4%   Board  of  Regents   47%   97%   0%   25%   50%   75%   100%   *October 16, 2011 Weighted Average Google Index Ratio = 97.82% (10,306/10,536).
  • 30. GS  Pilot  2  U*lized  OCLC’s  rela*onship   with  Google  Scholar   u  19  Papers  in  GS  Pilot  2   Google Scholar Index Ratio v  6  of  7  GS  paper  types  represented   v  19  Full  Text  PDFs   62% u  Augmented  CONTENTdm  v.6   v  Highwire  Press  Meta  tags   v  Browse  By  Year   v  Recently  Added   v  College  &  Department  
  • 31. A  Pre-­‐Print  Author  Manuscript  is  not  the   Journal  Ar*cle.   Meta  Tag    Pre-­‐Print   Journal  Article   1  -­‐  citation_author   Maloney,  Krisellen;  Antelman,  Kristin;   Maloney,  Krisellen;  Antelman,  Kristin;  Arlitsch,   Arlitsch,  Kenning;  Butler,  John   Kenning;  Butler,  John   2  -­‐  citation_date   2009   2010   3  -­‐  citation_title   Future  leaders'  views  on  organizational   Future  leaders'  views  on  organizational  culture   culture   4  -­‐  citation_publisher   N/A   Association  of  College  &  Research  Libraries   5  -­‐  citation_journal_title   N/A   College  and  Research  Libraries   6  -­‐  citation_volume   71   7  -­‐  citation_issue   4   8  -­‐  citation_firstpage   1   322   9  -­‐  citation_lastpage   56   347   10  -­‐  citation_doi     11  -­‐  citation_issn   12  -­‐  citation_isbn   13  -­‐  citation_keywords   Organizational  culture   Organizational  culture   16  -­‐  citation_technical_report_institution   Uspace  Ins7tu7onal  Repository,     N/A   University  of  Utah     17  -­‐  citation_technical_report_number   N/A   18  -­‐  citation_language   en   en   21  -­‐  citation_pdf_url     hTp://cdm6gs.lib.utah.edu/u7ls/geeile/ hTp://cdm6gs.lib.utah.edu/u7ls/geeile/collec7on/ collec7on/uspace/id/10/filename/3.pdf   uspace/id/16/filename/17.pdf   22  -­‐  citation_abstract_html_url   hTp://cdm6gs.lib.utah.edu/cdm/singleitem/ hTp://cdm6gs.lib.utah.edu/cdm/singleitem/ Not Relevant collec7on/uspace/id/10/rec/1   collec7on/uspace/id/16/rec/2   14 - citation_dissertation_institution 15 - citation_dissertation_name 19 - citation_conference_title 20 - citation_inbook_title
  • 32. A  minor  nuance  is  the  difference  between   Books  and  Book  Chapters   Meta  Tag    Book  Chapter   Book   1  -­‐  citation_author   Riloff,  Ellen  M.   Ram,  Ashwin   2  -­‐  citation_date   1999   1999   3  -­‐  citation_title   Information  extraction  as  a  stepping  stone  toward   Understanding  Language:  Understanding   story  understanding   Computational  Models  of  Reading   4  -­‐  citation_publisher   MIT  Press   MIT  Press   8  -­‐  citation_firstpage   435   1   9  -­‐  citation_lastpage   460   519   12  -­‐  citation_isbn   0-­‐262-­‐18192-­‐4   0-­‐262-­‐18192-­‐4   13  -­‐  citation_keywords   Information  extraction;  Story  understanding;   Information  extraction;  Story  understanding;   18  -­‐  citation_language   en   en   20  -­‐  citation_inbook_title   Understanding  Language:  Understanding   N/A   Computational  Models  of  Reading   21  -­‐  citation_pdf_url     hTp://cdm6gs.lib.utah.edu/u7ls/geeile/collec7on/ uspace/id/9/filename/5.pdf   22  -­‐   hTp://cdm6gs.lib.utah.edu/cdm/singleitem/collec7on/ citation_abstract_html_url   uspace/id/9/rec/1   Not Relevant 5 - citation_journal_title 6 - citation_volume 7 - citation_issue 10 - citation_doi 11 - citation_issn 14 - citation_dissertation_institution 15 - citation_dissertation_name 16 - citation_technical_report_institution 17 - citation_technical_report_number 19 - citation_conference_title
  • 33. ETDs  use  very  different  metadata  tags   Meta  Tag    PhD   Masters   1  -­‐  citation_author   Rague,  Brian  William   Wu,  Shangduan   2  -­‐  citation_date   2010/08   2010/07   3  -­‐  citation_title   A  CS1  pedagogical  approach  to  parallel  thinking   Electronic  structure  and  transport  property  of   disordered  graphene   8  -­‐  citation_firstpage   1   1   9  -­‐  citation_lastpage   234   84   13  -­‐  citation_keywords   Computer;  CS1;  Educa7on;  Parallel;  Programming;     Disorder;  Electronic  structure;  Graphene;  Transport   property;  Electronic  structure;     14  -­‐  citation_dissertation_institution   University  of  Utah,  College  of  Engineering   University  of  Utah,  College  of  Science   15  -­‐  citation_dissertation_name   PhD   MS   18  -­‐  citation_language   en   en   21  -­‐  citation_pdf_url     hTp://cdm6gs.lib.utah.edu/u7ls/geeile/collec7on/ hTp://cdm6gs.lib.utah.edu/u7ls/geeile/collec7on/ uspace/id/5/filename/19.pdf   uspace/id/0/filename/4.pdf   22  -­‐  citation_abstract_html_url   hTp://cdm6gs.lib.utah.edu/cdm/singleitem/ hTp://cdm6gs.lib.utah.edu/cdm/singleitem/collec7on/ collec7on/uspace/id/5/rec/1   uspace/id/0/rec/1   Not Relevant 4 - citation_publisher 5 - citation_journal_title 6 - citation_volume 7 - citation_issue 10 - citation_doi 11 - citation_issn 12 - citation_isbn 16 - citation_technical_report_institution 17 - citation_technical_report_number 19 - citation_conference_title 20 - citation_inbook_title
  • 34. Working  papers  have  a  unique   combina*on  of  metadata  tags.   Meta  Tag   Working  Paper   1  -­‐  citation_author   Wolfinger,  Nicholas  H.;  McKeever,  Matthew   2  -­‐  citation_date   2006-­‐07-­‐26   3  -­‐  citation_title   Thanks  for  nothing:  changes  in  income  and  labor  force  participation  for  never-­‐married   mothers  since  1982   6  -­‐  citation_volume   7  -­‐  citation_issue   8  -­‐  citation_firstpage   1   9  -­‐  citation_lastpage   43   10  -­‐  citation_doi     13  -­‐  citation_keywords   Motherhood;  Single  Mothers;  Income;  Population  surveys;   16  -­‐  citation_technical_report_institution   Institute  of  Public  &  International  Affairs  (IPIA),  University  of  Utah   17  -­‐  citation_technical_report_number   2006-­‐07-­‐04   18  -­‐  citation_language   en   19  -­‐  citation_conference_title   101st  American  Sociological  Associa7on  (ASA)  Annual  Mee7ng;  2006  Aug  11-­‐14;  Montreal,   Canada   21  -­‐  citation_pdf_url     hTp://cdm6gs.lib.utah.edu/u7ls/geeile/collec7on/uspace/id/7/filename/21.pdf   22  -­‐  citation_abstract_html_url   hTp://cdm6gs.lib.utah.edu/cdm/singleitem/collec7on/uspace/id/7/rec/1   Not Relevant 4 - citation_publisher 5 - citation_journal_title 11 - citation_issn 12 - citation_isbn 14 - citation_dissertation_institution 15 - citation_dissertation_name 20 - citation_inbook_title
  • 35. Conferece  Ar*cles  may  or  may  not  have   published  proceedings   Meta  Tag    Conference  Article   1  -­‐  citation_author   Balasubramonian,  Rajeev;  Awasthi,  Manu;  Sudan,  Kshitij;  Carter,  John   2  -­‐  citation_date   2009/02/14   3  -­‐  citation_title   Dynamic  hardware-­‐assisted  software-­‐controlled  page  placement  to  manage  capacity  allocation  and   sharing  within  large  caches   4  -­‐  citation_publisher   Institute  of  Electrical  and  Electronics  Engineers  (IEEE)   5  -­‐  citation_journal_title   High  Performance  Computer  Architecture,  2009.  HPCA  2009.  IEEE  15th  International  Symposium  on   6  -­‐  citation_volume   7  -­‐  citation_issue   8  -­‐  citation_firstpage   250   9  -­‐  citation_lastpage   261   10  -­‐  citation_doi     10.1109/HPCA.2009.4798260   11  -­‐  citation_issn   1530-­‐0897   12  -­‐  citation_isbn   978-­‐1-­‐4244-­‐2932-­‐5   13  -­‐  citation_keywords   Page  coloring;  Shadow-­‐memory  addresses;  Cache  capacity  allocation;  Data/page  migration   18  -­‐  citation_language   en   19  -­‐  citation_conference_title   15th  Interna7onal  Symposium  on  High  Performance  Computer  Architecture  (HPCA-­‐15  2009)  [14-­‐18  Feb.   2009,  Raleigh,  NC,  USA]   21  -­‐  citation_pdf_url     hTp://cdm6gs.lib.utah.edu/u7ls/geeile/collec7on/uspace/id/1/filename/11.pdf   citation_abstract_html_url   hTp://cdm6gs.lib.utah.edu/cdm/ref/collec7on/uspace/id/1   22  -­‐  Not Relevant 14 - citation_dissertation_institution 15 - citation_dissertation_name 16 - citation_technical_report_institution 17 - citation_technical_report_number 20 - citation_inbook_title
  • 36. Ques*ons?   Kenning  Arlitsch   kenning.arlitsch@utah.edu     Patrick  OBrien   www.RevXcorp.com   Patrick.OBrien@utah.edu   805.509.2586  
  • 37. Ques*ons?   Kenning  Arlitsch   kenning.arlitsch@utah.edu     Patrick  OBrien   www.RevXcorp.com   Patrick.OBrien@utah.edu