SlideShare ist ein Scribd-Unternehmen logo
1 von 64
Downloaden Sie, um offline zu lesen
Digital  Libraries:  
  L
History,  Technology,  
  T
R&D
Edward	
  A.	
  Fox	
  
Professor,	
  Computer	
  Science,	
  Virginia	
  Tech	
  
Blacksburg,	
  VA	
  24061	
  USA	
  
fox@vt.edu	
  	
  	
  	
  	
  	
  h�p://fox.cs.vt.edu	
  	
  
6	
  Jan.	
  2014	
  

1	
  
Outline

	
  
	
  

  Acknowledgments	
  
  Introduc�on	
  
  History	
  
  Technology	
  
  Research	
  
  Development	
  
  Summary	
  and	
  Discussion	
  
6	
  Jan.	
  2014	
  

2	
  
Sponsored	
  by	
  Qatar	
  University	
  &	
  Qatar	
  Na�onal	
  Library	
  

HTTP://qnl.qa

HTTP://WWW.QU.EDU.QA/

Funding provided thru the ELISQ project:
Electronic Library Institute - SeerQ

HTTP://WWW.VT.EDU/

HTTP://WWW.PSU.EDU/

6	
  Jan.	
  2014	
  

HTTP://WWW.TAMU.EDU/

3	
  
ELISQ  Project  Team  
  P
  T
  
Qatar	
  University,	
  Qatar:	
  
Mohammed Samaka (Ph.D., Co-Lead PI)
Sumaya Ali S A Al-Maadeed (Ph.D., PI)
Myrna Tabet
Asad Nafees
Tahseena Moideen
	
  
Qatar	
  Na�onal	
  Library,	
  Qatar:
	
  
Claudia Lux (PI)
Krishna Roy Chowdhury
Postdoc - TBA

Virginia Tech, USA:
Edward Fox (Ph.D., Lead-PI)
Tarek Kanan

Penn. State University, USA:
C. Lee Giles (Ph.D., PI)
Sagnik Ray Choudhury
Texas A&M, USA:
Richard Furuta (Ph.D., PI)
Hamed Alhoori

Consultants:
John Impagliazzo (Ph.D., Key Investigator)
Susan Lukesh (Ph.D.)
This	
  project	
  was	
  made	
  possible	
  by	
  NPRP	
  Grant	
  #	
  4	
  -­‐	
  029	
  -­‐	
  1	
  –	
  007	
  from	
  
Carole Thompson
the	
  Qatar	
  Na�onal	
  Research	
  Fund	
  (a	
  member	
  of	
  Qatar	
  Founda�on).	
  	
  
6	
  Jan.	
  2014	
  

4	
  
Acknowledgements  
  
  Dr.	
  Mazen	
  Hasna,	
  VP	
  and	
  Chief	
  Academic	
  Officer,	
  Qatar	
  
University	
  	
  
  Dr.	
  Rashid	
  Alammari,	
  Dean,	
  College	
  of	
  Engineering,	
  Qatar	
  
University	
  	
  
  Dr.	
  Moumen	
  Hasnah	
  ,	
  Director	
  of	
  Academic	
  Research,	
  Qatar	
  
University	
  
  Dr.	
  Claudia	
  Lux,	
  Qatar	
  Na�onal	
  Library	
  Director	
  	
  
  Dr.	
  Imad	
  Bachir,	
  Qatar	
  University	
  Library	
  Director	
  
  Dr.	
  Munir	
  Tag,	
  Ac�ng	
  Director	
  Technical,	
  ICT	
  Program	
  
Manager	
  (QNRF)	
  
  Ms.	
  Krishna	
  Roy	
  Chowdhury,	
  Associate	
  Director	
  for	
  Library	
  IT,	
  
Qatar	
  Na�onal	
  Library	
  
  Prof.	
  Seb�	
  Foufou,	
  Head	
  of	
  Department	
  of	
  Computer	
  Science	
  
and	
  Engineering,	
  Qatar	
  University	
  	
  
Addi�onal  Thanks
  T
Qscience	
  –	
  providing	
  collec�on:
Christopher J. Leonard, Editorial Director
Paul Coyne, CTO
US	
  Na�onal	
  Science	
  Founda�on	
  
	
  (recent	
  and	
  current	
  grants	
  to	
  Fox):
  IIS-­‐1319578	
  
  IIS-­‐0916733	
  
  DUE-­‐0840719	
  
  OCI-­‐1032677	
  
  plus	
  those	
  to	
  PSU,	
  TAMU	
  
6	
  Jan.	
  2014	
  

6	
  
Outline

	
  
	
  

  Acknowledgments	
  
  Introduc�on	
  
  History	
  
  Technology	
  
  Research	
  
  Development	
  
  Summary	
  and	
  Discussion	
  
6	
  Jan.	
  2014	
  

7	
  
Introduc�on
  Reasons	
  to	
  be	
  here	
  
  Interested	
  
  Find	
  what	
  to	
  do	
  with	
  your	
  content	
  
  Find	
  how	
  to	
  help	
  your	
  user	
  community	
  
  h�p://www.morganclaypool.com/toc/icr/1/1	
  

	
  

  1.	
  DL	
  Introduc�on,	
  5S	
  framework	
  (2012)	
  
  2.	
  DL	
  Quality,	
  Integra�on	
  (2013)	
  
  3.	
  DL	
  Technologies	
  (in	
  press)	
  
  4.	
  DL	
  Applica�ons	
  (in	
  press)	
  
6	
  Jan.	
  2014	
  

8	
  
6	
  Jan.	
  2014	
  

9	
  
6	
  Jan.	
  2014	
  

10	
  
6	
  Jan.	
  2014	
  

11	
  
6	
  Jan.	
  2014	
  

12	
  
DLs  Shorten  the  Chain  to
  S
  t   C
  t

Author

Teacher

Reader
Editor
Reviewer

Learner

Digital
Library

Librarian
13	
  
Digital Library Content
Content
Types
Text
Documents

Video
Audio

Geographic
Information

Software,
Programs

Bio
Information

Images and
Graphics

Articles,
Reports,
Books

Speech,
Music

(Aerial)
Photos

Models
Simulations

Genome
Human,
animal,
plant

2D, 3D,
VR,
CAT

6	
  Jan.	
  2014	
  

14	
  
Content Based
Information
Retrieval

15	
  
16	
  
Digital	
  Library	
  Reference	
  Model	
  1.0	
  p.	
  30	
  of	
  234	
  
Informal  5S  DL  Defini�ons    
  5   D   D
    


DLs	
  are	
  complex	
  systems	
  that:	
  

 help	
  sa�sfy	
  info	
  needs	
  of	
  users	
  (socie�es)	
  
 provide	
  info	
  services	
  (scenarios)	
  
 organize	
  info	
  in	
  usable	
  ways	
  (structures)	
  
 present	
  info	
  in	
  usable	
  ways	
  (spaces)	
  
 communicate	
  info	
  with	
  users	
  (streams)	
  

18	
  
Informa�on  Life  Cycle
  L   C
Authoring
Modifying
Using
Creating

Organizing
Indexing

Retention
/ Mining

Storing
Retrieving

Accessing
Filtering

Distributing
Networking
6	
  Jan.	
  2014	
  

19	
  
Infrastructure Services
Repository-Building
Creational

Preservational

Acquiring
Cataloging
Crawling (focused)
Describing
Digitizing
Federating
Harvesting
Purchasing
Submitting

Conserving
Converting
Copying/Replicating
Emulating
Renewing
Translating (format)

Add
Value

Annotating
Classifying
Clustering
Evaluating
Extracting
Indexing
Measuring
Publicizing
Rating
Reviewing (peer)
Surveying
Translating
(language)

Information
Satisfaction
Services
Browsing
Collaborating
Customizing
Filtering
Providing access
Recommending
Requesting
Searching
Visualizing

20	
  
SeerSuite  is  Not  Google
  i   N   G
  Metadata	
  (as	
  in	
  library	
  catalogs)	
  as	
  well	
  as	
  content	
  
  Sets	
  of	
  collec�ons,	
  rather	
  than	
  the	
  Web	
  as	
  a	
  whole	
  
  Provided	
  by	
  a	
  curator	
  (e.g.,	
  publisher,	
  museum)	
  
  Provided	
  by	
  user	
  submissions	
  
  Or	
  collected	
  by	
  focused	
  ‘crawling’	
  

  Tailored	
  services,	
  rather	
  than	
  the	
  same	
  for	
  everyone	
  
  Browsing	
  using	
  categories,	
  preserving,	
  adding	
  value	
  
  Based	
  on	
  studying	
  user	
  requirements,	
  e.g.,	
  chemists	
  

  Working	
  with	
  en��es,	
  rather	
  than	
  just	
  words	
  

  Cita�ons,	
  tables,	
  figures,	
  names,	
  chemical	
  formula	
  
  Using	
  knowledge	
  bases,	
  machine	
  learning,	
  ar�ficial	
  intelligence	
  
6	
  Jan.	
  2014	
  

21	
  
Outline

	
  
	
  

  Acknowledgments	
  
  Introduc�on	
  
  History	
  
  Technology	
  
  Research	
  
  Development	
  
  Summary	
  and	
  Discussion	
  
6	
  Jan.	
  2014	
  

22	
  
History  Overview
  O
  1991,	
  esp.	
  from	
  Informa�on	
  Retrieval	
  
  Connec�ng	
  computer,	
  library,	
  and	
  informa�on	
  science	
  
communi�es	
  
  NSF	
  DL	
  Ini�a�ve	
  1	
  in	
  1994	
  included	
  funding	
  for	
  
Stanford,	
  where	
  Google	
  was	
  prototyped	
  
  Interna�onal	
  conferences	
  in	
  the	
  Americas	
  (JCDL),	
  as	
  
well	
  as	
  Europe	
  (TPDL,	
  by	
  DELOS),	
  Asia	
  (ICADL)	
  
  Publishers:	
  ACM,	
  …	
  
  DOIs,	
  (Ins�tu�onal)	
  Repositories	
  
  Spinoffs:	
  content	
  &	
  courseware	
  management	
  systems	
  
  Recently	
  including	
  (linked)	
  data	
  
6	
  Jan.	
  2014	
  

23	
  
www.nsdl.org	
  

6	
  Jan.	
  2014	
  

24	
  
25	
  
Ins�tu�onal  Repositories
  R
  “Ins�tu�onal	
  repositories	
  are	
  digital	
  collec�ons	
  
that	
  capture	
  and	
  preserve	
  the	
  intellectual	
  output	
  of	
  
a	
  single	
  university	
  or	
  a	
  mul�ple	
  ins�tu�on	
  
community	
  of	
  colleges	
  and	
  universi�es.”	
  
  Crow,	
  R.	
  “Ins�tu�onal	
  repository	
  checklist	
  and	
  
resource	
  guide”,	
  SPARC,	
  Washington,	
  D.C.,	
  USA	
  
  www.arl.org/sparc/IR/IR_Guide_v1.pdf	
  
6	
  Jan.	
  2014	
  

26	
  
NDLTD:  www.ndltd.org  
  w
  Networked	
  Digital	
  Library	
  of	
  Theses	
  and	
  
Disserta�ons	
  (NDLTD)	
  
  Vision:	
  	
  
Every	
  thesis	
  and	
  disserta�on	
  in	
  the	
  world	
  is:	
  
o  Devised	
  to	
  take	
  advantage	
  of	
  the	
  most	
  helpful	
  
electronic	
  publishing	
  methods	
  
o  Shared	
  globally	
  and	
  easily	
  found	
  

o  Supported	
  by	
  a	
  suite	
  of	
  digital	
  library	
  services	
  to	
  aid	
  
authors,	
  researchers,	
  learners,	
  universi�es	
  
o  Preserved	
  and	
  migrated	
  permanently	
  
6	
  Jan.	
  2014	
  

27	
  
Crisis,  Tragedy,  and  Recovery  (CTR)  Network  /  
  T
  a   R
  (
  N
  /
Integrated  Digital  Event  Archive  &  Library  (IDEAL)
  D
  E
  A
  &  L
  (
  Human	
  tragedies	
  that	
  result	
  from	
  man-­‐made	
  
and	
  natural	
  events	
  affect	
  humans	
  and	
  
communi�es	
  significantly.
  During	
  and	
  a�er	
  a	
  tragic	
  event,	
  there	
  are	
  a	
  
series	
  of	
  needs	
  that	
  have	
  to	
  be	
  addressed.	
  
	
  

o Compounded	
  by	
  communica�on	
  failures	
  and	
  a	
  
confusing	
  plethora	
  of	
  data	
  and	
  informa�on	
  

6	
  Jan.	
  2014	
  

28	
  
  CTRnet	
  (Crisis,	
  Tragedy	
  &	
  Recovery	
  Net)	
  
  Disaster	
  Loca�ons	
  

29	
  
  CTRnet	
  (Crisis,	
  Tragedy	
  &	
  Recovery	
  Net)	
  

  Word	
  Clouds	
  of	
  Japan	
  Earthquake	
  and	
  Libya	
  Revolu�on	
  
(using	
  tweets)	
  

	
  

Japan	
  Earthquake,	
  
Tsunami	
  Disaster	
   Updated	
  every	
  10	
  minutes	
  

Libya	
  Revolu�on	
  	
  

30	
  
CTR  stakeholders
  s

6	
  Jan.	
  2014	
  

31	
  
  CINET:	
  Network	
  Science	
  Middleware	
  

32	
  
—  CINET:	
  Network	
  Science	
  Middleware	
  

  Netviz:	
  	
  Course	
  project	
  aims	
  
to	
  develop	
  a	
  visualiza�on	
  
component	
  for	
  CINET	
  which	
  
contains	
  large	
  network	
  
graphs.	
  The	
  visualiza�on	
  
service	
  will	
  get	
  Networks	
  
from	
  CINET,	
  convert	
  from	
  
Galib	
  to	
  Gexf	
  format,	
  then	
  
visualize	
  the	
  graphs	
  using	
  
Gelphi.	
  
CINET	
  network	
  displayed	
  using	
  Gephi	
  
33	
  
Outline

	
  
	
  

  Acknowledgments	
  
  Introduc�on	
  
  History	
  
  Technology	
  
  Research	
  
  Development	
  
  Summary	
  and	
  Discussion	
  
6	
  Jan.	
  2014	
  

34	
  
Web  Archiving  
  A
  
  Introduc�on:	
  Web	
  archiving	
  is	
  the	
  process	
  of	
  
gathering	
  up	
  data	
  recorded	
  on	
  the	
  World	
  Wide	
  Web,	
  	
  
  storing	
  it,	
  	
  
  ensuring	
  the	
  data	
  is	
  preserved	
  in	
  an	
  archive,	
  and	
  	
  
  making	
  the	
  collected	
  data	
  available	
  for	
  future	
  
research.	
  	
  
	
  
  The	
  Internet	
  Archive	
  and	
  several	
  na�onal	
  libraries	
  
ini�ated	
  Web	
  archiving	
  prac�ces	
  in	
  1996.	
  	
  
6	
  Jan.	
  2014	
  

35	
  
Crawler  (Heritrix)  
  (
(for  search  engines  &  Web  archives)
  s
  e
  &  W   a
  A	
  Web	
  crawler	
  starts	
  with	
  a	
  list	
  of	
  URLs	
  to	
  visit,	
  
called	
  the	
  seeds.	
  	
  
	
  
  On	
  those	
  page,	
  iden�fies	
  all	
  the	
  hyperlinks	
  	
  
  adds	
  them	
  to	
  the	
  list	
  of	
  URLs	
  to	
  visit	
  
  recursively	
  visits	
  pages	
  pointed	
  to	
  	
  
  according	
  to	
  a	
  set	
  of	
  policies.	
  
  Priori�zes	
  its	
  downloads	
  –	
  some	
  pages	
  change	
  o�en.	
  
6	
  Jan.	
  2014	
  

36	
  
Focused  Crawlers
  C
  For	
  a	
  par�cular	
  topic	
  or	
  event	
  
  to	
  build	
  a	
  Web	
  collec�on	
  focused	
  in	
  that	
  area	
  
  Start	
  with	
  URLs	
  of	
  interest,	
  viewed	
  as	
  seeds	
  to	
  grow	
  from	
  
  Expand	
  in	
  a	
  ‘smart’	
  way	
  to	
  get	
  all	
  and	
  only	
  what	
  is	
  relevant	
  
  Use	
  informa�on	
  retrieval	
  /	
  ar�ficial	
  intelligence	
  /	
  machine	
  
learning	
  
o  Require	
  ‘knowledge	
  bases’	
  and/or	
  human	
  training	
  examples	
  
	
  

  Nevertheless,	
  there	
  is	
  a	
  tradeoff	
  between	
  the	
  resul�ng	
  
o  Recall	
  (i.e.,	
  coverage	
  of	
  what	
  is	
  out	
  there)	
  
o  Precision	
  (i.e.,	
  freedom	
  from	
  noise	
  in	
  what	
  is	
  collected)	
  
6	
  Jan.	
  2014	
  

37	
  
SeerSuite  Instan�a�ons
  I
  CiteSeerx

  http://citeseerx.ist.psu.edu
  A scientific literature digital library and search engine

  ChemXSeer

  http://chemxseer.ist.psu.edu
  Portal for researchers in environmental chemistry
integrating the scientific literature with experimental,
analytical, and simulation results and tools

  ArchSeer

  http://archseer.ist.psu.edu/
  Archeology literature

  TableSeer
  ANY fields with tables

6	
  Jan.	
  2014	
  

38	
  
CiteSeerX	
  

h�p://citeseerx.ist.psu.edu	
  

 	
  CiteSeerX	
  crawls	
  researcher	
  homepages	
  on	
  the	
  web	
  for	
  scholarly	
  papers,	
  formerly	
  in	
  
computer	
  science	
  

 	
  Converts	
  PDF	
  to	
  text	
  
 	
  Automa�cally	
  extracts	
  OAI	
  metadata	
  and	
  other	
  data	
  
 	
  Automa�c	
  cita�on	
  indexing,	
  links	
  to	
  cited	
  documents,	
  crea�on	
  of	
  
document	
  page,	
  author	
  disambigua�on	
  
 	
  So�ware	
  open	
  source	
  –	
  can	
  be	
  used	
  to	
  build	
  other	
  such	
  tools	
  

 	
  3	
  M	
  documents	
  
 	
  Ms	
  of	
  files	
  
 	
  60	
  M	
  cita�ons	
  
 	
  3	
  to	
  6	
  M	
  authors	
  
 	
  2	
  to	
  4	
  M	
  hits	
  day	
  
 	
  100K	
  documents	
  added	
  
monthly	
  
 	
  800K	
  individual	
  users	
  
 	
  several	
  Tbytes	
  
6	
  Jan.	
  2014	
  

39	
  
6	
  Jan.	
  2014	
  

40	
  
6	
  Jan.	
  2014	
  

41	
  
SeerSuite
  Tool	
  kit	
  used	
  to	
  build	
  search	
  engines	
  and	
  digital	
  libraries	
  
  CiteSeerX	
  ,	
  MyCiteSeerX	
  ,	
  ChemXSeer,	
  ArchSeer,	
  AlgoSeer,	
  
AckSeer,	
  BizSeer,	
  CSSeer,	
  CollabSeer,	
  RefSeer,	
  GrantSeer,	
  
SeerSeer,	
  YouSeer,	
  etc.	
  
  Built	
  on	
  commercial	
  grade	
  open	
  source	
  tools	
  (Solr/Lucene)	
  
  Penn	
  State	
  exper�se	
  –	
  	
  automated	
  specialized	
  metadata	
  
extrac�on	
  
  Supports	
  research	
  in	
  
  Indexing	
  and	
  search	
  
  Data	
  mining	
  &	
  structures	
  
  Informa�on	
  and	
  knowledge	
  extrac�on	
  
  Social	
  networks:	
  Name/en�ty	
  disambigua�on	
  
  Scientometrics/infometrics	
  
  Systems	
  engineering	
  
  User	
  interface	
  design	
  (HCI	
  =	
  human-­‐computer	
  interac�on)	
  
  So�ware	
  engineering	
  and	
  management	
  
ChemXSeer Highlights

Portal for academic researchers in chemistry which integrates the scientific
literature with experimental, analytical and simulation results and tools
Provides unique metadata extraction, indexing and searching pertinent to the
chemical literature by using heuristics combined with machine learning
Chemical formulae and names
Tables
Figures
Publication functions as in CiteSeerX
Expert and expertise search.

After extraction, data stored API accessible xml for users.
Hybrid repository: Serves as a federated information interoperational system
Scientific papers crawled and indexed from the web
User submitted papers and datasets (e.g. excel worksheets, Gaussian and CHARMM
toolkit outputs)
Scientific documents and metadata from publishers, web or archives.

Access control for proprietary provided content and user-submitted
experiment data

Takes advantage of in-house open source projects such as CiteSeerX/
Seersuite.
Example Formula Search
Outline

	
  
	
  

  Acknowledgments	
  
  Introduc�on	
  
  History	
  
  Technology	
  
  Research	
  
  Development	
  
  Summary	
  and	
  Discussion	
  
6	
  Jan.	
  2014	
  

45	
  
Users  -­‐  TAMU
  -­‐  T
  Requirements	
  (content,	
  services)	
  
  Prac�ces	
  (scholarly,	
  informa�on	
  seeking)	
  
  Social	
  framework	
  (collabora�on,	
  recommenda�on)	
  
  Interviews,	
  surveys	
  
  Evalua�ons:	
  usability,	
  benefits	
  

6	
  Jan.	
  2014	
  

46	
  
Infrastructure  -­‐  PSU
  -­‐  P
  Computers,	
  so�ware,	
  launching	
  infrastructure	
  at:	
  

  QU:	
  powerful	
  server,	
  now	
  crawling	
  
  	
  	
  	
  	
  +	
  ready	
  to	
  help	
  any	
  group	
  interes�ng	
  in	
  cura�ng	
  a	
  collec�on	
  
  VT,	
  QNL	
  (postdoc),	
  QCRI	
  (Prof.	
  Mitra),	
  …	
  

  Adapt	
  to	
  disciplines,	
  interes�ng	
  parts	
  of	
  documents	
  
  Adapt	
  to	
  each	
  collec�on	
  
  Develop	
  knowledge	
  base	
  and	
  heuris�cs	
  for	
  the	
  coll.	
  
  Change	
  document	
  parser	
  
  Change	
  database	
  to	
  match	
  what	
  occurs	
  
  Change	
  extractors	
  :	
  document	
  -­‐>	
  database	
  
6	
  Jan.	
  2014	
  

47	
  
Arabic  -­‐  VT
  -­‐  V
  Handle	
  Arabic	
  text	
  documents	
  
  Obtain	
  a	
  suitable	
  category/classifica�on	
  system	
  
  Have	
  people	
  provide	
  ‘training	
  set’	
  
  Use	
  machine	
  learning	
  to	
  automa�cally	
  classify	
  future	
  
Arabic	
  text	
  documents	
  
  Support	
  cross-­‐language	
  informa�on	
  retrieval	
  
  Arabic	
  ques�on	
  against	
  English	
  documents	
  
  English	
  ques�on	
  against	
  Arabic	
  documents	
  	
  
6	
  Jan.	
  2014	
  

48	
  
Arabic  Handwri�ng  -­‐  QU
  H
  -­‐  Q
  Images	
  of	
  historic	
  documents	
  
  Arabic	
  text	
  extracted	
  
  Mapping	
  from	
  a	
  part	
  of	
  the	
  text	
  to	
  the	
  corresponding	
  
part	
  of	
  the	
  image	
  
  Special	
  tools	
  for	
  
  Those	
  processing	
  the	
  original	
  documents	
  
  Those	
  doing	
  research	
  with	
  the	
  collec�on	
  

  Will	
  allow	
  work	
  on	
  non-­‐textual	
  collec�ons	
  too,	
  e.g.,	
  
museum	
  images,	
  set	
  of	
  photos	
  for	
  teaching	
  architecture	
  
6	
  Jan.	
  2014	
  

49	
  
Accessible  Collec�ons  in  Qatar  -­‐  QNL
  C
  i   Q
  -­‐  Q
  What	
  collec�ons	
  have	
  the	
  highest	
  priority?	
  
  What	
  special	
  handling	
  is	
  needed	
  for	
  each	
  class,	
  for	
  
each	
  subclass	
  of	
  collec�on	
  type?	
  
  How	
  do	
  DLs	
  best	
  fit	
  into	
  the	
  ac�vi�es	
  of	
  the	
  Na�onal	
  
Library?	
  
  Can	
  .qa	
  be	
  fully	
  archived	
  for	
  Wayback	
  Machine	
  use?	
  
6	
  Jan.	
  2014	
  

50	
  
Outline

	
  
	
  

  Acknowledgments	
  
  Introduc�on	
  
  History	
  
  Technology	
  
  Research	
  
  Development	
  
  Summary	
  and	
  Discussion	
  
6	
  Jan.	
  2014	
  

51	
  
RELATED
TOPICS

CORE DL
TOPICS

COURSE
STRUCTURE

DL  Curriculum  Framework
  C
  F
Semester 1:
DL collections:
development/creation

Digitization
Storage
Interchange

Metadata
Cataloging
Author
submission

Digital objects
Composites
Packages

Semester 2:
DL services and
sustainability

Architectures
(agents, buses,
wrappers/mediators)
Interoperability

Spaces
(conceptual,
geographic,
2/3D, VR)

Documents
E-publishing
Markup

Multimedia
streams/structures
Capture/representation
Compression/coding

Bibliographic
information
Bibliometrics
Citations

Content-based
analysis
Multimedia
indexing

Naming
Repositories
Archives

Services
(searching,
linking,
browsing, etc.)

Archiving and
preservation
Integrity

Architectures
(agents, buses,
wrappers/mediators)
Interoperability

Thesauri
Ontologies
Classification
Categorization

Info. Needs
Relevance
Evaluation
Effectiveness

Intellectual property
rights mgmt.
Privacy
Protection (watermarking)

Routing
Filtering
Community
filtering

Search & search strategy
Info seeking behavior
User modeling
Feedback

Multimedia
presentation,
rendering

6	
  Jan.	
  2014	
  

Info
summarization
Visualization

52	
  
Modules
  h�p://en.wikiversity.org/wiki/
Curriculum_on_Digital_Libraries	
  
  Table	
  1:	
  Core	
  DL	
  Modules	
  
  Table	
  2:	
  Informa�on	
  Retrieval	
  Packages	
  
  Table	
  3:	
  Big	
  Data	
  
  Table	
  4:	
  Mul�media	
  So�ware	
  
  Like	
  lesson	
  plans,	
  for	
  a	
  training	
  session	
  or	
  lecture	
  
  Can	
  be	
  used	
  for	
  self-­‐study,	
  refreshers	
  
53	
  
h�p://curric.dlib.vt.edu/modDev/modDev.html	
  
6	
  Jan.	
  2014	
  

54	
  
h�p://elisq.qu.edu.qa/	
  
ELISQ  Audience  (Users)
  A
  (

  Primary:	
  
o 
o 
o 
o 

Librarians	
  	
  and	
  libraries	
  in	
  Qatar	
  
Researchers	
  and	
  academics	
  
Government	
  organiza�ons	
  
Non-­‐Governmental	
  organiza�ons	
  	
  
(such	
  as	
  h�p://www.fsd.org.qa/)	
  

  Secondary:	
  
o 
o 
o 
o 
o 

University	
  /	
  School	
  Students	
  
Teachers	
  /	
  Faculty	
  	
  
Managers	
  
Qatari	
  ci�zens	
  
Other	
  stakeholders	
  
6	
  Jan.	
  2014	
  

55	
  
ELISQ  Project    ((1    of    2)  
  P
o 2   
Project	
  Objec�ves/Aims	
  
A.  Research	
   and	
   prototype	
   digital	
   library	
   systems	
   and	
  
infrastructure	
   for	
   Qatar,	
   focusing	
   ini�ally	
   on	
   Qatari	
  
informa�on	
   related	
   to	
   government	
   and	
   scholarly	
  
ac�vi�es.	
  
	
  

Leverage	
   the	
   crawling	
   engine	
   from	
   Penn	
   State‘s	
   SeerSuite	
  
so�ware	
  infrastructure,	
  and	
  extend	
  it	
  beyond	
  its	
  current	
  focus	
  on	
  
English	
   to	
   support	
   Arabic-­‐English	
   collec�ons,	
   and	
   to	
   cover	
   a	
   broad	
  
range	
   of	
   scholarly	
   disciplines,	
   and	
   all	
   types	
   of	
   government	
  
informa�on.	
  	
  
6	
  Jan.	
  2014	
  

56	
  
ELISQ  Project    ((2    of    2)  
  P
o 2   
Project	
  Objec�ves/Aims	
  (con�nued)	
  
	
  
B.  Research	
   and	
   build	
   the	
   digital	
   library	
   community	
   in	
  
Qatar,	
   suppor�ng	
   digital	
   library	
   use,	
   services,	
  
collec�on	
   development,	
   tailored	
   systems,	
   and	
  
advancing	
  toward	
  a	
  Knowledge	
  Society.	
  
	
  

Study	
   scholarly	
   ac�vi�es,	
   and	
   engage	
   in	
   community	
   building	
   in	
  
Qatar,	
   so	
   DLs	
   can	
   be	
   tailored	
   to	
   specific	
   domains	
   and	
   to	
   the	
  
unique	
   needs	
   of	
   Qatar.	
   Through	
   workshops,	
   a	
   consul�ng	
   center	
   at	
  
the	
  proposed	
  Ins�tute,	
  and	
  collabora�ve	
  efforts	
  with	
  libraries	
  and	
  
museums	
   in	
   Qatar,	
   we	
   will	
   iden�fy	
   par�cular	
   needs	
   and	
   uses,	
   and	
  
tailor	
  collec�ons,	
  systems,	
  and	
  services,	
  to	
  lead	
  toward	
  the	
  Qatari	
  
Knowledge	
  Society.	
  
6	
  Jan.	
  2014	
  

57	
  
Significance  to  Librarians,  Corpora�ons,  
  t   L
  C
and    Governmental  Agencies
    G
  A
  The	
  need	
  to	
  preserve	
  cultural	
  and	
  historical	
  heritage	
  =>	
  
o  Collec�ons	
  of	
  fragile	
  and	
  precious	
  ar�facts	
  =>	
  	
  
o  Libraries,	
  museums,	
  and	
  archives	
  developing	
  digital	
  	
  
collec�ons	
  =>	
  
o  Users	
  from	
  all	
  over	
  the	
  world	
  accessing	
  and	
  studying	
  

  A	
  one	
  stop	
  search	
  of:	
  	
  

o  Informa�on	
  about	
  Qatar	
  
o  Informa�on	
  to	
  preserve	
  the	
  culture	
  of	
  Qatar	
  

  Deep	
  indexing,	
  analysis,	
  and	
  retrieval	
  of:	
  

o  Resources,	
  reports,	
  sta�s�cs,	
  and	
  other	
  types	
  of	
  informa�on	
  
o  Informa�on	
  in	
  the	
  Arabic	
  language	
  as	
  well	
  as	
  in	
  English	
  
6	
  Jan.	
  2014	
  

58	
  
ELISQ  Content
  C
  Metadata,	
  data,	
  and	
  many	
  types	
  of	
  documents	
  
(including	
  full	
  text)	
  
  Qatari	
  resources	
  that	
  first	
  appeared	
  in	
  digital	
  form	
  -­‐	
  
‘born’	
  digital	
  
  At	
  a	
  later	
  stage	
  the	
  project	
  will	
  include:	
  	
  
o  Digital	
  versions	
  of	
  material	
  already	
  exis�ng	
  in	
  print	
  
o  Mul�media	
  (image,	
  audio,	
  video)	
  forms	
  

  Free	
  and	
  open	
  as	
  well	
  as	
  content	
  with	
  limited	
  access	
  

6	
  Jan.	
  2014	
  

59	
  
ELISQ  Focus
  F
Community	
  in	
  Qatar	
  
  Iden�fy	
  interested	
  stakeholders,	
  to	
  tailor	
  to	
  needs	
  
  Train	
  next	
  genera�on	
  of	
  digital	
  librarians,	
  archivists,	
  
and	
  curators	
  
  Partners	
  helping	
  with	
  addi�onal	
  collec�on	
  
development	
  
	
  
Advanced	
  Technology	
  for	
  Enhanced	
  Access	
  
  “Low	
  hanging	
  fruit”	
  by	
  crawling	
  Qatar-­‐related	
  Web	
  
  Improved	
  analysis	
  (cita�ons,	
  tables,	
  chemicals,	
  …)	
  
  Support	
  for	
  both	
  Arabic	
  and	
  English	
  
	
  
6	
  Jan.	
  2014	
  

60	
  
Outline

	
  
	
  

  Acknowledgments	
  
  Introduc�on	
  
  History	
  
  Technology	
  
  Research	
  
  Development	
  
  Summary	
  and	
  Discussion	
  
6	
  Jan.	
  2014	
  

61	
  
Summary  (some  highlights)
  (
  h
  Introduc�on	
  to	
  digital	
  libraries:	
  5S,	
  any	
  content	
  
  History:	
  since	
  1991,	
  Google,	
  repositories	
  
  Technology:	
  SeerSuite,	
  Heritrix,	
  Solr,	
  HCI	
  
  Ini�al	
  collec�ons:	
  Qscience,	
  news,	
  …	
  

  Research:	
  extend	
  SeerSuite;	
  Arabic	
  

  Adapt	
  other	
  tools	
  for	
  handwri�ng	
  collec�on,	
  non-­‐text	
  collec�ons	
  

  Development:	
  consul�ng	
  center	
  (addressing	
  needs)	
  
6	
  Jan.	
  2014	
  

62	
  
Ques�ons  for  You
  f   Y
  What	
  communi�es	
  should	
  be	
  served?	
  
  What	
  collec�ons	
  should	
  be	
  made	
  accessible?	
  
  What	
  services	
  are	
  required?	
  
  What	
  are	
  the	
  priori�es	
  in	
  the	
  above?	
  
  Can	
  you	
  help	
  us	
  find	
  suitable	
  partners,	
  content	
  owners,	
  
curators,	
  user	
  groups?	
  
6	
  Jan.	
  2014	
  

63	
  
Ques�ons  for  Us?
  f   U
  h�p://elisq.qu.edu.qa/	
  
  fox@vt.edu	
  
  h�p://fox.cs.vt.edu	
  

6	
  Jan.	
  2014	
  

64	
  

Weitere ähnliche Inhalte

Was ist angesagt?

Digital Thesis Unmsm
Digital Thesis UnmsmDigital Thesis Unmsm
Digital Thesis UnmsmLibio Huaroto
 
eLearning and Open Educational Resources (OER)
eLearning and Open Educational Resources (OER)eLearning and Open Educational Resources (OER)
eLearning and Open Educational Resources (OER)Mokhtar Ben Henda
 
B2: Open Up: Open Data in the Public Sector
B2: Open Up: Open Data in the Public SectorB2: Open Up: Open Data in the Public Sector
B2: Open Up: Open Data in the Public SectorMarieke Guy
 
Parry, Poole and Pratty on Muiseums and the Semantic Web
Parry, Poole and Pratty on Muiseums and the Semantic WebParry, Poole and Pratty on Muiseums and the Semantic Web
Parry, Poole and Pratty on Muiseums and the Semantic Webrdp5
 
IFLA ARL Webinar Series: Tales of Rising from the Ashes: Rebuilding Libraries...
IFLA ARL Webinar Series: Tales of Rising from the Ashes: Rebuilding Libraries...IFLA ARL Webinar Series: Tales of Rising from the Ashes: Rebuilding Libraries...
IFLA ARL Webinar Series: Tales of Rising from the Ashes: Rebuilding Libraries...IFLAAcademicandResea
 
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...Stefan Dietze
 
The Learning Registry: Social networking for open educational resources?
The Learning Registry: Social networking for open educational resources?The Learning Registry: Social networking for open educational resources?
The Learning Registry: Social networking for open educational resources?Lorna Campbell
 
Online Learning and Linked Data: An Introduction
Online Learning and Linked Data: An IntroductionOnline Learning and Linked Data: An Introduction
Online Learning and Linked Data: An IntroductionEUCLID project
 
Open Educational Resources and their place in teaching and research for Class...
Open Educational Resources and their place in teaching and research for Class...Open Educational Resources and their place in teaching and research for Class...
Open Educational Resources and their place in teaching and research for Class...Simon Mahony
 
Research data management: a tale of two paradigms:
Research data management: a tale of two paradigms: Research data management: a tale of two paradigms:
Research data management: a tale of two paradigms: Martin Donnelly
 
Learning Analytics Metadata Standards, xAPI recipes & Learning Record Store -
Learning Analytics Metadata Standards, xAPI recipes & Learning Record Store - Learning Analytics Metadata Standards, xAPI recipes & Learning Record Store -
Learning Analytics Metadata Standards, xAPI recipes & Learning Record Store - Hendrik Drachsler
 
Open Educational Resources and Repositories: Discussion Breakout Session
Open Educational Resources and Repositories: Discussion Breakout SessionOpen Educational Resources and Repositories: Discussion Breakout Session
Open Educational Resources and Repositories: Discussion Breakout SessionSarah Currier
 
Webinar updatedessential library services-covid-2019-converted (1)
Webinar updatedessential library services-covid-2019-converted (1)Webinar updatedessential library services-covid-2019-converted (1)
Webinar updatedessential library services-covid-2019-converted (1)Dr Trivedi
 
Ensuring Continuing Access to Online Scholarly Resources - China
Ensuring Continuing Access to Online Scholarly Resources - ChinaEnsuring Continuing Access to Online Scholarly Resources - China
Ensuring Continuing Access to Online Scholarly Resources - ChinaEDINA, University of Edinburgh
 
From local to global: sharing information literacy teaching as open education...
From local to global: sharing information literacy teaching as open education...From local to global: sharing information literacy teaching as open education...
From local to global: sharing information literacy teaching as open education...Jane Secker
 
WWW2013 Tutorial: Linked Data & Education
WWW2013 Tutorial: Linked Data & EducationWWW2013 Tutorial: Linked Data & Education
WWW2013 Tutorial: Linked Data & EducationStefan Dietze
 
From Open Data to Open Science, by Geoffrey Boulton
 From Open Data to Open Science, by Geoffrey Boulton From Open Data to Open Science, by Geoffrey Boulton
From Open Data to Open Science, by Geoffrey BoultonLEARN Project
 
[DCSB] Simon Mahony (UCL) "Open Education, Open Educational Resources, and th...
[DCSB] Simon Mahony (UCL) "Open Education, Open Educational Resources, and th...[DCSB] Simon Mahony (UCL) "Open Education, Open Educational Resources, and th...
[DCSB] Simon Mahony (UCL) "Open Education, Open Educational Resources, and th...Digital Classicist Seminar Berlin
 

Was ist angesagt? (20)

British Library Datasets Programme Feb 2011
British Library Datasets Programme Feb 2011British Library Datasets Programme Feb 2011
British Library Datasets Programme Feb 2011
 
Digital Thesis Unmsm
Digital Thesis UnmsmDigital Thesis Unmsm
Digital Thesis Unmsm
 
eLearning and Open Educational Resources (OER)
eLearning and Open Educational Resources (OER)eLearning and Open Educational Resources (OER)
eLearning and Open Educational Resources (OER)
 
B2: Open Up: Open Data in the Public Sector
B2: Open Up: Open Data in the Public SectorB2: Open Up: Open Data in the Public Sector
B2: Open Up: Open Data in the Public Sector
 
Parry, Poole and Pratty on Muiseums and the Semantic Web
Parry, Poole and Pratty on Muiseums and the Semantic WebParry, Poole and Pratty on Muiseums and the Semantic Web
Parry, Poole and Pratty on Muiseums and the Semantic Web
 
IFLA ARL Webinar Series: Tales of Rising from the Ashes: Rebuilding Libraries...
IFLA ARL Webinar Series: Tales of Rising from the Ashes: Rebuilding Libraries...IFLA ARL Webinar Series: Tales of Rising from the Ashes: Rebuilding Libraries...
IFLA ARL Webinar Series: Tales of Rising from the Ashes: Rebuilding Libraries...
 
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
 
The Learning Registry: Social networking for open educational resources?
The Learning Registry: Social networking for open educational resources?The Learning Registry: Social networking for open educational resources?
The Learning Registry: Social networking for open educational resources?
 
Online Learning and Linked Data: An Introduction
Online Learning and Linked Data: An IntroductionOnline Learning and Linked Data: An Introduction
Online Learning and Linked Data: An Introduction
 
Open Educational Resources and their place in teaching and research for Class...
Open Educational Resources and their place in teaching and research for Class...Open Educational Resources and their place in teaching and research for Class...
Open Educational Resources and their place in teaching and research for Class...
 
Research data management: a tale of two paradigms:
Research data management: a tale of two paradigms: Research data management: a tale of two paradigms:
Research data management: a tale of two paradigms:
 
Learning Analytics Metadata Standards, xAPI recipes & Learning Record Store -
Learning Analytics Metadata Standards, xAPI recipes & Learning Record Store - Learning Analytics Metadata Standards, xAPI recipes & Learning Record Store -
Learning Analytics Metadata Standards, xAPI recipes & Learning Record Store -
 
Open Educational Resources and Repositories: Discussion Breakout Session
Open Educational Resources and Repositories: Discussion Breakout SessionOpen Educational Resources and Repositories: Discussion Breakout Session
Open Educational Resources and Repositories: Discussion Breakout Session
 
Webinar updatedessential library services-covid-2019-converted (1)
Webinar updatedessential library services-covid-2019-converted (1)Webinar updatedessential library services-covid-2019-converted (1)
Webinar updatedessential library services-covid-2019-converted (1)
 
Ensuring Continuing Access to Online Scholarly Resources - China
Ensuring Continuing Access to Online Scholarly Resources - ChinaEnsuring Continuing Access to Online Scholarly Resources - China
Ensuring Continuing Access to Online Scholarly Resources - China
 
From local to global: sharing information literacy teaching as open education...
From local to global: sharing information literacy teaching as open education...From local to global: sharing information literacy teaching as open education...
From local to global: sharing information literacy teaching as open education...
 
WWW2013 Tutorial: Linked Data & Education
WWW2013 Tutorial: Linked Data & EducationWWW2013 Tutorial: Linked Data & Education
WWW2013 Tutorial: Linked Data & Education
 
Webquest presentation, December 2011
Webquest presentation, December 2011Webquest presentation, December 2011
Webquest presentation, December 2011
 
From Open Data to Open Science, by Geoffrey Boulton
 From Open Data to Open Science, by Geoffrey Boulton From Open Data to Open Science, by Geoffrey Boulton
From Open Data to Open Science, by Geoffrey Boulton
 
[DCSB] Simon Mahony (UCL) "Open Education, Open Educational Resources, and th...
[DCSB] Simon Mahony (UCL) "Open Education, Open Educational Resources, and th...[DCSB] Simon Mahony (UCL) "Open Education, Open Educational Resources, and th...
[DCSB] Simon Mahony (UCL) "Open Education, Open Educational Resources, and th...
 

Andere mochten auch

20140113 q uchemxseerseminar
20140113 q uchemxseerseminar20140113 q uchemxseerseminar
20140113 q uchemxseerseminarTahseenaM
 
20140113 q uchemxseerseminarfinal
20140113 q uchemxseerseminarfinal20140113 q uchemxseerseminarfinal
20140113 q uchemxseerseminarfinalTahseenaM
 
Development and operational control of two string maximum power point tracker...
Development and operational control of two string maximum power point tracker...Development and operational control of two string maximum power point tracker...
Development and operational control of two string maximum power point tracker...Ecwayt
 
20140113 q uchemxseerseminar
20140113 q uchemxseerseminar20140113 q uchemxseerseminar
20140113 q uchemxseerseminarTahseenaM
 
Presentation for ECSU Staff Retreat - July 2014
Presentation for ECSU Staff Retreat - July 2014Presentation for ECSU Staff Retreat - July 2014
Presentation for ECSU Staff Retreat - July 2014sbclapp
 
digital libraries: the phoenix rises from the ashes
digital libraries: the phoenix rises from the ashesdigital libraries: the phoenix rises from the ashes
digital libraries: the phoenix rises from the ashesSarah Houghton
 
Marketing Of Digital Libraries
Marketing Of Digital LibrariesMarketing Of Digital Libraries
Marketing Of Digital LibrariesElco van Staveren
 

Andere mochten auch (7)

20140113 q uchemxseerseminar
20140113 q uchemxseerseminar20140113 q uchemxseerseminar
20140113 q uchemxseerseminar
 
20140113 q uchemxseerseminarfinal
20140113 q uchemxseerseminarfinal20140113 q uchemxseerseminarfinal
20140113 q uchemxseerseminarfinal
 
Development and operational control of two string maximum power point tracker...
Development and operational control of two string maximum power point tracker...Development and operational control of two string maximum power point tracker...
Development and operational control of two string maximum power point tracker...
 
20140113 q uchemxseerseminar
20140113 q uchemxseerseminar20140113 q uchemxseerseminar
20140113 q uchemxseerseminar
 
Presentation for ECSU Staff Retreat - July 2014
Presentation for ECSU Staff Retreat - July 2014Presentation for ECSU Staff Retreat - July 2014
Presentation for ECSU Staff Retreat - July 2014
 
digital libraries: the phoenix rises from the ashes
digital libraries: the phoenix rises from the ashesdigital libraries: the phoenix rises from the ashes
digital libraries: the phoenix rises from the ashes
 
Marketing Of Digital Libraries
Marketing Of Digital LibrariesMarketing Of Digital Libraries
Marketing Of Digital Libraries
 

Ähnlich wie 20140106 qu seminar

The transformation of a Science Librarian: a 2.0 experience
The transformation of a Science Librarian: a 2.0 experienceThe transformation of a Science Librarian: a 2.0 experience
The transformation of a Science Librarian: a 2.0 experiencePavlinka Kovatcheva
 
Europeana Cloud - Work Package 1: Assessing Researcher Needs in the Cloud and...
Europeana Cloud - Work Package 1: Assessing Researcher Needs in the Cloud and...Europeana Cloud - Work Package 1: Assessing Researcher Needs in the Cloud and...
Europeana Cloud - Work Package 1: Assessing Researcher Needs in the Cloud and...Europeana
 
Lightning Talks: All EartCube Funded Projects
Lightning Talks: All EartCube Funded ProjectsLightning Talks: All EartCube Funded Projects
Lightning Talks: All EartCube Funded ProjectsEarthCube
 
Sediment Experimentalist Network (SEN): Sharing and reusing methods and data ...
Sediment Experimentalist Network (SEN): Sharing and reusing methods and data ...Sediment Experimentalist Network (SEN): Sharing and reusing methods and data ...
Sediment Experimentalist Network (SEN): Sharing and reusing methods and data ...hsuleslie
 
Digital Scholarly Communication @Claremont Colleges
Digital Scholarly Communication @Claremont CollegesDigital Scholarly Communication @Claremont Colleges
Digital Scholarly Communication @Claremont CollegesAshley Sanders, Ph.D.
 
Rebecca Grant - DH research data: identification and challenges (DH2016)
Rebecca Grant - DH research data: identification and challenges (DH2016)Rebecca Grant - DH research data: identification and challenges (DH2016)
Rebecca Grant - DH research data: identification and challenges (DH2016)dri_ireland
 
EarthCube's OceanLink - Project Overview and Presentation Updates (March 2014)
EarthCube's OceanLink - Project Overview and Presentation Updates (March 2014)EarthCube's OceanLink - Project Overview and Presentation Updates (March 2014)
EarthCube's OceanLink - Project Overview and Presentation Updates (March 2014)EarthCube
 
Deep Earth Computer: A Platform for Linked Science of the Deep Carbon Obser...
Deep Earth Computer: A Platform for Linked Science of the Deep Carbon Obser...Deep Earth Computer: A Platform for Linked Science of the Deep Carbon Obser...
Deep Earth Computer: A Platform for Linked Science of the Deep Carbon Obser...Xiaogang (Marshall) Ma
 
2011 odh webinar.nitle
2011 odh webinar.nitle2011 odh webinar.nitle
2011 odh webinar.nitleRebecca Davis
 
Open access for researchers, policy makers and research managers - Short ver...
Open access  for researchers, policy makers and research managers - Short ver...Open access  for researchers, policy makers and research managers - Short ver...
Open access for researchers, policy makers and research managers - Short ver...Iryna Kuchma
 
Gaining the Momentum: Open Repositories in Transitional Countries
Gaining the Momentum: Open Repositories in Transitional CountriesGaining the Momentum: Open Repositories in Transitional Countries
Gaining the Momentum: Open Repositories in Transitional CountriesIryna Kuchma
 
We’re All Prosumers Now? Sociality and Open Access Archaeology
We’re All Prosumers Now? Sociality and Open Access ArchaeologyWe’re All Prosumers Now? Sociality and Open Access Archaeology
We’re All Prosumers Now? Sociality and Open Access Archaeologyariadnenetwork
 
Alastair Dunning, Europeana Cloud: The Project and the Challenges of Assessin...
Alastair Dunning, Europeana Cloud: The Project and the Challenges of Assessin...Alastair Dunning, Europeana Cloud: The Project and the Challenges of Assessin...
Alastair Dunning, Europeana Cloud: The Project and the Challenges of Assessin...The European Library
 
Why Data Science Matters - 2014 WDS Data Stewardship Award Lecture
Why Data Science Matters - 2014 WDS Data Stewardship Award LectureWhy Data Science Matters - 2014 WDS Data Stewardship Award Lecture
Why Data Science Matters - 2014 WDS Data Stewardship Award LectureXiaogang (Marshall) Ma
 
Emerging researchers slideshow jv r -7-fonts
Emerging researchers slideshow   jv r -7-fontsEmerging researchers slideshow   jv r -7-fonts
Emerging researchers slideshow jv r -7-fontseResearchatUCT
 
Open access for researchers and students, research managers and publishers
Open access  for researchers and students, research managers and publishersOpen access  for researchers and students, research managers and publishers
Open access for researchers and students, research managers and publishersIryna Kuchma
 
Building a global teaching profile: Showcasing Open Educational Resources a...
Building a global teaching profile:  Showcasing Open Educational Resources a...Building a global teaching profile:  Showcasing Open Educational Resources a...
Building a global teaching profile: Showcasing Open Educational Resources a...Michael Paskevicius
 
Ecloud copenhagen-130625074823-phpapp01
Ecloud copenhagen-130625074823-phpapp01Ecloud copenhagen-130625074823-phpapp01
Ecloud copenhagen-130625074823-phpapp01The European Library
 

Ähnlich wie 20140106 qu seminar (20)

Sgci iwsg-a-10-10-16
Sgci iwsg-a-10-10-16Sgci iwsg-a-10-10-16
Sgci iwsg-a-10-10-16
 
The transformation of a Science Librarian: a 2.0 experience
The transformation of a Science Librarian: a 2.0 experienceThe transformation of a Science Librarian: a 2.0 experience
The transformation of a Science Librarian: a 2.0 experience
 
Europeana Cloud - Work Package 1: Assessing Researcher Needs in the Cloud and...
Europeana Cloud - Work Package 1: Assessing Researcher Needs in the Cloud and...Europeana Cloud - Work Package 1: Assessing Researcher Needs in the Cloud and...
Europeana Cloud - Work Package 1: Assessing Researcher Needs in the Cloud and...
 
Lightning Talks: All EartCube Funded Projects
Lightning Talks: All EartCube Funded ProjectsLightning Talks: All EartCube Funded Projects
Lightning Talks: All EartCube Funded Projects
 
Sediment Experimentalist Network (SEN): Sharing and reusing methods and data ...
Sediment Experimentalist Network (SEN): Sharing and reusing methods and data ...Sediment Experimentalist Network (SEN): Sharing and reusing methods and data ...
Sediment Experimentalist Network (SEN): Sharing and reusing methods and data ...
 
Digital Scholarly Communication @Claremont Colleges
Digital Scholarly Communication @Claremont CollegesDigital Scholarly Communication @Claremont Colleges
Digital Scholarly Communication @Claremont Colleges
 
Rebecca Grant - DH research data: identification and challenges (DH2016)
Rebecca Grant - DH research data: identification and challenges (DH2016)Rebecca Grant - DH research data: identification and challenges (DH2016)
Rebecca Grant - DH research data: identification and challenges (DH2016)
 
EarthCube's OceanLink - Project Overview and Presentation Updates (March 2014)
EarthCube's OceanLink - Project Overview and Presentation Updates (March 2014)EarthCube's OceanLink - Project Overview and Presentation Updates (March 2014)
EarthCube's OceanLink - Project Overview and Presentation Updates (March 2014)
 
Deep Earth Computer: A Platform for Linked Science of the Deep Carbon Obser...
Deep Earth Computer: A Platform for Linked Science of the Deep Carbon Obser...Deep Earth Computer: A Platform for Linked Science of the Deep Carbon Obser...
Deep Earth Computer: A Platform for Linked Science of the Deep Carbon Obser...
 
2011 odh webinar.nitle
2011 odh webinar.nitle2011 odh webinar.nitle
2011 odh webinar.nitle
 
Open access for researchers, policy makers and research managers - Short ver...
Open access  for researchers, policy makers and research managers - Short ver...Open access  for researchers, policy makers and research managers - Short ver...
Open access for researchers, policy makers and research managers - Short ver...
 
Gaining the Momentum: Open Repositories in Transitional Countries
Gaining the Momentum: Open Repositories in Transitional CountriesGaining the Momentum: Open Repositories in Transitional Countries
Gaining the Momentum: Open Repositories in Transitional Countries
 
We’re All Prosumers Now? Sociality and Open Access Archaeology
We’re All Prosumers Now? Sociality and Open Access ArchaeologyWe’re All Prosumers Now? Sociality and Open Access Archaeology
We’re All Prosumers Now? Sociality and Open Access Archaeology
 
Alastair Dunning, Europeana Cloud: The Project and the Challenges of Assessin...
Alastair Dunning, Europeana Cloud: The Project and the Challenges of Assessin...Alastair Dunning, Europeana Cloud: The Project and the Challenges of Assessin...
Alastair Dunning, Europeana Cloud: The Project and the Challenges of Assessin...
 
Why Data Science Matters - 2014 WDS Data Stewardship Award Lecture
Why Data Science Matters - 2014 WDS Data Stewardship Award LectureWhy Data Science Matters - 2014 WDS Data Stewardship Award Lecture
Why Data Science Matters - 2014 WDS Data Stewardship Award Lecture
 
Emerging researchers slideshow jv r -7-fonts
Emerging researchers slideshow   jv r -7-fontsEmerging researchers slideshow   jv r -7-fonts
Emerging researchers slideshow jv r -7-fonts
 
Open access for researchers and students, research managers and publishers
Open access  for researchers and students, research managers and publishersOpen access  for researchers and students, research managers and publishers
Open access for researchers and students, research managers and publishers
 
Dh presentation 2019
Dh presentation 2019Dh presentation 2019
Dh presentation 2019
 
Building a global teaching profile: Showcasing Open Educational Resources a...
Building a global teaching profile:  Showcasing Open Educational Resources a...Building a global teaching profile:  Showcasing Open Educational Resources a...
Building a global teaching profile: Showcasing Open Educational Resources a...
 
Ecloud copenhagen-130625074823-phpapp01
Ecloud copenhagen-130625074823-phpapp01Ecloud copenhagen-130625074823-phpapp01
Ecloud copenhagen-130625074823-phpapp01
 

Kürzlich hochgeladen

Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)cama23
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxPoojaSen20
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 

Kürzlich hochgeladen (20)

Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 

20140106 qu seminar

  • 1. Digital  Libraries:    L History,  Technology,    T R&D Edward  A.  Fox   Professor,  Computer  Science,  Virginia  Tech   Blacksburg,  VA  24061  USA   fox@vt.edu            h�p://fox.cs.vt.edu     6  Jan.  2014   1  
  • 2. Outline       Acknowledgments     Introduc�on     History     Technology     Research     Development     Summary  and  Discussion   6  Jan.  2014   2  
  • 3. Sponsored  by  Qatar  University  &  Qatar  Na�onal  Library   HTTP://qnl.qa HTTP://WWW.QU.EDU.QA/ Funding provided thru the ELISQ project: Electronic Library Institute - SeerQ HTTP://WWW.VT.EDU/ HTTP://WWW.PSU.EDU/ 6  Jan.  2014   HTTP://WWW.TAMU.EDU/ 3  
  • 4. ELISQ  Project  Team    P  T   Qatar  University,  Qatar:   Mohammed Samaka (Ph.D., Co-Lead PI) Sumaya Ali S A Al-Maadeed (Ph.D., PI) Myrna Tabet Asad Nafees Tahseena Moideen   Qatar  Na�onal  Library,  Qatar:   Claudia Lux (PI) Krishna Roy Chowdhury Postdoc - TBA Virginia Tech, USA: Edward Fox (Ph.D., Lead-PI) Tarek Kanan Penn. State University, USA: C. Lee Giles (Ph.D., PI) Sagnik Ray Choudhury Texas A&M, USA: Richard Furuta (Ph.D., PI) Hamed Alhoori Consultants: John Impagliazzo (Ph.D., Key Investigator) Susan Lukesh (Ph.D.) This  project  was  made  possible  by  NPRP  Grant  #  4  -­‐  029  -­‐  1  –  007  from   Carole Thompson the  Qatar  Na�onal  Research  Fund  (a  member  of  Qatar  Founda�on).     6  Jan.  2014   4  
  • 5. Acknowledgements       Dr.  Mazen  Hasna,  VP  and  Chief  Academic  Officer,  Qatar   University       Dr.  Rashid  Alammari,  Dean,  College  of  Engineering,  Qatar   University       Dr.  Moumen  Hasnah  ,  Director  of  Academic  Research,  Qatar   University     Dr.  Claudia  Lux,  Qatar  Na�onal  Library  Director       Dr.  Imad  Bachir,  Qatar  University  Library  Director     Dr.  Munir  Tag,  Ac�ng  Director  Technical,  ICT  Program   Manager  (QNRF)     Ms.  Krishna  Roy  Chowdhury,  Associate  Director  for  Library  IT,   Qatar  Na�onal  Library     Prof.  Seb�  Foufou,  Head  of  Department  of  Computer  Science   and  Engineering,  Qatar  University    
  • 6. Addi�onal  Thanks  T Qscience  –  providing  collec�on: Christopher J. Leonard, Editorial Director Paul Coyne, CTO US  Na�onal  Science  Founda�on    (recent  and  current  grants  to  Fox):   IIS-­‐1319578     IIS-­‐0916733     DUE-­‐0840719     OCI-­‐1032677     plus  those  to  PSU,  TAMU   6  Jan.  2014   6  
  • 7. Outline       Acknowledgments     Introduc�on     History     Technology     Research     Development     Summary  and  Discussion   6  Jan.  2014   7  
  • 8. Introduc�on   Reasons  to  be  here     Interested     Find  what  to  do  with  your  content     Find  how  to  help  your  user  community     h�p://www.morganclaypool.com/toc/icr/1/1       1.  DL  Introduc�on,  5S  framework  (2012)     2.  DL  Quality,  Integra�on  (2013)     3.  DL  Technologies  (in  press)     4.  DL  Applica�ons  (in  press)   6  Jan.  2014   8  
  • 9. 6  Jan.  2014   9  
  • 10. 6  Jan.  2014   10  
  • 11. 6  Jan.  2014   11  
  • 12. 6  Jan.  2014   12  
  • 13. DLs  Shorten  the  Chain  to  S  t  C  t Author Teacher Reader Editor Reviewer Learner Digital Library Librarian 13  
  • 14. Digital Library Content Content Types Text Documents Video Audio Geographic Information Software, Programs Bio Information Images and Graphics Articles, Reports, Books Speech, Music (Aerial) Photos Models Simulations Genome Human, animal, plant 2D, 3D, VR, CAT 6  Jan.  2014   14  
  • 16. 16  
  • 17. Digital  Library  Reference  Model  1.0  p.  30  of  234  
  • 18. Informal  5S  DL  Defini�ons      5  D  D     DLs  are  complex  systems  that:    help  sa�sfy  info  needs  of  users  (socie�es)    provide  info  services  (scenarios)    organize  info  in  usable  ways  (structures)    present  info  in  usable  ways  (spaces)    communicate  info  with  users  (streams)   18  
  • 19. Informa�on  Life  Cycle  L  C Authoring Modifying Using Creating Organizing Indexing Retention / Mining Storing Retrieving Accessing Filtering Distributing Networking 6  Jan.  2014   19  
  • 20. Infrastructure Services Repository-Building Creational Preservational Acquiring Cataloging Crawling (focused) Describing Digitizing Federating Harvesting Purchasing Submitting Conserving Converting Copying/Replicating Emulating Renewing Translating (format) Add Value Annotating Classifying Clustering Evaluating Extracting Indexing Measuring Publicizing Rating Reviewing (peer) Surveying Translating (language) Information Satisfaction Services Browsing Collaborating Customizing Filtering Providing access Recommending Requesting Searching Visualizing 20  
  • 21. SeerSuite  is  Not  Google  i  N  G   Metadata  (as  in  library  catalogs)  as  well  as  content     Sets  of  collec�ons,  rather  than  the  Web  as  a  whole     Provided  by  a  curator  (e.g.,  publisher,  museum)     Provided  by  user  submissions     Or  collected  by  focused  ‘crawling’     Tailored  services,  rather  than  the  same  for  everyone     Browsing  using  categories,  preserving,  adding  value     Based  on  studying  user  requirements,  e.g.,  chemists     Working  with  en��es,  rather  than  just  words     Cita�ons,  tables,  figures,  names,  chemical  formula     Using  knowledge  bases,  machine  learning,  ar�ficial  intelligence   6  Jan.  2014   21  
  • 22. Outline       Acknowledgments     Introduc�on     History     Technology     Research     Development     Summary  and  Discussion   6  Jan.  2014   22  
  • 23. History  Overview  O   1991,  esp.  from  Informa�on  Retrieval     Connec�ng  computer,  library,  and  informa�on  science   communi�es     NSF  DL  Ini�a�ve  1  in  1994  included  funding  for   Stanford,  where  Google  was  prototyped     Interna�onal  conferences  in  the  Americas  (JCDL),  as   well  as  Europe  (TPDL,  by  DELOS),  Asia  (ICADL)     Publishers:  ACM,  …     DOIs,  (Ins�tu�onal)  Repositories     Spinoffs:  content  &  courseware  management  systems     Recently  including  (linked)  data   6  Jan.  2014   23  
  • 24. www.nsdl.org   6  Jan.  2014   24  
  • 25. 25  
  • 26. Ins�tu�onal  Repositories  R   “Ins�tu�onal  repositories  are  digital  collec�ons   that  capture  and  preserve  the  intellectual  output  of   a  single  university  or  a  mul�ple  ins�tu�on   community  of  colleges  and  universi�es.”     Crow,  R.  “Ins�tu�onal  repository  checklist  and   resource  guide”,  SPARC,  Washington,  D.C.,  USA     www.arl.org/sparc/IR/IR_Guide_v1.pdf   6  Jan.  2014   26  
  • 27. NDLTD:  www.ndltd.org    w   Networked  Digital  Library  of  Theses  and   Disserta�ons  (NDLTD)     Vision:     Every  thesis  and  disserta�on  in  the  world  is:   o  Devised  to  take  advantage  of  the  most  helpful   electronic  publishing  methods   o  Shared  globally  and  easily  found   o  Supported  by  a  suite  of  digital  library  services  to  aid   authors,  researchers,  learners,  universi�es   o  Preserved  and  migrated  permanently   6  Jan.  2014   27  
  • 28. Crisis,  Tragedy,  and  Recovery  (CTR)  Network  /    T  a  R  (  N  / Integrated  Digital  Event  Archive  &  Library  (IDEAL)  D  E  A  &  L  (   Human  tragedies  that  result  from  man-­‐made   and  natural  events  affect  humans  and   communi�es  significantly.   During  and  a�er  a  tragic  event,  there  are  a   series  of  needs  that  have  to  be  addressed.     o Compounded  by  communica�on  failures  and  a   confusing  plethora  of  data  and  informa�on   6  Jan.  2014   28  
  • 29.   CTRnet  (Crisis,  Tragedy  &  Recovery  Net)     Disaster  Loca�ons   29  
  • 30.   CTRnet  (Crisis,  Tragedy  &  Recovery  Net)     Word  Clouds  of  Japan  Earthquake  and  Libya  Revolu�on   (using  tweets)     Japan  Earthquake,   Tsunami  Disaster   Updated  every  10  minutes   Libya  Revolu�on     30  
  • 31. CTR  stakeholders  s 6  Jan.  2014   31  
  • 32.   CINET:  Network  Science  Middleware   32  
  • 33. —  CINET:  Network  Science  Middleware     Netviz:    Course  project  aims   to  develop  a  visualiza�on   component  for  CINET  which   contains  large  network   graphs.  The  visualiza�on   service  will  get  Networks   from  CINET,  convert  from   Galib  to  Gexf  format,  then   visualize  the  graphs  using   Gelphi.   CINET  network  displayed  using  Gephi   33  
  • 34. Outline       Acknowledgments     Introduc�on     History     Technology     Research     Development     Summary  and  Discussion   6  Jan.  2014   34  
  • 35. Web  Archiving    A     Introduc�on:  Web  archiving  is  the  process  of   gathering  up  data  recorded  on  the  World  Wide  Web,       storing  it,       ensuring  the  data  is  preserved  in  an  archive,  and       making  the  collected  data  available  for  future   research.         The  Internet  Archive  and  several  na�onal  libraries   ini�ated  Web  archiving  prac�ces  in  1996.     6  Jan.  2014   35  
  • 36. Crawler  (Heritrix)    ( (for  search  engines  &  Web  archives)  s  e  &  W  a   A  Web  crawler  starts  with  a  list  of  URLs  to  visit,   called  the  seeds.         On  those  page,  iden�fies  all  the  hyperlinks       adds  them  to  the  list  of  URLs  to  visit     recursively  visits  pages  pointed  to       according  to  a  set  of  policies.     Priori�zes  its  downloads  –  some  pages  change  o�en.   6  Jan.  2014   36  
  • 37. Focused  Crawlers  C   For  a  par�cular  topic  or  event     to  build  a  Web  collec�on  focused  in  that  area     Start  with  URLs  of  interest,  viewed  as  seeds  to  grow  from     Expand  in  a  ‘smart’  way  to  get  all  and  only  what  is  relevant     Use  informa�on  retrieval  /  ar�ficial  intelligence  /  machine   learning   o  Require  ‘knowledge  bases’  and/or  human  training  examples       Nevertheless,  there  is  a  tradeoff  between  the  resul�ng   o  Recall  (i.e.,  coverage  of  what  is  out  there)   o  Precision  (i.e.,  freedom  from  noise  in  what  is  collected)   6  Jan.  2014   37  
  • 38. SeerSuite  Instan�a�ons  I   CiteSeerx   http://citeseerx.ist.psu.edu   A scientific literature digital library and search engine   ChemXSeer   http://chemxseer.ist.psu.edu   Portal for researchers in environmental chemistry integrating the scientific literature with experimental, analytical, and simulation results and tools   ArchSeer   http://archseer.ist.psu.edu/   Archeology literature   TableSeer   ANY fields with tables 6  Jan.  2014   38  
  • 39. CiteSeerX   h�p://citeseerx.ist.psu.edu      CiteSeerX  crawls  researcher  homepages  on  the  web  for  scholarly  papers,  formerly  in   computer  science      Converts  PDF  to  text      Automa�cally  extracts  OAI  metadata  and  other  data      Automa�c  cita�on  indexing,  links  to  cited  documents,  crea�on  of   document  page,  author  disambigua�on      So�ware  open  source  –  can  be  used  to  build  other  such  tools      3  M  documents      Ms  of  files      60  M  cita�ons      3  to  6  M  authors      2  to  4  M  hits  day      100K  documents  added   monthly      800K  individual  users      several  Tbytes   6  Jan.  2014   39  
  • 40. 6  Jan.  2014   40  
  • 41. 6  Jan.  2014   41  
  • 42. SeerSuite   Tool  kit  used  to  build  search  engines  and  digital  libraries     CiteSeerX  ,  MyCiteSeerX  ,  ChemXSeer,  ArchSeer,  AlgoSeer,   AckSeer,  BizSeer,  CSSeer,  CollabSeer,  RefSeer,  GrantSeer,   SeerSeer,  YouSeer,  etc.     Built  on  commercial  grade  open  source  tools  (Solr/Lucene)     Penn  State  exper�se  –    automated  specialized  metadata   extrac�on     Supports  research  in     Indexing  and  search     Data  mining  &  structures     Informa�on  and  knowledge  extrac�on     Social  networks:  Name/en�ty  disambigua�on     Scientometrics/infometrics     Systems  engineering     User  interface  design  (HCI  =  human-­‐computer  interac�on)     So�ware  engineering  and  management  
  • 43. ChemXSeer Highlights Portal for academic researchers in chemistry which integrates the scientific literature with experimental, analytical and simulation results and tools Provides unique metadata extraction, indexing and searching pertinent to the chemical literature by using heuristics combined with machine learning Chemical formulae and names Tables Figures Publication functions as in CiteSeerX Expert and expertise search. After extraction, data stored API accessible xml for users. Hybrid repository: Serves as a federated information interoperational system Scientific papers crawled and indexed from the web User submitted papers and datasets (e.g. excel worksheets, Gaussian and CHARMM toolkit outputs) Scientific documents and metadata from publishers, web or archives. Access control for proprietary provided content and user-submitted experiment data Takes advantage of in-house open source projects such as CiteSeerX/ Seersuite.
  • 45. Outline       Acknowledgments     Introduc�on     History     Technology     Research     Development     Summary  and  Discussion   6  Jan.  2014   45  
  • 46. Users  -­‐  TAMU  -­‐  T   Requirements  (content,  services)     Prac�ces  (scholarly,  informa�on  seeking)     Social  framework  (collabora�on,  recommenda�on)     Interviews,  surveys     Evalua�ons:  usability,  benefits   6  Jan.  2014   46  
  • 47. Infrastructure  -­‐  PSU  -­‐  P   Computers,  so�ware,  launching  infrastructure  at:     QU:  powerful  server,  now  crawling            +  ready  to  help  any  group  interes�ng  in  cura�ng  a  collec�on     VT,  QNL  (postdoc),  QCRI  (Prof.  Mitra),  …     Adapt  to  disciplines,  interes�ng  parts  of  documents     Adapt  to  each  collec�on     Develop  knowledge  base  and  heuris�cs  for  the  coll.     Change  document  parser     Change  database  to  match  what  occurs     Change  extractors  :  document  -­‐>  database   6  Jan.  2014   47  
  • 48. Arabic  -­‐  VT  -­‐  V   Handle  Arabic  text  documents     Obtain  a  suitable  category/classifica�on  system     Have  people  provide  ‘training  set’     Use  machine  learning  to  automa�cally  classify  future   Arabic  text  documents     Support  cross-­‐language  informa�on  retrieval     Arabic  ques�on  against  English  documents     English  ques�on  against  Arabic  documents     6  Jan.  2014   48  
  • 49. Arabic  Handwri�ng  -­‐  QU  H  -­‐  Q   Images  of  historic  documents     Arabic  text  extracted     Mapping  from  a  part  of  the  text  to  the  corresponding   part  of  the  image     Special  tools  for     Those  processing  the  original  documents     Those  doing  research  with  the  collec�on     Will  allow  work  on  non-­‐textual  collec�ons  too,  e.g.,   museum  images,  set  of  photos  for  teaching  architecture   6  Jan.  2014   49  
  • 50. Accessible  Collec�ons  in  Qatar  -­‐  QNL  C  i  Q  -­‐  Q   What  collec�ons  have  the  highest  priority?     What  special  handling  is  needed  for  each  class,  for   each  subclass  of  collec�on  type?     How  do  DLs  best  fit  into  the  ac�vi�es  of  the  Na�onal   Library?     Can  .qa  be  fully  archived  for  Wayback  Machine  use?   6  Jan.  2014   50  
  • 51. Outline       Acknowledgments     Introduc�on     History     Technology     Research     Development     Summary  and  Discussion   6  Jan.  2014   51  
  • 52. RELATED TOPICS CORE DL TOPICS COURSE STRUCTURE DL  Curriculum  Framework  C  F Semester 1: DL collections: development/creation Digitization Storage Interchange Metadata Cataloging Author submission Digital objects Composites Packages Semester 2: DL services and sustainability Architectures (agents, buses, wrappers/mediators) Interoperability Spaces (conceptual, geographic, 2/3D, VR) Documents E-publishing Markup Multimedia streams/structures Capture/representation Compression/coding Bibliographic information Bibliometrics Citations Content-based analysis Multimedia indexing Naming Repositories Archives Services (searching, linking, browsing, etc.) Archiving and preservation Integrity Architectures (agents, buses, wrappers/mediators) Interoperability Thesauri Ontologies Classification Categorization Info. Needs Relevance Evaluation Effectiveness Intellectual property rights mgmt. Privacy Protection (watermarking) Routing Filtering Community filtering Search & search strategy Info seeking behavior User modeling Feedback Multimedia presentation, rendering 6  Jan.  2014   Info summarization Visualization 52  
  • 53. Modules   h�p://en.wikiversity.org/wiki/ Curriculum_on_Digital_Libraries     Table  1:  Core  DL  Modules     Table  2:  Informa�on  Retrieval  Packages     Table  3:  Big  Data     Table  4:  Mul�media  So�ware     Like  lesson  plans,  for  a  training  session  or  lecture     Can  be  used  for  self-­‐study,  refreshers   53  
  • 55. h�p://elisq.qu.edu.qa/   ELISQ  Audience  (Users)  A  (   Primary:   o  o  o  o  Librarians    and  libraries  in  Qatar   Researchers  and  academics   Government  organiza�ons   Non-­‐Governmental  organiza�ons     (such  as  h�p://www.fsd.org.qa/)     Secondary:   o  o  o  o  o  University  /  School  Students   Teachers  /  Faculty     Managers   Qatari  ci�zens   Other  stakeholders   6  Jan.  2014   55  
  • 56. ELISQ  Project    ((1    of    2)    P o 2   Project  Objec�ves/Aims   A.  Research   and   prototype   digital   library   systems   and   infrastructure   for   Qatar,   focusing   ini�ally   on   Qatari   informa�on   related   to   government   and   scholarly   ac�vi�es.     Leverage   the   crawling   engine   from   Penn   State‘s   SeerSuite   so�ware  infrastructure,  and  extend  it  beyond  its  current  focus  on   English   to   support   Arabic-­‐English   collec�ons,   and   to   cover   a   broad   range   of   scholarly   disciplines,   and   all   types   of   government   informa�on.     6  Jan.  2014   56  
  • 57. ELISQ  Project    ((2    of    2)    P o 2   Project  Objec�ves/Aims  (con�nued)     B.  Research   and   build   the   digital   library   community   in   Qatar,   suppor�ng   digital   library   use,   services,   collec�on   development,   tailored   systems,   and   advancing  toward  a  Knowledge  Society.     Study   scholarly   ac�vi�es,   and   engage   in   community   building   in   Qatar,   so   DLs   can   be   tailored   to   specific   domains   and   to   the   unique   needs   of   Qatar.   Through   workshops,   a   consul�ng   center   at   the  proposed  Ins�tute,  and  collabora�ve  efforts  with  libraries  and   museums   in   Qatar,   we   will   iden�fy   par�cular   needs   and   uses,   and   tailor  collec�ons,  systems,  and  services,  to  lead  toward  the  Qatari   Knowledge  Society.   6  Jan.  2014   57  
  • 58. Significance  to  Librarians,  Corpora�ons,    t  L  C and    Governmental  Agencies    G  A   The  need  to  preserve  cultural  and  historical  heritage  =>   o  Collec�ons  of  fragile  and  precious  ar�facts  =>     o  Libraries,  museums,  and  archives  developing  digital     collec�ons  =>   o  Users  from  all  over  the  world  accessing  and  studying     A  one  stop  search  of:     o  Informa�on  about  Qatar   o  Informa�on  to  preserve  the  culture  of  Qatar     Deep  indexing,  analysis,  and  retrieval  of:   o  Resources,  reports,  sta�s�cs,  and  other  types  of  informa�on   o  Informa�on  in  the  Arabic  language  as  well  as  in  English   6  Jan.  2014   58  
  • 59. ELISQ  Content  C   Metadata,  data,  and  many  types  of  documents   (including  full  text)     Qatari  resources  that  first  appeared  in  digital  form  -­‐   ‘born’  digital     At  a  later  stage  the  project  will  include:     o  Digital  versions  of  material  already  exis�ng  in  print   o  Mul�media  (image,  audio,  video)  forms     Free  and  open  as  well  as  content  with  limited  access   6  Jan.  2014   59  
  • 60. ELISQ  Focus  F Community  in  Qatar     Iden�fy  interested  stakeholders,  to  tailor  to  needs     Train  next  genera�on  of  digital  librarians,  archivists,   and  curators     Partners  helping  with  addi�onal  collec�on   development     Advanced  Technology  for  Enhanced  Access     “Low  hanging  fruit”  by  crawling  Qatar-­‐related  Web     Improved  analysis  (cita�ons,  tables,  chemicals,  …)     Support  for  both  Arabic  and  English     6  Jan.  2014   60  
  • 61. Outline       Acknowledgments     Introduc�on     History     Technology     Research     Development     Summary  and  Discussion   6  Jan.  2014   61  
  • 62. Summary  (some  highlights)  (  h   Introduc�on  to  digital  libraries:  5S,  any  content     History:  since  1991,  Google,  repositories     Technology:  SeerSuite,  Heritrix,  Solr,  HCI     Ini�al  collec�ons:  Qscience,  news,  …     Research:  extend  SeerSuite;  Arabic     Adapt  other  tools  for  handwri�ng  collec�on,  non-­‐text  collec�ons     Development:  consul�ng  center  (addressing  needs)   6  Jan.  2014   62  
  • 63. Ques�ons  for  You  f  Y   What  communi�es  should  be  served?     What  collec�ons  should  be  made  accessible?     What  services  are  required?     What  are  the  priori�es  in  the  above?     Can  you  help  us  find  suitable  partners,  content  owners,   curators,  user  groups?   6  Jan.  2014   63  
  • 64. Ques�ons  for  Us?  f  U   h�p://elisq.qu.edu.qa/     fox@vt.edu     h�p://fox.cs.vt.edu   6  Jan.  2014   64