SlideShare ist ein Scribd-Unternehmen logo

Metadata harvesting

Metadata harvesting is the automatic collection of metadata from individual repositories using metadata extraction systems or generators. It occurs through analyzing tags and elements like Dublin Core to gather descriptive, technical, and administrative information without human intervention. However, inconsistencies in metadata practices across repositories can cause confusion and insufficient data for service providers harvesting metadata through the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). Improving guidelines, local standards, evaluation, communication, and data quality can help address these harvesting problems.

1 von 12
Downloaden Sie, um offline zu lesen
Metadata Harvesting and the OAI-PMH,[object Object],Andrew Schenck,[object Object],Pamela Russell,[object Object],LIS 688,[object Object]
What is Metadata Harvesting?,[object Object],An automatic metadata generating method,[object Object],Occurs when metadata is automatically collected from META tags ,[object Object],Automatically gathers metadata from individual repositories,[object Object]
Example Metadata Generators,[object Object],Metadata generators are also known as metadata extraction systems,[object Object],Sample metadata extraction systems available for libraries include:,[object Object],DC-dot,[object Object],MarcEdit,[object Object],Metaextract,[object Object],IBM Magic System,[object Object],Some are available via open source,[object Object]
DC-dot,[object Object],DC-dot is open source and it can be redistributed or modified,[object Object],DC-dot creates Dublin Core metadata,[object Object],Metadata creation is initiated by submitting a URL,[object Object],Generates keywords by analyzing hyperlinked concepts and presentation encoding,[object Object],Does not produce description metadata,[object Object],Generates type, format and date metadata ,[object Object]
MarcEdit,[object Object],MarcEdit is open source,[object Object],MarcEdit was initially conceived as a graphical user interface designed as a batch MARC editing tool.,[object Object],An application suite of metadata editing tools that includes character set conversion, XML crosswalking, and metadata harvesting. ,[object Object],It allows users to:,[object Object],Customize the existing data conversion rules or create new data conversion rules,[object Object],Harvest metadata from a supported metadata format,[object Object],Create conversion templates for additional metadata formats,[object Object],Customize existing conversion templates to reflect many variations in best practices used among projects,[object Object]
Metaextract,[object Object],Designed for metadata extraction in the domain of math and science education for K-12,[object Object],Also designed to extract Dublin Core and Gateway to Educational Materials metadata on both the item and collection levels ,[object Object],Collection-level metadata is generated based on a collection-specific configuration,[object Object],Item-level metadata is extracted from the content of educational documents using three extraction modules:,[object Object],eQuery,[object Object],HTML-based modules,[object Object],Keyword generator module ,[object Object]

Más contenido relacionado

Was ist angesagt?

Functional Requirements For Bibliographic Records - FRBR
Functional Requirements For Bibliographic Records - FRBRFunctional Requirements For Bibliographic Records - FRBR
Functional Requirements For Bibliographic Records - FRBRIslamic University of Lebanon
 
Digital library technologies
Digital library technologies Digital library technologies
Digital library technologies Shriram Pandey
 
RDA (Resource Description & Access)
RDA (Resource Description & Access)RDA (Resource Description & Access)
RDA (Resource Description & Access)Jennifer Joyner
 
Bibliographic description an overview
Bibliographic description an overviewBibliographic description an overview
Bibliographic description an overviewDr. Utpal Das
 
Digital Reference Service in Library
Digital Reference Service in LibraryDigital Reference Service in Library
Digital Reference Service in LibraryPallavi Belkar
 
Library networking in india for resources sharing
Library networking in india for resources sharingLibrary networking in india for resources sharing
Library networking in india for resources sharingTiqueRebecca
 
Indexing language concept types and characteristics
Indexing language concept types and characteristicsIndexing language concept types and characteristics
Indexing language concept types and characteristicsDr. Utpal Das
 
Ppt evaluation of information retrieval system
Ppt evaluation of information retrieval systemPpt evaluation of information retrieval system
Ppt evaluation of information retrieval systemsilambu111
 
Library congress subject headings
Library congress subject headings Library congress subject headings
Library congress subject headings MahendraAdhikari7
 
Library automation history Anandraj.L
Library automation history Anandraj.LLibrary automation history Anandraj.L
Library automation history Anandraj.Lanujessy
 

Was ist angesagt? (20)

Functional Requirements For Bibliographic Records - FRBR
Functional Requirements For Bibliographic Records - FRBRFunctional Requirements For Bibliographic Records - FRBR
Functional Requirements For Bibliographic Records - FRBR
 
INSPEC
INSPECINSPEC
INSPEC
 
Desidoc
DesidocDesidoc
Desidoc
 
Medlars
MedlarsMedlars
Medlars
 
Digital library technologies
Digital library technologies Digital library technologies
Digital library technologies
 
RDA (Resource Description & Access)
RDA (Resource Description & Access)RDA (Resource Description & Access)
RDA (Resource Description & Access)
 
Marc 21
Marc 21Marc 21
Marc 21
 
Bibliographic description an overview
Bibliographic description an overviewBibliographic description an overview
Bibliographic description an overview
 
Dspace software
Dspace softwareDspace software
Dspace software
 
Digital Reference Service in Library
Digital Reference Service in LibraryDigital Reference Service in Library
Digital Reference Service in Library
 
INIS.pptx
INIS.pptxINIS.pptx
INIS.pptx
 
Library networking in india for resources sharing
Library networking in india for resources sharingLibrary networking in india for resources sharing
Library networking in india for resources sharing
 
Interoperability in Digital Libraries
Interoperability in Digital LibrariesInteroperability in Digital Libraries
Interoperability in Digital Libraries
 
UNISIST
UNISISTUNISIST
UNISIST
 
Metadata Standards
Metadata StandardsMetadata Standards
Metadata Standards
 
Spiral of Scientific Method Arun Joseph MPhil
Spiral of Scientific Method   Arun Joseph MPhilSpiral of Scientific Method   Arun Joseph MPhil
Spiral of Scientific Method Arun Joseph MPhil
 
Indexing language concept types and characteristics
Indexing language concept types and characteristicsIndexing language concept types and characteristics
Indexing language concept types and characteristics
 
Ppt evaluation of information retrieval system
Ppt evaluation of information retrieval systemPpt evaluation of information retrieval system
Ppt evaluation of information retrieval system
 
Library congress subject headings
Library congress subject headings Library congress subject headings
Library congress subject headings
 
Library automation history Anandraj.L
Library automation history Anandraj.LLibrary automation history Anandraj.L
Library automation history Anandraj.L
 

Destacado

OAI-PMH for dummies: how to build an institutional repository with limited re...
OAI-PMH for dummies: how to build an institutional repository with limited re...OAI-PMH for dummies: how to build an institutional repository with limited re...
OAI-PMH for dummies: how to build an institutional repository with limited re...Patrice Chalon
 
Visual Resources for Teaching and Learning
Visual Resources for Teaching and LearningVisual Resources for Teaching and Learning
Visual Resources for Teaching and LearningEmilia Frinculeasa
 
Grooming Presentation
Grooming PresentationGrooming Presentation
Grooming PresentationNikhil Mathur
 

Destacado (6)

OAI-PMH for dummies: how to build an institutional repository with limited re...
OAI-PMH for dummies: how to build an institutional repository with limited re...OAI-PMH for dummies: how to build an institutional repository with limited re...
OAI-PMH for dummies: how to build an institutional repository with limited re...
 
Cataloguing
CataloguingCataloguing
Cataloguing
 
Visual Resources for Teaching and Learning
Visual Resources for Teaching and LearningVisual Resources for Teaching and Learning
Visual Resources for Teaching and Learning
 
FishBase
FishBaseFishBase
FishBase
 
OAI and OAI-PMH
OAI and OAI-PMHOAI and OAI-PMH
OAI and OAI-PMH
 
Grooming Presentation
Grooming PresentationGrooming Presentation
Grooming Presentation
 

Ähnlich wie Metadata harvesting

UNIT - 1 Part 2: Data Warehousing and Data Mining
UNIT - 1 Part 2: Data Warehousing and Data MiningUNIT - 1 Part 2: Data Warehousing and Data Mining
UNIT - 1 Part 2: Data Warehousing and Data MiningNandakumar P
 
MetadataTheory: Metadata Tools (7th of 10)
MetadataTheory: Metadata Tools (7th of 10)MetadataTheory: Metadata Tools (7th of 10)
MetadataTheory: Metadata Tools (7th of 10)Nikos Palavitsinis, PhD
 
CC Technology Summit 3 Update
CC Technology Summit 3 UpdateCC Technology Summit 3 Update
CC Technology Summit 3 UpdateNathan Yergler
 
TSPUG: Content Management in SharePoint 2010
TSPUG: Content Management in SharePoint 2010TSPUG: Content Management in SharePoint 2010
TSPUG: Content Management in SharePoint 2010Eli Robillard
 
Metadata: Towards Machine-Enabled Intelligence
Metadata: Towards Machine-Enabled Intelligence               Metadata: Towards Machine-Enabled Intelligence
Metadata: Towards Machine-Enabled Intelligence dannyijwest
 
Metadata: Towards Machine-Enabled Intelligence
Metadata: Towards Machine-Enabled IntelligenceMetadata: Towards Machine-Enabled Intelligence
Metadata: Towards Machine-Enabled Intelligencedannyijwest
 
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...IEEEMEMTECHSTUDENTSPROJECTS
 
IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...
IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...
IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...IEEEMEMTECHSTUDENTPROJECTS
 
Searching Repositories of Web Application Models
Searching Repositories of Web Application ModelsSearching Repositories of Web Application Models
Searching Repositories of Web Application ModelsMarco Brambilla
 
Webinar: 10-Step Guide to Creating a Single View of your Business
Webinar: 10-Step Guide to Creating a Single View of your BusinessWebinar: 10-Step Guide to Creating a Single View of your Business
Webinar: 10-Step Guide to Creating a Single View of your BusinessMongoDB
 
LIS688_Group1
LIS688_Group1 LIS688_Group1
LIS688_Group1 e_chae
 
Vision Based Deep Web data Extraction on Nested Query Result Records
Vision Based Deep Web data Extraction on Nested Query Result RecordsVision Based Deep Web data Extraction on Nested Query Result Records
Vision Based Deep Web data Extraction on Nested Query Result RecordsIJMER
 
Oracle data integrator training from hyderabad
Oracle data integrator training from hyderabadOracle data integrator training from hyderabad
Oracle data integrator training from hyderabadFuturePoint Technologies
 
Opinioz_intern
Opinioz_internOpinioz_intern
Opinioz_internSai Ganesh
 

Ähnlich wie Metadata harvesting (20)

Metadata
MetadataMetadata
Metadata
 
UNIT - 1 Part 2: Data Warehousing and Data Mining
UNIT - 1 Part 2: Data Warehousing and Data MiningUNIT - 1 Part 2: Data Warehousing and Data Mining
UNIT - 1 Part 2: Data Warehousing and Data Mining
 
MetadataTheory: Metadata Tools (7th of 10)
MetadataTheory: Metadata Tools (7th of 10)MetadataTheory: Metadata Tools (7th of 10)
MetadataTheory: Metadata Tools (7th of 10)
 
Meta data
Meta dataMeta data
Meta data
 
CC Technology Summit 3 Update
CC Technology Summit 3 UpdateCC Technology Summit 3 Update
CC Technology Summit 3 Update
 
CodeIgniter
CodeIgniterCodeIgniter
CodeIgniter
 
TSPUG: Content Management in SharePoint 2010
TSPUG: Content Management in SharePoint 2010TSPUG: Content Management in SharePoint 2010
TSPUG: Content Management in SharePoint 2010
 
Webinar@AIMS: LODE-BD
Webinar@AIMS: LODE-BDWebinar@AIMS: LODE-BD
Webinar@AIMS: LODE-BD
 
Metadata: Towards Machine-Enabled Intelligence
Metadata: Towards Machine-Enabled Intelligence               Metadata: Towards Machine-Enabled Intelligence
Metadata: Towards Machine-Enabled Intelligence
 
MIDESS
MIDESSMIDESS
MIDESS
 
Metadata: Towards Machine-Enabled Intelligence
Metadata: Towards Machine-Enabled IntelligenceMetadata: Towards Machine-Enabled Intelligence
Metadata: Towards Machine-Enabled Intelligence
 
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
 
IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...
IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...
IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...
 
Searching Repositories of Web Application Models
Searching Repositories of Web Application ModelsSearching Repositories of Web Application Models
Searching Repositories of Web Application Models
 
Webinar: 10-Step Guide to Creating a Single View of your Business
Webinar: 10-Step Guide to Creating a Single View of your BusinessWebinar: 10-Step Guide to Creating a Single View of your Business
Webinar: 10-Step Guide to Creating a Single View of your Business
 
LIS688_Group1
LIS688_Group1 LIS688_Group1
LIS688_Group1
 
Cake PHP
Cake PHPCake PHP
Cake PHP
 
Vision Based Deep Web data Extraction on Nested Query Result Records
Vision Based Deep Web data Extraction on Nested Query Result RecordsVision Based Deep Web data Extraction on Nested Query Result Records
Vision Based Deep Web data Extraction on Nested Query Result Records
 
Oracle data integrator training from hyderabad
Oracle data integrator training from hyderabadOracle data integrator training from hyderabad
Oracle data integrator training from hyderabad
 
Opinioz_intern
Opinioz_internOpinioz_intern
Opinioz_intern
 

Último

2024-02-24_Session 1 - PMLE_UPDATED.pptx
2024-02-24_Session 1 - PMLE_UPDATED.pptx2024-02-24_Session 1 - PMLE_UPDATED.pptx
2024-02-24_Session 1 - PMLE_UPDATED.pptxgdgsurrey
 
2.27.24 Malcolm X and the Black Freedom Struggle.pptx
2.27.24 Malcolm X and the Black Freedom Struggle.pptx2.27.24 Malcolm X and the Black Freedom Struggle.pptx
2.27.24 Malcolm X and the Black Freedom Struggle.pptxMaryPotorti1
 
ADAPTABILITY, Types of Adaptability AND STABILITY ANALYSIS method.pptx
ADAPTABILITY, Types of Adaptability AND STABILITY ANALYSIS  method.pptxADAPTABILITY, Types of Adaptability AND STABILITY ANALYSIS  method.pptx
ADAPTABILITY, Types of Adaptability AND STABILITY ANALYSIS method.pptxAKSHAYMAGAR17
 
How To Create Record Rules in the Odoo 17
How To Create Record Rules in the Odoo 17How To Create Record Rules in the Odoo 17
How To Create Record Rules in the Odoo 17Celine George
 
Andreas Schleicher_ Strengthening Upper Secondary Education in Lithuania
Andreas Schleicher_ Strengthening Upper Secondary  Education in LithuaniaAndreas Schleicher_ Strengthening Upper Secondary  Education in Lithuania
Andreas Schleicher_ Strengthening Upper Secondary Education in LithuaniaEduSkills OECD
 
Healthy Habits for Happy School Staff - presentation
Healthy Habits for Happy School Staff - presentationHealthy Habits for Happy School Staff - presentation
Healthy Habits for Happy School Staff - presentationPooky Knightsmith
 
Discussing the new Competence Framework for project managers in the built env...
Discussing the new Competence Framework for project managers in the built env...Discussing the new Competence Framework for project managers in the built env...
Discussing the new Competence Framework for project managers in the built env...Association for Project Management
 
Ideotype concept and climate resilient crop varieties for future- Wheat, Rice...
Ideotype concept and climate resilient crop varieties for future- Wheat, Rice...Ideotype concept and climate resilient crop varieties for future- Wheat, Rice...
Ideotype concept and climate resilient crop varieties for future- Wheat, Rice...AKSHAYMAGAR17
 
Dr.M.Florence Dayana-Cloud Computing-Unit - 1.pdf
Dr.M.Florence Dayana-Cloud Computing-Unit - 1.pdfDr.M.Florence Dayana-Cloud Computing-Unit - 1.pdf
Dr.M.Florence Dayana-Cloud Computing-Unit - 1.pdfDr.Florence Dayana
 
A TEXTBOOK OF INTELLECTUAL ROPERTY RIGHTS
A TEXTBOOK OF INTELLECTUAL ROPERTY RIGHTSA TEXTBOOK OF INTELLECTUAL ROPERTY RIGHTS
A TEXTBOOK OF INTELLECTUAL ROPERTY RIGHTSDr.M.Geethavani
 
EVALUATION POWERPOINT - STRANGER THINGS.pptx
EVALUATION POWERPOINT - STRANGER THINGS.pptxEVALUATION POWERPOINT - STRANGER THINGS.pptx
EVALUATION POWERPOINT - STRANGER THINGS.pptxiammrhaywood
 
Managing Choice, Coherence and Specialisation in Upper Secondary Education - ...
Managing Choice, Coherence and Specialisation in Upper Secondary Education - ...Managing Choice, Coherence and Specialisation in Upper Secondary Education - ...
Managing Choice, Coherence and Specialisation in Upper Secondary Education - ...EduSkills OECD
 
Bilingual notes of Pharmacognosy chapter 4Glycosides, Volatile oils,Tannins,R...
Bilingual notes of Pharmacognosy chapter 4Glycosides, Volatile oils,Tannins,R...Bilingual notes of Pharmacognosy chapter 4Glycosides, Volatile oils,Tannins,R...
Bilingual notes of Pharmacognosy chapter 4Glycosides, Volatile oils,Tannins,R...SUMIT TIWARI
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...Nguyen Thanh Tu Collection
 
Add Products From Catalog in Odoo 17 Sales
Add Products From Catalog in Odoo 17 SalesAdd Products From Catalog in Odoo 17 Sales
Add Products From Catalog in Odoo 17 SalesCeline George
 
Plagiarism, Types & Consequences by Dr. Sarita Anand
Plagiarism, Types & Consequences by Dr. Sarita AnandPlagiarism, Types & Consequences by Dr. Sarita Anand
Plagiarism, Types & Consequences by Dr. Sarita AnandDr. Sarita Anand
 
Google Ad Grants Services at TechSoup.pdf
Google Ad Grants Services at TechSoup.pdfGoogle Ad Grants Services at TechSoup.pdf
Google Ad Grants Services at TechSoup.pdfTechSoup
 
Dr.M.Florence Dayana-Cloud Computing-unit - 4.pdf
Dr.M.Florence Dayana-Cloud Computing-unit - 4.pdfDr.M.Florence Dayana-Cloud Computing-unit - 4.pdf
Dr.M.Florence Dayana-Cloud Computing-unit - 4.pdfDr.Florence Dayana
 
MEC MAJUBA SADDENED BY THE PASSING AWAY OF THREE TEACHERS FOLLOWING A CAR ACC...
MEC MAJUBA SADDENED BY THE PASSING AWAY OF THREE TEACHERS FOLLOWING A CAR ACC...MEC MAJUBA SADDENED BY THE PASSING AWAY OF THREE TEACHERS FOLLOWING A CAR ACC...
MEC MAJUBA SADDENED BY THE PASSING AWAY OF THREE TEACHERS FOLLOWING A CAR ACC...SABC News
 
Genetics, Heredity, Variation, history, its roles, Scope, Importance, and Bra...
Genetics, Heredity, Variation, history, its roles, Scope, Importance, and Bra...Genetics, Heredity, Variation, history, its roles, Scope, Importance, and Bra...
Genetics, Heredity, Variation, history, its roles, Scope, Importance, and Bra...AKSHAYMAGAR17
 

Último (20)

2024-02-24_Session 1 - PMLE_UPDATED.pptx
2024-02-24_Session 1 - PMLE_UPDATED.pptx2024-02-24_Session 1 - PMLE_UPDATED.pptx
2024-02-24_Session 1 - PMLE_UPDATED.pptx
 
2.27.24 Malcolm X and the Black Freedom Struggle.pptx
2.27.24 Malcolm X and the Black Freedom Struggle.pptx2.27.24 Malcolm X and the Black Freedom Struggle.pptx
2.27.24 Malcolm X and the Black Freedom Struggle.pptx
 
ADAPTABILITY, Types of Adaptability AND STABILITY ANALYSIS method.pptx
ADAPTABILITY, Types of Adaptability AND STABILITY ANALYSIS  method.pptxADAPTABILITY, Types of Adaptability AND STABILITY ANALYSIS  method.pptx
ADAPTABILITY, Types of Adaptability AND STABILITY ANALYSIS method.pptx
 
How To Create Record Rules in the Odoo 17
How To Create Record Rules in the Odoo 17How To Create Record Rules in the Odoo 17
How To Create Record Rules in the Odoo 17
 
Andreas Schleicher_ Strengthening Upper Secondary Education in Lithuania
Andreas Schleicher_ Strengthening Upper Secondary  Education in LithuaniaAndreas Schleicher_ Strengthening Upper Secondary  Education in Lithuania
Andreas Schleicher_ Strengthening Upper Secondary Education in Lithuania
 
Healthy Habits for Happy School Staff - presentation
Healthy Habits for Happy School Staff - presentationHealthy Habits for Happy School Staff - presentation
Healthy Habits for Happy School Staff - presentation
 
Discussing the new Competence Framework for project managers in the built env...
Discussing the new Competence Framework for project managers in the built env...Discussing the new Competence Framework for project managers in the built env...
Discussing the new Competence Framework for project managers in the built env...
 
Ideotype concept and climate resilient crop varieties for future- Wheat, Rice...
Ideotype concept and climate resilient crop varieties for future- Wheat, Rice...Ideotype concept and climate resilient crop varieties for future- Wheat, Rice...
Ideotype concept and climate resilient crop varieties for future- Wheat, Rice...
 
Dr.M.Florence Dayana-Cloud Computing-Unit - 1.pdf
Dr.M.Florence Dayana-Cloud Computing-Unit - 1.pdfDr.M.Florence Dayana-Cloud Computing-Unit - 1.pdf
Dr.M.Florence Dayana-Cloud Computing-Unit - 1.pdf
 
A TEXTBOOK OF INTELLECTUAL ROPERTY RIGHTS
A TEXTBOOK OF INTELLECTUAL ROPERTY RIGHTSA TEXTBOOK OF INTELLECTUAL ROPERTY RIGHTS
A TEXTBOOK OF INTELLECTUAL ROPERTY RIGHTS
 
EVALUATION POWERPOINT - STRANGER THINGS.pptx
EVALUATION POWERPOINT - STRANGER THINGS.pptxEVALUATION POWERPOINT - STRANGER THINGS.pptx
EVALUATION POWERPOINT - STRANGER THINGS.pptx
 
Managing Choice, Coherence and Specialisation in Upper Secondary Education - ...
Managing Choice, Coherence and Specialisation in Upper Secondary Education - ...Managing Choice, Coherence and Specialisation in Upper Secondary Education - ...
Managing Choice, Coherence and Specialisation in Upper Secondary Education - ...
 
Bilingual notes of Pharmacognosy chapter 4Glycosides, Volatile oils,Tannins,R...
Bilingual notes of Pharmacognosy chapter 4Glycosides, Volatile oils,Tannins,R...Bilingual notes of Pharmacognosy chapter 4Glycosides, Volatile oils,Tannins,R...
Bilingual notes of Pharmacognosy chapter 4Glycosides, Volatile oils,Tannins,R...
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
 
Add Products From Catalog in Odoo 17 Sales
Add Products From Catalog in Odoo 17 SalesAdd Products From Catalog in Odoo 17 Sales
Add Products From Catalog in Odoo 17 Sales
 
Plagiarism, Types & Consequences by Dr. Sarita Anand
Plagiarism, Types & Consequences by Dr. Sarita AnandPlagiarism, Types & Consequences by Dr. Sarita Anand
Plagiarism, Types & Consequences by Dr. Sarita Anand
 
Google Ad Grants Services at TechSoup.pdf
Google Ad Grants Services at TechSoup.pdfGoogle Ad Grants Services at TechSoup.pdf
Google Ad Grants Services at TechSoup.pdf
 
Dr.M.Florence Dayana-Cloud Computing-unit - 4.pdf
Dr.M.Florence Dayana-Cloud Computing-unit - 4.pdfDr.M.Florence Dayana-Cloud Computing-unit - 4.pdf
Dr.M.Florence Dayana-Cloud Computing-unit - 4.pdf
 
MEC MAJUBA SADDENED BY THE PASSING AWAY OF THREE TEACHERS FOLLOWING A CAR ACC...
MEC MAJUBA SADDENED BY THE PASSING AWAY OF THREE TEACHERS FOLLOWING A CAR ACC...MEC MAJUBA SADDENED BY THE PASSING AWAY OF THREE TEACHERS FOLLOWING A CAR ACC...
MEC MAJUBA SADDENED BY THE PASSING AWAY OF THREE TEACHERS FOLLOWING A CAR ACC...
 
Genetics, Heredity, Variation, history, its roles, Scope, Importance, and Bra...
Genetics, Heredity, Variation, history, its roles, Scope, Importance, and Bra...Genetics, Heredity, Variation, history, its roles, Scope, Importance, and Bra...
Genetics, Heredity, Variation, history, its roles, Scope, Importance, and Bra...
 

Metadata harvesting

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.

Hinweis der Redaktion

  1. Metadata harvesting and the Open Archives Initiative Protocol for Metadata Harvesting by Andrew Schenck and Pamela Russell
  2. Metadata harvesting is an automatic metadata generating method. Harvesting occurs when metadata is automatically collected from META tags found in the “header” source code of an HTML resource or encoded from another resource format. Metadata harvesting automatically gathers metadata from individual repositories where it has been produced by either automatic or manual approaches.
  3. Much like other automated tasks, there are a multitude of metadata generators available.These generators, also known as metadata extraction systems, can be extremely helpful for libraries wishing to extract metadata from various repositories. Some of the different metadata extraction systems available for libraries to use include: DC-dotMarcEditMetaextractand IBM Magic System.Some of these systems are available via open source and are free, although the people needed to run them must usually be paid.Many of the systems were created to harvest all types of metadata, and some were created to harvest metadata for very specific objects or areas of study.
  4. DC-dot was developed by Andy Powell at UKOLN at the University of Bath. DC-dot is open source and it can be redistributed or modified under the terms of the GNU General Public License as published by the Free Software Foundation.DC-dot creates Dublin Core metadata and can format output according to a number of different metadata schemas.In DC-dot, metadata creation is initiated by submitting a URL. The resource identifier metadata from the Web browser’s address prompt is copied, and metadata included in the title, keywords, description, and type fields is then harvested from the resource META tags. DC-dot will automatically generate keywords by analyzing hyperlinked concepts and presentation encoding (bolding and font size), but will not produce description metadata. DC-dot also automatically generates type, format, and date metadata
  5. MarcEdit was created by Terry Reese in 1998 and was initially conceived as a graphical user interface designed as a batch MARC editing tool. Currently, MarcEdit is an application suite of metadata editing tools that includes character set conversion, XML crosswalking, and metadata harvesting. Unlike other metadata extraction systems, MarcEdit allows users to customize the existing data conversion rules or create new data conversion rules.This allows users to harvest metadata from a supported metadata format as well as create conversion templates for additional metadata formats.It also allows users to customize existing conversion templates to reflect many variations in best practices used among projects.
  6. Metaextract is an extraction system that was designed for metadata extraction in the domain of math and science education for K-12.It was designed to extract Dublin Core and Gateway to Educational Materials metadata on both the item and collection levels using natural language processing techniques.The collection-level metadata is generated based on a collection-specific configuration and the item-level metadata is extracted from the content of educational documents using three extraction modules: eQuery, HTML-based modules, and a keyword generator module.
  7. IBM Magic System was presented in 2005 and includes various content analytic modules for metadata generation.Audiovisual analysis modules are available that recognize semantic sound categories and identify narrators and informative text segments as well as text analysis modules that extract title, keywords and summaryfrom text documents.The IBM Magic System can facilitate content reuse and repurposing, improve interoperability and create more timely registration of content by course developers and authors.
  8. The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) provides an application-independent interoperability framework that is based on metadata harvesting.There are two levels of participants in the OAI-PMH: data providers and service providers.Data providers administer the systems that support the OAI-PMH as a means of supplying metadata.Service providers use the metadata harvested from the OAI-PMH to help build their digital collections.
  9. Some other key terms necessary to understand OAI-PMH are harvester and repository. A harvester is a client application that can issue any OAI-PMH requests.The harvester is operated by a service provider as a way to collect metadata from a repository. A repository is a network accessible server that is able to process OAI-PMH requests. A repository is managed by the data provider to allow harvesters access to its metadata.
  10. The most common problem with harvested metadata is a lack of consistency. For example, inconsistencies across collections can occur when data providers use some Dublin Core elements and controlled vocabularies in one collection but not in another.On a larger scale, some data providers use different Dublin Core elements in different ways throughout their repository. This can lead to similar kinds of metadata ending up in different fields when harvested. The metadata harvested from OAI-PMH has other significant problems.Many repositories have missing data within their metadata. For example, if an entire collection consisted of materials of the same format or type, the repository may decline to fill out the “format” or “type” element in Dublin Core because the information would be deemed unnecessary for the collection’s local purposes. Every item is the same type so why fill out that field? This causes problems when an OAI-PMH service provider wants to limit their search. If they wanted to limit their search using the format or type element they wouldn’t be able to do so because that particular field had been left empty by the repository.An example of incorrect data in a repository would be creator names repeated in the language element or repeating the identifier for the metadata record in the Dublin Core identifier element. Also included in incorrect data would be any misspelled words or stray characters such as dashes or hyphens.Another problem with harvested metadata is that it can be confusing. Strings of names can be ordered in an inconsistent manner or ambiguously separated with commas instead of semicolons. This type of confusing data can occur when the entries are dumped without revision into a metadata record. This may happen when records are cut and pasted from Web HTML text. Insufficient data can also cause problems with harvesting because the metadata present in the repositories is not useful when trying to limit searches and retrieve specific information.
  11. Recommendations for improving harvesting:As a repository, established guidelines should be used and local standards should be developed. Either use a guideline and best practices resource that already exists or develop and document standards to meet your local needs.Evaluate your metadata to determine if there is some that you do not want or need to share.Check to see if there are certain elements where you have local metadata that would not be useful in an aggregated environment.If you find that there are some unnecessary elements, unmap the fields before allowing them to be harvested.While checking for necessary and unnecessary fields, check to see if any fields are populated with unknown or N/A. In and aggregate environment this should not be done. It is better to leave a field blank than to use unknown or N/A in fields where harvesters might interpret them as meaningful data.Most importantly, communicate with the service provider who is harvesting your records. Review your metadata and determine if there are ways to make it cleaner and easier to understand
  12. Although the OAI-PMH is far from perfect, there is ample evidence to suggest that it is a successful endeavor.The number of repositories who make their metadata available through OAI-PMH has grown since the initial release in January of 2001.Another way to gage success is from the level of attention garnered from funding agencies. Some examples of funded projects and programs that promote or are based on the OAI are eprints.org, Metadata Harvesting Initiative of the Mellon Foundation and the NSF National Science Digital Library (NSDL).The importance of metadata is one of the reasons that the Open Archives Initiative created the Protocol for Metadata Harvesting. Although it is not a perfect process, it has been very successful in helping many libraries of all types, both large and small, to create and offer Web access to digital collections.