SlideShare a Scribd company logo
1 of 30
Download to read offline
Metadata 101
   An introduction metadata and data management

                           By Dominique Gerald M. Cimafranca
                              villageidiotsavant.com




This work is licensed under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 Philippines
License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/ph/ or send
a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.
What is Metadata?

Data that provides information
about other data
     A collection of structured information
     about a document or a piece of content

     For example: Author, Title, Subject, Issue
     Date, Publisher
Metadata isn't new...




           Image from http://www.meerimage.com/
In fact, the stuff you have may
       already have it...




             ...and you just didn't know it.
Purpose of Metadata
●   To identify content
    ●   Capture fields and distinguish each document from all others
●   Manage content
    ●   Version numbers, archive date, security and access permissions
●   Retrieval of content
    ●   Taxonomy topics, subject keywords, document type
●   Connect content to other content
    ●   Behavioral metadata captured in transaction (e.g., Amazon)
●   Business processes
    ●   Authored by whom? Reviewed by whom and when? Approved by whom and
        when?
●   Support records management
    ●   Retention periods, disposition cycle
In a nutshell...



...to make information easy to
find, manage, and contextualize.
However...


Metadata is most useful in collections

Metadata is most useful when shared

Metadata is most useful in collaboration
Issues with information access today

●   Tons of content from disparate sources
●   Cumbersome navigation
●   Keyword search assumes that you know what you are
    looking for
●   Large number of search results – most of them
    irrelevant
●   Lack of context in search results
●   Search engines rely on mathematical algorithms to
    determine relevance and ranking of search results
Types of Metadata
●   Descriptive
    ●
        Describes a resource for discovery and
        identification, e.g., abstract, author, keywords
●
    Structural
    ●   Indicates how the parts of the resource are
        arranged, e.g., chapters in a book
●   Administrative
    ●
        Provides information on how to manage a resource,
        e.g., when it was created, who has access to it
Structural Metadata
●   Structural metadata defines the relationship
    between whole and parts.
●   Structural metadata can also be used for
    navigational purposes, e.g., links to related
    files.
Administrative Metadata
●   Administrative metadata provides information
    to help manage a resource, such as when and
    how it was created, file type, and other
    technical information, and who can access it.
●
    Most common subsets
    ●
        Rights management metadata
    ●   Preservation metadata
Why Metadata?
●   Resource discovery
●   Organizing electronic resources
●   Interoperability
●
    Digital identification
●
    Archiving and presentation
Metadata alone isn't enough...



...even Metadata has to be properly
thought out and properly used.
Planning Metadata
●   Whose requirements are you trying to meet?
    ●
        Who are your users and what are their
        requirements?
●
    What is your business case?
    ●   Why should you undertake this project?
●   What is your business model?
    ●
        How will this project be worthwhile?
Metadata Structure
●   Recognized standards
●   Local specifications
●   Social tagging systems
Metadata Quality
●   Technical Quality
    ●   Adherence to local or international standards,
        specifications, and application profiles
●   Semantic Quality
    ●   Proper use of controlled vocabularies and semantic
        standards
●   Value Quality
    ●   Populating metadata fields appropriately for describing
        the resource and its relationships for the benefit of the
        user community and other stakeholders
“Accurate, consistent, sufficient, and
thus reliable.”
                   --Greenberg & Robertson, 2002
Nine Guiding Questions
●   Who will be using the collection?
●   Who is the collection cataloger?
●   How much time and money do you have?
●   How will your collection be accessed?
●   How is your collection related to other collections?
●   What is the scope of your collection?
●   Will your metadata be harvested?
●   Do you want your collection to work with other collections?
●   How much maintenance and quality control do you wish?


                   http://journals.tdl.org/jodi/article/viewArticle/226/205
Use cases for Metadata
●   Resource discovery
●   Resource selection
●   Resource aggregation and manipulation
●   Intellectual property rights
●   Digital preservation
●   Marketing
●   Accessibility
●   Interoperability
●   Workflow identification
●   Reputation (of individuals and organizations)
How is Metadata created? By humans...
●   Created by resource authors
●   Added by resource depositors
●   Created, checked, augmented by professionals
    ●   Catalogers
    ●   Subject Experts
    ●   Designated IPR keepers
●   Enriched by resource users
    ●   Additional description, comments, annotations, descriptions of usage
    ●   Corrections
    ●   Enrichment (additional subject description)
    ●   Social tagging
    ●   Ratings and recommendations
...or by machines

●   Extraction from resource files
●   Inferred from resource relationships
●   Creation according to system settings
●
    Generation of default values
●
    Extraction via text mining
The need for Metadata standards
●   Different information providers using different
    metadata schemas
●   Even metadata schemas of groups within
    organizations are different or out of sync
●
    Result
    ●
        Inconsistent search results
    ●   Lack of interoperability
    ●   Information silos
Dublin Core
●   General purpose metadata standard for use across domains
●   15 core elements
●   Element qualifiers to narrow the meaning of elements
    ●   E.g., Date Created vs Date Modified
●   Encoding schemes: controlled vocabularies or parsing rules
    to refine the interpretation of an element
●   Can be represented in HTML and XML (RDF)
●   See http://dublincore.org
Dublin Core Metadata Elements
Taxonomy
●   A classification scheme
    ●
        Designed to group related things together
●
    Semantic
    ●
        Fixed vocabulary that is meaningful to its users
●   A knowledge map
    ●
        Should give the user a grasp of the structure of the
        knowledge domain
    ●   Establishes relationships between objects
Government-related taxonomies
●   Australian Governments' Interactive Functions Thesaurus
    ●   Three-level hierarchical thesaurus that describes business functions
        carried out through Australian government units
    ●   25 high-level functions with second and third level terms
    ●   Purpose: to aid online discovery of government information and
        services
●   Functions of New Zealand and Subjects of New Zealand
    ●   Thesauri for NZ government resources
    ●   Classification of all-of-government level


                    http://www.naa.gov.au/records-management/create-capture-describe/describe/agift/index.aspx

                    http://www.e.govt.nz/standards/nzgls/thesauri/
Actually, we don't need to look
           outside...
Remember!


Metadata is most useful in collections

Metadata is most useful when shared

Metadata is most useful in collaboration
Sources
●
    Metadata Primer (http://www.slideshare.net/selvats/metadata-primer)
●
    AGIFT (http://www.naa.gov.au/records-management/create-capture-describe/describe/classification/agift/index.htm)
●
    Taxonomy and Metadata (http://www.slideshare.net/dchampeau/taxonomy-and-metadata)
●
    Understanding Metadata (www.niso.org/standards/resources/UnderstandingMetadata.pdf)
●
    An Introduction to Metadata (http://www.library.uq.edu.au/iad/ctmeta4.html)
●
    NZGLS thesauri (http://www.e.govt.nz/standards/nzgls/thesauri/downloads.html)
●
    If you tag it, will they come? (Sarah Currier)
●
    Nine questions to guide you in choosing a metadata schema
    (http://journals.tdl.org/jodi/article/viewArticle/226/205)

More Related Content

What's hot

What's hot (7)

Metadata in Business Intelligence
Metadata in Business IntelligenceMetadata in Business Intelligence
Metadata in Business Intelligence
 
Metadata Standard for Digital Content Creation / Nafisah Ahmad
Metadata Standard for Digital Content Creation / Nafisah AhmadMetadata Standard for Digital Content Creation / Nafisah Ahmad
Metadata Standard for Digital Content Creation / Nafisah Ahmad
 
Metadata
MetadataMetadata
Metadata
 
Mendeley Data FAIR hackathon
Mendeley Data FAIR hackathonMendeley Data FAIR hackathon
Mendeley Data FAIR hackathon
 
DTL Partners Event - FAIR Data Tech overview - Day 1
DTL Partners Event - FAIR Data Tech overview - Day 1DTL Partners Event - FAIR Data Tech overview - Day 1
DTL Partners Event - FAIR Data Tech overview - Day 1
 
Introduction to Metadata
Introduction to MetadataIntroduction to Metadata
Introduction to Metadata
 
JOSA TechTalk: Metadata Management
in Big Data
JOSA TechTalk: Metadata Management
in Big DataJOSA TechTalk: Metadata Management
in Big Data
JOSA TechTalk: Metadata Management
in Big Data
 

Viewers also liked

Review power point
Review power pointReview power point
Review power point
firengb
 
jQuery From the Ground Up
jQuery From the Ground UpjQuery From the Ground Up
jQuery From the Ground Up
Kevin Griffin
 
WordCamp Tampa 2015
WordCamp Tampa 2015WordCamp Tampa 2015
WordCamp Tampa 2015
David Bisset
 
SIS 2011 - Transforming Organizations Into Publishing Machines - Rob Garner -...
SIS 2011 - Transforming Organizations Into Publishing Machines - Rob Garner -...SIS 2011 - Transforming Organizations Into Publishing Machines - Rob Garner -...
SIS 2011 - Transforming Organizations Into Publishing Machines - Rob Garner -...
iCrossing
 

Viewers also liked (20)

iCrossing UK Client Summit 2011 - Total retail: mobile and the multichannel mix
iCrossing UK Client Summit 2011 - Total retail: mobile and the multichannel mixiCrossing UK Client Summit 2011 - Total retail: mobile and the multichannel mix
iCrossing UK Client Summit 2011 - Total retail: mobile and the multichannel mix
 
Internet
InternetInternet
Internet
 
Negotiations to Amend the Great Lakes Water Quality Agreement
Negotiations to Amend the Great Lakes Water Quality AgreementNegotiations to Amend the Great Lakes Water Quality Agreement
Negotiations to Amend the Great Lakes Water Quality Agreement
 
Social Media Workshop Stichting Infix Uden
Social Media Workshop Stichting Infix UdenSocial Media Workshop Stichting Infix Uden
Social Media Workshop Stichting Infix Uden
 
Review power point
Review power pointReview power point
Review power point
 
Sholto In China
Sholto In ChinaSholto In China
Sholto In China
 
WordPress Meetup - Top 9 September 2015
WordPress Meetup - Top 9 September 2015WordPress Meetup - Top 9 September 2015
WordPress Meetup - Top 9 September 2015
 
Business Plan Negocia Masters Assignment
Business Plan Negocia Masters AssignmentBusiness Plan Negocia Masters Assignment
Business Plan Negocia Masters Assignment
 
Social Marketing
Social MarketingSocial Marketing
Social Marketing
 
Vernal Pools & Plants
Vernal Pools & PlantsVernal Pools & Plants
Vernal Pools & Plants
 
Wakoopa Workshop #SocialFriday
Wakoopa Workshop #SocialFridayWakoopa Workshop #SocialFriday
Wakoopa Workshop #SocialFriday
 
How to Learn to Love Your Risk Manager
How to Learn to Love Your Risk ManagerHow to Learn to Love Your Risk Manager
How to Learn to Love Your Risk Manager
 
jQuery From the Ground Up
jQuery From the Ground UpjQuery From the Ground Up
jQuery From the Ground Up
 
State Of Ohio Vernal Pools M. Micacchion
State Of Ohio Vernal Pools  M. MicacchionState Of Ohio Vernal Pools  M. Micacchion
State Of Ohio Vernal Pools M. Micacchion
 
WordCamp Tampa 2015
WordCamp Tampa 2015WordCamp Tampa 2015
WordCamp Tampa 2015
 
Ohio Environmental Council's 40th Anniversary
Ohio Environmental Council's 40th AnniversaryOhio Environmental Council's 40th Anniversary
Ohio Environmental Council's 40th Anniversary
 
Managing Reputation in a Multichannel World | ENTER 2011 | Innsbruck
Managing Reputation in a Multichannel World | ENTER 2011 | InnsbruckManaging Reputation in a Multichannel World | ENTER 2011 | Innsbruck
Managing Reputation in a Multichannel World | ENTER 2011 | Innsbruck
 
Blogovi1
Blogovi1Blogovi1
Blogovi1
 
SIS 2011 - Transforming Organizations Into Publishing Machines - Rob Garner -...
SIS 2011 - Transforming Organizations Into Publishing Machines - Rob Garner -...SIS 2011 - Transforming Organizations Into Publishing Machines - Rob Garner -...
SIS 2011 - Transforming Organizations Into Publishing Machines - Rob Garner -...
 
Bloggen, niet alleen omdat het leuk is - Ikki.nl
Bloggen, niet alleen omdat het leuk is - Ikki.nlBloggen, niet alleen omdat het leuk is - Ikki.nl
Bloggen, niet alleen omdat het leuk is - Ikki.nl
 

Similar to Metadata 101

For Essay #3, please write about from eitherChronicle of a D
For Essay #3, please write about from eitherChronicle of a DFor Essay #3, please write about from eitherChronicle of a D
For Essay #3, please write about from eitherChronicle of a D
ShainaBoling829
 

Similar to Metadata 101 (20)

DataONE Education Module 07: Metadata
DataONE Education Module 07: MetadataDataONE Education Module 07: Metadata
DataONE Education Module 07: Metadata
 
For Essay #3, please write about from eitherChronicle of a D
For Essay #3, please write about from eitherChronicle of a DFor Essay #3, please write about from eitherChronicle of a D
For Essay #3, please write about from eitherChronicle of a D
 
Going Meta in SharePoint – Tricks of the Trade
Going Meta in SharePoint – Tricks of the TradeGoing Meta in SharePoint – Tricks of the Trade
Going Meta in SharePoint – Tricks of the Trade
 
Introduction to Metadata for IDAH Fellows
Introduction to Metadata for IDAH FellowsIntroduction to Metadata for IDAH Fellows
Introduction to Metadata for IDAH Fellows
 
FEDSPUG Meeting: Intelligent Metadata and Auto-classification in Records Mana...
FEDSPUG Meeting: Intelligent Metadata and Auto-classification in Records Mana...FEDSPUG Meeting: Intelligent Metadata and Auto-classification in Records Mana...
FEDSPUG Meeting: Intelligent Metadata and Auto-classification in Records Mana...
 
Data, Text and Web Mining
Data, Text and Web Mining Data, Text and Web Mining
Data, Text and Web Mining
 
Metadata Strategies
Metadata StrategiesMetadata Strategies
Metadata Strategies
 
Taxonomy 101
Taxonomy 101Taxonomy 101
Taxonomy 101
 
Introduction to Microdata & Google Rich Snippets
Introduction to Microdata  & Google Rich SnippetsIntroduction to Microdata  & Google Rich Snippets
Introduction to Microdata & Google Rich Snippets
 
2020 | Metadata Day | LinkedIn
2020 | Metadata Day | LinkedIn2020 | Metadata Day | LinkedIn
2020 | Metadata Day | LinkedIn
 
Day in the life of a data librarian [presentation for ANU 23Things group]
Day in the life of a data librarian [presentation for ANU 23Things group]Day in the life of a data librarian [presentation for ANU 23Things group]
Day in the life of a data librarian [presentation for ANU 23Things group]
 
chapter11-220725121546-671fc36c.pdf
chapter11-220725121546-671fc36c.pdfchapter11-220725121546-671fc36c.pdf
chapter11-220725121546-671fc36c.pdf
 
‏‏‏‏‏‏‏‏Chapter 11: Meta-data Management
‏‏‏‏‏‏‏‏Chapter 11: Meta-data Management‏‏‏‏‏‏‏‏Chapter 11: Meta-data Management
‏‏‏‏‏‏‏‏Chapter 11: Meta-data Management
 
Workshop - Ways of Working Within the M365 Workspace.pptx
Workshop - Ways of Working Within the M365 Workspace.pptxWorkshop - Ways of Working Within the M365 Workspace.pptx
Workshop - Ways of Working Within the M365 Workspace.pptx
 
DITA and Metadata on an Enterprise Scale
DITA and Metadata on an Enterprise ScaleDITA and Metadata on an Enterprise Scale
DITA and Metadata on an Enterprise Scale
 
Microformats I: What & Why
Microformats I: What & WhyMicroformats I: What & Why
Microformats I: What & Why
 
Data Systems Integration & Business Value Pt. 1: Metadata
Data Systems Integration & Business Value Pt. 1: MetadataData Systems Integration & Business Value Pt. 1: Metadata
Data Systems Integration & Business Value Pt. 1: Metadata
 
Data-Ed: Data Systems Integration & Business Value PT. 1: Metadata
Data-Ed: Data Systems Integration & Business Value PT. 1: MetadataData-Ed: Data Systems Integration & Business Value PT. 1: Metadata
Data-Ed: Data Systems Integration & Business Value PT. 1: Metadata
 
L07 metadata
L07 metadataL07 metadata
L07 metadata
 
Domain Semantics
Domain SemanticsDomain Semantics
Domain Semantics
 

More from Dominique Cimafranca

Welcome Address - CS Cluster Orientation
Welcome Address - CS Cluster OrientationWelcome Address - CS Cluster Orientation
Welcome Address - CS Cluster Orientation
Dominique Cimafranca
 
Why you can't write filipino science fiction
Why you can't write filipino science fictionWhy you can't write filipino science fiction
Why you can't write filipino science fiction
Dominique Cimafranca
 

More from Dominique Cimafranca (18)

Welcome Address - CS Cluster Orientation
Welcome Address - CS Cluster OrientationWelcome Address - CS Cluster Orientation
Welcome Address - CS Cluster Orientation
 
Story a meditation
Story a meditationStory a meditation
Story a meditation
 
Video Games as Culture and Experience
Video Games as Culture and ExperienceVideo Games as Culture and Experience
Video Games as Culture and Experience
 
Privacy and how the Internet works
Privacy and how the Internet works Privacy and how the Internet works
Privacy and how the Internet works
 
Why you can't write filipino science fiction
Why you can't write filipino science fictionWhy you can't write filipino science fiction
Why you can't write filipino science fiction
 
What is MVC?
What is MVC?What is MVC?
What is MVC?
 
Thesis proposal checklist
Thesis proposal checklistThesis proposal checklist
Thesis proposal checklist
 
FOSS and social development
FOSS and social developmentFOSS and social development
FOSS and social development
 
Creative nonfiction
Creative nonfictionCreative nonfiction
Creative nonfiction
 
Speculative Fiction
Speculative FictionSpeculative Fiction
Speculative Fiction
 
Online Literature
Online LiteratureOnline Literature
Online Literature
 
Writing Short Fiction
Writing Short FictionWriting Short Fiction
Writing Short Fiction
 
Teaching Open Source In The University
Teaching Open Source In The UniversityTeaching Open Source In The University
Teaching Open Source In The University
 
Poetry
PoetryPoetry
Poetry
 
Architecture Of The Linux Kernel
Architecture Of The Linux KernelArchitecture Of The Linux Kernel
Architecture Of The Linux Kernel
 
Understanding The Boot Process
Understanding The Boot ProcessUnderstanding The Boot Process
Understanding The Boot Process
 
Ubuntu For Intranet Services
Ubuntu For Intranet ServicesUbuntu For Intranet Services
Ubuntu For Intranet Services
 
Open Source In Education
Open Source In EducationOpen Source In Education
Open Source In Education
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Recently uploaded (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

Metadata 101

  • 1. Metadata 101 An introduction metadata and data management By Dominique Gerald M. Cimafranca villageidiotsavant.com This work is licensed under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 Philippines License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/ph/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.
  • 2. What is Metadata? Data that provides information about other data A collection of structured information about a document or a piece of content For example: Author, Title, Subject, Issue Date, Publisher
  • 3. Metadata isn't new... Image from http://www.meerimage.com/
  • 4. In fact, the stuff you have may already have it... ...and you just didn't know it.
  • 5. Purpose of Metadata ● To identify content ● Capture fields and distinguish each document from all others ● Manage content ● Version numbers, archive date, security and access permissions ● Retrieval of content ● Taxonomy topics, subject keywords, document type ● Connect content to other content ● Behavioral metadata captured in transaction (e.g., Amazon) ● Business processes ● Authored by whom? Reviewed by whom and when? Approved by whom and when? ● Support records management ● Retention periods, disposition cycle
  • 6. In a nutshell... ...to make information easy to find, manage, and contextualize.
  • 7. However... Metadata is most useful in collections Metadata is most useful when shared Metadata is most useful in collaboration
  • 8. Issues with information access today ● Tons of content from disparate sources ● Cumbersome navigation ● Keyword search assumes that you know what you are looking for ● Large number of search results – most of them irrelevant ● Lack of context in search results ● Search engines rely on mathematical algorithms to determine relevance and ranking of search results
  • 9. Types of Metadata ● Descriptive ● Describes a resource for discovery and identification, e.g., abstract, author, keywords ● Structural ● Indicates how the parts of the resource are arranged, e.g., chapters in a book ● Administrative ● Provides information on how to manage a resource, e.g., when it was created, who has access to it
  • 10. Structural Metadata ● Structural metadata defines the relationship between whole and parts. ● Structural metadata can also be used for navigational purposes, e.g., links to related files.
  • 11. Administrative Metadata ● Administrative metadata provides information to help manage a resource, such as when and how it was created, file type, and other technical information, and who can access it. ● Most common subsets ● Rights management metadata ● Preservation metadata
  • 12. Why Metadata? ● Resource discovery ● Organizing electronic resources ● Interoperability ● Digital identification ● Archiving and presentation
  • 13. Metadata alone isn't enough... ...even Metadata has to be properly thought out and properly used.
  • 14. Planning Metadata ● Whose requirements are you trying to meet? ● Who are your users and what are their requirements? ● What is your business case? ● Why should you undertake this project? ● What is your business model? ● How will this project be worthwhile?
  • 15. Metadata Structure ● Recognized standards ● Local specifications ● Social tagging systems
  • 16. Metadata Quality ● Technical Quality ● Adherence to local or international standards, specifications, and application profiles ● Semantic Quality ● Proper use of controlled vocabularies and semantic standards ● Value Quality ● Populating metadata fields appropriately for describing the resource and its relationships for the benefit of the user community and other stakeholders
  • 17. “Accurate, consistent, sufficient, and thus reliable.” --Greenberg & Robertson, 2002
  • 18. Nine Guiding Questions ● Who will be using the collection? ● Who is the collection cataloger? ● How much time and money do you have? ● How will your collection be accessed? ● How is your collection related to other collections? ● What is the scope of your collection? ● Will your metadata be harvested? ● Do you want your collection to work with other collections? ● How much maintenance and quality control do you wish? http://journals.tdl.org/jodi/article/viewArticle/226/205
  • 19. Use cases for Metadata ● Resource discovery ● Resource selection ● Resource aggregation and manipulation ● Intellectual property rights ● Digital preservation ● Marketing ● Accessibility ● Interoperability ● Workflow identification ● Reputation (of individuals and organizations)
  • 20. How is Metadata created? By humans... ● Created by resource authors ● Added by resource depositors ● Created, checked, augmented by professionals ● Catalogers ● Subject Experts ● Designated IPR keepers ● Enriched by resource users ● Additional description, comments, annotations, descriptions of usage ● Corrections ● Enrichment (additional subject description) ● Social tagging ● Ratings and recommendations
  • 21. ...or by machines ● Extraction from resource files ● Inferred from resource relationships ● Creation according to system settings ● Generation of default values ● Extraction via text mining
  • 22. The need for Metadata standards ● Different information providers using different metadata schemas ● Even metadata schemas of groups within organizations are different or out of sync ● Result ● Inconsistent search results ● Lack of interoperability ● Information silos
  • 23. Dublin Core ● General purpose metadata standard for use across domains ● 15 core elements ● Element qualifiers to narrow the meaning of elements ● E.g., Date Created vs Date Modified ● Encoding schemes: controlled vocabularies or parsing rules to refine the interpretation of an element ● Can be represented in HTML and XML (RDF) ● See http://dublincore.org
  • 25. Taxonomy ● A classification scheme ● Designed to group related things together ● Semantic ● Fixed vocabulary that is meaningful to its users ● A knowledge map ● Should give the user a grasp of the structure of the knowledge domain ● Establishes relationships between objects
  • 26. Government-related taxonomies ● Australian Governments' Interactive Functions Thesaurus ● Three-level hierarchical thesaurus that describes business functions carried out through Australian government units ● 25 high-level functions with second and third level terms ● Purpose: to aid online discovery of government information and services ● Functions of New Zealand and Subjects of New Zealand ● Thesauri for NZ government resources ● Classification of all-of-government level http://www.naa.gov.au/records-management/create-capture-describe/describe/agift/index.aspx http://www.e.govt.nz/standards/nzgls/thesauri/
  • 27. Actually, we don't need to look outside...
  • 28. Remember! Metadata is most useful in collections Metadata is most useful when shared Metadata is most useful in collaboration
  • 29.
  • 30. Sources ● Metadata Primer (http://www.slideshare.net/selvats/metadata-primer) ● AGIFT (http://www.naa.gov.au/records-management/create-capture-describe/describe/classification/agift/index.htm) ● Taxonomy and Metadata (http://www.slideshare.net/dchampeau/taxonomy-and-metadata) ● Understanding Metadata (www.niso.org/standards/resources/UnderstandingMetadata.pdf) ● An Introduction to Metadata (http://www.library.uq.edu.au/iad/ctmeta4.html) ● NZGLS thesauri (http://www.e.govt.nz/standards/nzgls/thesauri/downloads.html) ● If you tag it, will they come? (Sarah Currier) ● Nine questions to guide you in choosing a metadata schema (http://journals.tdl.org/jodi/article/viewArticle/226/205)