SlideShare a Scribd company logo
1 of 3
Download to read offline
Proc. Int’l Conf. on Dublin Core and Metadata Applications 2015 
Best Practice Poster:  
Gateway to Oklahoma History Case Study:  
Structured Data and Metadata Evaluation for Improved Image 
Resource Findability on the Web 
 
  Emily Ann Kolvitz 
University of Oklahoma, 
USA 
kolvitz1@gmail.com 
 
 
 
Keywords: structured data; metadata; Semantic Web; online search; information retrieval; image                     
findability 
1. Introduction 
Image Resource Findability on the World Wide Web is still very much a land­grab. For the                               
Semantic Web to become a reality online businesses and individuals have to get their hands dirty                               
and also come face­to­face with the realization that search engine giants are increasingly                         
becoming the go­to tool for information resource retrieval. “Increasingly, students use Web                       
search engines such as Google to locate information resources rather than seek out library online                             
catalogs or databases of scholarly journal articles” (Lippincott 2013). This puts the search engine                           
giant in a unique position to dictate how the future of search will work on the Web ­ and                                     
therefore, your organization’s future presence (or lack thereof) on the Web. Search Engine                         
Optimization (SEO) techniques change frequently and remain much a mystery to many                       
companies. The one variable in the equation of Web findability that remains a staple is good                               
quality metadata under the hood of the Website. In this case study, a methodology is applied to                                 
the Gateway to Oklahoma History’s Website. This study can be generalized to organizations                         
looking to benchmark their own findability maturity on the Web from an image­centric                         
viewpoint.  
2. Purpose 
Image search and retrieval is a more difficult area than text search and retrieval because                             
accessibility to the image content is largely dependent on the context presented in and around the                               
image resource. The future of Semantic Web technologies relies very much on the idea that                             
organizations are fluent in structured data and have devoted resources to exposing valuable data                           
to the web. The W3C (World Wide Web Consortium) was founded over two decades ago and the                                 
widespread adoption of Schema.org and structured data on the Web did not gain traction until the                               
big four search engines (Google, Bing, Yahoo, and Yandex ) agreed that a standard was needed to                                 
pave the way forward. “On­page markup helps search engines understand the information on                         
web pages and provide richer search results. A shared markup vocabulary makes easier for                           
webmasters to decide on a markup schema and get the maximum benefit for their efforts.”                             
(Schema.org 2014) Many organizations still lag behind on their implementation of any type of                           
structured data. Structured data is only one piece of the findability algorithm. Metadata near                           
content, embedded within content, or listed in the alt text of an html document all tell machines                                 
something about the content inside of the record as well. There are no guardians of the Web,                                 
ensuring structured data is uniformly applied to all records with equal attention and care and there                               
is no standard, mandated requirement for records on the Web to provide context for image                             
 
Proc. Int’l Conf. on Dublin Core and Metadata Applications 2015 
resource findability. Most search engines do not crawl embedded XMP data or the invisible Web,                             
leaving text near images, file names or text in the alt­text in html markup as the only context for                                     
image resources. The search algorithms for image retrieval are subject to change frequently                         
(Kritzinger 2013) and additionally, social media sites and organizations strip embedded data from                         
images (Embedded Metadata Manifesto 2014). Embedded metadata provides context and                   
provenance for image resources. Even with the dramatic adoption of structured data markup                         
utilizing schema.org vocabularies, there still remains metadata opportunities on the Web. Reicks                       
recommends embedded metadata as a strategy for online findability by showcasing examples of                         
applications that parse embedded data into structured data around images on the Web such as                             
PhotoShelter and LicenseStream (2010).  
2. Research Methods 
 The following research question informed this project: What are the types and quality of 
structured data, XMP, and metadata records available for image resources appearing on the 
website? Utilizing the Structured Data Linter Tool and Phil Harvey’s ExifTool, information was 
gathered to quantify these research questions.  Image records on the Gateway to Oklahoma 
History’s website were investigated for the types, quality and quantity of embedded metadata and 
structured data.  
3.  Results 
The Gateway to Oklahoma History’s Website has a wealth of structured data and metadata                           
pertaining to its image resources. Search queries utilizing structured data markup tags and/or                         
embedded metadata yielded relevant and accurate results during a normal web search, but did not                             
yield relevant and/or accurate image resources during an image search. Descriptive filenames                       
were not used for image resources, which is an important part of image retrieval through web                               
search engines. Adding Schema.org tags to the on­page markup, to accompany the structured                         
data already present is another area for improvement. An interesting finding from this research                           
was that embedded metadata was only found on the largest, original version of the image                             
resource, and never on smaller derivative images. Structured data included in the on­page                         
mark­up included Open Graph Protocol and Dublin Core. IPTC was the primarily type of                           
embedded metadata present for the image resources.   
4. Conclusion 
The results and methodology for this research can help GLAM institutions (Galleries,                         
Libraries, Archives & Museums) by bringing awareness to the state of structured data and image                             
resource findability for cultural heritage institutions on the Web. GLAMs must be active in the                             
SEO space, support machine­readable language in the markup of their sites, and utilize                         
Schema.org vocabularies and descriptive filenames for relevancy in search engine results. The                       
Digital Library Federation, which is a program of the Council on Library and Information                           
Resources, concludes that “​Getting found means repository objects must be included in the                         
indexes of major search engines because most students and faculty now begin their research with                             
Internet search engines. Digital repositories created by libraries will be largely invisible to users if                             
their contents are not indexed in these search engines”  (Digital Library Foundation 2014).   
 
References 
Digital Library Federation.  Last accessed October 20, 2014.   “SEO for Digital Libraries.”   
     ​http://www.diglib.org/community/groups/seo­for­digital­libraries/ 
 
Proc. Int’l Conf. on Dublin Core and Metadata Applications 2015 
International Business, Times. 0006. "Bing,Google and Yahoo merge to make search easier with schema.org."                           
International Business Times, April. 
IPTC International Press Telecommunications Council, 2014. “Embedded Metadata Manifesto” Last accessed                     
November 20, 2014. ​http://www.embeddedmetadata.org/social­media­test­results.php​(Embedded Metadata         
Manifesto 2014).  
Kritzinger, W. T. "Search Engine Optimization and Pay­per­Click Marketing Strategies." Journal of Organizational                         
Computing and Electronic Commerce, no. 3 (2013): 273­86. 
Lippincott, Joan K. “Net Generation Students and Libraries,” EDUCAUSE (2005), accessed November 19, 2014,                           
http://www.educause.edu/research­and­publications/books/educating­net­generation/net­generation­students­and­lib
raries 
Reicks, David. 2010. “Why Embedded Metadata Won’t Help Your SEO,” Last Updated December 30, 2013. Last                               
Accessed November 23, 2014. ​http://www.controlledvocabulary.com/blog/embedded­metadata­wont­help­seo.html 
Schema.org. 2015.  “About Schema.org”  Last Updated Unknown.  ​https://schema.org/docs/faq.html 
 

More Related Content

Recently uploaded

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 

Recently uploaded (20)

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 

Featured

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Featured (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Best Practice Poster: Gateway to Oklahoma History Case Study: Structured Data and Metadata Evaluation for Improved Image Resource Findability on the Web

  • 1. Proc. Int’l Conf. on Dublin Core and Metadata Applications 2015  Best Practice Poster:   Gateway to Oklahoma History Case Study:   Structured Data and Metadata Evaluation for Improved Image  Resource Findability on the Web      Emily Ann Kolvitz  University of Oklahoma,  USA  kolvitz1@gmail.com        Keywords: structured data; metadata; Semantic Web; online search; information retrieval; image                      findability  1. Introduction  Image Resource Findability on the World Wide Web is still very much a land­grab. For the                                Semantic Web to become a reality online businesses and individuals have to get their hands dirty                                and also come face­to­face with the realization that search engine giants are increasingly                          becoming the go­to tool for information resource retrieval. “Increasingly, students use Web                        search engines such as Google to locate information resources rather than seek out library online                              catalogs or databases of scholarly journal articles” (Lippincott 2013). This puts the search engine                            giant in a unique position to dictate how the future of search will work on the Web ­ and                                      therefore, your organization’s future presence (or lack thereof) on the Web. Search Engine                          Optimization (SEO) techniques change frequently and remain much a mystery to many                        companies. The one variable in the equation of Web findability that remains a staple is good                                quality metadata under the hood of the Website. In this case study, a methodology is applied to                                  the Gateway to Oklahoma History’s Website. This study can be generalized to organizations                          looking to benchmark their own findability maturity on the Web from an image­centric                          viewpoint.   2. Purpose  Image search and retrieval is a more difficult area than text search and retrieval because                              accessibility to the image content is largely dependent on the context presented in and around the                                image resource. The future of Semantic Web technologies relies very much on the idea that                              organizations are fluent in structured data and have devoted resources to exposing valuable data                            to the web. The W3C (World Wide Web Consortium) was founded over two decades ago and the                                  widespread adoption of Schema.org and structured data on the Web did not gain traction until the                                big four search engines (Google, Bing, Yahoo, and Yandex ) agreed that a standard was needed to                                  pave the way forward. “On­page markup helps search engines understand the information on                          web pages and provide richer search results. A shared markup vocabulary makes easier for                            webmasters to decide on a markup schema and get the maximum benefit for their efforts.”                              (Schema.org 2014) Many organizations still lag behind on their implementation of any type of                            structured data. Structured data is only one piece of the findability algorithm. Metadata near                            content, embedded within content, or listed in the alt text of an html document all tell machines                                  something about the content inside of the record as well. There are no guardians of the Web,                                  ensuring structured data is uniformly applied to all records with equal attention and care and there                                is no standard, mandated requirement for records on the Web to provide context for image                               
  • 2. Proc. Int’l Conf. on Dublin Core and Metadata Applications 2015  resource findability. Most search engines do not crawl embedded XMP data or the invisible Web,                              leaving text near images, file names or text in the alt­text in html markup as the only context for                                      image resources. The search algorithms for image retrieval are subject to change frequently                          (Kritzinger 2013) and additionally, social media sites and organizations strip embedded data from                          images (Embedded Metadata Manifesto 2014). Embedded metadata provides context and                    provenance for image resources. Even with the dramatic adoption of structured data markup                          utilizing schema.org vocabularies, there still remains metadata opportunities on the Web. Reicks                        recommends embedded metadata as a strategy for online findability by showcasing examples of                          applications that parse embedded data into structured data around images on the Web such as                              PhotoShelter and LicenseStream (2010).   2. Research Methods   The following research question informed this project: What are the types and quality of  structured data, XMP, and metadata records available for image resources appearing on the  website? Utilizing the Structured Data Linter Tool and Phil Harvey’s ExifTool, information was  gathered to quantify these research questions.  Image records on the Gateway to Oklahoma  History’s website were investigated for the types, quality and quantity of embedded metadata and  structured data.   3.  Results  The Gateway to Oklahoma History’s Website has a wealth of structured data and metadata                            pertaining to its image resources. Search queries utilizing structured data markup tags and/or                          embedded metadata yielded relevant and accurate results during a normal web search, but did not                              yield relevant and/or accurate image resources during an image search. Descriptive filenames                        were not used for image resources, which is an important part of image retrieval through web                                search engines. Adding Schema.org tags to the on­page markup, to accompany the structured                          data already present is another area for improvement. An interesting finding from this research                            was that embedded metadata was only found on the largest, original version of the image                              resource, and never on smaller derivative images. Structured data included in the on­page                          mark­up included Open Graph Protocol and Dublin Core. IPTC was the primarily type of                            embedded metadata present for the image resources.    4. Conclusion  The results and methodology for this research can help GLAM institutions (Galleries,                          Libraries, Archives & Museums) by bringing awareness to the state of structured data and image                              resource findability for cultural heritage institutions on the Web. GLAMs must be active in the                              SEO space, support machine­readable language in the markup of their sites, and utilize                          Schema.org vocabularies and descriptive filenames for relevancy in search engine results. The                        Digital Library Federation, which is a program of the Council on Library and Information                            Resources, concludes that “​Getting found means repository objects must be included in the                          indexes of major search engines because most students and faculty now begin their research with                              Internet search engines. Digital repositories created by libraries will be largely invisible to users if                              their contents are not indexed in these search engines”  (Digital Library Foundation 2014).      References  Digital Library Federation.  Last accessed October 20, 2014.   “SEO for Digital Libraries.”         ​http://www.diglib.org/community/groups/seo­for­digital­libraries/   
  • 3. Proc. Int’l Conf. on Dublin Core and Metadata Applications 2015  International Business, Times. 0006. "Bing,Google and Yahoo merge to make search easier with schema.org."                            International Business Times, April.  IPTC International Press Telecommunications Council, 2014. “Embedded Metadata Manifesto” Last accessed                      November 20, 2014. ​http://www.embeddedmetadata.org/social­media­test­results.php​(Embedded Metadata          Manifesto 2014).   Kritzinger, W. T. "Search Engine Optimization and Pay­per­Click Marketing Strategies." Journal of Organizational                          Computing and Electronic Commerce, no. 3 (2013): 273­86.  Lippincott, Joan K. “Net Generation Students and Libraries,” EDUCAUSE (2005), accessed November 19, 2014,                            http://www.educause.edu/research­and­publications/books/educating­net­generation/net­generation­students­and­lib raries  Reicks, David. 2010. “Why Embedded Metadata Won’t Help Your SEO,” Last Updated December 30, 2013. Last                                Accessed November 23, 2014. ​http://www.controlledvocabulary.com/blog/embedded­metadata­wont­help­seo.html  Schema.org. 2015.  “About Schema.org”  Last Updated Unknown.  ​https://schema.org/docs/faq.html