Presented at the IPTC Spring 2019 meeting, three proposals for taxonomies:
1. Document how to use 3rd party entity schemes
2. Develop taxonomies for “perceived” metadata - for photo, video and audio items
3. Develop a way to “delegate” to wikidata as a way to extend IPTC Media Topics into more granular topics
2. Three Taxonomy Proposals
1. Document how to use 3rd party entity schemes
2. Develop taxonomies for “perceived” metadata - for photo, video
and audio items
3. Develop a way to “delegate” to wikidata as a way to extend IPTC
Media Topics into more granular topics
www.iptc.org 2
3. Adding Entities
• A frequent request is to add entities to
IPTC’s news codes
– People, places, organizations, companies,
events, works of art, buildings…
• News and media often discusses or
depicts specific entities
• Licensing news and media typically
involves particular entities
– model releases, photographers, video
producers, news publishers, media archives,
locations, etc.
www.iptc.org 3
4. Adding Entities?
• So, given the clear need for entities lists, particularly interoperable
ones, why doesn’t IPTC maintain standard lists?
• Creating, disambiguating, and updating a list of every newsworthy
person, organization or thing mentioned in the news
• … is too much work!
• So, how can IPTC help?
www.iptc.org 4
5. Entity Lists
• Luckily, lots of other organizations maintain entity lists
• ISNI http://www.isni.org/
• ORCID https://orcid.org/
• opencorporates https://opencorporates.com/
• Wikidata https://www.wikidata.org/
• PermID https://permid.org/
• APMS https://developer.ap.org/ap-metadata-services
• Photographer Identity Catalog https://pic.nypl.org/
• To name just a few
www.iptc.org 5
6. Proposal: Design and Documentation
• Proposal: document how to use 3rd party entity lists, rather than have IPTC
maintain them
– Maybe this is already documented? But I can’t find it
– Also, I assume that this will require some design work, to figure out exactly how 3rd party
lists could be used in a way that is compatible with IPTC standards
• Bonus proposal: we consider reviving the IPTC Semex work
– Ways to document how to reconcile between different taxonomy schemes in a standard
way
– https://www.slideshare.net/hledwards/iptc-ncdsummer2015semantic-
exchangehistorycorrected
– We could even apply for grant funding to support this type of work
www.iptc.org 6
7. Image Tags
• Media Topics are suitable for tagging the subject matter of visual imagery
• However, MT are not designed for properties such as
– Emotions conveyed - happy, sad
– Artistic elements - line, colour, texture
– Tone – essential, useful, entertaining, comedic
– Actions - celebrate, dance, hug
– Setting – city, exterior
– Purpose of the picture – editorial, creative
• These types of perceived qualities of images are useful for search, amongst
other use cases
– Many IPTC members use manual keywording to capture these types of properties
– Many are working on / experimenting with automated image and video tagging
– https://www.slideshare.net/smyles/image-tagging-at-the-associated-press
• Should IPTC create controlled vocabularies for these terms?
www.iptc.org 7
8. Image CV
• In fact, the idea that IPTC should create an “Image CV” is not new
• In 2010, a taxonomy for still and, perhaps, moving images was created
• The draft was presented at the Photo Metadata Conference
• Two types of properties were identified, beyond those covered by Media Topics:
– Perceived facets, like emotions, activities, colours, visible relationships of things
– Entities
• Two stumbling blocks were identified:
– Would the Image CV be useful without granular entities – and could IPTC realistically manage
granular entities?
– Who would use this taxonomy, given that many photo businesses have their own already?
www.iptc.org 8
9. Proposal: Perceived Metadata
• Proposal: Create vocabularies for what may be perceived about an item
– Initially, this would be visual perceptions (i.e. for still and moving images)
– Later we could add audio and – if VR/AR/XR takes hold – other senses
• We could use the 2010 era Image CV as a starting point, as well as any other
relevant schemes
– We should not include entities, but should document how 3rd party entity lists can be
used
• The perceived metadata standard would be useful for
– News and media organizations who wish to describe their content in a standard way
– Clients who wish to integrate the content offerings of different suppliers
– Image recognition vendors who don’t have a documented taxonomy of their own
www.iptc.org 9
10. Adding Granularity to Media Topics
• IPTC Media Topics are great.
• However, they don’t cover all possible news topics
– There are no entities, as we have discussed
– There is a topic of “animals”, but there are no “lions”, “tigers” or “bears”
• Some news and media organizations require more granularity
• Often different organizations require more granularity in some areas than in
others
– For example, in finance or sports
• Can we provide a standard way to extend IPTC MT in a way that
– Preserves the benefits of the IPTC MT as a standard
– Doesn’t require news and media organizations to create their own alternative
– Doesn’t overwhelm the IPTC newscodes committee with work
www.iptc.org 10
11. wikidata
• “Wikidata is a collaboratively edited knowledge base hosted by the Wikimedia
Foundation. It is a common source of open data that Wikimedia projects such as
Wikipedia can use, and anyone else, under a public domain license.”
• “In Wikidata, items are used to represent all the things in human knowledge,
including topics, concepts, and objects. For example, the "1988 Summer
Olympics", "love", "Elvis Presley", and "gorilla" are all items in Wikidata.”
• https://www.wikidata.org/wiki/Wikidata:Main_Page
• IPTC mapped all the Media Topics to wikidata
• https://iptc.org/news/wikidata/
www.iptc.org 11
12. Proposal: Delegate to wikidata
• Proposal: rather than extend Media Topics more deeply, develop a way to
delegate to wikidata for more granular items
• For example, rather than adding “house cats” to Media Topics, use the wikidata
term https://www.wikidata.org/wiki/Q146
• It may involve news providers adding missing terms into wikidata
– And worrying about notability
• It likely requires documentation for how to process a mix of Media Topics and wikidata URLs
• It is an opportunity to more tightly bind into the Linked Data world
www.iptc.org 12
13. Recap: Three Taxonomy Proposals
1. Document how to use 3rd party entity schemes
2. Develop taxonomies for “perceived” metadata - for photo, video
and audio items
3. Develop a way to “delegate” to wikidata as a way to extend IPTC
Media Topics into more granular topics
Discussion?
www.iptc.org 13