The Essentials of Metadata and Taxonomy - Henry Stewart Event
The Next Wave: Using Wikipedia as a Controlled Vocabulary
* Leveraging an online resource for internal use
* Integrating pre-existing unique identifications numbers (UIDs)
* Inherited relations
* Capturing and cataloging
* Risks and remedies
Chris Sizemore BBC Future Technology & Media and Silver Oliver, BBC Future Technology & Media
3. BBC Topic Page Iâm about âVictoriansâ Outside the BBC BBC silo #1 BBC silo #3 BBC silo #2
4. BBC Topic Page Iâm about âVictoriansâ viktorianisch V ìë r ìŽì ÎλληΜÎčÎșÎŹ NY Times, flickr, wikipedia Outside the BBC BBC silo #1 BBC silo #3 BBC silo #2
15. Story of Wikipedia-as-CV: personal origins So of course we built a categorisation system from scratch -- including its own controlled vocab
16. Story of Wikipedia-as-CV: personal origins And when people saw the system, they always said: âHey, that reminds me of Internet Movie DatabaseâŠâ
18. Story of Wikipedia-as-CV: personal origins It struck me that the way Internet Movie Database is set up isnât dissimilar to the structure of a thesaurus or a very flat taxonomyâŠ
19. Story of Wikipedia-as-CV: personal origins But itsâs one where the emphasis is on ârelated toâ, not broader/narrower, synonym, antonym, etc
20. Story of Wikipedia-as-CV: personal origins From then, I couldnât help but be drawn to websites where the structure is clearly:
21. Story of Wikipedia-as-CV: personal origins From then, I couldnât help but be drawn to websites where the structure is clearly: â a single primary Concept per page -- and pages for related Concepts link to each otherâ
22. Story of Wikipedia-as-CV: personal origins Could those âone Concept per pageâ webpages be used as âtermsâ as in a controlled vocabulary?
28. Demo of conText -- a Wikipedia-as-CV auto-categoriser prototype
29. Demo of conText -- a Wikipedia-as-CV auto-categoriser prototype: Take text from audience!
30. Wikipedia is already being used across the Web as a form of subject identification & disambiguation, in a grassroots way:
31. Wikipedia is already being used across the Web as a form of subject identification & disambiguation, in a grassroots way: in the form of hyperlinks embedded by authors in blog posts, news articles, music reviews, etc everywhere!
33. These days, by convention, when you link to Wikipedia from your webpage, more than saying âgo and have a look at this other pageâ, you are more likely giving a definition to a concept referred to in your contentâŠ
34. These days, by convention, when you link to Wikipedia from your webpage, more than saying âgo and have a look at this other pageâ, you are more likely giving a definition to a concept referred to in your content⊠Also used in this way for specific domains are Internet Movie Database (for films & TV programmes), MySpace (for bands), Amazon (for books), etc
35. For general knowledge, though, Wikipedia is becoming the Webâs defacto controlled vocabulary
39. Wikipedia pages provide the best scope notes in the world Wikipedia-as-CV benefits from being developed through a social process, maintained and kept current by the Wikipedia community
40. Wikipedia pages provide the best scope notes in the world Wikipedia-as-CV benefits from being developed through a social process, maintained and kept current by the Wikipedia community Each concept represents a consensus view and its meaning can be understood simply by reading the associated Wikipedia page
41. Wikipedia pages provide the best scope notes in the world For each Concept, the document edit history, discussion around concept definition, & debate is important hereâŠ
44. So, we can tag pretty accurately semi-automatically with globally unique subject identifiers using this approach⊠So what?
45. So, we can tag pretty accurately semi-automatically with globally unique subject identifiers using this approach⊠So what? Un-silo your content repository quickly and cheaply, by connecting it to the Web via Wikipedia
53. Now playing vs. the Web Why not bring in BBC Archive materials to this service via Wikipedia-as-CV tagging and linked data bridge between Wikipedia & MusicBrainz?
64. Chris Sizemore Silver Oliver BBC Wikipedia as controlled vocabulary Wikipedia is a controlled vocabulary
65. Chris Sizemore Silver Oliver BBC Wikipedia as controlled vocabulary Wikipedia is a controlled vocabulary
66. Chris Sizemore Silver Oliver BBC Wikipedia as controlled vocabulary Chris Sizemore Silver Oliver BBC Wikipedia is a controlled vocabulary
67. Chris Sizemore Silver Oliver BBC Wikipedia as controlled vocabulary Chris Sizemore Silver Oliver BBC Wikipedia is a controlled vocabulary Much thanks! Questions, comments, & constructive criticism?
68. Chris Sizemore Silver Oliver BBC Wikipedia as controlled vocabulary http://flickr.com/photos/deniscollette/1817034358/