17. metadata is. . .
a descriptive assertion about something
an access point into a content record
a signal to aggregators, browsers, search engines
metadata is the adhesive that provides structure and the
common language that promotes interoperability
20. how the web has changed
routes to content
Search is the home page
The web is the CMS
Devices, Robots, Developers get our
content first and decide where it
appears
25. content: pack it for solo travel
Is it something you search
by? sort by?
How could it relate to
other content?
How might it be extracted
or displayed?
How may it be
reprioritized or resized?
26. metadata is. . .
a descriptive assertion about something
an access point into a content record
a signal to aggregators, browsers, search engines
metadata is the adhesive that provides structure and the common language
that promotes interoperability
that allows content to travel alone
27. The irony of control
You must have control to relinquish it. Utilize and
contribute to existing, extensible schemas and standards.
Schema.org
managed by Google, Bing, Yahoo and Yandex
Dublin Core
MODS (Federal Register)
28. Fly! Be free!
Common Language
Outgrow WYSIWIG
Pages to Parts
Content as atoms/molecules
Recognize Relationships
Rule the Robots
29.
30. semantic web thinking changes the way we work:
•
we focus on things and the relationships between
them as opposed to the documents
•
we introduce a culture of building with open
vocabularies to add context and links and
•
we maximize the value we get out of our tagging of
content..
–bbc digital
31. The content and the audience must connect
find the audience
reaching audiences
making connections
viral knowledge
colleagues not competitors
32. Customers don't know — and don't care to know — how
government is organized.
So why make them go from agency to agency to get the
full picture of what gov't has to offer on any subject?
-Participant in the National Dialogue in improving government websites
Editor's Notes
Metadata needs an image upgrade. Every major content project I’ve been a part of includes ‘Metadata Strategy’ as part of the original proposal. But somewhere along the production cycle- the workplan, the implementation, or the maintenance - the enthusiasm for a well considered metadata strategy wanes exponentially.
It’s a tedious, painful, time suck that many think requires group consensus… and ultimately it’s value won’t be realized until sometime in the future… when everything will have changed…. so the thinking goes…. so a true metadata strategy often dies a slow death. We step over the body saying we’ll come back for it… but we move on.
But metadata, in the form of structured content and data is sloughing off it’s image as one of the boring dull historian or librarian, cataloguing every possible detail of an object… and becoming an action hero. With the right phone booth, metadata might change from Clark Kent, the steady, dull, there if-you-remember-him hard worker into Superman… and he will take our stuff… our content … on journeys we hadn’t even planned for it.
NB: the concepts presented here have a long history in information science and data management. Content Strategists have not invented anything new… we are refashioning these concepts for clients and content owners.
Metadata is one of those words that means everything and nothing. Project teams bandy it about, when ironically each have a different definition for it. The archivist / historian / librarian considers metadata the descriptive or administrative metadata that catalogs items for storage and retrieval. Developers – web developers and designers– consider it part of their presentation markup, the structural metadata that gives browsers and devices instructions. Data developers consider metadata as the individual pieces of their data model. Content strategists think of it mostly in terms of how it moves content efficiently around the content management system.
What brings us together is that are jobs, generally, revolve around connecting people to knowledge. Without an audience - whether that be humans or other machines I might add - without an audience, if we don’t reach an audience… we are irrelevant.
We may use metadata to help organize ourselves in the backend and in the content management system… to help streamline some of the decision making. Metadata helps us make order out of chaos. But metadata is also a signal or access point for others so they can know our meaning without us there holding our hand. The call number on a library book means I can find something without the librarians help.
So browsers know what to do on the fly. So rules can be applied based on format, rights, or size. Without constant supervision.
So here is my first stab at a meaningful definition.
It’s a descriptive assertion – not necessarily objective.
It’s an access point for its users to discover more about that which it describes.
And it’s a signal out to users – to give directions or provide a beacon drawing them to that access point.
Most importantly, metadata makes decisions.
It’s why it’s difficult to create –committing to a decision is hard.
And it’s why it’s powerful. but this is what metadata means to all of us. It helps us make decisions. It helps browsers make decisions. It helps machines to make decisions. And more and more, machines, robots, code… these things are making decisions before we even get a chance to weigh in on what we want.
I hope at the end of this presentation you will agree that our metadata – even, maybe especially its absence - affects the decisions that our users make about our content. And more and more decisions are being made for the user before they even see the content – by devices and third party intermediaries between us and the knowledge we are trying to impart.
I reject the ‘data about data’ definition because it suggests that metadata is objective – quantifiable fact.
Sometimes that is the case, but often judgments are made in ascribing metadata that may have long reaching ramifications for a piece of content.
In all of the disciplines we are accustomed to thinking of metadata in terms of hierarchical terms. But where does Billie Holiday fall in music genre? Lady Sings the Blues, after all, but would you find her under Blues? You can make the argument that individual songs fall under many genres, and maybe as a whole you consider her a jazz singer. You may discover her because your heart is broken and you go on a itunes / pandora torch song bender.
Metadata - like the content it describes - is often subjective, and that has ramifications. So we need to start thinking in terms of all the different relationships our content may have to the world at large.
Metadata is the access point / road sign that puts you different paths
And the path that you go down is likely to be curated by others before you. Because the flip side of the access point is the signal. Our content, we’ve discovered, will sneak off and become part of someone else's construct. As it moves around, it’s likely to lose some of its original source information unless metadata is in place. This movie theatre in key west may end up on someone elses art deco pinterest board, and those viewers won’t know it’s location.
Metadata – again, even its absence – makes decisions for the content’s journey – regardless of the original publishers intent.
It’s not as if we need to convince ourselves or our clients that metadata is important. Most content projects begin with grand plans regarding metadata strategy. But these smaller implementations of metdata tend to ultimately appear as guest star functionality as opposed to a foundational piece a project. We certainly don’t consider it or apply it as much as any of us like. Primarily both we as project teams and our clients would like to use it more to make our own lives easier – to cut down on duplication, to ensure content stays fresh and accurate, to end the need for 100% manual’ content maintenance.
But metadata is hard.
There are many good reasons why we slowly lose enthusiasm for metadata strategy:
departments argue over the words – who gets the control of the controlled vocabulary
content managers are burdened with excessive administration – need to be exhaustive means you exhaust your content editors.
authors and experts are often the wrong people to choose metadata
training and maintenance is required
PLUS Metadata is hidden there is little immediate gratification and no glaring errors… you can move on without it.
So that metadata section of your project becomes an easy casualty when time and budget get short. We step over the body and move on.
So what? Is metadata optional? Yes, obviously, because we move on without it. And I’m not suggesting that every project needs a gynormous metadata implementation or controlled vocabulary.
But a project team has to consider the ramifications of whatever you do… including doing nothing. But all teams - content, design, development, client - must speak the same language as to what metadata means.. and how it will affect the content.
We are familiar with the acronym that NPR coined in regards to their content: COPE
Before the rest of us had to deal with moving content around multiple screens and devices, content / media creators like NPR and the BBC had to make their content available for subsidiary's – BBC Scotland, World Service, BBC America, NPR member stations.
They have been early adopters – by necessity – of designing content for a Shared Space.
What’s key in this NPR diagram is that they recognize they need to feed not only multiple iterations of their content internally, but they need to be aware of how third parties will use their content.
They are also sometimes the third party themselves. During the London Olympics, BBC Sport realized it could not produce and maintain country and competitor information for over 10,000 athletes, 200 countries, 304 events and 30 venues so it relied on linked data from other institutions. It has embraced the semantic web and the concept of linked data as both a utilizer and a well respected content provider.
But to do that they need metadata … to create structure…. So that rules can be applied….making it possible for users – human and machine – to make decisions.
So I add this to the definition to make sure that we don’t only think of how metadata helps us as content ownders… but how metadata makes our content available to the web in a far more efficient way then big blobs of words and images, video or audio without context applied.
But why should we care about interoperability?
You should care if you care about Google (not every does or needs to)
Last year Amit Singhal introduced Google’s Knowledge Graph – and with it Googe’s move toward Semantic Thinking. Amit coined the phrase ‘Things, not strings’… no longer focusing on strings of keywords, but the relationships between things.
In this same introduction he said that the goal of Google Search is to Answer, Converse and Anticipate
The important part of this quote is: information comes to you without you seeking it.
The only way Google can do this is to continue to learn language – and to do that it needs context.
But we know that Google is succeeding – and one can expect they will continue to improve. So this third party is somewhat meaningful because
Our routes to content is changing.
Our home pages mean less and less. Search is the home page. The web is the CMS – it is the web itself that is moving around our content trying to give it to the person exactly when they need it.
And to make this possible, robots, developers devices get our content first and decide where it appears for its user – before the user has a chance to weigh in.
The decisions metadata makes now go (way) beyond your CMS. Presumably people want their content to reach the intended audience and not rot in cyberspace. Explain how the knowledge between flu and vaccine differ… and why.
Page source of NLM Pub Med page
The decisions metadata makes now go (way) beyond your CMS. Presumably people want their content to reach the intended audience and not rot in cyberspace. Explain ‘Recipe’ microformat for schema.org
Who is going to interpret your content? If you want your content to live outside of your web page (and you do)… you have to give some hints and tips to the robots who ingest it.
Mark P. uses the metaphor ‘Free the content from CMS jail’. I will go further and say we need to launch our knowledge babies into the world by packing them carefully for travel.
But I’m not suggesting, again, that an entire mega-site needs to be considered this way… maybe it’s just about response to the flu, or vaccination, or an emergency response, or a new law… say. I don’t advocate boiling the ocean… but I do think you should recognize where value can be created.
so content can travel alone. Just like the call numbers on a book give you instructions so you don’t need a librarian go get it for you, metadata is the manual for the content - allowing users to mash it up in new ways.
To relinquish control…you must first have it.
One of the major challenges of metadata and taxonomy is the political nature of choosing the terms. Well, you narrow this somewhat by adopting schemas, taxonomies and standards that are already in place. Nearly all of them are extensible, most are industry specific.
This is a group sport where it is in everyone’s interest to work together – note the creators of schema.org – it’s a joint effort to improve all of their products
How do we launch the content on it’s journey? These look easy in a wee bulleted list. But each is very difficult because they mean changing content culture.
This is a little side road away from metadata exactly but gives an idea about how we as content owners need to help our clients start thinking in parts not pages.
Developers have long used a domain model. Domain driven design is borrowed from software engineering defining the individual entities in a subject and define the relationships between them. What does my content talk about and how does it join up in the ‘real world’. This is a major shift from the hierarchical taxonomic thinking broad drilling down to narrow… now we want to think in mental models, maps of relationships. This shift alone will help content owners thing outside the page and consider where and how there content might travel… where they want it to travel, and how they might be able to pull content together quickly across agencies or how media and search engines might find this information more quickly.
BBC is a government agency
The web has always been about sharing knowledge, making new connections, reaching new audiences, viral information. The content and its audience has always had to ‘connect’. The difference now is that other agents are making the connections for the user as opposed to them actively seeking. Content is being ‘presented’ as an answer, a dialogue, an anticipated need.
They won’t bother going from agency to agency if they are confident in the information they get from other sources (wikipedia, webmd, mayo clinic) adopting a metadata strategy that focuses on the ‘shared space of the web, regardless of backend technology or content management system sites would be a major step toward cross agency coordination.
Equally, other trusted sites – like google – will be able to use .gov content to answer users questions and / or direct them to the most appropriate .gov content without users ever knowing the home page url for the agency. If we wanted to coordinate some kind of response – in an emergency situation for example – we’d have many of the pieces in place simply by having adopted this metadata driven semantic web thinking.
We don’t have to do this of course. As we’ve seen, we can move forward without metadata strategy. But the knowledge graph suggests very soon our users may no longer be coming to us, but instead be expecting content to come to them. If that happens, and we aren’t using structured content, it’s possible metadata will make its decisions without us.