who we are: we work for BBCi Search in New Media department we are going to talk about search, search engines and metadata, and how we can help you to generate more traffic for your sites via search engines
what are search engines? there are two types: 1. spider / crawler / or robot indexes pages it visits on the web, and then follow links on those pages to find other pages to index. in that way, in theory, because of the interconnected nature of the web the index can be very thorough – some are over 3 billion documents examples include google, altavista, teoma & lycos
the second type are directories human editors decide whether to include and where to put your site within a hierarchy of subjects, which are organised by topic. we do this ourselves at BBCi with WebGuide and with the taxonomy that governs the BBCi Best Links appearing on search Increasingly search engines are struggling to make money from advertising, and have introduced programs to monetize search results by charging for inclusion, placement, and click-thrus. It is a strict BBC policy never to pay for inclusion or placement.
so what is the user experience of search? here are some comments from people we saw in a recent round of competitive user testing - people get confused, frustrated, and have that nagging feeling that the information they want is out there, but that search engines can sometimes be an obstacle rather than empowering
this is a quote from one of the chief scientists at a major search corporation, inktomi. and its quite mind boggling when you think about it - users sit down in front of a computer with an idea that they want information about something specific, and then type in one or two words and expect the search engine to understand the context.
why is this important to BBCi? around 1 and half million visits to the BBCi site a week come from search engines. very often these are people who would not have thought of coming to the BBC site to look for the information they want
its our job to make sure that your content is being presented to them when they are typing those one or two words, so now i am going to talk a little bit about the key metadata elements - the title, description, and the less useful keywords tag
here is the BBCi Communicate homepage - you can see at the top the source code for the <title> tag. It is not immediately obvious where it appears on the page
it is also displayed somewhere much more important - as this screen grab from Google shows, the <title> tag is the headline nearly every search engine will give your pages for that reason every page published on the BBCi site should have a unique title tag. The required format is BBC hyphen site name or section hyphen unique detail.
the next metadata element is the description tag. this is where you should write a snappy 25-30 word description of your site. every page must have a description. it is perfectly acceptable for all the pages from a section to share the same description.
its very easy to overlook the description tag, as it is never visible to the human eye on the page however many search engines (like MSN, AltaVista) display the content of the tag underneath the title of the page in their results. that is why it is so important for the description to be both informative and appealing to the user.
if you look at a search engine results page you will see that you are competing for attention with banner ads, proprietary content, and sites that have paid for placement.
all you have is the 25-30 words to sell your site and make the user click on the link
the last piece of the metadata puzzle is the keywords tag. and i’ve left it last because it is very much least. people can spend a lot of time cramming their keyword tag with every conceivable word related to their site, with mis-spellings, capitalisation and a host of variations. but ask yourself, if you were running a search engine, what would you use to index the material? the words that appeared on the page? the title of document? the filename? or a bunch or arbitrary random words stuffed into a metadata tag with the sole aim of boosting your rankings in a search engine? hardly any search engines pay attention to the keywords tag anymore, and you are more likley to cause harm by doing it badly than achieve higher rankings by doing it well. in fact you can do a lot of damage by doing it badly, as the following example shows.
the last piece of the metadata puzzle is the keywords tag. and i’ve left it last because it is very much least. people can spend a lot of time cramming their keyword tag with every conceivable word related to their site, with mis-spellings, capitalisation and a host of variations. but ask yourself, if you were running a search engine, what would you use to index the material? the words that appeared on the page? the title of document? the filename? or a bunch or arbitrary random words stuffed into a metadata tag with the sole aim of boosting your rankings in a search engine? hardly any search engines pay attention to the keywords tag anymore, and you are more likley to cause harm by doing it badly than achieve higher rankings by doing it well. in fact you can do a lot of damage by doing it badly, as the following example shows.
so the ideal we are working for is that a user types the keywords relevant to your content into a search engine, and then your content is returned, and you can see that what will be displayed is your title tag, your metadata description tag on most search engines, and your url. all those things are within your control, so to get that traffic you need to work to make sure that this is going to be attractive to the user and make them want to click on the link to your site
but metadata is not the only factor that makes a site search friendly. the way a site is structured or built is crucial to whether it can be reached by search engines, and consequently, whether it can be reached by our audience
for example search engines will not index certain types on content: dynamic content. search engines do not fill in forms or choose options. if your content is only accessible via an interactive choice accessing dynamic content, it will not be available. you need to make a static site map of the dynamic content for it to be indexed. someone pointed out to me that this kind of defeats the point of making dynamic content. fair enough, but making content that can never be found defeats the point of making content full stop. and the process of making a static section is childsplay to automate compared to making a dynamic content management system drop-down menus search engines do not follow links from drop-down menus. if the only way to get at your news archive is via a drop-down menu then the archive material is effectively invisible to search, internally and externally no incoming links similarly if you have content on the live servers, but nothing links to the pages, there is no way for any search engine, internal or external, to reach the content. search engines do not know or guess urls, they follow links headlines written in graphics all graphics are invisible to search - if you have made a wonderful graphic banner for the notting hill carnival and use it on every carnival related page, but never mention the words notting hill carnival in text on those pages, they will be deemed relevant to the notting hill carnival by a search engine because the text doesn’t mention them flash search engines do not index flash. the trick here is to make HTML containers or interstitials that have indexable content - for example if you have a flash version of the arcade classic asteroids, what you want is a HTML page with the blurb and instruction, and a launch game button, rather than simply having a launch game button from the homepage with all the blurb and instructions contained within the flash movie.
its a good rule of thumb that if it is good for search it is good for usability and therefore good for your users. effectively search engines are the most dumb, disadvantaged, inexperienced users on the internet. all they do is render a page as text only, index the contents, then move on following the links. they don’t look at images. they don’t process javascript or flash. they don’t press buttons or fill in forms. so: use simple text navigation - for humans it is good for those on a slower connection, or people with visual impairment link to all your content - you might know you have a news archive going back to 2000, but unless you clue in your users they won’t even think it is possible to search for archive material. site maps may not aesthetically appealing, but if you have a link from your homepage to a site map, and your site map is comprehensive, within two clicks both your users and search engines have access to all of your content use obvious title not branding - up the latics! may make sense as branding consistently across your news and messageboards, but it isn’t going to be picked up by people searching for “oldham athletic news” or novices to your site. unless people know the branding they won’t be getting the pages in their search engine results text browsers and html - put your pages through text only browsers or a HTML validator and see if they pass - not only will you be complying with equal opportunities legislation, you will be future proofing your archives
who we are: we work for BBCi Search in New Media department we are going to talk about search, search engines and metadata, and how we can help you to generate more traffic for your sites via search engines