Setting the right keywords

Picking the right keywords is important. Don't just use the ones you think are likely to be looked for if they aren't actually relevant to your page content.

Spiders will cross-reference everything they read in a page to see how well things match up. If they don't, then your page won't score highly, and it may be marked down further if it looks like you've attempted to cook the books.

Look through the text intended for your pages and see whether it's clear and descriptive. Then see which ones seem to be natural keywords that you could pick out for use elsewhere. Once you have a set of candidates, see whether your preferred search engine sites have ways of checking the popularity of keyword searches. If only a handful of people ever search for a specific word, it might not be worth promoting it specifically. On the other hand, if searching for that word doesn't bring up many results, then including it could mean that you feature prominently when someone does look for that word. One useful although commercial keyword tool is, a service which uses a number of processes including thesaurus and lateral connection tricks to come up with sets of keywords likely to work hard for your site.

You can put simple strings of keywords into the 'keyword' meta tag, separated by commas. The 'description' meta tag should read more smoothly, however, as this may be shown to users in the search results display. However, if this is as a puffed-up bit of PR, it will cost you dearly in directory listings and hurt the click-through rate from search result listings. A measure of promotion isn't fatal, but make sure you've describe the page objectively, clearly and concisely; 25 words is a recommended maximum. Do note, however, that meta tags aren't all-powerful: some search engines do look at those, but Google doesn't bother with them at all. The Google spiders don't care if meta tags are there or not; they're almost entirely ignored.

Once you're armed with your ideal keywords, you may want to rethink some of the content for your pages. Having relevant keywords close together in the text as well as not too far from the top of the page can improve your standing. You shouldn't mess up the grammar and structure of your text just to move keywords about, but careful tweaks can prove helpful.

While looking at the content of your pages, whether you're still in the planning stages or working with an existing site, remember that the upper area of a page (sometimes referred to as the 'above the fold' section) is the most effective in terms of human viewing. This also goes for spiders to an extent; although the level of importance given to this will vary, the text in the earlier parts of pages is regarded as being more important than what comes later.

There's another area spiders pay attention to but which many people overlook: the text included within the link tags in your pages. Put simply, if you have 'click here for information on fonts' and the words 'click here' is the actual link, then the link text is, in a word, uninspiring. Remember that the text wrapped in the link code is effectively highlighted for the spiders; picking the right words for the link will provide more for them to work with.

You can also set a separate title attribute for links (for example, title="title data"), images and other elements, another useful place to store relevant keywords and descriptions. Some browsers show this as tooltips, but it's also useful for spiders. The result of this work should be slightly better results in the index, and every bit helps.

Many people believe frame-based sites can't be indexed. While this kind of site causes serious problems for some spidering techniques, most of the big search sites will take things in their stride - to an extent, at least. However, if you don't have valid content in the 'noframes' section of your frameset documents (the HTML documents that define the frame areas that show your pages), then you're unlikely to get very far. Spiders coming into the top level of your site will be served the frameset page. Some will then go on to pull down the referenced frame pages, but others will be stymied by the unhelpful message that the viewer needs to 'use a frames-capable browser to view the site'. Instead, use your site production tool to edit the noframes tag to contain useful information. Feel free to design a complete alternative page with graphics and so on, but be aware that any human visitor using a browser not capable of handling frames won't be able to do much in the way of CSS or JavaScript either.

Next: Tricks to avoid


The Basics

Setting the right Keywords

Tricks to avoid

Commercial options

The final word

FAQ pages

Because spiders can have problems with some site content, it's worth adding a number of static information pages that sit on the server, linked to from key pages, waiting to be indexed. Consider what sort of content you can offer that doesn't have to be built on the fly all the time. A set of frequently asked questions is a good starting point, as it's useful to the human reader and content-rich for indexes. Even if everything else is database-driven, keep the heart of your FAQ pages static, and your search engine ranking should improve.


Using the Robot protocols

If you have parts of your site that you don't want indexed for some reason - for example, your cgi-bin folder or any equivalent - then you can use the robots exclusion protocol to let these programs know where not to go. This is done in a text file, which must be called robots.txt, and placed at the root level of your site directory, where your home index page lives. The first line in this file should begin User-agent:. In most cases, this would be followed with an asterisk, meaning all robots rather than just named ones. The next lines will list each directory or specific page you want left alone. Begin each line with Disallow: then the relative address - for example Disallow: /cgi-bin/.

The same effect can be achieved by putting a 'robots' meta tag into the head of a page. Use some paired combination of index, follow, noindex and nofollow in the content data. For example, noindex,nofollow tells spiders not to index the page and not to follow any links they might find. The full tag would look like this: <meta name="robots" content="noindex,nofollow">. You don't need top-level admin access to the server to set this. However, not all spiders pay attention to this meta tag instruction. For more information visit

Even if you don't want to exclude search engines from parts of your site you should still consider adding a robots.txt file at the top level. Search engines look for one when they arrive at a site, and even if they don't mind not finding one the request will show up as a 'page not found' error in your web server logs.

Robots file


As long as you have admin access to the top level of your Web site's directory structure you can upload your own robots.txt file to instruct search site spiders where they are and are not allowed to go. This is the preferred process, although you won't be able to do this if your site is within a subdirectory of a domain.

Robots meta tag


The alternative approach, putting robot instructions into a 'robots' meta tag, is more flexible, although not all robots pay attention to this. Use index or noindex paired with follow or nofollow to tell robots what to do with each page as they get there.

Be Found: Designing Findable Sites (3)