Commercial options

Should you consider spending money on getting into search pages or should you save your cash and stick to optimising techniques?

If your site contains useful information, services, products or resources, and if it's well-constructed, with accurate, relevant meta tags, title, text content and link structures and so on, then by rights it should be included in search engine indexes with no more than just the standard prompting.

However, many indexes are so swamped with submission requests that it can take many weeks, and even months in many cases for index requests to be processed. This delay is especially galling when you're informed that for a fee the process can be shortened to just a few days. Your goal, at least most of the time, should be to avoid paying for inclusion in search engine directories and indexes. So should you go ahead and fork out for speedier inclusion?

This depends entirely on which service is asking for your cash. If it's a key search engine or directory, particularly one that influences other services with its content, and if being found easily in this way is important to you or your client, then it's worth considering. In many cases, this also allows more frequent checks to be performed on your site. If you have content that changes regularly and you're concerned about specific search indexes going out of date on a regular basis, this can be a clear benefit.

However, be wary if you're asked to pay for a service that makes submissions to indexes and directories for you. Virtually all search site organisations dislike automated URL submissions, and many will reject and even block your site if it appears to come via such methods. The reason is that these are 'spammer' methods used to send in thousands of site addresses in order to skew search engine results. Because such efforts undermine the usefulness of the search engine indexes, this sort of behaviour is taken very seriously. The Open Directory project at www.dmoz.org, which is one of the most important directories in the Internet due to content-sharing agreements with Google and others, deletes such submissions without warning. They also say that 'persistent automatic submission may force us to ban you from the dmoz site, so we can provide resources to real human beings'.

Pay-per-click services are offered by a number of companies, with Google's being the best known. This is where you set up small ads which are shown when certain keywords are included in a search, and you pay a small fee every time someone clicks on a link to your site from the host's search result listings. The more popular a keywork is, the more you'll have to pay for each click. If you hope to generate revenue in some way from your visitors, then this may be worth trying, but do your sums very carefully before you take this on. You won't generally risk being presented with an unexpectedly large bill as the total you wish to spend should be established up front, but you might find the click-to-sale conversion rate isn't enough to justify things. Have clear, established ways to make profits from your site before you pay for visits in this way.

The bottom line is that the site must be ready for the spiders before you start inviting them over. You can spend hours submitting your URL to search engine and directory sites around the world, but if the pages aren't spider-friendly, then they're unlikely to rank well at all - and if they aren't particularly human-friendly, then they won't be listed in directories, and real visitors won't stay long even if they do find you in a page of search results.

If you come up with a method you think will trick your favourite search engine spider into ranking you higher than normal, don't use it. The chances are it has already been tried out, spotted and countered. More importantly, if a URL is associated with spamming tricks in any way, its ranking will suffer immediately, and the site may even be dropped entirely. It simply isn't worth the risk.

Next: The final word

Introduction

The Basics

Setting the right Keywords

Tricks to avoid

Commercial options

The final word

dynamic pages

If your site is entirely (or even just largely) dynamic - that is, served via a database rather than being a collection of ready-made HTML documents in folders on the server - then you can run into problems getting it indexed by a spider. This is a particular problem when multiple parameters are used, pulling data from more than one source to create the final result. It's best to avoid referencing multiple data sources for dynamic pages if possible. Another solution is to redesign the way queries are sent to the server and handled there so they don't contain elements such as ?, $, or similar codes, working with ordinary URLs (or what appear to be ordinary URLs but are still recognised as query parameters) instead. Finally, consider providing static content where possible to make life easier for the spider-driven indexing process.

Using htaccess to redirect visitors properly

Here, we'll show how to control where people go in your site without disturbing automated indexing tools. If you work with an existing site, it may have been visited by search engine spiders already. As such, it's best to avoid moving files and folders around or renaming things. However, sometimes this is unavoidable. If you simply move and rename things, when people try to follow search links to your site they'll get dead links. Spiders will also get the same result when they check back, at least until the new site structure has been found.

To avoid this, you need to have people redirected automatically when they try to visit the old pages. Not by using the 'refresh' meta tag, as this is a crude method that's frowned on by spiders. The professional method uses the standard Apache server redirect command. This is stored in a text file called .htaccess (the full-stop at the front of the name is important), and contains one or more lines that tell the server what to do when someone asks for certain files. This only works with Apache-driven sites, but that's what runs the vast majority of sites around the world. Work out what redirections you'll need. If you want to point requests for specific files to other ones, then list these by name, one in each line. If you want to redirect all requests for a directory to another one, list it first then the directory (or single file) to be used instead.

Don't try to name the file .htaccess on your Mac before uploading; the full-stop at the beginning tells Unix-based systems, including the Mac, that it's invisible. Instead, name it htaccess.txt and rename it once on the server. For full details, although admittedly fairly technical, on how to use the Apache redirect command and more, visit httpd.apache.org/docs/mod/mod_alias.html#redirect

Set up htaccess instructions

ahtaccess1

Here, we set up simple htaccess instructions to redirect users. Begin with redirect permanent. This tells the Apache server what to do with the following data, and to tell visitors that the change is a permanent one. Then comes the URL, which someone might request, listed in local form with a leading slash, as in /file.html or /folder. Finally, supply the full replacement URL, including http://. So, to redirect a request for file.html at the top level of your site, use redirect permanent /file.html http://www.site.com/newfile.html, and to redirect a whole folder (called 'directory') use redirect permanent /folder http://www.site.com/newfolder.

Upload the htaccess file

bhtaccess2

Now we'll upload the file and name it correctly. Save the text file as htaccess then use an FTP tool to upload it to the top-level of your Web site's directory structure, alongside your home page's index.html file. Once there, change the file's name to include the full-stop at the beginning of the name (remove any .txt filename suffix). Now the Apache server will check here before serving any pages, and requests for named items will be passed the correct replacements immediately.

redirect permanent /file.html http://www.site.com/newfile.html

redirect permanent /folder http://www.site.com/newfolder

Be Found: Designing Findable Sites (5)