Until recently I was under the impression that search engine spiders conformed to “sane” standards. If a page returns a HTTP status code of 404 you would have thought that the page / link would be removed from the index.
Seemingly this is not the case…
I’ve now been advised that I should try to use either a 410 status code
There are some interesting differences between the two HTTP responses:
10.4.11 410 Gone
The requested resource is no longer available at the server and no forwarding address is known. This condition is expected to be considered permanent. Clients with link editing capabilities SHOULD delete references to the Request-URI after user approval. If the server does not know, or has no facility to determine, whether or not the condition is permanent, the status code 404 (Not Found) SHOULD be used instead. This response is cacheable unless indicated otherwise.
The 410 response is primarily intended to assist the task of web maintenance by notifying the recipient that the resource is intentionally unavailable and that the server owners desire that remote links to that resource be removed. Such an event is common for limited-time, promotional services and for resources belonging to individuals no longer working at the server’s site. It is not necessary to mark all permanently unavailable resources as “gone” or to keep the mark for any length of time — that is left to the discretion of the server owner.
While the good old 404 response code is defined as:
10.4.5 404 Not Found
The server has not found anything matching the Request-URI. No indication is given of whether the condition is temporary or permanent. The 410 (Gone) status code SHOULD be used if the server knows, through some internally configurable mechanism, that an old resource is permanently unavailable and has no forwarding address. This status code is commonly used when the server does not wish to reveal exactly why the request has been refused, or when no other response is applicable.
That’s a bit of a head wrecker.
I suppose if you know exactly which pages you want to remove, then a 410 response code is easy enough to generate, but in my case it isn’t. I’d either have to change the 404 page to a 410 for the entire site, or simply wait it out.







Leave a Reply