Removing Information from Search Engines
Clearing or removing a web page from the local host is the first step. However, some search engines cache web pages-- copies of entire web pages or portions of them.
See the Wikipedia article on How Search Engines Work for an explanation of indexing vs. caching.
- Check the major search engines for cached copies of web pages
- Request removal from the cache. Several search engines rely on updates rather than an explicit removal service. That is, it will be removed automatically only when a "404 -- File Not Found" error is returned for the page. It can take 7-14 days for material to be removed.
- Follow up to verify that the material has been removed. Make use of metasearch engines (sites that send searches to several search engines at once and report the results). This is especially helpful in the verification step.
- META tags protect content only if a search engine honors them. Do not trust META tags to protect sensitive information.
- Altavista
- See entry for Yahoo below.
- Archive.org
- See Removing Documents From the Wayback Machine.
- Ask.com
- This site seems to rely on updates to indexing rather than providing a removal mechanism. Some (not all) pages are listed as cached.
- GigaBlast
- Historic cache is listed as a feature. No instructions found on content removal. Suggest an email to Customer Support for assistance.
-
Insert META tags, as described in the appropriate section of How can I remove content from Google's index.
Use the URL Removal Request Tool. See How do I use the URL removal request tool?
In order for this automated process to work, the webmaster must first insert the appropriate meta tags into the page's HTML code.
Note: The references on How can I remove content from Google's index explain META tags and suggest that all search engines always honor them. Alas, this is not the case. META tags protect content only if a search engine honors them.. - Wayback
- Removing Documents From the Wayback Machine
- Wisenut
- The terms of service state that Wisenut "assumes no responsibility or liability for Third Party Content". Service problems can be brought to their attention via US Mail or a telephone call.
625 Second Street
San Francisco, CA 94107
415-348-7000 - Yahoo! Search
- The web page explains META tags to prevent indexing by search engines that follow these conventions. The claim is that material will be removed from the cache the next time the site is indexed.
How can I have my web site or web pages removed from the search engine?