I was asked by a visitor how he could hide certain content (or "data" in his words) from the search engines. This article addresses that question.
The visitor didn't give any reason why he wanted to prevent the search engines from indexing some of his content, but from experience with other people who have asked similar questions, it is probably because he wants some way to share information (eg photos and updates) privately with family and close friends.
In order of decreasing effectiveness, here are some ways.
The most obvious way of not letting the search engines index your data is of course not to publish it on a website. For example, you can use some other methods to update your family, such as the old traditional method of doing it in person or by phone. Another way is to use email.
That said, the email method may not be appropriate for everyone. For example, not every family member or friend will appreciate receiving a daily dose of photos from you, overflowing their mailbox. You will probably end up on their spam blocklist.
The most practical method is probably to set up a password protected directory (ie, folder) on your website, where only people who know the password can access it. Then put all your private content in that directory.
Since Google and Bing do not have passwords to those directories, they will not be able to index it and put it in their search engine results.
If you want to do this, you can find instructions on how to password-protect a directory here.
If you don't mind any random person on the Internet reading your content, but merely don't want the two big search engines (Google and Bing) displaying it in their results, then another method is to put all the private data in a directory and create a robots.txt file that forbids search engines from indexing that directory.
For example, the following
robots.txt instructions tells the search engines that they should
not access or index anything in the
Note that the
robots.txt file does not actually stop people or software from accessing your files.
The instructions there are merely advisory. However, as far as I know, respectable search engines like Google
and Bing will heed it, and not index anything they are asked not to.
You can find details on how to create a robots.txt file here. And if your content is displayed on a web page, you can even take the additional (though probably unnecessary) step of using a robots META tag on that page to forbid indexing.
There are other ways as well, such as to install some sort of content management system and creating a private area that is accessible only by people who have an account. However, that requires far more work and technical knowledge than simply password-protecting a directory.
Another thing to note is that the above methods will only prevent the search engines from indexing content that is hosted on your own website. They do not prevent them from getting it from elsewhere. So if your parents create a website, and proudly display your photos there, Google and Bing will happily index them from that source.
Do you find this article useful? You can learn of new articles and scripts that are published on thesitewizard.com by subscribing to the RSS feed. Simply point your RSS feed reader or a browser that supports RSS feeds at https://www.thesitewizard.com/thesitewizard.xml. You can read more about how to subscribe to RSS site feeds from my RSS FAQ.
This article is copyrighted. Please do not reproduce or distribute this article in whole or part, in any form.
It will appear on your page as: