What is robots.txt?
The robots.txt is not a HTML file, but a simple text file which has the set of instructions which is understood by the crawlers and search engine bots. Search engine crawlers are also called as robots. Hence the name robots.txt. The purpose of robots.txt is not to stop the search engines to crawl the webpage of your website, but it has the instructions to make the robots obey accordingly. When we need certain pages of the website to be skipped by the search engine robots, we give those instructions in the robots.txt file.
Location of robots.txt
Where to place the robots.txt file in your website? This is a common question and it is important as well. The robots.txt file is placed in the main directory like this: http://websitename.com/robots.txt
For blogger: http://blogname.blogspot.com/robots.txt
Blogger by Google generates the robots.txt file automatically, so you need not worry about its creation or addition. Just type http://yourblogname.blogspot.com/robots.txt to view your robots.txt file contents.
The other way to view your robots.txt is to go to the Google WebMaster Tool. Login there and follow the number below:
How to generate a fresh copy of blogger robots.txt for your blogspot?
This can be done the same way by logging in to Google WebMaster tool. And follow the numbers in the pic:
In the above picture, you can add the rules you would like to apply and then create the robots.txt file as desired.
The robots.txt file structure for Blogger will look like:
User-agent here is the search engine bots which are going to index through your site pages.
Disallow option is the files, urls, or any pattern which is going to be skipped from getting indexed. By default /search of Blogger pages are going to be skipped by the crawlers.
To check the pages already indexed by Google, use the command site:yourblogname.blogspot.com in Google.com
To check the validity of your robots.txt, check it on Motoricerca robots.txt validator.