The Magento configuration includes settings to generate and manage instructions for web crawlers and bots that index your site. The instructions are saved in a file called
robots.txt that resides in the root of your Magento installation. The instructions are directives that are recognized and followed by most search engines.
By default, the robots.txt file that is generated by Magento contains instructions for web crawler to avoid indexing certain parts of the site that contain files that are used internally by the system. You can use the default settings, or define your own custom instructions for all, or for specific search engines. There are many articles online that explore the subject in detail.
Example: Custom Instructions
Allows Full Access
Disallows Access to All Folders
User-agent:* Disallow: /
Disallow: /lib/ Disallow: /*.php$ Disallow: /pkginfo/ Disallow: /report/ Disallow: /var/ Disallow: /catalog/ Disallow: /customer/ Disallow: /sendfriend/ Disallow: /review/ Disallow: /*SID=
On the Admin sidebar, go to Content > Design > Configuration.
Find the Global configuration in the first row of the grid and click Edit.
Global Design Configuration
Scroll down and expand the Search Engine Robots section and do the following:
Search Engine Robots
Set Default Robots to one of the following:
INDEX, FOLLOW Instructs web crawlers to index the site and to check back later for changes. NOINDEX, FOLLOW Instructs web crawlers to avoid indexing the site, but to check back later for changes. INDEX, NOFOLLOW Instructs web crawlers to index the site once, but to not check back later for changes. NOINDEX, NOFOLLOW Instructs web crawlers to avoid indexing the site, and to not check back later for changes.
If needed, enter custom instructions into the Edit Custom instruction of robots.txt file box. For example, while a site is in development, you might want to disallow access to all folders.
To restore the default instructions, click Reset to Default.
When complete, click Save Configuration.