How to Implement Robots txt in Your Website

How to Implement Robots txt in Your Website

How to Implement Robots txt in Your Website 

 

Robots.txt is a text file webmasters create to instruct web robots. how to crawl pages on their website. The robots.txt file is part of the the robots exclusion protocol (REP), a group of web standards that regulate how robots crawl the web, access and index content, and serve that content up to users. The REP also includes directives like meta robots, as well as page-, subdirectory-, or site-wide instructions for how search engines should treat links (such as “follow” or “nofollow”).

In practice, robots.txt files indicate whether certain user agents can or cannot crawl parts of a website. These crawl instructions are specified by “disallowing” or “allowing” the behavior of certain (or all) user agents.

BASIC FORMAT:
User-agent: [user-agent name]
Disallow: [URL string not to be crawled]

How does robots.txt work?

Search engines have two main jobs:
1.Crawling the web to discover content.
2.Indexing that content so that it can be served up to searchers who are looking for information.