parm530
5/8/2019 - 6:07 PM

Index, Crawl and List

Differences

  • Crawl: a bot is used to recursively access all links on your webpage
  • Index: the bot can then, index the links, or save/cache them to be used in other web site areas such as "similar items" or "others have viewed"
  • List: the bot adds all the links crawled for displaying search results

robots.txt

  • In your rails app, you can prevent bots from crawling and indexing your links by adding the links you do not want crawled or index to a file public/robots.txt
  • User-Agent are all the search engines (Google, Bing, Yahoo, etc...), to select them all use the *
  • Disallow: /, means ignore all links!