# robots.txt file for www.wellho.info # This sample file is written to the "robots exclusion protocol" # or "robots exclusion standard". Well behaved robots (that's # all the important ones!) use this file to check where they are # unwelcome ... and they should then only crawl / use your other # pages. # # robots.txt file for www.wellho.net and www.wellho.co.uk # See # http://en.wikipedia.org/wiki/Robots.txt # http://www.robotstxt.org/robotstxt.html # and checker at # http://tool.motoricerca.info/robots-checker.phtml # # Why do you want to exclude certain URLs when the whole point # of having a web site is to give the public access to the # informaion it contains? You'll see in my example that I've # put a note beside each of the URLs listed. # # * I do NOT want search results within our site indexed, as they # would just hide the real pages # # * There is no point in the search engines trying to index all # possibly accessibility combinations # # * CGI program outputs differ every time - no point in indexing them # # * The "happens" directory is our staff short cuts - not really a place # for new visitors to land! # # * The unique.html file is automatically generated from all the other # pages on our site and contains a list of possible spelling mistakes on # other pages - NOT what we want to index under! User-agent: * Disallow: /docdir/ Disallow: /cgi-bin/ # Disallow cgi programs Disallow: /net/unique.html # Unique words Disallow: /happens/ # Our Staff Short Cuts Disallow: /resources/mywellho.html # Accessibility Options Disallow: /net/search.php4 # Searches Disallow: /demo/poc01.php?item # Also searches Disallow: /illust/ # Short cuts to images Disallow: /net/recents.html # Pointless to index and varients Disallow: /resources/recents.html # Pointless to index Disallow: /net/recents.htm # Pointless to index Disallow: /resources/recents.htm # Pointless to index Disallow: /net// # Supress recursive pages in /net Disallow: /resources// # Supress recursive pages in /resources Disallow: /solutions// # Supress recursive pages in /solutions Disallow: /demo// # Supress recursive pages in /demo Disallow: /short/ # Supress Short Tags Disallow: /share/e107 Disallow: /share/skin Disallow: /share/zboard Disallow: /share/component Disallow: /share/bbs Disallow: /share/admin Disallow: /share/modules Disallow: /resources/maps.htm Disallow: /net/maps.htm Disallow: /wiltshire/index.php4 Disallow: /share/index.php4 Disallow: /overview Disallow: /demo/pflog Disallow: /archive Disallow: /wall Disallow: /pix/x19_ # Christmas quiz 2019 images Disallow: /mouth/3272_ # Disallow: /net//maps.html # Disallow: /net///maps.html # Disallow: /net////maps.html # Disallow: /net/////maps.html # Disallow: /net//////maps.html # Disallow: /resources//maps.html # Disallow: /resources///maps.html User-agent: TurnitinBot Disallow: / # Not interested in selling courses or hotel rooms in # Russian (Yandex) or Chinese (Baidu) ... and these robots # can impose significant load User-agent: Yandex Disallow: / User-agent: Baiduspider User-agent: Baiduspider-video User-agent: Baiduspider-image Disallow: / User-agent: AhrefsBot Disallow: / # Note that blank lines are NOT allowed within the block!