Wanting to get an idea of where our traffic comes from on a typical Friday evening, I took a look at all visiting IPs to one of our websites over a 15 minute period and found the IP address 72.5.230.111 which I didn't recognise. A reverse DNS lookup points me to a domain called site24x7.com:
Dorothy:jan09 grahamellis$ host 72.5.230.111
111.230.5.72.in-addr.arpa is an alias for 111.65-123.230.5.72.in-addr.arpa.
111.65-123.230.5.72.in-addr.arpa domain name pointer monitor.site24x7.com.
Dorothy:jan09 grahamellis$
and a further look at my log file showed me it had been visiting, routinely every 5 minutes, all day. Indeed a look back in my history file showed me that it had been grabbing a copy of the home page at this interval for at least the last couple of weeks - and probably a lot longer, but I went back no further.
The web site of the 24 x 7 people says "Site24x7 is the easier, faster and more effective way to monitor the uptime and performance of your websites, online services and servers." In essence, their servers run robots that check your machine from a remote location, and gather stats. They offer a free account to monitor one or two sites at intervals of an hour (or less frequently), and paid options for more frequent monitoring.
It's a good idea for the admin of a server to have it monitored in this way.
The problem that I have in this case is that I am the admin and I have not asked for the monitoring to take place.
Automata such as the monitor program are supposed to read the file robots.txt file (see
robots exclusion standard) from time to time to check that they are welcome, and automata can - in theory - by excluded on a case by case basis. I say in theory because it's a voluntary code. It seems that the site 24 x 7 computer has enough time on its hands to monitor my site every 5 minutes, but hasn't bothered to read robots.txt for at least a month, so there's little point in me even trying to ask it to desist via that file!
Being a voluntary code, there's nothing
illegal about what the site24x7 people are doing - but their system is being very rude in putting a small but repeated load on my server without even asking "may I" every so often. And someone's using the service in a way that it's not being sold. It's great to buy a tool that you can use to look for issues on your own site, and rather less clever to be able to buy a tool that lets you find fault and analyse someone else's site. I rather dislike the ability that site24x7 is giving person(s) unknown to monitor me.
Would it be practical for the folks who provide this service to check that they're only monitoring sites where they are wanted by the site owner? Yes - they could so easily ask the site owner to put up a specific file with specific content that they could check to validate ownership - Google does this for some services (and, as even, "Well done Google"). They could certainly check the robots.txt file from time to time (the respect it), and they could have easy to find information on their web site telling webmasters who come across them unexpectedly, as I did, how they can disable the snooping onto their server.
I sent an email to their support line asking (a) how to exclude their service from my server and (b) who has asked for us to be monitored. Since the company specialises in 24 x 7 monitoring, it seems ironic to me that they haven't got back to me yet, and it's been some 12 hours since I wrote.
But secretly, I'm slightly flattered that someone feel it's worth paying even a tiny amount of money to be told when our server's not running.
(written 2009-01-10)
Associated topics are indexed as below, or enter http://melksh.am/nnnn for individual articles
G911 - Well House Consultants - Search Engine Optimisation [165] Implementing an effective site search engine - (2005-01-01)
[427] The Melksham train - a button is pushed - (2005-08-28)
[1015] Search engine placement - long term strategy and success - (2006-12-30)
[1029] Our search engine placement is dropping. - (2007-01-11)
[1344] Catching up on indexing our resources - (2007-09-10)
[1793] Which country does a search engine think you are located in? - (2008-09-11)
[1969] Search Engines. Getting the right pages seen. - (2009-01-01)
[1971] Telling Google which country your business trades in - (2009-01-02)
[1982] Cooking bodies and URLs - (2009-01-08)
[2000] 2000th article - Remember the background and basics - (2009-01-18)
[2019] Baby Caleb and Fortune City in your web logs? - (2009-01-31)
[2045] Does robots.txt actually work? - (2009-02-16)
[2065] Static mirroring through HTTrack, wget and others - (2009-03-03)
[2106] Learning to Twitter / what is Twitter? - (2009-03-28)
[2107] How to tweet automatically from a blog - (2009-03-28)
[2137] Reaching the right people with your web site - (2009-04-23)
[2324] What search terms FAIL to bring visitors to our site, when they should? - (2009-08-05)
[2330] Update - Automatic feeds to Twitter - (2009-08-09)
[2428] Diluting History - (2009-09-27)
[2552] Web site traffic - real users, or just noise? - (2009-12-26)
[2562] Tuning the web site for sailing on through this year - (2010-01-03)
[2686] Freedom of Information - consideration for web site designers - (2010-03-20)
[2748] Monitoring the success and traffic of your web site - (2010-05-01)
[3670] Reading Google Analytics results, based on the relative populations of countries - (2012-03-24)
[3746] Google Analytics and the new UK Cookie law - (2012-06-02)
[4121] Has your Twitter feed stopped working? Switching to their new API - (2013-06-23)
Some other Articles
Melksham, Wiltshire. Town Crier Competition, 2009One Cheer for Local Democracy - Asda in MelkshamWalk to BowerhillLearning to program as a part of your jobSite24x7 prowls uninvitedKeeping PHP code in database and running itBitter coldMichelleLooking forward, in Melksham, in 2009