As well as running server side scripts (through CGI, modPerl or ASP) and many uses independent of networks, Perl can be used to write server processes, and client processes too.
"Why do I want to write my own browser?" you will ask. The answer, of course, is that you don't ... but you might well want to write an application to mimic a browser as it collects information from a server using HTTP.
Let's say, for example, that I wish to write a Perl program to convert a price in one currency into another currency, using the current "spot" exchange rates. There's a suitable table available at:
http://www.ecb.int/home/eurofxref.htm
and it's updated daily.
[Like all websites, ecb has just changed ;-( ... they now provide an XML file that's specifically intended for applications like this one :-) - it's at
http://www.ecb.int/stats/eurofxref/eurofxref-daily.xml]
Although we could write a program using our own connections and sockets, it's much easier to use Perl's LWP module to do so - here are the sort of results we might get:
$ ecbgrab
AUD 1.693 Australian dollar
BGN 1.951 Bulgarian lev
CAD 1.371 Canadian dollar
CHF 1.478 Swiss franc
CYP 0.576 Cyprus pound
CZK 31.910 Czech koruna
DKK 7.428 Danish krone
EEK 15.647 Estonian kroon
EUR 1.000 European euros
GBP 0.610 Pound sterling
HKD 6.732 Hong Kong dollar
HUF 243.000 Hungarian forint
ISK 88.650 Icelandic krona
JPY 115.660 Japanese yen
KRW 1133.040 South Korean won
LTL 3.453 Lithuanian litas
LVL 0.555 Latvian lat
MTL 0.398 Maltese lira
NOK 7.840 Norwegian krone
NZD 2.068 New Zealand dollar
PLN 3.585 Polish zloty
ROL 27700.000 Romanian leu
SEK 9.188 Swedish krona
SGD 1.585 Singaporean dollar
SIT 222.598 Slovenian tolar
SKK 42.470 Slovakian koruna
TRL 1133000.000 Turkish lira
USD 0.863 US dollar
ZAR 9.931 South African rand
Please enter an amount to convert (e.g. 290.00 GBP) ... 19.99 GBP
into what currency ... NOK
19.99 Pound sterling converts to 256.94 Norwegian krone (GBP to NOK at 12.8533)
$
Here's the program:
#!/usr/bin/perl
use LWP::UserAgent;
$agent = LWP::UserAgent->new;
$agent->agent("Well House Consultants/$0 ");
$req = HTTP::Request->new(GET => "http://www.ecb.int/home/eurofxref.htm");
$res = $agent->request($req);
$page = $res->content;
%currencies = ("EUR","European euros",($page =~ />([A-Z]{3})<.*?>\s*<.*?>(.*?)</gs));
%rates = ("EUR",1,($page =~ />([A-Z]{3})<.*?>\s*<.*?>.*?<.*?>\s*<.*?>(.*?)</gs));
foreach $c (sort keys %currencies) {
printf ("%4s %12.3f %s\n",$c,$rates{$c},$currencies{$c});
}
print "Please enter an amount to convert (e.g. 290.00 GBP) ... ";
chop ($yousaid = <STDIN>);
if ($yousaid) {
die ("invalid entry\n") unless(($amount,$incurr) = ($yousaid =~ /(.*?)\s*([A-Z]{3})$/)) ;
die ("not a know currency\n") unless ($rates{$incurr});
print "into what currency ... ";
chop ($yousaid = <STDIN>);
die ("invalid entry\n") unless(($outcurr) = ($yousaid =~ /([A-Z]{3})$/)) ;
die ("not a know currency\n") unless ($rates{$outcurr});
$becomes = $amount / $rates{$incurr} * $rates{$outcurr};
$erate = 1.0 / $rates{$incurr} * $rates{$outcurr};
printf ("%.2f %s converts to %.2f %s (%s to %s at %.4f)\n",
$amount, $currencies{$incurr},
$becomes, $currencies{$outcurr},
$incurr, $outcurr, $erate);
}
We've used the LWP::UserAgent module from the CPAN - Browsers are know as "User Agents" in case you were wondering. We've given the User Agent a program name so that the server knows what type of agent (i.e. what model or browser) we are, and we've then made up a GET request and submitted it.
There are other modules such as HTML-Parser available to help you parse the response, although in this particular case the response is in a straightforward enough format, so we've just used regular expressions for the job.
A NOTE OF CAUTION
Web servers were designed to supply information to browsers at the request of human users. Such accesses are relatively sporadic in computing terms, even for an enthusiastic user, so that lots of users can all be accessing the same web site in the same period of time and the web server can cope.
If you write a client program that reaps a very large number of pages as fast as it can, it's quite possible that you'll overload the server or the internet connection to it, and your effort may be seen as unwelcome - it may even be classified as a "denial of service" attack.
There are two rules to note if you are going to be looking for many pages, or if you are going to be looking regularly.
Firstly, you should look at a file called robots.txt which should be in the home directory on the web server; this file contains information placed on the web server by the web site administrator, and tells robots where they are not welcome.
Second, if you require more than one page from a server, you should pause between each page that you grab to give other users a chance of a look in. Chances are that if you have a major robotic program, you'll be looking at pages on many different sites so you don't actually have to slow your program down - just grab pages from each site in turn
See also
Perl on the Web course
Please note that articles in this section of our
web site were current and correct to the best of our ability when published,
but by the nature of our business may go out of date quite quickly. The
quoting of a price, contract term or any other information in this area of
our website is NOT an offer to supply now on those terms - please check
back via
our main web site
Web Application Deployment - XML, DTD, XSLT, XHTML and More [653] - ()
[1050] - ()
[1901] - ()
[2246] - ()
[2378] - ()
[2554] - ()
Handling XML in Perl [2378] - ()
[2555] - ()
[3874] - ()
Perl - Standard Web Modules [975] - ()
[2229] - ()
[2402] - ()
[2416] - ()
[3485] - ()
[4099] - ()
[4100] - ()
resource index - Perl
Solutions centre home page
You'll find shorter technical items at
The Horse's Mouth and
delegate's questions answered at
the
Opentalk forum.
At Well House Consultants, we provide
training courses on
subjects such as Ruby, Lua, Perl, Python, Linux, C, C++,
Tcl/Tk, Tomcat, PHP and MySQL. We're asked (and answer)
many questions, and answers to those which are of general
interest are published in this area of our site.