Articles

Back to Magnum Home Page / Back to Publications Page

Finding CGI Scripts

What's wrong with most free CGI scripts

Dave Cross
by Dave Cross,
Director, Magnum Solutions Ltd

Many people's first experience of Perl comes when they download a free CGI script from the web. In this article Dave Cross discusses why that might be a bad way to start.

This article was originally the lead article on perl.com in January 2002.

Introduction

No matter how much we try to convince people that Perl is a multi-purpose programming language, we'd be deluding ourselves if we didn't admit that the majority of programmers first come into contact with Perl through their experience with CGI programs. People have a small Web site and one day they decide that they need a guest book, a form mail script or a hit counter. Because these people aren't programmers, they go out onto the Web to see what pre-written scripts they can find.

And there are plenty to choose from. Try searching on ``CGI scripts'' at Google. I found about 2 million hits. The first two were those well-known sites - Matt's Script Archive and the CGI Resource Index. Our Web site owner will visit one of these sites, find the required scripts and install them on his site. What could be simpler? See, the Web is as easy as people make it out to be.

In this article, I'll take a closer look at this scenario and show that all is not as rosy as I've portrayed it above.

CGI Script Quality

An important factor that Google takes into account when displaying search results is the number of links to a given site. Google assumes that if there are a large number of links to a given Web page, then it must be a well-known page and that Google's visitors will want to visit that site first.

Notice that I said ``well-known'' in that previous paragraph. Not ``useful'' or ``valuable.'' Think about this for a second. The types of people that I described in the introduction are not programmers. They certainly aren't Perl programmers. Therefore, they are in no position to make value judgments on the Perl code that they download from the Internet.

This means that the ``most popular'' site becomes a self-fulfilling prophecy. The best known site is listed first on the search engines. More people download scripts from that site, assuming that the most popular site must have the highest quality scripts and that the popular sites end up becoming more popular.

At no point does any kind of quality control enter into the process.

OK, so that's not strictly true. If the scripts from a particular site just didn't work at all, then word would soon get out and that site's scripts would become unpopular. But what if the problems were more subtle and didn't manifest themselves on all sites. Here is a list of some potential problems:

The fact is, unfortunately, that these kinds of problems are commonplace in the scripts that you can download from many popular CGI script archives. That's not to say that the authors of these scripts are deliberately trying to give crackers access to your servers. It's simply evidence that Perl has moved on a great deal since the introduction of Perl 5 in 1994 and many of the CGI script authors haven't kept their scripts up to date with current practices. In other cases, the authors know only too well how out of date their scripts are and have produced newer, improved versions, but other people are still distributing the older versions.

Setting a Good Example

Although the people who are downloading these scripts aren't usually programmers, there often comes a time when they want to start changing the way a program works and perhaps even writing their own CGI programs. When this time comes, they will go to the scripts they already have for examples of how to write them. If the original script contained bad programming practices, then these will be copied in the new scripts. This is the way that many bad programming practices have become so common among Perl scripts. I, therefore, think that it's a good idea for any publicly distributed programs to follow best programming practices as much as possible.

Script Quality - A Checklist

So now we have an obvious problem. I said before that the people who are downloading and installing these scripts aren't qualified to make judgments on the quality of the code. Given that there are some problematic scripts out there, how are they supposed to know whether they should be using a particular script that they find on the Web?

It's a difficult question to answer, but there are some clues that you can look for that give a idea of how well-written a script is. Here's a brief checklist:

Of course, these rules will have exceptions, but if a script scores badly on most of them, then you might have second thoughts on whether you should be using the script.

nms - A New CGI Program Archive

Having spent most of this article being quite negative about existing CGI program archives, let's now get a bit more positive. In the summer of 2001, a group of London Perl Mongers started to wonder what would be involved in writing a set of new CGI programs that could act as replacements for the ones in common use. After some discussion, the nms project was born. The name nms originally stood for a disparaging remark about one of the existing archives, but we decided that we didn't want the kind of negativity in the name. By that time, however, the abbreviated name was in common usage so we decided to keep it - but it no longer stands for anything.

The objectives for nms were quite simple. We wanted to provide a set of CGI programs which fulfilled the following:

We decided that we would base our programs on the ones found in Matt's Script Archive. This wasn't because Matt Wright's scripts were the worst out there, but simply that they were the most commonly used. We made a rule that our scripts would be drop-in replacements for Matt's scripts. That meant that anyone who had existing data from using one of Matt's scripts would be able to take our replacement and simply put it in place of the old script. This, of course, meant that we had to become familiar with the inner workings of Matt's scripts. This actually turned out not to be a hard as I expected. The majority of Matt's scripts are simple. It's only really formmail, guestbook and wwwboard that are complex.

Sometimes our objectives contradicted one anther. We decided early on, that part of making the scripts as easy to use as possible meant not relying on any CPAN modules. We forced ourselves to only use only modules that came as part of the standard Perl distribution. The reason for this is that our target audience probably doesn't know anything about CPAN modules and wouldn't find it easy to install them. A large part of our audience isprobably operating a Web site on a hosted server where they may not be able to install new modules and in many cases won't have telnet access to their server. We felt that asking them to install extra modules would make them far less likely to use our programs. This, of course, goes against our objective of using best programming practices as in many cases there is a CPAN module that implements functionality that we use. The best example of this is in formmail where we resort to sending e-mails by talking directly to sendmail rather than using one of the e-mail modules. In these cases, we decided that getting people to use the scripts (by not relying on CPAN) was more important to us than following best practices.

nms is a SourceForge project. You can get the latest released versions of the scripts from http://nms-cgi.sourceforge.net or, if you're feeling braver, then you can get the leading edge versions from CVS at the project page at http://sourceforge.net/projects/nms-cgi/. Both of those pages also have links to the nms mailing lists. We have two lists, one for developers and one for support questions. There is also a FAQ that will hopefully answer any further questions that you have about the project.

Here is a list of the scripts available from nms

I should point out that this is very much a "work in progress." While we're happy with the way that they work, we can always use more people looking at the code. The one advantage that Matt's scripts have over ours is that they've had many years of testing on a large number of Web sites.

A Plea for Help

So now we have a source of well-written CGI programs that we can point users to. What more needs to be done? Well, the whole point of writing this article was to ask more people to help. There's always more work to do :-)

While I don't pretend for a minute that these are the only well-written and secure CGI programs available, I do think that the Perl community needs a well-known and trusted set of CGI programs that we can point people to. With your help, that's what I want nms to become.

Dave Cross is the Owner and Managing Director of Magnum Solutions Limited.
He lives in Balham, South West London within walking distance of the best Comedy Club in London. This is a good thing.
You can email him at dave@mag-sol.com

This article is copyright, © 2002, Magnum Solutions Ltd. All rights reserved.