Oh man, that's a long one. But, to be brief, here's the way it works:
Each CGI application on the site, like the one which runs the User's Rides, works with information that it stores using some method. Some web applications work with a real database, like MySQL, Oracle, Sequel Server, or DB2. Others that are going to handle much less information might be written to store and work with information in "flat files". Flat files are just text files basically with data stored in them in a predefined format which the application knows how to read a chunk at a time of and break each chunk into separate parts as it read them. The User's Rides uses flat files to store everyone's information.
The hurtle with using flat files is that you can only have one instance of the application accessing the files for the purpose of writing new data, or updating existing data, to the file at once. However, a potential problem comes into play when a program like this (in this case it's the User's Rides, which is a application written in Perl5) is run multiple times simultaneously, and each instance of that program tries to change the data file in some way at while another instance of that application already has the file open.
For example: Say
cad614 is editing his User's Ride by updating his specs, and while he is doing that you (
JL1314) are clicking on someone's listing to view their car. Now, as you know the User's Rides application keeps track of how many times your listing has been viewed. To do this, it updates the main data file containing an index listing of basic information about each person's listing. Now, let's also say that you click on the link to the ride you want to view, which invokes the program, which in turn runs and wants to open and modify the data file. And, at the same time cad614 clicks the button to save his new specs. Now although each car's specs are kept in a unique file for each listing, it still updates a few basic information items in the main index data file also.
Now here's the dilemma. You've run the program to do something that updates the index file, and at the same time so has cad614. There are now
two copies of the User's Rides program running at the same exact time which both want to WRITE information to the main data file. Unfortunately, this CAN NOT HAPPEN. If it does, the data file(s) become what we call "corrupted". Basically what that means is that it really SCREWS it, all data in the file is lost, and the file is now ZERO bytes in size. It essentially contains nothing. OOPS.
Now in order to prevent this situation from occurring, which would actually be VERY common for the User's Rides otherwise, you have to program the application utilizing a file locking technique. File locking is simply a method employed to prevent more than one instance of the program from using the application's data files at the same time. Or, in effect, allows the program to know if the data files are already in use by another instance of itself.
So, here are the common steps that happen concerning the more common file locking techniques when the program executes:
- The program runs
- The program looks to see if a predefined filename exists (for example: rides.lock).
- If the file DOES exist than the program simply waits, or sits there, and rechecks to see if the file exists once each second. Once it sees the file DOES NOT exist it knows it can now use the data file.
- Now that the data file is not in use, the program must first lock the data file for itself. So, it creates the files "rides.lock", which is just an empty file. But, because it exists, it lets other copies of the program that get run while this copy is still using the data file that it's in use.
- It now updates the data files as needed.
- It now ERASES the "rides.lock" file.
- The program now finishes outputting data to the users browser and exits.
- And, of course, while it was doing all this if another copy of this program was invoked by a web surfer and is waiting till this one is done, it now sees the lock file is gone and starts to do this whole process for itself too. And, so one.
Now, maybe you can understand how this process works. However, my example was with just two copies running. Really, on Mustang Works, DOZENS of copies of the User's Rides may be running at any given second. In fact, the message board system, which is also written in Perl and uses flat files, does the same thing. And, I've done a process listing on our main UNIX server running it in the past during PEAK usage times of the day and found over 100 copies of the main message board program (the ones which displays the listings of posted threads in forums) at any given second!! Yikes! So, you can see why on a very busy site like MustangWorks.com this could, and WOULD, happen easily with out using such file locking techniques.
Furthermore, in all the apps I've programed myself to run features on the site (most of them), I also use my own methods and tricks that I've developed over the years too. One is a method where the program self backs up the data files it uses each time data is changed in them. Therefore, before an instance of the program is executed it always has a current backup copy of the data files. The first thing it does when it gets its turn to use the data files is do a couple checks on the regular data files to make sure they are not corrupted. Now, if they are, it will simply delete the corrupted versions and automatically rename the backup versions to the regular versions, and then go ahead and do its business. Then, make new backup versions as usual after it changes data in them.
I know I've probably lost you. But, these are some basics to knowledgeable CGI programming. I tried to explain them in non technical terms. Unfortunately, when you have a site like this that gets several hundred thousand hits each day no matter how well you cover your bases with techniques like this, you are bound to have a collision sooner or later with an app that is very heavily used utilizing flat files. So, for the rare instances where both the file locking techniques fail, and the self replicating backup technique I've developed also doesn't come through, I also have the program write each data change it makes to a log file. And, like I just had to do the other day to restore the Rides, I can as a last resort rebuild the main data file from the data trail in the log. A little bit of a hassle, but no sweat.
Some time in the future, I may very well reprogram the User's Rides system as a Cold Fusion application that will store everything in Sequel Server. We already have this capability, because one of the Mustang Works servers is a Win2K box with Cold Fusion 4.5.1 and SQL6.5...
------------------
Dan McClain, Editor
The Mustang Works Magazine
1991 Mustang GT - NOVI Supercharged 377 Stroker
1999 Ford Lightning SVT - Supercharged 5.4L Triton