Users rides glitch?

jonnyk · 09-03-2000, 05:28 PM

Can't seem to view any of the cars here...could just be a temporary thing but I thought I'd mention it anyways.

------------------
1991 LX Hatch 5.0L

cad614 · 09-03-2000, 06:07 PM

Looks like everything got deleted again.

Let us know if we're going to have to repost everything.

StangFlyer · 09-04-2000, 12:37 AM

Yes, the main data file became corrupted. That happens once in a blue moon despite several counter measures I've built in to the application. However, it keeps a backup, and a log file of EVERY update made. Therefore, I can either restore the backup. If that's not possible I can rebuild the data from the log. I've got all the bases covered!

It's all better now.

------------------
Dan McClain, Editor
The Mustang Works Magazine
1991 Mustang GT - NOVI Supercharged 377 Stroker
1999 Ford Lightning SVT - Supercharged 5.4L Triton

cad614 · 09-04-2000, 03:37 AM

Good to know you have it covered. After the Users Rides section was all lost shortly after being created I remember having to post everything again, and it was not fun. Glad there's a backup now!

joe4speed · 09-04-2000, 11:30 PM

How do things become "corrupted" anyways...? I was always curious about that!

StangFlyer · 09-05-2000, 02:45 AM

Oh man, that's a long one. But, to be brief, here's the way it works:

Each CGI application on the site, like the one which runs the User's Rides, works with information that it stores using some method. Some web applications work with a real database, like MySQL, Oracle, Sequel Server, or DB2. Others that are going to handle much less information might be written to store and work with information in "flat files". Flat files are just text files basically with data stored in them in a predefined format which the application knows how to read a chunk at a time of and break each chunk into separate parts as it read them. The User's Rides uses flat files to store everyone's information.

The hurtle with using flat files is that you can only have one instance of the application accessing the files for the purpose of writing new data, or updating existing data, to the file at once. However, a potential problem comes into play when a program like this (in this case it's the User's Rides, which is a application written in Perl5) is run multiple times simultaneously, and each instance of that program tries to change the data file in some way at while another instance of that application already has the file open.

For example: Say cad614 is editing his User's Ride by updating his specs, and while he is doing that you (JL1314) are clicking on someone's listing to view their car. Now, as you know the User's Rides application keeps track of how many times your listing has been viewed. To do this, it updates the main data file containing an index listing of basic information about each person's listing. Now, let's also say that you click on the link to the ride you want to view, which invokes the program, which in turn runs and wants to open and modify the data file. And, at the same time cad614 clicks the button to save his new specs. Now although each car's specs are kept in a unique file for each listing, it still updates a few basic information items in the main index data file also.

Now here's the dilemma. You've run the program to do something that updates the index file, and at the same time so has cad614. There are now two copies of the User's Rides program running at the same exact time which both want to WRITE information to the main data file. Unfortunately, this CAN NOT HAPPEN. If it does, the data file(s) become what we call "corrupted". Basically what that means is that it really SCREWS it, all data in the file is lost, and the file is now ZERO bytes in size. It essentially contains nothing. OOPS.

Now in order to prevent this situation from occurring, which would actually be VERY common for the User's Rides otherwise, you have to program the application utilizing a file locking technique. File locking is simply a method employed to prevent more than one instance of the program from using the application's data files at the same time. Or, in effect, allows the program to know if the data files are already in use by another instance of itself.

So, here are the common steps that happen concerning the more common file locking techniques when the program executes:

The program runs
The program looks to see if a predefined filename exists (for example: rides.lock).
If the file DOES exist than the program simply waits, or sits there, and rechecks to see if the file exists once each second. Once it sees the file DOES NOT exist it knows it can now use the data file.
Now that the data file is not in use, the program must first lock the data file for itself. So, it creates the files "rides.lock", which is just an empty file. But, because it exists, it lets other copies of the program that get run while this copy is still using the data file that it's in use.
It now updates the data files as needed.
It now ERASES the "rides.lock" file.
The program now finishes outputting data to the users browser and exits.
And, of course, while it was doing all this if another copy of this program was invoked by a web surfer and is waiting till this one is done, it now sees the lock file is gone and starts to do this whole process for itself too. And, so one.

Now, maybe you can understand how this process works. However, my example was with just two copies running. Really, on Mustang Works, DOZENS of copies of the User's Rides may be running at any given second. In fact, the message board system, which is also written in Perl and uses flat files, does the same thing. And, I've done a process listing on our main UNIX server running it in the past during PEAK usage times of the day and found over 100 copies of the main message board program (the ones which displays the listings of posted threads in forums) at any given second!! Yikes! So, you can see why on a very busy site like MustangWorks.com this could, and WOULD, happen easily with out using such file locking techniques.

Furthermore, in all the apps I've programed myself to run features on the site (most of them), I also use my own methods and tricks that I've developed over the years too. One is a method where the program self backs up the data files it uses each time data is changed in them. Therefore, before an instance of the program is executed it always has a current backup copy of the data files. The first thing it does when it gets its turn to use the data files is do a couple checks on the regular data files to make sure they are not corrupted. Now, if they are, it will simply delete the corrupted versions and automatically rename the backup versions to the regular versions, and then go ahead and do its business. Then, make new backup versions as usual after it changes data in them.

I know I've probably lost you. But, these are some basics to knowledgeable CGI programming. I tried to explain them in non technical terms. Unfortunately, when you have a site like this that gets several hundred thousand hits each day no matter how well you cover your bases with techniques like this, you are bound to have a collision sooner or later with an app that is very heavily used utilizing flat files. So, for the rare instances where both the file locking techniques fail, and the self replicating backup technique I've developed also doesn't come through, I also have the program write each data change it makes to a log file. And, like I just had to do the other day to restore the Rides, I can as a last resort rebuild the main data file from the data trail in the log. A little bit of a hassle, but no sweat.

Some time in the future, I may very well reprogram the User's Rides system as a Cold Fusion application that will store everything in Sequel Server. We already have this capability, because one of the Mustang Works servers is a Win2K box with Cold Fusion 4.5.1 and SQL6.5...

------------------
Dan McClain, Editor
The Mustang Works Magazine
1991 Mustang GT - NOVI Supercharged 377 Stroker
1999 Ford Lightning SVT - Supercharged 5.4L Triton

Mr 5 0 · 09-05-2000, 08:27 AM

JL1314:

Aren't you glad you asked?

Dan obviously knows his business and is always working to keep Mustang Works Online as useful, efficient and ahead-of-the-pack in every way possible. It takes a lot of expertise to keep this site running as smoothly as it does and Dan's programming and formatting work is a big part of that, as his 'simple' explanation above shows us.

------------------
Mr. 5.0
Messageboard Administrator

joe4speed · 09-05-2000, 12:06 PM

Wow! That's some serious stuff! I do understand some of what you're sayin though, I have a site so I understand about the CGI and Perl, but yours is WAY more advanced than mine, and I am very impressed how you explained that! I wouldn't have even known where to start! Thanks for taking the time to write all that down for me! Now when someone says corrupted, I know!!!

Joe

[This message has been edited by JL1314 (edited 09-05-2000).]

StangFlyer · 09-05-2000, 04:19 PM

Hey no problem. I've had to learn to verbalize all of this tech mumbo-jumbo, as I've taught classes for my company in the past where I've schooled our other IT professionals in Web Development and CGI / application programming techniques just like that... As a matter of fact, I've used the source code to The Mustang Works guestbook program on occasion to demonstrate basic dynamic CGI / page generation, along with the file locking techniques I described.

------------------
Dan McClain, Editor
The Mustang Works Magazine
1991 Mustang GT - NOVI Supercharged 377 Stroker
1999 Ford Lightning SVT - Supercharged 5.4L Triton

jimberg · 09-10-2000, 11:49 AM

Dan, I ran into some similar issues a while back. When dealing with a lot of simultaneous users it is most beneficial to process the request as quickly as possible to free up resources. Rather than waiting for a lock to be released on the file, I would queue the information somewhere else and let the next thread that obtains a lock deal with the queued requests then deal with its own requests. This way the program would never sit there waiting for a lock release.

In my case, I just put the requests in a list maintained in a common memory area. In your case, you may just want to write out little files with an incrementing name that contains the modification request. When the next thread gets a lock, it would read in those little files if they exist, fold them into the main data file, delete the files, and then process any of its own reads and writes. This will keep all the read and write requests linear when they need to be. The format in which the requests are stored obviously have to be independent of the data file that they are going to change.

I suspect that you may have so many instances of a single program running because of that arbitrary delay. Say that one thread obtains a lock on the data file and another thread then sees that lock and goes into its 1 sec wait almost at the same time. If the first lock is released in .5 secs and and another thread comes requesting a fresh lock right after, it obtains the lock and then the one that was waiting for 1 sec will then still see a lock and wait once again. Threads will continually trip over themselves causing temporary deadlocks, which in turn will keep instances of the program running.

Just some ideas from one programmer to another.

StangFlyer · 09-11-2000, 07:32 AM

It really doesn't matter. If a program is waiting for it's turn at the data files because they are locked by another instance, they are using virtually no CPU time. That is what matters. I think trying to build a que with this type of CGI would only make it more complex than it needs to be, and in the end it's still reading and writing the same exact things. But, in your case, even more. With a CGI, file access is what takes the most processing time. Keep that to the bare minimum, which is hopefully just one file, beyond the lock file, it has to open, use, and close.

There is a trade off in everything, and the trade off with building a que to minimize the number of "waiting" instances on a heavily used CGI app isn't worth the extended processing time using such a strategy would create. Especially when memory and disk size is so cheap, and plentiful on a server.

After years of developing MW's CGI applications in Perl, however, I'm now going to be developing all future features in Cold Fusion on our Win2K server with a Sequel Server back end.

------------------
Dan McClain, Editor
The Mustang Works Magazine
1991 Mustang GT - NOVI Supercharged 377 Stroker
1999 Ford Lightning SVT - Supercharged 5.4L Triton

jimberg · 09-12-2000, 04:32 PM

Maintaining 100 threads consumes a considerable amount of resources regardless to whether or not they are waiting. The context switch, the OS overhead to determine which thread to run next, the amount of memory necessary to keep an instance running, etc. What's the size of the PERL interpreter? Does a new interpreter get loaded for each object instance? Just the OS overhead involved is significant enough and will always be a problem no matter what you use for development. The overhead will vary from OS to OS. BeOS and Linux are probably more efficient than Windows 2000 at the way they handle threads.

If you had access to system objects and were able to create a mutex, I would suggest using that instead of a lock file. This way, the waiting thread would go active as soon as the mutex was released rather than waiting for 1 sec. I don't think you have those as an option in PERL, though.

Your data corruption problems probably occur in that minute span of time between when one thread is creating the lock file and another thread isn't finding the lock file. (Thread 1 sees that the lock file isn't there. A context switch occurs. Thread 2 sees that the lock file is not there. Thread 2 Creates lock file. Context switch. Thread 1 creates lock file.) Both threads probably think they have it locked and then blast the data. You might try simply opening the file for exclusive access and go into your wait loop if you get a sharing violation error on the other thread. Continue waiting and attempting to open the file until you succeed. Otherwise, do what you need to do with the file. This way, the OS is handling synchronization and not you. Doing this should at least resolve your data corruption problems and should be fairly easy to implement.

I still don't think it is a good idea to ever wait for an arbitrary period. The queue I'm suggesting would consist of tiny little files that contain very little overhead. They could contain just enough information to perform the update that was requested. e.g. The update that would occur the most is incrementing the number of times a user's ride was viewed. That could be represented by something as simple as "jimberg+". You have the lookup key and the action. The filename could have a counter that keeps requests in the proper order. I know that in practice this works.

I think the queueing solution would actually reduce overhead by a significant amount and improve reliability. That's irrelevant though, since you're planning to move to SQL Server and Cold Fusion. Does this mean you are rewriting everything or just developing new stuff? Or do you already have everything rewritten and you're gonna spring it on is sometime in the near future?

Looking forward to seeing what you come up with.

I hope you don't mind my suggestions, but the only thing I like to discuss more than Mustangs is programming. I've been developing real-time systems, including the multi-tasking kernels they use, since 1986. My first was for a multi-user BBS.

StangFlyer · 09-13-2000, 01:26 AM

Jim, I don't mind your suggestions and I do understand where you're coming from, believe me. Of course, having 100 threads does use resources, there's no doubt about it. However, on a server that has lots of memory, cache, and drive storage, using these resources is not nearly as large of a concern as the amount of CPU time being used. Since only 1 process of "X" application is running at a time really, and the others just waiting, you have one using CPU time and the others are virtually at zero usage. This is visible when monitoring processes/threads on the box in real time and taking note of their CPU usage, along with overall system load. The main box running MustangWorks.com is a Cobalt RaQ3i-512. If you aren't very familiar with it, it's a low cost but high end server with a 550 Mhz Intel based processor with a large cache, 512 Mb Ram, and a 20 Gig Ultra DMA 66 7200 RPM drive that runs the latest version of Red Hat Linux. As far as speed, it blows Win2K away and is EXTREMELY stable. In fact, MW's has gone for over a YEAR in the past on Linux without being rebooted, and was only rebooted because I was forced to after installing OS updates. So I worry less about sucking some memory and more about the load the app puts on the box when each instance does its thing. And, from experience I can tell you that the more file handling you do, the heavier the CPU load and the higher the load.

However, the Perl interpreter is like 100K, and yes it is loaded for every instance. Unless you are using either Mod_Perl on Unix or PerlEX on Windows. Mod_Perl is a module for Apache that is loaded into memory when the web server loads. PerlEX is a service for Windows NT or Win2K that is essentially the Perl interpreter running like Cold Fusion does. By utilizing one of these you eliminated launching the interpreter for each instance and the resource load associated with it. In addition, once a Perl app is loaded, compiled, and run, it is not dumped from memory either. It is kept compiled and reutilized each subsequent time it is invoked. This increasing a CGI's execution time by up to 50%. On MW's main server we DO have Mod_Perl.

Your observations about the scenario that corrupts a data file is probably close to the money. However, it is to be expected, and as I've stated I take precautions so that even in this case no data is lost. The locking system, in combination with my self backup technique, really keeps it from happening on all but the blue moon occasion. However, your suggestion about opening the data files with exclusive locking would be a good idea, but unfortunately is irrelevant because were on Linux. That's a feature of the Windows OS.

Something that I don't get, however, is how will your queuing method get around having to perform the same file locking technique that I am now? Even if you have each instance perform the actual file updates the last instance recorded in the queue file, you still have to make a lock file to protect multiple instances from writing to the queue file now. In the end, you're just doing the same thing, but with more steps and more complexity. And, you still have processes that have to wait in line for their turn.

Really, to do what you're suggesting correctly, you'd really need to create a daemon (like a service on Windows), that would be running in the background all the time. Then an instance of the regular app would just pass string commands to the daemon that contain the update to perform. The daemon would have a queue in memory that it keeps the commands in and just do them one at a time as quickly as possible, removing each command string from memory after completion. Now doing this would both eliminate the file locking system all together and eliminate waiting processes of the Perl app.

However, it is all irrelevant since I'm doing all future apps in CF. And, if I were going to redo any of the apps we have now, I would also do them in CF, as well with a SQL Server back end eliminates all these issues any ways.

------------------
Dan McClain, Editor
The Mustang Works Magazine
1991 Mustang GT - NOVI Supercharged 377 Stroker
1999 Ford Lightning SVT - Supercharged 5.4L Triton

joe4speed · 09-13-2000, 09:17 AM

Now my head is really spinning!!!! LOL, You guys are just chock full of information!

jimberg · 09-13-2000, 12:21 PM

Creating a daemon would work, of course, but I think it would be overkill. Like you said, you don't want to implement something that is too complex. Given the limitations of the file system you are using (I would think that Linux would allow an exclusive access lock. I'm not a Linux programmer yet.

) your process could look something like this (in pseudocode):

Code:

if Create lock file with O_CREAT | O_EXCL flags succeeds then
    // At this point, we know we're good to go and have exclusive access to data file
  Get list of files that match the queue files pattern  e.g. Q*.UPD or whatever.
  Sort the list according to the order that they occur
  Read and apply the updates from the files and then delete them.
  Process the update of the current process
  Delete lock file
else
  Get unique ID 
  Create update file with name of Q&lt;unique id&gt;.UPD
  Write update request to file
  Close update file
end if

I don't know how much information you have available to you in PERL, but if you can get the thread id or a high resolution timer value, use that for the unique id. A timer value would be helpful for maintaining the order of the updates.

The beauty of this process is that it will always complete without a wait. Since you process the updates in the order that they occur it will be seamless to the user. No lock is necessary on the UPD files since they are unique. In the specific case of User Rides, whenever someone looks up a ride the counter is incremented so this process will pretty much always occur.

Your autobackup strategy must consume some time. Removing the necessity for that should certainly help.

As far as the threads not consuming CPU time, you are correct. The load is occurring in the task manager. Each time a thread uses its time, the task manager has to determine if the next thread is ready to run. I'm assuming that you are using some sort SLEEP function as opposed to YIELD, CHECK TIME, YIELD process. A SLEEP function probably just suspends a thread and sets a counter that decrements each time the task manager is called until it goes to zero.

Also, don't forget the initial scenario I mentioned. Since you are waiting for 1 sec, it's possible for one thread to give up a lock and another to take it before the waiting thread completes its wait. When it finally gets around to checking again, it will see that it is still locked and wait again. As load increases (the number of threads), this race condition is more likely to occur. If timed right, some threads could be starved for access for many seconds while only a few get to actually do anything. This is why your thread count goes so high.

These types of synchonization problems will still occur with Cold Fusion and SQL server, just in different areas.

[This message has been edited by jimberg (edited 09-13-2000).]

StangFlyer · 09-15-2000, 05:58 AM

OK, now I understand a little bit better what you are saying. From your original message I thought what you meant was to have a central (i.e. one) que file. Of course, if that were the case each instance of the CGI would be reading and writing to it, you have to have a lock file for that, and you'd still have to use the locking file to protect the main files. Which, of course, made no sense because you're just doing what I already am now, and more. Plus you're still working on the main data files. In essence, it would just make it MORE complex and make you work with more files. However, what you are actually suggesting is that you make multiple que files. One for each update that needs to occur to the main data file. When an instance of the CGI is launched, it would read in (and delete) each que file in order of their time stamps (basically) and update whatever data each que file directed in the main data file. You would still have to use the lock file, however, because you'd still have to make sure that more than one instance is not trying to do this procedure on the main data file at the same time. Right?

So, you wouldn't really totally eliminate the locking file system, but it would significantly reduce waiting processes because a process would only have to wait if que files existed. This wouldn't be nearly as often since not every instance needs to write information, the majority of the time it just needs to read it. And, you can have multiple instances (or programs) reading a file at the same time, you just can't have them writing at the same time (or reading and writing at the same time).

I still think the only way to totally eliminate the locking file and need for any instances to wait would be through a daemon working as I described. Unless I'm still not seeing something in your plan.

Now, in other notes, if the CGI needs to wait it uses a "sleep" command in Perl. What's giving me a chuckle is that you keep sighting Window's stuff. Unlike Windows, where as soon as you just boot to the desktop it's running dozens of darn DLL's and service processes in the background, Unix (Linux in this case) is nothing more than one central "kernel". The OS Kernal is actually very small and uses little resources. Upon boot up, the only thing running is the main Kernal and whatever daemons (services) you define and actually need. This for example on our server for MW's would be a daemon for Apache, FTP, sendmail, MySql, among a few others. Most other functions are completely autonomous modules that do everything. Such as renaming a file, moving a file, editing a file, changing file permission, X-Windows, whatever. When you type the command to move a file called "readme.txt", for example, to another directory from its currently location, which is say "mv /home/readme.txt /etc/readme.txt", it invokes an external program called "mv" which resides in the "/bin" directory. That module loads, does what you directed, and exits. In other words, all the functions of the Unix OS only run as needed and thus Unix has virtually no overhead. And, because you don't have all that crap running all the time at once, the thing is EXTREMELY stable. And this is why Linux, verses Windows on the same box, is way faster.

Yeah, it's not the thing for a regular home user's desktop (yet), but it's by far the most inexpensive and ideal server solution. I'm particularly happy that Allaire finally came out with Cold Fusion for Linux now...

As for these issues still occurring with a Cold Fusion and SQL Server solution, yes they are somewhat still there, but at that point its irrelevant to you because you have a heavy hitting database engine that handles it for you.

------------------
Dan McClain, Editor
The Mustang Works Magazine
1991 Mustang GT - NOVI Supercharged 377 Stroker
1999 Ford Lightning SVT - Supercharged 5.4L Triton

[This message has been edited by Dan McClain (edited 09-15-2000).]

jimberg · 09-15-2000, 07:04 PM

I don't think I even mentioned Windows stuff in my last message.

Most of my major multithreaded programming was with kernels and schedulers that I wrote. When I did programs for IIS that were fairly similar to User Rides and pounded it with 120 simultaneous users, the highest thread count I ever got was 7 threads. If everything positive about Linux that I have read is true, you should be able to get your max count to at most half of that.

Regardless of how efficient the scheduler is in the OS, however, having tasks vying for resources by having them wait for an arbitrary, predetermined period of time is still a major problem because of the issue creating a race condition.

I think you pretty much have what I am saying, except for the part about waiting only if queue files exist. The process will be extended a little, but would still be continuous. The program will never wait.

As far as getting rid of the locking file, I went to PERL.ORG to see if I could find some info on file locking. Does the FLOCK <file handle>, 2 not work with your implementation of PERL? That would give you exclusive access to write to the file without needing to create a lock file. In the FAQ, their example was a page hit counter, so I'm pretty sure that is what I was looking for. If the FLOCK fails, run the ELSE code in the above pseudo code.

If you use SQL Server 7, yes, it will be much easier.

StangFlyer · 09-15-2000, 07:46 PM

Quote:

The load is occurring in the task manager.

FLOCK is supported by the Linux OS, and has nothing to do with Perl per say. Yes, I have it and can us it, but I'm not. It isn't really like file sharing control on Windows. It's essentially what I am doing on my own, but the OS handles the lock. It is more reliable and efficient though, but I don't use it because it would make the app non-portable to Windows. The great thing about writing something in Perl is that as long as you don't use external Unix commands, you can simply put the code on a Win box or a Unix box and it will run the same. If you use FLOCK, that is not the case and it would not function on Windows. Since one of our boxes is a Win2k server, I want the app to run on either. And, in the past it DID run on the Win box for a while.

------------------
Dan McClain, Editor
The Mustang Works Magazine
1991 Mustang GT - NOVI Supercharged 377 Stroker
1999 Ford Lightning SVT - Supercharged 5.4L Triton

jimberg · 09-15-2000, 08:23 PM

I was using the term task manager instead of scheduler, sorry. I still believe that there will be a significant load in the scheduler. I guess I have Windows on the brain.

Anyhoo...

I thought I read that flock was a portable command based on the Unix flock, not necessarily a direct pass-through. If you use it on Windows NT/2000 it should work (according to the documentation that I read). The difference, though, is that flock on Linux is only advisory while on NT/2000 it would enforced. Your code would still be portable.

StangFlyer · 09-16-2000, 06:49 AM

I'd have to check into it then. But, to my knowledge, it was not supported on Windows (in the past).

------------------
Dan McClain, Editor
The Mustang Works Magazine
1991 Mustang GT - NOVI Supercharged 377 Stroker
1999 Ford Lightning SVT - Supercharged 5.4L Triton

09-03-2000, 05:28 PM	#1
jonnyk Being stroked is great Join Date: Oct 1998 Location: Alberta, Canada Posts: 772	Users rides glitch? Can't seem to view any of the cars here...could just be a temporary thing but I thought I'd mention it anyways. ------------------ 1991 LX Hatch 5.0L

09-05-2000, 02:45 AM	#6
StangFlyer Founder Join Date: Jun 1995 Location: Michigan Posts: 19,326	Oh man, that's a long one. But, to be brief, here's the way it works: Each CGI application on the site, like the one which runs the User's Rides, works with information that it stores using some method. Some web applications work with a real database, like MySQL, Oracle, Sequel Server, or DB2. Others that are going to handle much less information might be written to store and work with information in "flat files". Flat files are just text files basically with data stored in them in a predefined format which the application knows how to read a chunk at a time of and break each chunk into separate parts as it read them. The User's Rides uses flat files to store everyone's information. The hurtle with using flat files is that you can only have one instance of the application accessing the files for the purpose of writing new data, or updating existing data, to the file at once. However, a potential problem comes into play when a program like this (in this case it's the User's Rides, which is a application written in Perl5) is run multiple times simultaneously, and each instance of that program tries to change the data file in some way at while another instance of that application already has the file open. For example: Say cad614 is editing his User's Ride by updating his specs, and while he is doing that you (JL1314) are clicking on someone's listing to view their car. Now, as you know the User's Rides application keeps track of how many times your listing has been viewed. To do this, it updates the main data file containing an index listing of basic information about each person's listing. Now, let's also say that you click on the link to the ride you want to view, which invokes the program, which in turn runs and wants to open and modify the data file. And, at the same time cad614 clicks the button to save his new specs. Now although each car's specs are kept in a unique file for each listing, it still updates a few basic information items in the main index data file also. Now here's the dilemma. You've run the program to do something that updates the index file, and at the same time so has cad614. There are now two copies of the User's Rides program running at the same exact time which both want to WRITE information to the main data file. Unfortunately, this CAN NOT HAPPEN. If it does, the data file(s) become what we call "corrupted". Basically what that means is that it really SCREWS it, all data in the file is lost, and the file is now ZERO bytes in size. It essentially contains nothing. OOPS. Now in order to prevent this situation from occurring, which would actually be VERY common for the User's Rides otherwise, you have to program the application utilizing a file locking technique. File locking is simply a method employed to prevent more than one instance of the program from using the application's data files at the same time. Or, in effect, allows the program to know if the data files are already in use by another instance of itself. So, here are the common steps that happen concerning the more common file locking techniques when the program executes: The program runs The program looks to see if a predefined filename exists (for example: rides.lock). If the file DOES exist than the program simply waits, or sits there, and rechecks to see if the file exists once each second. Once it sees the file DOES NOT exist it knows it can now use the data file. Now that the data file is not in use, the program must first lock the data file for itself. So, it creates the files "rides.lock", which is just an empty file. But, because it exists, it lets other copies of the program that get run while this copy is still using the data file that it's in use. It now updates the data files as needed. It now ERASES the "rides.lock" file. The program now finishes outputting data to the users browser and exits. And, of course, while it was doing all this if another copy of this program was invoked by a web surfer and is waiting till this one is done, it now sees the lock file is gone and starts to do this whole process for itself too. And, so one. Now, maybe you can understand how this process works. However, my example was with just two copies running. Really, on Mustang Works, DOZENS of copies of the User's Rides may be running at any given second. In fact, the message board system, which is also written in Perl and uses flat files, does the same thing. And, I've done a process listing on our main UNIX server running it in the past during PEAK usage times of the day and found over 100 copies of the main message board program (the ones which displays the listings of posted threads in forums) at any given second!! Yikes! So, you can see why on a very busy site like MustangWorks.com this could, and WOULD, happen easily with out using such file locking techniques. Furthermore, in all the apps I've programed myself to run features on the site (most of them), I also use my own methods and tricks that I've developed over the years too. One is a method where the program self backs up the data files it uses each time data is changed in them. Therefore, before an instance of the program is executed it always has a current backup copy of the data files. The first thing it does when it gets its turn to use the data files is do a couple checks on the regular data files to make sure they are not corrupted. Now, if they are, it will simply delete the corrupted versions and automatically rename the backup versions to the regular versions, and then go ahead and do its business. Then, make new backup versions as usual after it changes data in them. I know I've probably lost you. But, these are some basics to knowledgeable CGI programming. I tried to explain them in non technical terms. Unfortunately, when you have a site like this that gets several hundred thousand hits each day no matter how well you cover your bases with techniques like this, you are bound to have a collision sooner or later with an app that is very heavily used utilizing flat files. So, for the rare instances where both the file locking techniques fail, and the self replicating backup technique I've developed also doesn't come through, I also have the program write each data change it makes to a log file. And, like I just had to do the other day to restore the Rides, I can as a last resort rebuild the main data file from the data trail in the log. A little bit of a hassle, but no sweat. Some time in the future, I may very well reprogram the User's Rides system as a Cold Fusion application that will store everything in Sequel Server. We already have this capability, because one of the Mustang Works servers is a Win2K box with Cold Fusion 4.5.1 and SQL6.5... ------------------ Dan McClain, Editor The Mustang Works Magazine 1991 Mustang GT - NOVI Supercharged 377 Stroker 1999 Ford Lightning SVT - Supercharged 5.4L Triton

09-13-2000, 12:21 PM	#15
jimberg Registered Member Join Date: Oct 1998 Location: Rogers, MN Posts: 2,089	Creating a daemon would work, of course, but I think it would be overkill. Like you said, you don't want to implement something that is too complex. Given the limitations of the file system you are using (I would think that Linux would allow an exclusive access lock. I'm not a Linux programmer yet. ) your process could look something like this (in pseudocode): Code: if Create lock file with O_CREAT \| O_EXCL flags succeeds then // At this point, we know we're good to go and have exclusive access to data file Get list of files that match the queue files pattern e.g. Q*.UPD or whatever. Sort the list according to the order that they occur Read and apply the updates from the files and then delete them. Process the update of the current process Delete lock file else Get unique ID Create update file with name of Q<unique id>.UPD Write update request to file Close update file end if I don't know how much information you have available to you in PERL, but if you can get the thread id or a high resolution timer value, use that for the unique id. A timer value would be helpful for maintaining the order of the updates. The beauty of this process is that it will always complete without a wait. Since you process the updates in the order that they occur it will be seamless to the user. No lock is necessary on the UPD files since they are unique. In the specific case of User Rides, whenever someone looks up a ride the counter is incremented so this process will pretty much always occur. Your autobackup strategy must consume some time. Removing the necessity for that should certainly help. As far as the threads not consuming CPU time, you are correct. The load is occurring in the task manager. Each time a thread uses its time, the task manager has to determine if the next thread is ready to run. I'm assuming that you are using some sort SLEEP function as opposed to YIELD, CHECK TIME, YIELD process. A SLEEP function probably just suspends a thread and sets a counter that decrements each time the task manager is called until it goes to zero. Also, don't forget the initial scenario I mentioned. Since you are waiting for 1 sec, it's possible for one thread to give up a lock and another to take it before the waiting thread completes its wait. When it finally gets around to checking again, it will see that it is still locked and wait again. As load increases (the number of threads), this race condition is more likely to occur. If timed right, some threads could be starved for access for many seconds while only a few get to actually do anything. This is why your thread count goes so high. These types of synchonization problems will still occur with Cold Fusion and SQL server, just in different areas. [This message has been edited by jimberg (edited 09-13-2000).]

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
User's Rides... Think about it....	Hammer	Modular Madness	2	07-23-2002 06:20 PM
Attn: Dan. "Users Rides" section only shows 2 users.	Fox Body	Blue Oval Lounge	2	02-11-2001 01:11 AM

09-03-2000, 06:07 PM	#2
cad614 Senior Member Join Date: Apr 1999 Location: San Angelo, TX Posts: 377	Looks like everything got deleted again. Let us know if we're going to have to repost everything.

09-04-2000, 12:37 AM	#3
StangFlyer Founder Join Date: Jun 1995 Location: Michigan Posts: 19,326	Yes, the main data file became corrupted. That happens once in a blue moon despite several counter measures I've built in to the application. However, it keeps a backup, and a log file of EVERY update made. Therefore, I can either restore the backup. If that's not possible I can rebuild the data from the log. I've got all the bases covered! It's all better now. ------------------ Dan McClain, Editor The Mustang Works Magazine 1991 Mustang GT - NOVI Supercharged 377 Stroker 1999 Ford Lightning SVT - Supercharged 5.4L Triton

09-04-2000, 03:37 AM	#4
cad614 Senior Member Join Date: Apr 1999 Location: San Angelo, TX Posts: 377	Good to know you have it covered. After the Users Rides section was all lost shortly after being created I remember having to post everything again, and it was not fun. Glad there's a backup now!

09-04-2000, 11:30 PM	#5
joe4speed He said Member...heh, heh Join Date: Sep 1999 Location: Jupiter, Florida U.S.A. Posts: 3,718	How do things become "corrupted" anyways...? I was always curious about that!

09-05-2000, 08:27 AM	#7
Mr 5 0 Conservative Individualist Join Date: May 1997 Location: Wherever I need to be Posts: 7,487	JL1314: Aren't you glad you asked? Dan obviously knows his business and is always working to keep Mustang Works Online as useful, efficient and ahead-of-the-pack in every way possible. It takes a lot of expertise to keep this site running as smoothly as it does and Dan's programming and formatting work is a big part of that, as his 'simple' explanation above shows us. ------------------ Mr. 5.0 Messageboard Administrator

09-05-2000, 12:06 PM	#8
joe4speed He said Member...heh, heh Join Date: Sep 1999 Location: Jupiter, Florida U.S.A. Posts: 3,718	Wow! That's some serious stuff! I do understand some of what you're sayin though, I have a site so I understand about the CGI and Perl, but yours is WAY more advanced than mine, and I am very impressed how you explained that! I wouldn't have even known where to start! Thanks for taking the time to write all that down for me! Now when someone says corrupted, I know!!! Joe [This message has been edited by JL1314 (edited 09-05-2000).]

09-05-2000, 04:19 PM	#9
StangFlyer Founder Join Date: Jun 1995 Location: Michigan Posts: 19,326	Hey no problem. I've had to learn to verbalize all of this tech mumbo-jumbo, as I've taught classes for my company in the past where I've schooled our other IT professionals in Web Development and CGI / application programming techniques just like that... As a matter of fact, I've used the source code to The Mustang Works guestbook program on occasion to demonstrate basic dynamic CGI / page generation, along with the file locking techniques I described. ------------------ Dan McClain, Editor The Mustang Works Magazine 1991 Mustang GT - NOVI Supercharged 377 Stroker 1999 Ford Lightning SVT - Supercharged 5.4L Triton

09-10-2000, 11:49 AM	#10
jimberg Registered Member Join Date: Oct 1998 Location: Rogers, MN Posts: 2,089	Dan, I ran into some similar issues a while back. When dealing with a lot of simultaneous users it is most beneficial to process the request as quickly as possible to free up resources. Rather than waiting for a lock to be released on the file, I would queue the information somewhere else and let the next thread that obtains a lock deal with the queued requests then deal with its own requests. This way the program would never sit there waiting for a lock release. In my case, I just put the requests in a list maintained in a common memory area. In your case, you may just want to write out little files with an incrementing name that contains the modification request. When the next thread gets a lock, it would read in those little files if they exist, fold them into the main data file, delete the files, and then process any of its own reads and writes. This will keep all the read and write requests linear when they need to be. The format in which the requests are stored obviously have to be independent of the data file that they are going to change. I suspect that you may have so many instances of a single program running because of that arbitrary delay. Say that one thread obtains a lock on the data file and another thread then sees that lock and goes into its 1 sec wait almost at the same time. If the first lock is released in .5 secs and and another thread comes requesting a fresh lock right after, it obtains the lock and then the one that was waiting for 1 sec will then still see a lock and wait once again. Threads will continually trip over themselves causing temporary deadlocks, which in turn will keep instances of the program running. Just some ideas from one programmer to another.

09-11-2000, 07:32 AM	#11
StangFlyer Founder Join Date: Jun 1995 Location: Michigan Posts: 19,326	It really doesn't matter. If a program is waiting for it's turn at the data files because they are locked by another instance, they are using virtually no CPU time. That is what matters. I think trying to build a que with this type of CGI would only make it more complex than it needs to be, and in the end it's still reading and writing the same exact things. But, in your case, even more. With a CGI, file access is what takes the most processing time. Keep that to the bare minimum, which is hopefully just one file, beyond the lock file, it has to open, use, and close. There is a trade off in everything, and the trade off with building a que to minimize the number of "waiting" instances on a heavily used CGI app isn't worth the extended processing time using such a strategy would create. Especially when memory and disk size is so cheap, and plentiful on a server. After years of developing MW's CGI applications in Perl, however, I'm now going to be developing all future features in Cold Fusion on our Win2K server with a Sequel Server back end. ------------------ Dan McClain, Editor The Mustang Works Magazine 1991 Mustang GT - NOVI Supercharged 377 Stroker 1999 Ford Lightning SVT - Supercharged 5.4L Triton

09-12-2000, 04:32 PM	#12
jimberg Registered Member Join Date: Oct 1998 Location: Rogers, MN Posts: 2,089	Maintaining 100 threads consumes a considerable amount of resources regardless to whether or not they are waiting. The context switch, the OS overhead to determine which thread to run next, the amount of memory necessary to keep an instance running, etc. What's the size of the PERL interpreter? Does a new interpreter get loaded for each object instance? Just the OS overhead involved is significant enough and will always be a problem no matter what you use for development. The overhead will vary from OS to OS. BeOS and Linux are probably more efficient than Windows 2000 at the way they handle threads. If you had access to system objects and were able to create a mutex, I would suggest using that instead of a lock file. This way, the waiting thread would go active as soon as the mutex was released rather than waiting for 1 sec. I don't think you have those as an option in PERL, though. Your data corruption problems probably occur in that minute span of time between when one thread is creating the lock file and another thread isn't finding the lock file. (Thread 1 sees that the lock file isn't there. A context switch occurs. Thread 2 sees that the lock file is not there. Thread 2 Creates lock file. Context switch. Thread 1 creates lock file.) Both threads probably think they have it locked and then blast the data. You might try simply opening the file for exclusive access and go into your wait loop if you get a sharing violation error on the other thread. Continue waiting and attempting to open the file until you succeed. Otherwise, do what you need to do with the file. This way, the OS is handling synchronization and not you. Doing this should at least resolve your data corruption problems and should be fairly easy to implement. I still don't think it is a good idea to ever wait for an arbitrary period. The queue I'm suggesting would consist of tiny little files that contain very little overhead. They could contain just enough information to perform the update that was requested. e.g. The update that would occur the most is incrementing the number of times a user's ride was viewed. That could be represented by something as simple as "jimberg+". You have the lookup key and the action. The filename could have a counter that keeps requests in the proper order. I know that in practice this works. I think the queueing solution would actually reduce overhead by a significant amount and improve reliability. That's irrelevant though, since you're planning to move to SQL Server and Cold Fusion. Does this mean you are rewriting everything or just developing new stuff? Or do you already have everything rewritten and you're gonna spring it on is sometime in the near future? Looking forward to seeing what you come up with. I hope you don't mind my suggestions, but the only thing I like to discuss more than Mustangs is programming. I've been developing real-time systems, including the multi-tasking kernels they use, since 1986. My first was for a multi-user BBS.

09-13-2000, 01:26 AM	#13
StangFlyer Founder Join Date: Jun 1995 Location: Michigan Posts: 19,326	Jim, I don't mind your suggestions and I do understand where you're coming from, believe me. Of course, having 100 threads does use resources, there's no doubt about it. However, on a server that has lots of memory, cache, and drive storage, using these resources is not nearly as large of a concern as the amount of CPU time being used. Since only 1 process of "X" application is running at a time really, and the others just waiting, you have one using CPU time and the others are virtually at zero usage. This is visible when monitoring processes/threads on the box in real time and taking note of their CPU usage, along with overall system load. The main box running MustangWorks.com is a Cobalt RaQ3i-512. If you aren't very familiar with it, it's a low cost but high end server with a 550 Mhz Intel based processor with a large cache, 512 Mb Ram, and a 20 Gig Ultra DMA 66 7200 RPM drive that runs the latest version of Red Hat Linux. As far as speed, it blows Win2K away and is EXTREMELY stable. In fact, MW's has gone for over a YEAR in the past on Linux without being rebooted, and was only rebooted because I was forced to after installing OS updates. So I worry less about sucking some memory and more about the load the app puts on the box when each instance does its thing. And, from experience I can tell you that the more file handling you do, the heavier the CPU load and the higher the load. However, the Perl interpreter is like 100K, and yes it is loaded for every instance. Unless you are using either Mod_Perl on Unix or PerlEX on Windows. Mod_Perl is a module for Apache that is loaded into memory when the web server loads. PerlEX is a service for Windows NT or Win2K that is essentially the Perl interpreter running like Cold Fusion does. By utilizing one of these you eliminated launching the interpreter for each instance and the resource load associated with it. In addition, once a Perl app is loaded, compiled, and run, it is not dumped from memory either. It is kept compiled and reutilized each subsequent time it is invoked. This increasing a CGI's execution time by up to 50%. On MW's main server we DO have Mod_Perl. Your observations about the scenario that corrupts a data file is probably close to the money. However, it is to be expected, and as I've stated I take precautions so that even in this case no data is lost. The locking system, in combination with my self backup technique, really keeps it from happening on all but the blue moon occasion. However, your suggestion about opening the data files with exclusive locking would be a good idea, but unfortunately is irrelevant because were on Linux. That's a feature of the Windows OS. Something that I don't get, however, is how will your queuing method get around having to perform the same file locking technique that I am now? Even if you have each instance perform the actual file updates the last instance recorded in the queue file, you still have to make a lock file to protect multiple instances from writing to the queue file now. In the end, you're just doing the same thing, but with more steps and more complexity. And, you still have processes that have to wait in line for their turn. Really, to do what you're suggesting correctly, you'd really need to create a daemon (like a service on Windows), that would be running in the background all the time. Then an instance of the regular app would just pass string commands to the daemon that contain the update to perform. The daemon would have a queue in memory that it keeps the commands in and just do them one at a time as quickly as possible, removing each command string from memory after completion. Now doing this would both eliminate the file locking system all together and eliminate waiting processes of the Perl app. However, it is all irrelevant since I'm doing all future apps in CF. And, if I were going to redo any of the apps we have now, I would also do them in CF, as well with a SQL Server back end eliminates all these issues any ways. ------------------ Dan McClain, Editor The Mustang Works Magazine 1991 Mustang GT - NOVI Supercharged 377 Stroker 1999 Ford Lightning SVT - Supercharged 5.4L Triton

09-13-2000, 09:17 AM	#14
joe4speed He said Member...heh, heh Join Date: Sep 1999 Location: Jupiter, Florida U.S.A. Posts: 3,718	Now my head is really spinning!!!! LOL, You guys are just chock full of information!

09-15-2000, 05:58 AM	#16
StangFlyer Founder Join Date: Jun 1995 Location: Michigan Posts: 19,326	OK, now I understand a little bit better what you are saying. From your original message I thought what you meant was to have a central (i.e. one) que file. Of course, if that were the case each instance of the CGI would be reading and writing to it, you have to have a lock file for that, and you'd still have to use the locking file to protect the main files. Which, of course, made no sense because you're just doing what I already am now, and more. Plus you're still working on the main data files. In essence, it would just make it MORE complex and make you work with more files. However, what you are actually suggesting is that you make multiple que files. One for each update that needs to occur to the main data file. When an instance of the CGI is launched, it would read in (and delete) each que file in order of their time stamps (basically) and update whatever data each que file directed in the main data file. You would still have to use the lock file, however, because you'd still have to make sure that more than one instance is not trying to do this procedure on the main data file at the same time. Right? So, you wouldn't really totally eliminate the locking file system, but it would significantly reduce waiting processes because a process would only have to wait if que files existed. This wouldn't be nearly as often since not every instance needs to write information, the majority of the time it just needs to read it. And, you can have multiple instances (or programs) reading a file at the same time, you just can't have them writing at the same time (or reading and writing at the same time). I still think the only way to totally eliminate the locking file and need for any instances to wait would be through a daemon working as I described. Unless I'm still not seeing something in your plan. Now, in other notes, if the CGI needs to wait it uses a "sleep" command in Perl. What's giving me a chuckle is that you keep sighting Window's stuff. Unlike Windows, where as soon as you just boot to the desktop it's running dozens of darn DLL's and service processes in the background, Unix (Linux in this case) is nothing more than one central "kernel". The OS Kernal is actually very small and uses little resources. Upon boot up, the only thing running is the main Kernal and whatever daemons (services) you define and actually need. This for example on our server for MW's would be a daemon for Apache, FTP, sendmail, MySql, among a few others. Most other functions are completely autonomous modules that do everything. Such as renaming a file, moving a file, editing a file, changing file permission, X-Windows, whatever. When you type the command to move a file called "readme.txt", for example, to another directory from its currently location, which is say "mv /home/readme.txt /etc/readme.txt", it invokes an external program called "mv" which resides in the "/bin" directory. That module loads, does what you directed, and exits. In other words, all the functions of the Unix OS only run as needed and thus Unix has virtually no overhead. And, because you don't have all that crap running all the time at once, the thing is EXTREMELY stable. And this is why Linux, verses Windows on the same box, is way faster. Yeah, it's not the thing for a regular home user's desktop (yet), but it's by far the most inexpensive and ideal server solution. I'm particularly happy that Allaire finally came out with Cold Fusion for Linux now... As for these issues still occurring with a Cold Fusion and SQL Server solution, yes they are somewhat still there, but at that point its irrelevant to you because you have a heavy hitting database engine that handles it for you. ------------------ Dan McClain, Editor The Mustang Works Magazine 1991 Mustang GT - NOVI Supercharged 377 Stroker 1999 Ford Lightning SVT - Supercharged 5.4L Triton [This message has been edited by Dan McClain (edited 09-15-2000).]

09-15-2000, 07:04 PM	#17
jimberg Registered Member Join Date: Oct 1998 Location: Rogers, MN Posts: 2,089	I don't think I even mentioned Windows stuff in my last message. Most of my major multithreaded programming was with kernels and schedulers that I wrote. When I did programs for IIS that were fairly similar to User Rides and pounded it with 120 simultaneous users, the highest thread count I ever got was 7 threads. If everything positive about Linux that I have read is true, you should be able to get your max count to at most half of that. Regardless of how efficient the scheduler is in the OS, however, having tasks vying for resources by having them wait for an arbitrary, predetermined period of time is still a major problem because of the issue creating a race condition. I think you pretty much have what I am saying, except for the part about waiting only if queue files exist. The process will be extended a little, but would still be continuous. The program will never wait. As far as getting rid of the locking file, I went to PERL.ORG to see if I could find some info on file locking. Does the FLOCK <file handle>, 2 not work with your implementation of PERL? That would give you exclusive access to write to the file without needing to create a lock file. In the FAQ, their example was a page hit counter, so I'm pretty sure that is what I was looking for. If the FLOCK fails, run the ELSE code in the above pseudo code. If you use SQL Server 7, yes, it will be much easier.

09-15-2000, 08:23 PM	#19
jimberg Registered Member Join Date: Oct 1998 Location: Rogers, MN Posts: 2,089	I was using the term task manager instead of scheduler, sorry. I still believe that there will be a significant load in the scheduler. I guess I have Windows on the brain. Anyhoo... I thought I read that flock was a portable command based on the Unix flock, not necessarily a direct pass-through. If you use it on Windows NT/2000 it should work (according to the documentation that I read). The difference, though, is that flock on Linux is only advisory while on NT/2000 it would enforced. Your code would still be portable.

09-16-2000, 06:49 AM	#20
StangFlyer Founder Join Date: Jun 1995 Location: Michigan Posts: 19,326	I'd have to check into it then. But, to my knowledge, it was not supported on Windows (in the past). ------------------ Dan McClain, Editor The Mustang Works Magazine 1991 Mustang GT - NOVI Supercharged 377 Stroker 1999 Ford Lightning SVT - Supercharged 5.4L Triton