Maintaining 100 threads consumes a considerable amount of resources regardless to whether or not they are waiting. The context switch, the OS overhead to determine which thread to run next, the amount of memory necessary to keep an instance running, etc. What's the size of the PERL interpreter? Does a new interpreter get loaded for each object instance? Just the OS overhead involved is significant enough and will always be a problem no matter what you use for development. The overhead will vary from OS to OS. BeOS and Linux are probably more efficient than Windows 2000 at the way they handle threads.
If you had access to system objects and were able to create a mutex, I would suggest using that instead of a lock file. This way, the waiting thread would go active as soon as the mutex was released rather than waiting for 1 sec. I don't think you have those as an option in PERL, though.
Your data corruption problems probably occur in that minute span of time between when one thread is creating the lock file and another thread isn't finding the lock file. (Thread 1 sees that the lock file isn't there. A context switch occurs. Thread 2 sees that the lock file is not there. Thread 2 Creates lock file. Context switch. Thread 1 creates lock file.) Both threads probably think they have it locked and then blast the data. You might try simply opening the file for exclusive access and go into your wait loop if you get a sharing violation error on the other thread. Continue waiting and attempting to open the file until you succeed. Otherwise, do what you need to do with the file. This way, the OS is handling synchronization and not you. Doing this should at least resolve your data corruption problems and should be fairly easy to implement.
I still don't think it is a good idea to ever wait for an arbitrary period. The queue I'm suggesting would consist of tiny little files that contain very little overhead. They could contain just enough information to perform the update that was requested. e.g. The update that would occur the most is incrementing the number of times a user's ride was viewed. That could be represented by something as simple as "jimberg+". You have the lookup key and the action. The filename could have a counter that keeps requests in the proper order. I know that in practice this works.
I think the queueing solution would actually reduce overhead by a significant amount and improve reliability. That's irrelevant though, since you're planning to move to SQL Server and Cold Fusion. Does this mean you are rewriting everything or just developing new stuff? Or do you already have everything rewritten and you're gonna spring it on is sometime in the near future?

Looking forward to seeing what you come up with.
I hope you don't mind my suggestions, but the only thing I like to discuss more than Mustangs is programming. I've been developing real-time systems, including the multi-tasking kernels they use, since 1986. My first was for a multi-user BBS.