19.3 Flawed Locking Methods
The suggested locking methods in the first and second editions of the book Programming Perl (O'Reilly) and the DB_File manpage (before Version 1.72, fixed in 1.73) are flawed. If you use them in an environment where more than one process can modify the DBM file, it can be corrupted. The following is an explanation of why this happens.
You cannot use a tied file's file handle for locking, since you get the file handle after the file has already been tied. It's too late to lock. The problem is that the database file is locked after it is opened. When the database is opened, the first 4 KB (for the Berkeley DB library, at least) are read and then cached in memory. Therefore, a process can open the database file, cache the first 4 KB, and then block while another process writes to the file. If the second process modifies the first 4 KB of the file, when the original process gets the lock it now has an inconsistent view of the database. If it writes using this view it may easily corrupt the database on disk.
This problem can be difficult to trace because it does not cause corruption every time a process has to wait for a lock. One can do quite a bit of writing to a database file without actually changing the first 4 KB. But once you suspect this problem, you can easily reproduce it by making your program modify the records in the first 4 KB of the DBM file.
It's better to resort to using the standard modules for locking than to try to invent your own.
If your DBM file is used only in the read-only mode, generally there is no need for locking at all. If you access the DBM file in read/write mode, the safest method is to tie the DBM file after acquiring an external lock and untie it before the lock is released. So to access the file in shared mode (FLOCK_SH), follow this pseudocode:
flock $fh, FLOCK_SH <= == == start critical section tie... read... untie... flock $fh, FLOCK_UN <= == == end critical section
Similarly for the exclusive (EX) write access:
flock FLOCK_EX <= == == start critical section tie... write... sync... untie... flock FLOCK_UN <= == == end critical section
You might want to save a few tie( )/untie( ) calls if the same request accesses the DBM file more than once. Be careful, though. Based on the caching effect explained above, a process can perform an atomic downgrade of an exclusive lock to a shared one without retying the file:
flock FLOCK_EX <= == == start critical section tie... write... sync... <= == == end critical section flock FLOCK_SH <= == == start critical section read... untie... flock FLOCK_UN <= == == end critical section
because it has the updated data in its cache. By atomic, we mean it's ensured that the lock status gets changed without any other process getting exclusive access in between.
If you can ensure that one process safely upgrades a shared lock to an exclusive lock, you can save the overhead of doing the extra tie( ) and untie( ). But this operation might lead to a deadlock if two processes try to upgrade from shared to exclusive locks at the same time. Remember that in order to acquire an exclusive lock, all other processes need to release all locks. If your OS's locking implementation resolves this deadlock by denying one of the upgrade requests, make sure your program handles that appropriately. The process that was denied has to untie the DBM file and then ask for an exclusive lock.
A DBM file always has to be untied before the lock is released (unless you do an atomic downgrade from exclusive to shared, as we have just explained). Remember that if at any given moment a process wants to lock and access the DBM file, it has to retie this file if it was tied already. If this is not done, the integrity of the DBM file is not ensured.