Why is file locking necessary?

When reading and writing to a file, an operating system maintains a file pointer that is the offset from the start of the file where the next read or write operation will occur. There can be separate file pointers for reading and writing.

Each time a process opens a file, the operating system associates a file pointer for reading and writing with that process. If another process opens the same file, then another set of file pointers are created. Ever opened file has its own file pointers.

Since the file pointers are not shared, two processes can write to the same area of a file, the last process to write will win. This means that the data written by the first process will be lost.

Loss of data demonstration

The NoFileLocking class illustrates how data can be lost when multiple processes write to the same file at the same time.

NoFileLocking.java
import java.io.*;

public class NoFileLocking {
    public static void main( String[] args ) {
	int delay = 1000;
	int len = 5;
	String id = args[0];
	String file = args[1];
	String what = args[2];
	try {
	    RandomAccessFile fw = new RandomAccessFile( file, "rws" );
	    fw.seek( fw.length() );

	    for( int i = 0; i < len ; i++ ) {
		Thread.sleep( delay );
		for( int j = 0; j < what.length(); j++ ) {
		    fw.write( (byte)what.charAt(j) );
		}
		System.out.println( id + " " + fw.getFilePointer() );
	    }
	    fw.close();
	}
	catch( Exception ex ) {
	}
    }
}

Thread.sleep ensures that multiple processes will write to the same file at the same time.

Loss of data demonstration (1)

Two terminal windows are needed to demonstrate the data lose. In one window type the command:

java NoFileLocking A foo a

In the second window, type the following command after two seconds.

java NoFileLocking B foo b

Loss of data demonstration (2)

The output of the first command is:

A 1
A 2
A 3
A 4
A 5

The second command produces:

B 3
B 4
B 5
B 6
B 7

Notice that process A and process B overlap with file pointers 3, 4, and 5. The contents of foo is:

foo.save
aabbbbb

Which process won? Why? If the processes did not overlap, what should the contents of foo be?

Preventing shared access

The data is lost because two processes are writing to the same file using different sets of file pointers. The above problem can be avoided by preventing shared access. Java provide file locking that ensures that only one process can access a file at a time. The modified example with file locking is:

FileLocking.java
import java.io.*;
import java.nio.channels.*;

public class FileLocking {
    public static void main( String[] args ) {
	int delay = 1000;
	int len = 5;
	String id = args[0];
	String file = args[1];
	String what = args[2];
	try {
	    RandomAccessFile fw = new RandomAccessFile( file, "rws" );
	    FileChannel chan = fw.getChannel();
	    FileLock lock = chan.lock();
	    fw.seek( fw.length() );

	    for( int i = 0; i < len ; i++ ) {
		Thread.sleep( delay );
		for( int j = 0; j < what.length(); j++ ) {
		    fw.write( (byte)what.charAt(j) );
		}
		System.out.println( id + " " + fw.getFilePointer() );
	    }
	    fw.close();
	    lock.release();
	}
	catch( Exception ex ) {
	    // XXX what is wrong here
	}
    }
}

Repeating the previouse experiment

The output from process A with the command java FileLocking A bar a is:

A 1
A 2
A 3
A 4
A 5

The output for the command java FileLocking B bar b is:

B 6
B 7
B 8
B 9
B 10

The output from B does not start until A is finished, and has released the lock. The contents of bar are:

bar.save
aaaaabbbbb

No data has been lost, but are there any problems with this approach?

Java file locking API

The JDK 1.4 version of Java provides file locking in the java.nio.channels package. A FileChannel object is required to perform file locking. A file channel object can be accessed with the getChannel method defined in:

A file is locked with FileLock lock(). If lock() is invoked and the file is already locked, then lock() will block.

FileLock tryLock() attempts to lock a file and will return null if the file is already locked. tryLock is used if the process should not be blocked.

The FileLock provides the release() method to release the lock. Once the lock is released, one of the processed (if any) waiting for the lock will be awoken and given the lock.