Monday, April 07, 2003

Manipulating Files and Directories & Process Management (S&P -- Chapters 13 and 14

Consider the following problem -- we have a collection of numerous files (they could be mp3's, spreadsheet files, telemetry data etc.) spread across multiple nested subdirectories of some top level level directory and we want to copy all these files to a new directory hierarchy. The new directory hierarchy is relatively flat -- there are only 26 directories named A, B ... Z. The directory to which a file is copied depends upon the first alphabetic letter of the file's name. Therefore, the file named some_music.mp3 would be copied to the directory named S and the file 03_Hello.mp3 would be copied to the directory named H.

This problem can be solved with the perl script given below. This perl script does some minor directory management and also invokes an external process (namely, the find command) to perform its task. This code also demonstrates the use of a couple of routines for filename manipulation as provided by perl's File module.


#!/usr/bin/perl -w

use strict;

use File::Basename;
use File::Spec;

my $pattern = '\.mp3$';

defined(my $topdir = shift @ARGV) or die "Must specify directory";
die "Must specify absolute path!\n" if substr($topdir, 0, 1) ne "/";

for (split /\x0/, `find $topdir -type f -print0`) {
	my $basename = basename $_;
	if ($basename !~ /$pattern/i) {
		warn "File not matched '$_'!\n";
		next;
	}
	my ($first) = ($basename =~ m/([a-z])/i);
	if (!defined $first) {
		warn "No alphabetic character to use for file '$_'!\n";
		next;
	}
	my $dir = uc $first;
	if (! -d $dir) {
		mkdir $dir, 0700 or die "Cannot make directory! '$dir': $!\n"
	}
	my $fname = File::Spec->catfile($dir, $basename);
	link $_, $fname or die "Cannot create link from '$_' to '$fname': $!\n";
}


The main points of the above script can be summarized as follows:

Last modified: Tue Apr 8 00:10:38 2003