Automating Simple Tasks with Scheme (Competing with Perl, Python and Ruby)
Compare that with the types of simple programs we see in Perl and Python. “I have a bunch of files, and I want to rename them all according to some pattern.” Common problem, easy solution. “I’ve got a log file full of email addresses, I need to strip them out from the log entries, remove duplicates, and add them to a database.” Again, fairly simple, fairly small, really useful. When Haskell can compete on those types of problems, it’ll be easier to induce people to learn it. (Same with CL, my fav language….)
So here is a Scheme program that does this. It is written to use MzScheme because that’s the only Scheme I have installed in Windows at the moment. Thus, it takes advantage of PLaneT and the other libraries that come with MzScheme.
Hopefully this can convince others that Scheme is a good language for common tasks.
I have a bunch of images I’ve uploaded from a digital camera and they all have the horrendous DSCxxxx.JPG pattern. Let’s rename them to include an image collection name and the date they were taken. The common way of including meta-data with an image is by using EXIF. I won’t go into all of the detail of the format since that’s unimportant to the example. All we need to know is that the meta-data is at the top of file, the file is binary, and we’re looking for the date and time stamp.
The date and time stamp comes after the word “Exif” at some point in the file, and is in the form of:
Thus our regular-expression will look like this:
We’ll rename the files to look like this:
collectionName [year_month] imageId
For example, the file
DSC02484.JPG would be renamed to
Canada Day [2007_08] DSC02484.JPG.
When looping through files, we’ll need to make sure we’re renaming the right ones. So we need yet another regular expression to check for DSC at the beginning and the JPG file extension at the end. It looks like this:
The prefix for images uploaded from your camera may be different, but I use a Sony cam and “DSC” is the typical prefix.
The resulting code is:[gist 5942564]
The first line imports the necessary libraries which are pregexp.ss (the Perl compatible regular expressions library), list.ss which contains list manipulation functions, and file.ss, which has useful functions for directory and file manipulation. One of the useful functions from file.ss that is used in our script is directory-list which returns a list of all files and directories in a path (the current directory is used if the path is omitted).
The first two functions are simply utility functions that I did not want to place in another file. They read all lines from a file and turn them into a list for use with
for-each and other list functions.
The next function was explained above and is used with the filter function to filter out the files that we don’t need to rename. After this function, we set the name of the collection to whatever we want and then grab a list of files that pass the
After this, we loop through the files, and then loop through the lines looking for the EXIF date/time format. If we find the format, we rename the file. We have to use the
build-path function to ensure that the file name separators work on the platform we’re on. The function
list-ref is used to select the parts of the date/time format that we would like to use:
(list-ref winner 1) selects the year, and
(list-ref winner 2) selects the month.
Clarifications will be provided if asked for. This code seems self-explanatory, but I’m sure Perl/C++/Java developers feel the same way about their most obfuscated code.
It Was Like So, But Wasn’t
Languages such as Perl, Python and Ruby enable you to write throw away scripts. Perl has been called a write-only language because you write a script to fix something and then toss it away because you can easily re-write it the next time it’s needed. The philosophy of those languages is to enable you to write simple, common programs as easily as possible. I noticed that when I wrote the script, I looked for utility functions that I could re-use in the future when needed.
The other day, I wrote a small Python script to replace specific lines in certain files. When I wrote it, all I could think about was fixing the damned files. It was so easy that I didn’t think about how I would do this in the future for other types of files if I needed to. The answer is simple: make a function that takes a filename and a list of lists where each sub-list contains the string to match and the string to replace it with. This answer came easily after the pressure was off. But with Scheme, I am always looking for macros and functions to create to ease potential problems I may have.
In conclusion, the philosophy and community surrounding the Lisp and the Perl/Python/Ruby families is different and this can be seen clearly by the weak attempts at emulating CPAN and the lack of common tasks being done with Lisp. But as I’ve shown, it can be done…sorta 😛