Automating Simple Tasks with Scheme (Competing with Perl, Python and Ruby)

Racket/PLT Scheme Logo

reddit user by the name of alanshutko stated what was necessary to make Scheme, Common Lisp, Haskell, and other non-mainstream languages more appealing to the average programmer.

Compare that with the types of simple programs we see in Perl and Python. “I have a bunch of files, and I want to rename them all according to some pattern.” Common problem, easy solution. “I’ve got a log file full of email addresses, I need to strip them out from the log entries, remove duplicates, and add them to a database.” Again, fairly simple, fairly small, really useful. When Haskell can compete on those types of problems, it’ll be easier to induce people to learn it. (Same with CL, my fav language….)

So here is a Scheme program that does this. It is written to use MzScheme because that’s the only Scheme I have installed in Windows at the moment. Thus, it takes advantage of PLaneT and the other libraries that come with MzScheme.

Hopefully this can convince others that Scheme is a good language for common tasks.

Renaming files in Scheme

I have a bunch of images I’ve uploaded from a digital camera and they all have the horrendous DSCxxxx.JPG pattern. Let’s rename them to include an image collection name and the date they were taken. The common way of including meta-data with an image is by using EXIF. I won’t go into all of the detail of the format since that’s unimportant to the example. All we need to know is that the meta-data is at the top of file, the file is binary, and we’re looking for the date and time stamp.

Regular Expressions

The date and time stamp comes after the word “Exif” at some point in the file, and is in the form of:

YYYY:MM:DD hh:mm:ss

Thus our regular-expression will look like this:

Exif.+(\d{4}):(\d{2}):(\d{2}).(\d{2}):(\d{2}):(\d{2})

Here is a good reference that will help when deciphering the above regular expression.

We’ll rename the files to look like this:

collectionName [year_month] imageId

For example, the file DSC02484.JPG would be renamed to Canada Day [2007_08] DSC02484.JPG.

When looping through files, we’ll need to make sure we’re renaming the right ones. So we need yet another regular expression to check for DSC at the beginning and the JPG file extension at the end. It looks like this:

^DSC.+(jpg|JPG)$

The prefix for images uploaded from your camera may be different, but I use a Sony cam and “DSC” is the typical prefix.

The Result in Scheme

The resulting code is:

The first line imports the necessary libraries which are pregexp.ss (the Perl compatible regular expressions library), list.ss which contains list manipulation functions, and file.ss, which has useful functions for directory and file manipulation. One of the useful functions from file.ss that is used in our script is directory-list which returns a list of all files and directories in a path (the current directory is used if the path is omitted).

The first two functions are simply utility functions that I did not want to place in another file. They read all lines from a file and turn them into a list for use with mapfor-each and other list functions.

The next function was explained above. It is used with the filter function to filter out the files that we don’t need to rename. After this function, we set the name of the collection to whatever we want. Then grab a list of files that pass the image? test.

After this, we loop through the files, and then loop through the lines looking for the EXIF date/time format. If we find the format, we rename the file. We have to use the build-path function to ensure that the file name separators work on the platform we’re on. The function list-ref is used to select the parts of the date/time format that we would like to use: (list-ref winner 1) selects the year, and (list-ref winner 2) selects the month.

Clarifications will be provided if asked for. This code seems self-explanatory, but I’m sure Perl/C++/Java developers feel the same way about their most obfuscated code.

It Was Like So, But Wasn’t

Languages such as Perl, Python and Ruby enable you to write throw away scripts. Perl has been called a write-only language because you write a script to fix something and then toss it away because you can easily re-write it the next time it’s needed. The philosophy of those languages is to enable you to write simple, common programs as easily as possible. I noticed that when I wrote the script, I looked for utility functions that I could re-use in the future when needed.

The other day, I wrote a small Python script to replace specific lines in certain files. When I wrote it, all I could think about was fixing the damned files. It was so easy that I didn’t think about how I would do this in the future for other types of files. The answer is simple: make a function that takes a filename and a list of lists where each sub-list contains the string to match and the string to replace it with. This answer came easily after the pressure was off. But with Scheme, I am always looking for macros and functions to create to ease potential problems I may have.

In conclusion, the philosophy and community surrounding the Lisp and the Perl/Python/Ruby families is different. This can be seen clearly by the weak attempts to emulate CPAN. It can be seen by the lack of common tasks being done with Lisp. But as I’ve shown, it can be done…sorta 😛