/ domm

I hack Perl for fun and
profit.

Follow me on twitter!
Atom Icom ... on Atom!
<<<<<<<<<<
25.10.2012: "Essential Modern Perl" Track at Austrian Perl Workshop
11.10.2012: new laptop
02.10.2012: Vienna.pm TechMeet: DBIx::SchemaChecksum
28.09.2012: Call for Papers: Austrian Perl Workshops 2012
20.09.2012: Generating PDF contact sheets with PDF::Create

A few weeks ago my father celebrated a decadal birthday and as part of a present I did a portrait picture of each guest. I needed to generate some small hard copies of the collected portraits, i.e. I needed to generate something like a contact sheet from a bunch (49) of pictures. Doing this by hand of course violates the First Rule of Perl Club ("Laziness"), so I hacked up a quick script, which I cleaned up a bit and hereby present to you, dear Internet.

The script assumes that all images are of the same size, and a rather small size (something like 200 pixels width). I of course have another script to resize a directory full of images to some other size, but I'm not going to bore you with that (at least for now..)

So, here's the script, broken into a few chunks:

#!/usr/bin/env perl
use 5.010;
use strict;
use warnings;
use PDF::Create;

my $image_dir = $ARGV[0] || '.';
my $outfile   = $ARGV[1] || 'out.pdf';

I'm using PDF::Create because it provides a high enough layer of abstraction to the gruesome horror that is PDF. For such a small script I'm also too lazy to use any command line parsing modules and just fall back to good, old 90ies style @ARGV-munging.

After getting the options, we need to get some objects:

my $pdf        = PDF::Create->new( 'filename' => $outfile, );
my $pdf_format = $pdf->get_page_size('A4');
my $pdf_page   = $pdf->new_page( 'MediaBox' => $pdf_format );

I initiate a new PDF::Create object using the A4 format (I could of course get the format from the command line, but than I'd definitely need to use some proper option parsing module. Besides, A4 is the only sane format for paper...). $pdf_page will be the template from which further pages will be created.

Now we read all files in the specified directory:

my ( %images, $width, $height );
opendir( my $dh, $image_dir ) || die "Cannot read dir $image_dir: $!";
while ( my $file = readdir($dh) ) {
    next unless $file =~ /\.jpe?g$/;
    my $path = $image_dir . '/' . $file;
    my $img_data = $images{$path} = $pdf->image($path);
    unless ( $width && $height ) {
        $width  = $img_data->{width};
        $height = $img_data->{height};
    }
}

Again using 90ies style, because I'm in the mood for retro programming, and using Path::Class or similar is a bit overkill for a 47-liner running on my machine... We skip all non-jpegs and call $pdf->image on the path. image does some PDF magic I don't care about (abstraction!), but also returns a data structure containing the width and height of the image. Which we store for future reference.

Next step: putting the images into the PDF.

my $current_page = $pdf_page->new_page;
my $ypos         = $pdf_format->[3] - $height;
my $xpos         = 0;
foreach my $imgpath ( sort keys %images ) {
    my $img = $images{$imgpath};
    $current_page->image(
        image => $img,
        xpos => $xpos,
        ypos => $ypos
    );
    $xpos += $width;
    if ( $xpos > $pdf_format->[2] ) {
        $xpos = 0;
        $ypos -= $height;
    }
    if ( $ypos < 0 ) {
        $xpos         = 0;
        $ypos         = $pdf_format->[3] - $height;
        $current_page = $pdf_page->new_page;
    }
}

The most complex part of the script is figuring out where to put the images. It doesn't help that PDF defines 0/0 to be the bottom left corner of a page, and also uses the bottom left corner of an image to position it.

So, to put an image right to the top of the page, we have to get the maximum page size from the data structure $pdf_format (which we got from $pdf->get_page_size earlier) and subtract the height of the image. Now we can loop through all the images and put them on the page. Should the xpos get to big, we start a new row (with $xpos =0 and $ypos -= $height). Should the next row start below the page bottom (i.e $ypos < 0), start a new page and again set $ypos to the top of the page.

The last step is to close the PDF, so it gets written to disk:

$pdf->close;

Done!

Here's an example contact sheet spanning two pages and using images of a recent bike trip

And here's the whole script (which I might put on CPAN and/or github later):

#!/usr/bin/env perl
use 5.010;
use strict;
use warnings;
use PDF::Create;

my $image_dir = $ARGV[0] || '.';
my $outfile   = $ARGV[1] || 'out.pdf';

# set up PDF
my $pdf        = PDF::Create->new( 'filename' => $outfile, );
my $pdf_format = $pdf->get_page_size('A4');
my $pdf_page   = $pdf->new_page( 'MediaBox' => $pdf_format );

# get all images
my ( %images, $width, $height );
opendir( my $dh, $image_dir ) || die "Cannot read dir $image_dir: $!";
while ( my $file = readdir($dh) ) {
    next unless $file =~ /\.jpe?g$/;
    my $path = $image_dir . '/' . $file;
    my $img_data = $images{$path} = $pdf->image($path);
    unless ( $width && $height ) {
        $width  = $img_data->{width};
        $height = $img_data->{height};
    }
}

# put images in PDF
my $current_page = $pdf_page->new_page;
my $ypos         = $pdf_format->[3] - $height;
my $xpos         = 0;
foreach my $imgpath ( sort keys %images ) {
    my $img = $images{$imgpath};
    $current_page->image(
        image => $img,
        xpos => $xpos,
        ypos => $ypos
    );
    $xpos += $width;
    if ( $xpos > $pdf_format->[2] ) {
        $xpos = 0;
        $ypos -= $height;
    }
    if ( $ypos < 0 ) {
        $xpos         = 0;
        $ypos         = $pdf_format->[3] - $height;
        $current_page = $pdf_page->new_page;
    }
}

$pdf->close;
Tags: pdf, app, script, howto, ~/bin

Comments (via disqus)

11.09.2012: Some new Blio features
04.09.2012: Vienna.pm September 2012 Techmeet
25.08.2012: Which tag to use for YAPCs?
23.08.2012: Things I learned at YAPC::Europe 2012
09.08.2012: Blio - my blogging "engine"
>>>>>>>>>>