Using generate_series and daterange for fun and profit

While working on a project for a company that's renting out professional movie equipment, I needed to display a calendar view of the availability status of a given thing (eg a camera). The availability status is stored in a table a bit like this:

CREATE TABLE schedule (
    id integer NOT NULL,
    type integer NOT NULL REFERENCES type(id),
    thing integer REFERENCES thing(id),
    begin timestamp with time zone NOT NULL,
    until timestamp with time zone NOT NULL,
    state text NOT NULL
);

For example:

 type | thing |   begin    |   until    |  state   
------+-------+------------+------------+----------
    1 |     2 | 2019-07-08 | 2019-07-31 | rented
    1 |       | 2019-07-12 | 2019-07-23 | reserved
    1 |       | 2019-07-16 | 2019-08-05 | reserved
    1 |     3 | 2019-07-25 | 2019-07-30 | rented
    1 |     3 | 2019-07-31 | 2019-08-01 | returned
    1 |     3 | 2019-08-02 | infinity   | damaged

Here we can see that we have three different cameras (thing) of the same camera type (type) with IDs 1, 2 and 3. We can also see that camera #2 is rented from 2019-07-08 until 2019-07-31, and camera #3 is rented for a few days from 2019-07-25 until 2019-07-30. There are also two reservations, for as yet unspecified cameras (which camera a team will get is decided during "checkout").

Camera #3 was returned on 2019-07-31 (i.e. back at the company, but not yet ready for further use, as each camera has to be inspected after being away). It was then declared damaged on 2019-08-02, and will stay in this state forever (infinity) or until a different state (hopefully repaired) will be set.

Screenshot time

This currently looks a bit like this:

I'm using vuetify calendar here to render the data, showing on each day the number of available, rented, reserved etc items.

Data to display

Now vuetify calendar needs the data in a completely different format. It wants a hash containing a key for each day with an arrayref of "events" as the values. In JSON (as returned by our Perl backend API):

 "calendar" : {
      "2019-07-31" : [
         {
            "count" : 1,
            "state" : "rented"
         },
         {
            "count" : 1,
            "state" : "reserved"
         },
         {
            "count" : 1,
            "state" : "returned"
         }
      ],
      "2019-08-01" : [
         {
            "count" : 1,
            "state" : "available"
         },
         {
            "count" : 1,
            "state" : "damaged"
         },
         {
            "count" : 1,
            "state" : "reserved"
         }
      ],
  }

So how can I convert the data I have in my DB into this rather verbose data structure? I could probably muck around in my app code, but I much more prefer to use my database engine (Postgres, what else..) to do the hard lifting.

SQL Time!

WITH bounds AS (SELECT (?)::date as lower, (?)::date as upper)
SELECT
  generate_series(
    greatest( bounds.lower, s.begin ),
    least(    bounds.upper, s.until ),
    '1 day'::interval
  )::date as day,
  s.state,
  count(s.state) as count
FROM
  schedule s, bounds
WHERE
      s.type = ?
  AND daterange( s.begin::date, s.until::date, '[]')
      &&
      daterange( bounds.lower,  bounds.upper,  '[]')
GROUP BY
  day, s.state
ORDER BY day, state

As this might be a bit hard to grok, let's break it down a bit. We have three interesting parts here. A CTE at the beginning, generate_series in the SELECT list and a daterange query in the WHERE clause.

CTE: Common Table Expression

WITH bounds AS (SELECT (?)::date as lower, (?)::date as upper)

We use this simple Common Table Expression (CTE) to alias the query parameters and convert them to proper date objects. In our code we execute the query like so:

$sth->execute( $start_date->ymd, $end_date->ymd, $thing);

$start_date->ymd will be something like '2019-07-01'.

Using this technique we can refer to the same parameter more than once. And it makes manual testing of the query easier...

WHERE daterange

Let's jump down to the daterange in the WHERE clause first:

WHERE
      s.type = ?
  AND daterange( s.begin::date, s.until::date, '[]')
      &&
      daterange( bounds.lower,  bounds.upper,  '[]')

daterange generates a date range (good naming there…) between two dates. The third parameter (in this case '[]') specifies if the start/end dates should be inclusive or exclusive (in this case inclusive; for exclusive use round bracket (;).

So we construct two date ranges, one between the begin and until date of the stored events, and another one between the lower and upper dates specified by the user.

Using the && operator we tell postgres to select all rows where those two date ranges overlap. (See here for more operators).

So the WHERE query selects all events of a given type (s.type = ?) which fall partly or completely into the requested date range. Even if one of the event dates is infinity. Yay!

Convert to a list of days: generate_series

But we now need to "blow up" the date ranges into distinct days to display them in the calendar. Easy:

 generate_series(
    greatest( bounds.lower, s.begin ),
    least(    bounds.upper, s.until ),
    '1 day'::interval
  )::date as day,

generate_series generates a series (again, good naming!) of values between the start and the stop value. You can also pass in the step interval as the third value, which is what we're doing here via '1 day'::interval. greatest and least return the biggest or smallest value of the list you pass in. We're using this to limit the list of days to the "tightest" amount of days, while still generating all days needed to display the calendar.

Run it!

Here's the full query with some params:

WITH bounds AS (SELECT ('2019-07-01')::date as lower, ('2019-08-05')::date as upper)
SELECT
  generate_series(
    greatest( bounds.lower, s.begin ),
    least(    bounds.upper, s.until ),
    '1 day'::interval
  )::date as day,
  s.state,
  count(s.state) as count
FROM
  schedule s, bounds
WHERE
      s.type = 1
  AND daterange( s.begin::date, s.until::date, '[]')
      &&
      daterange( bounds.lower,  bounds.upper,  '[]')
GROUP BY
  day, s.state
ORDER BY day, state
;

Which returns

    day     |  state   | count 
------------+----------+-------
 2019-07-08 | rented   |     1
 2019-07-09 | rented   |     1
 2019-07-10 | rented   |     1
 ...
 2019-07-31 | rented   |     1
 2019-07-31 | reserved |     1
 2019-07-31 | returned |     1
 2019-08-01 | reserved |     1
 2019-08-01 | returned |     1
 2019-08-02 | damaged  |     1
 2019-08-02 | reserved |     1

Nearly there!

Finally, some Perl

If you've been very attentive, you might have noticed one thing: We are nowhere storing the actual available items! And we don't need to: If an item does not have a state in a given time frame it is considered available. But we still want to display the count of available items in the calendar. To fill these gaps, we need to iterate over the result returned from the DB in the model, and do a tiny bit of data munging:

$sth->execute( $start_date->ymd, $end_date->ymd, $type );
my $calendar = {};
my $current_day   = 0;
my $day_available = $total;
while ( my $r = $sth->fetchrow_hashref ) {
    if ( $r->{day} ne $current_day ) {
        $self->_fill_available($calendar, $current_day, $day_available);
        $current_day   = $r->{day};
        $day_available = $total;
    }
    push(
        $calendar->{ $r->{day} }->@*,
        {   state => $r->{state},
            count => $r->{count},
        }
    );
    $day_available -= $r->{count};
}
# clean up after last iteration
$self->_fill_available($calendar, $current_day, $day_available);

return $calendar;

$total contains the total number of available items for a given type (in this example 3, because we have 3 of these specific cameras). We start with an invalid $current_day and loop through the data we got from the DB. If the current_day changed, we call _fill_available which pushes a "fake" state available onto the start of the current day's list of events. We push the state and count of the current DB row onto the list of events for the current day. And we decrement the number of items available on this day.

After the loop we have to call _fill_available one more time, to add the last day's data.

And that's it!

For making it this far, here's another screenshot as a reward: