Chapter 9: Guaranteeing Complex Relationships with Triggers

Back to Table of Contents.

Guaranteeing Complex Relationships with Triggers

In this chapter, we’ll go over an advanced technique for mapping tricky real-world referential integrity constraints to the data layer. This technique uses database triggers, which allow you to run arbitrary code to check that new data is valid before inserting or updating rows. We’ll write a stored procedure that ensures an auditorium isn’t double-booked, and create a trigger that runs our procedure whenever a movie showtime is added or changed. I’ll describe the basic structure of a function in Postgtres’s procedural language PL/pgSQL, so you can start building your own functions for a variety of uses, and cover the special properties of trigger functions.

Constraint Checking with Triggers

As noted above, stored procedures that can be triggered when certain conditions in the database are met can allow you to check for and guarantee arbitrarily complex relationships. A number of times throughout this book, the problem of double-booking an auditorium has come up. This problem is not easy to avoid with simple foreign key or check constraints, because the problem spans multiple tables. Auditorium bookings are recorded in the movie_showtimes table, but the length of a movie is recorded in the movies table (Figure 9-1).

er_0901Figure 9-1. Auditorium bookings in movie_showtimes; length of movie in movies

There are also a number of cases to be considered to determine if one showtime overlaps with another. Figure 9-2 shows these cases. The first case, Case 0, is the throwaway case: the showtimes do not overlap. In Case 1, the movies do overlap: Movie 2 starts after Movie 1, but before Movie 1 finishes. Case 2 is the same case with the movies reversed: Movie 1 starts after Movie 2, but before Movie 2 finishes. In Case 3, Movie 2 starts and ends while Movie 1 is playing. Case 4 is the opposite of Case 3: Movie 1 starts and ends while Movie 2 is playing. Finally, we also want to treat the edge cases of a movie starting exactly when another movie ends, which may be okay for a double feature with no intermission. Therefore, if a start time and end time are the same, it’s okay, but if two start times are the same, it’s not okay.

er_0902Figure 9-2. Five cases for checking if two time intervals overlap

Examining all of these cases, we find the following statements hold true: if either the start or end time of one movie is between the start or end time of another movie, then the movies overlap. That is, unless the start time of one movie is equal to the end time of the other movie.

Rather than attacking this whole problem all at once, we work in stages. We begin by creating a small function called do_times_overlap, shown in Example 9-1. This method takes four arguments: start and end times for two intervals, respectively, and evaluates the expression we came up with above. The function also checks for invalid inputs, which can occur if an end time is before a start time.

Example 9-1. A PL/pgSQL function to determine if two intervals overlap

create or replace function do_times_overlap(
  start_1 timestamp with time zone,
  end_1 timestamp with time zone,
  start_2 timestamp with time zone,
  end_2 timestamp with time zone
) returns boolean as $$
begin
  if end_1 < start_1 then
    raise exception 'First end time is before first start time.';
  end if;
  if end_2 < start_2 then
    raise exception 'Second end time is before second start time.';
  end if;
  if start_1 = end_2 or start_2 = end_1 then
    return false;
  else
    return (start_1 between start_2 and end_2)
        or (end_1 between start_2 and end_2)
        or (start_2 between start_1 and end_1)
        or (end_2 between start_1 and end_1);
  end if;
end
$$ language plpgSQL;

The next order of business is to test this function. Does it work for all of these cases? Example 9-2 shows our test case class and all our tests, including tests that check that exceptions are thrown on invalid inputs. It may seem onerous or even wasteful to test such little pieces of code—indeed, the code to write the tests takes up twice as much space as the code we are testing—but it’s worth it. In the interest of full disclosure, neither the PL/pgSQL function nor the set of test cases came out right the first time through when they were added to this example. It took a few iterations to get it all right, and writing the tests was instrumental in that process. I could have just tested the procedure ad-hoc, which is how much code is often tested, but as I changed the code to cover all cases or improve readability, I was able to quickly rerun the test cases, guaranteeing that after each change, all scenarios were covered. The result is not only a more complete test bed, but also a correct and more complete function, which accurately accounts for all cases, even those I didn’t think of right away.

Example 9-2. Tests for the do_times_overlap PL/pgSQL function, placed in test/unit/physical/do_times_overlap_test_case.rb

require File.dirname(__FILE__) + '/../test_helper'

module Physical
  class DoTimesOverlapTestCase < Test::Unit::TestCase

    def setup
      @start_1 = Time.new
      @end_1 = @start_1 + 1.hour
      
      @before   = @start_1 - 15.minutes
      @between  = @start_1 + 15.minutes
      @between2 = @start_1 + 30.minutes 
      @after    = @end_1 + 15.minutes
      @after2   = @end_1 + 30.minutes
    end

    def do_times_overlap?(s1, e1, s2, e2)
      result = ActiveRecord::Base.connection.select_value("
        select do_times_overlap('#{s1}', '#{e1}', '#{s2}', '#{e2}')
      ")
      return (result == "t")
    end

    def test_do_times_overlap
      # case 0 - non-overlapping
      assert !do_times_overlap?(@start_1, @end_1, @after, @after2)
      # case 1 - movie 2 starts in the middle of movie 1
      assert do_times_overlap?(@start_1, @end_1, @between, @after)
      # case 2 - movie 2 ends in the middle of movie 1
      assert do_times_overlap?(@start_1, @end_1, @before, @between)
      # case 3 - movie 2 starts and ends during movie 1
      assert do_times_overlap?(@start_1, @end_1, @between, @between2)
      # case 4 - movie 1 starts adn ends during movie 2
      assert do_times_overlap?(@start_1, @end_1, @before, @after)
      # case 5 - non-overlapping - movie 2 starts right after movie 1
      assert !do_times_overlap?(@start_1, @end_1, @end_1, @after)
      # case 6 - non-overlapping - movie 1 starts right after movie 2
      assert !do_times_overlap?(@start_1, @end_1, @before, @start_1)
    end

    def test_exceptional_cases_caught
      test_for_db_error("Nonsense data was allowed") do 
        do_times_overlap?(@end_1, @start_1, @before, @start_1)
      end
    end

    def test_exceptional_cases_caught_2
      test_for_db_error("Nonsense data was allowed") do 
        do_times_overlap?(@start_1, @end_1, @after, @before)
      end
    end

  end
end

Anatomy of a PL/pgSQL function

Next we will write the function that will be triggered on inserts and updates, check_movie_showtime_overlaps. The actual definition of this function is shown in Example 9-3, but we’ll start by going over the structure of a PL/pgSQL function in detail, so that you can apply the lessons of this chapter to your own future, unknown situations.

Below is the basic structure of a PL/pgSQL stored procedure. Replaceable portions are in italics, and the rest of the code is common to all procedures:

CREATE FUNCTION myfunc(myparam sometype) RETURNS sometype AS $$
DECLARE
  local_variable sometype := default_value;
BEGIN
  -- procedure body goes here
  RETURN local_variable;
END;
$$ LANGUAGE plpgsql;

Aside from the basic structure, we see that a procedure has a name—our procedure is called myfunc—which we use to identify and also to execute it.

Procedures can also take parameters in a comma-separated list. Here we have a single parameter, myparam. Parameters also must have their type specified. Here we have used sometype as a placeholder for a real Postgres type. In practice we would specify an actual type here: integer, varchar, etc.

Special types are available in PL/pgSQL procedures as well, which allow us to deal with entire rows of data. To specify that a type should have the structure of a particular table’s row, you say tablename%ROWTYPE. If the structure does not match any particular table’s row definition—for example, if the row is a result of a query that joins multiple tables—you can specify the type record, which is a generic container for a set of columns returned from a query.

We can also add comments to PL/pgSQL procedures. Comments are either SQL style, beginning with –, or C style:

/* This is a C style PL/pgSQL comment. */

Notice that we end the function by specifying the language. Postgres supports stored procedures written in a number of languages. Included in the core distribution is language support for stored procedures written in Tcl, Perl, and Python, in addition to the SQL procedural language PL/pgSQL. There are also a number of languages developed outside the Postgres core, which can be downloaded and set up alongside the built-in languages. These include PHP, Java, and also Ruby. Although writing our procedures in Ruby is tempting, I don’t do so in this book because PL/pgSQL is the most portable; it matches Oracle’s PL/SQL rather closely, so you won’t have to do much relearning if you’re writing your website atop Oracle rather than Postgres.

Regardless of which language you do choose, before you can write a procedure in that language, you must tell Postgres you are going to use it within your the database. You do so by issuing the create language command:

create language plpgsql;

You only need to create a language once for each database. A good place to put any language definitions you might have is at the start of your database install SQL script.

It’s all strings

The entire body of a PL/pgSQL procedure is a string. In our example above, the string is delimited with double dollar signs, $$. We could have used the standard SQL string delimiter of an apostrophe, “’”, but then any apostrophes denoting strings within the procedure body would need to be escaped. This can get ugly, and it’s also an easy way to write bugs. Therefore, in PL/pgSQL procedures, you can create as many unique string delimiters as you want, as long as they start and end with a dollar sign. $$, $F$, and $FUNCTION$ are all valid delimiters. It’s a good practice to avoid quoting your quotes, using single apostrophes only for innermost strings. The following shows assignment of a string containing an apostrophe to a variable:

CREATE FUNCTION myfunc() RETURNS void AS $FUNC$
DECLARE
  book text := $$'Scaling to Enterprise' by Dan Chak$$;
BEGIN
END;
$FUNC$ language plpgsql;

Local variables and assignment

Any local variables used within a PL/pgSQL procedure must be declared in the declare section. The syntax is:

  varname  [:= default_value];

Once a variable has been declared, there are two mechanisms for assignment. One we’ve already seen: you specify := after a variable name. You can follow := with a literal, or with a SQL select statement. However, you drop the keyword select; it’s implied when using this syntax:

varname := foo from bar where baz = 1;

The second assignment mechanism is select into. You specify a target variable followed by the query:

select into varname foo from bar where baz = 1;

The two methods are largely equivalent. The distinction occurs when you are selecting into a variable of type record or %rowtype; in these situations, you must use the select into syntax.

Blocks

PL/pgSQL is a block-level language. The structure of a block is:

DECLARE
  -- declarations
BEGIN
  -- code
END;

This looks much like our

do_times_overlap

method itself, and indeed, entire functions do comprise a block. You can embed blocks within other blocks if you are writing complex procedures and want to keep variables local to certain chunks of code.

Trigger-specific features

When writing a procedure that is to be used as a trigger, there are some additional things to keep in mind. First, such procedures must have trigger as their return type.

Next, in an insert or update operation, a special local variable NEW is available, which contains the row being inserted or updated. In updates, the variable OLD contains the original row that is about to be changed.

Triggers: fail safe versus fail fast

The return value of a trigger procedure determines what is ultimately stored in the database. If NULL is returned, no insert or update occurs. Otherwise, a record of the same structure as the row to be inserted should be returned, and that will be the record that is inserted. You can create a new record, or modify NEW in place and return it. These are ways to fail safe in the presence of some bad data. You know you don’t want to commit the data to the database, but it’s not critical enough in nature to put a halt to the application trying to do the inserts.

Most often, if you are using triggers to guarantee referential integrity, you’ll want a hard stop that lets callers know explicitly that they’re attempting something bad. Usually that signals a bug in software code that should be fixed. In these cases, you want to fail fast. To do so, you throw an exception, which will be rethrown to the application layer:

raise exception 'Exception thrown at %', now();

You can customize exception strings with substitutions. Within a string, you insert a % wherever you want to substitute text, and then add additional parameters for each substitution. We failed fast in our do_times_overlap function when end times were before start times. Indeed, this indicates some type of programming error, and we want to catch this as soon as possible. Otherwise we will get unpredictable results from our procedure, and the bug may go undetected for some time, causing user frustration and compromising our data.

Conditionals

PL/pgSQL supports branching on conditional statements. The most common is an IF-THEN-ELSE structure, which is nestable. IFs must end with a matching END IF:

IF [conditional] THEN
  -- code
ELSE
  -- code
END IF;

Of course, there are additional conditionals in PL/pgSQL, as well as other control structures, and other language features in general. For complete documentation, see http://www.postgresql.org/docs/8.2/static/plpgsql.html. However, what we’ve explained above is enough to write and fully understand the procedure we’ll write to guarantee movie showtimes do not overlap, shown in Example 9-3.

Example 9-3. The check_for_movie_showtime_overlaps PL/pgSQL trigger function
create or replace function check_movie_showtime_overlaps()
  returns trigger as $F$
declare
  new_end_time timestamp with time zone;
  conflicting_showtime record;
begin
  new_end_time := NEW.start_time + 
       ((select length_minutes
          from movies
         where id = NEW.movie_id) || ' minutes')::interval;
  select into conflicting_showtime
         ms.*, m.*
    from movie_showtimes ms,
         movies m
   where ms.id != NEW.id
     and ms.theatre_id = NEW.theatre_id
     and ms.room = NEW.room
     and ms.movie_id = m.id
     and do_times_overlap(
           NEW.start_time,
           new_end_time,
           ms.start_time,
           ms.start_time + (m.length_minutes || ' minutes')::interval)
   limit 1;
  if conflicting_showtime is not null then
    raise exception $$This showtime overlaps with another showtime in the same auditorium: '%' starting at %$$,
      conflicting_showtime.name, conflicting_showtime.start_time;
  else
    return NEW;
  end if;
end
$F$ language plpgsql;

To specify that a procedure should be executed as a trigger, we have to create the trigger on a particular table. We create a trigger as follows:

create trigger check_movie_showtime_overlaps_iu_trigger 
  before insert or update
  on movie_showtimes 
  for each row
  execute procedure check_movie_showtime_overlaps();

The final step would be to create unit tests—positive and negative tests, and also tests that check for resilience against bogus inputs—for the movie_showtimes class. By now you should be a pro at writing unit tests, so doing so is left as an exercise for the reader.

Chapter 8 : Composite Keys and Domain Key/Normal Form
Chapter 10 : Multiple Table Inheritance