Chapter 19: Caching End to End

Back to Table of Contents.

Caching End to End

The purpose of caching is two-fold: first, to speed up access to a specific resource that, for whatever reason, might be slower than we desire; and second, to tread lightly on that resource. These goals go hand in hand. It is a sort of Catch-22; often it is the act of saturating a resource with requests that can make it slow for future requests.

In Chapter 12, we saw an example of caching at the database layer, in our implementation of a materialized view. Indeed, we saw dramatic performance improvements, on the order of 100 to 1,000 times, with even a very small dataset and an unloaded system. The 99 or 99.9% of time that was spent idle instead of processing our requests was freed up for other requests, making them faster as well.

Caching seems like a marvelous tool that should be used anywhere and everywhere. And it should. Caching can be the difference between having a problematic bottleneck on your hands and having a site or service that hums along without a hitch.

So why do so many caution against what they describe as “premature optimization” when it comes to caching? If you take Twitter as a cautionary tale, you will agree that by the time it’s obvious that you need to optimize, it’s already too late. The technorati will not wait for your upgrades and performance enhancements before they declare you dead, or worse, irrelevant.

What the worrywarts are really afraid of is that caching is hard. It’s also error-prone. As we saw in Chapter 12, maintaining the “correctness” of a cache can be quite an involved process. Worse, if the caching mechanism is not well devised, it can lock you into a model or convention you had not intended to marry yourself to.

But all of these cautions are actually reasons you should think about caching from the start. If you have not been thinking about caching, you could be writing something that is not easily cacheable. What does that code look like? And more importantly, what does cacheable code look like? Just as the act of writing unit tests can cause your code to look different—suddenly you must write “testable code,” where each function does one and only one thing—so too would your code look different if you were writing it with caching in mind. Indeed, if you write for caching from Day One, caching itself becomes a much simpler task, and far less forbidding. It is a Herculean effort to refactor a data model in situ, and so too is it terrifying to transform working code into something that can be properly cached. Caching really should be thought of from the start.

To help understand the challenges involved with caching, let’s take another look at the caching we already performed with the database layer, where, as was noted in Chapter 12, caching is somewhat straightforward to implement correctly because the toolset is well established. This time, we’ll treat each piece as a generic concept. Then we’ll examine caching of the logical model layer and see if we can come up with a caching scheme that is just as powerful and just as correct. We’ll also take a tour of other places where caching is possible, and examine potential pitfalls.

Data Layer Caching, Revisited

In Chapter 12, I described view materialization as the wax on, wax off of caching. As complex as it may have seemed, it was easy compared to what lies ahead. But it was “correct,” which is a distinction not to be taken lightly. The database—our fortress, the layer of our application built by the giants before us—has all of the mechanisms necessary to make correct caching possible. If you didn’t read Chapter 12, or you didn’t understand it fully, but you plan to cache at other layers, now is a good time to review that chapter again before we deconstruct it. We’ll now look at each part of our implementation in turn to understand its function in the greater scheme of cache correctness.

The Snapshot

The snapshot was our original, cache-complete version of our base data, transformed into the format the client wanted. First, it was complete. If a record was not present in the cache, then we know that the record did not exist. Requests for invalid data do not result in load on the bottleneck resource we are trying to conserve. In some cases—such as when using Memcache—this is not possible. If confronted with an LRU cache, we must take the hit of having to defer to the original data store when our cache doesn’t have the requested record, even if it turns out the record does not exist at all.

Second, the data is in the format that the client wants it in after joins and denormalization have been taken into account. If you are going to the trouble of caching, you should cache data after all of the hard work has been done, not before. We could have cached all of the base data from the tables that made up our view, but it was the process of computing the view itself that was time-consuming and challenging.

A good caching scheme requires a way of preparing the preliminary set of data to be placed in the cache. This is often termed cache warming. It should also put that data in the cache in the closest possible format to what the client ultimately wants.

The Refresh Function

The element to be cached should be thought of as a new, atomic piece of data, with utility in its own right. We need a way to generate these pieces of data, which often will not correspond neatly to a database row. We need to know just how big or small this unit of data is. Our view, which was created by putting a name to a frequently requested query, defined the element for us; it was a row of the new view. Elsewhere, the process may not be so obvious, but it is the same in nature.

Although in our materialized view it was the view itself that performed the computation to produce records for the materialized view, it was in combination with the refresh function that the data actually made it into our cache after the initial warming produced by the snapshot.

A complete caching scheme needs a way to recompute elements for the cache. They should be of a predefined shape and size. Elements built to replace original elements from the cache-warming process should match in structure and meaning.

Invalidation Triggers

In our materialized view implementation, we made use of database triggers to catch all moments when data underneath our cache was changing. Whether it was an insert, update, or delete, and whether the operation was targeted at a single item or multiple items, our triggers caught the event and called our refresh function. Triggers were crucial, because without them, we would not have known that our cache was slowly becoming a stale collection of garbage. The benefit of the triggers’ pervasive visibility cannot be understated, either; regardless of how the data came to change, the triggers caught the change.

To build our triggers, we also created a table on paper in which we tabulated, for our own piece of mind, which tables and which actions on those tables warranted some kind of action on the cache. This served two purposes. First, it ensured we were not churning the cache more than necessary, losing some of the performance benefit achieved through caching. Second, it ensured that we were aware of each and every moment where a cache invalidation was occurring. True, database triggers are infallible when it comes to detecting changes in database tables, but they only do this if you are wise enough to add the trigger in the first place.

A correct caching scheme must be able to detect all changes that would invalidate the cache, and take action at those moments. The cache designer, by extension, must be aware of all of the events that have to be monitored in order to ensure the cache does not become stale.

Indexing

In our materialized view implementation, we made short order of adding indexes to our cache using the create index command. Most often a cache will have some kind of key that lets you get at the data. In a database table, this would be the primary key of the table itself, and in a hash-like cache, such as Memcache, the primary key is the cache key itself.

But you don’t always want a single element at a time. While sometimes querying by the primary key is enough, often you want multiple objects at once, or you are looking for objects that satisfy some other property. Storing objects multiple times under different keys is wasteful, so where such a practice can be avoided, it should be. On the other hand, searching through the entire cache to find some element by anything other than the primary key also defeats the purpose of caching.

A good caching scheme may need to support multiple indexes on the data being cached. That, or it must allow for loose querying against the primary index to simulate multiple indexes.

Logical Model Caching

Your logical models sit atop the physical models, often transforming them in some way before returning them as the result of a service request. Later in this chapter, we’ll discuss how you can cache physical models; however, even with the majority of your physical models cached, the transformations required to turn them into logical model objects can be quite costly, too. Rather than recompute your logical model objects on each request, they too can be cached.

However, because existing tutorials do not treat logical and physical models differently, Rails websites today are not built with this principle in mind. As a consequence, there are no plugins currently available for easing logical model caching. Luckily, it’s easy to accomplish even without a plugin. In this section we’ll see how to go about building a complete caching layer on our own.

To implement our caching, we’ll use Memcache. If you’re not already using it, download it, install it, install the memcache-client gem, and add the following to your environment.rb file:

CACHE = MemCache.new \
  :c_threshold => 10_000,
  :compression => false,
  :debug => false,
  :namespace => RAILS_ENV,
  :readonly => false,
  :urlencode => false

CACHE.servers = '127.0.0.1:11211'

To help illustrate how our caching scheme will work, we’ll return to the logical model for the Movie class from Chapter 16, shown in Example 16-3. Just as with our materialized view from Chapter 12, we’d like our interface to cached data to be a drop-in replacement for our original data. Therefore, the first thing we do is push our original class out of the way, renaming it with the prefix Uncached, so that the cached version can be accessible via the original name (Example 19-1).

Example 19-1. Logical model for a movie, app/models/logical/movie.rb

module Logical
  class UncachedMovie < ActionWebService::Struct  
    def self.get(physical_movie_id)
      return nil if !(m = Physical::Movie.find_by_id(physical_movie_id)
      Movie.new(:id => m.id,
                :name => m.name,
                :length_minutes => m.length_minutes,
                :rating_id => m.rating.id,
                :rating_description => m.rating.description)
    end    
  end
end

Our caching wrapper classes will do most of the work, but this work is essentially the same for all wrapper classes we write. Therefore, we’ll start by building a base class, CachedObject, which will define and handle the common tasks. Example 19-2 shows this class, which exists within the Logical module. We’ll subclass CachedObject once for every class we plan cache.

Example 19-2. Our base class for caching wrapper classes, CachedObject

module Logical
  class CachedObject < ActionWebService::Struct
    
    def self.uncached_class
      eval("Uncached" + self.name.split('::')[-1])
    end

    def self.cache_key(*params)
      return "#{self.name}_#{self::VERSION}_#{params.join('_')}"
    end

    def self.get(*params)
      key = cache_key(params)
      CACHE[key] ||= self.uncached_class.get(params)
    end

    def self.rebuild(*params)
      key = cache_key(params)
      CACHE[key] = self.uncached_class.get(params)      
    end
    
    def self.clear(*params)
      key = cache_key(cache_key_params)
      CACHE[key] = nil
    end

    class Sweeper < ActiveRecord::Observer
      #observe ActiveRecord::Base
    end
    
  end
end

Let’s examine each piece of the CachedObject class, one item at a time.

The first method, uncached_class, defines our naming convention: all of our original classes are renamed to have a prefix Uncached, and the new caching wrapper classes take on the old class names. This method first peels off any modules our class might be in, and then adds Uncached to the front of what is left. This method will be used in most of our other methods.

The next method, cache_key, as its name suggests, builds a key that is unique to the object being requested. It concatenates the class name of the object being cached, a version number, and any additional parameters passed in as the primary key to create a unique identifier.

Note that idea of a version number is new. Because objects placed in the cache persist across a software upgrade, we need to be careful to invalidate cached objects if we change the structure of those objects. For example, if we added a field to the logical movie class to represent the average reviewer rating, we would need to avoid retrieving older objects out of the cache that do not have that field. When the new code tries to manipulate an old object, an error occurs because the field is not present in the old object retrieved from the cache.

To solve this, we add a VERSION constant to each class, and take care to bump this number up whenever the structure of the class changes. All of the old cache keys, which included the old version number, never get requested and slowly get purged from the cache.

To make this work, we rewrite the Movie class definition in our movies_service.rb file to look like this:

class Movie < CachedObject
    VERSION = 1
    member :id,                    :integer
    member :name,                  :string
    member :length_minutes,        :integer
    member :rating_id,             :string
    member :rating_description,    :string
  end

Note also that we are now defining this class to inherit from CachedObject rather than directly from ActionWebService::Struct.

Next, we define a get function. This function has the same purpose as the get function we defined for Movie, but it first checks to see if the object exists in the cache. If it does, the get function returns that object. If the object is not found in the cache, get will be called on the uncached version of this class. The result is stored in the cache under the appropriate key, and is also returned to the caller.

The rebuild method is very similar to the get method, but it reloads the uncached version of the object regardless of whether it exists in the cache or not. It then puts the newly built item in the cache, regardless of whether it was there before or not. This method would be used when we detect that an invalidation has occurred and we need to replace a stale object.

The final method in CachedObject, clear, gives us an easy way to clear an item from the cache. If passed the same parameters as passed to the get method, this method removes the cached item from our cache. This is useful when we detect an object has been deleted.

We also define an inner class type, Sweeper, based off of ActiveRecord::Observer. In our caching wrapper classes, we’ll subclass this class to watch for events that would require an invalidation or rebuild, much like our materialized view triggers.

Because we changed the subclass of our class definition to inherit from CachedObject in the shared “header” file, we need to define in that shared code that CachedObject is simply subclass of ActionWebService::Struct. We do that by placing the following definition in movies_service.rb:

class CachedObject < ActionWebService::Struct ; end

Just as with our materialized view, we may not always want to rebuild objects as soon as they are invalidated. In N:1 relationships, this could result in a lot of rebuilding that essentially goes to waste. Therefore, we’d like a way to push off rebuilds until the end of a request. We’ll give ourselves this ability by creating a cache manager class, CacheManager, show in Example 19-3.

Example 19-3. The logical model cache manager

module Logical
  class CacheManager  
    include Singleton
  
    def initialize
      @@objects_to_rebuild = {}
    end
  
    def schedule_rebuild(klass, *get_key)
      @@objects_to_rebuild[[klass, get_key]] = true
    end
  
    def rebuild
      keys = @@objects_to_rebuild.keys.clone
      @@objects_to_rebuild.clear
      keys.each {|k, v| k.rebuild(*v)}
    end
  end
end

First, note that this class is a singleton. This means that we’ll access its methods via the variable CacheManager.instance. The cache manager class initializes a hash, which stores the keys and classes that have a deferred rebuild. In the event we want to rebuild an object late, rather than call rebuild on a caching wrapper class directly, we instead call schedule_rebuild on the CacheManager singleton instance. The schedule_rebuild function accepts as its parameters the caching wrapper class and the parameters that would be passed to the rebuild method. The values are added as keys of the @@objects_to_rebuild hash, rather than concatenated on an array, to ensure that each value appears only once.

The final method, rebuild, rebuilds each object defined by the cache keys of the @@objects_to_rebuild hash. It also clears the hash for the next time. Note that we clone the hash before starting the rebuild process, so that we can clear it before we begin rebuilding. Why we do this will become apparent soon, as we’ll see it will be possible for rebuilds to trigger other objects to get newly invalidated.

The cache manager’s rebuild method needs to be called explicitly when we’re ready to rebuild invalidated objects. In the materialized view implementation, this happened within the reconciler view. At the application layer, we can call this method after each request has been processed. Example 19-4 shows how to set this up with an after filter. We also do the work in a thread, so the request response can be returned immediately, and no user actually has to wait for the rebuild process to finish.

Example 19-4. CacheManager hooks added to application.rb

after_filter do 
  Thread.new do 
    Logical::CacheManager.instance.rebuild
  end
end

We now have enough infrastructure in place to create our first caching wrapper class. In Example 19-5, we define the Movie class again, but this time it is the class that manages caching for the UncachedMovie class. Most of the work of this class has already been taken care of within the base class. All we need to do now is define the sweeper Sweeper subclass, which will monitor the physical layer for changes.

Example 19-5. The caching wrapper for the Logical::Movie class

module Logical
  class Movie < CachedObject
    class MovieSweeper < Sweeper
      observe Physical::Movie, Physical::Rating

      def after_save(obj)
        if obj.kind_of?(Physical::Movie)
          CacheManager.instance.schedule_rebuild(Movie, obj.id)
        end
        if obj.kind_of?(Physical::Rating)
          Physical::Movie.find_all_by_rating_id(
            obj.id, :select => 'id'
          ).each do |movie|
            Movie.clear(movie.id)
          end
        end
      end

      def after_destroy(obj)
        if obj.kind_of?(Physical::Movie)
          Movie.clear(obj.id)
        end
      end
    end
    MovieSweeper.instance
  end
end

In Example 19-5, we define the MovieSweeper class, which observes the Physical::Movie class and the Physical::Rating class. Note that we can observe as many physical classes as we need, depending on the complexity of our logical model class.

Next, we define callback methods for after_save and after_destroy. In each, we check the type of the object being observed, as it can be either a Physical::Movie or Physical::Rating. We take different steps for each action and each type. In after_save, we defer a rebuild when a movie object changes, though we could have rebuilt immediately by calling Movie.rebuild(obj.id) directly. Instead, we call the CacheManager instance’s schedule_rebuild function. If a rating object changes, many movies can be affected, so rather than rebuild them, we clear them all. They will be rebuilt piecemeal in future requests.

In the after_destroy method, we clear the cache of the invalid movie object right away if it is the movie object that was detected to have changed. We don’t bother doing anything if a rating was destroyed or saved; our referential integrity guarantees that no movies can exist for a rating that is just now being inserted or deleted.

We can now test logical model layer caching. Example 19-6 shows these tests. We create a setup method that creates a new physical layer movie item. Then we clear the cache of any item that may have been left over from a previous test.

In the first test, test_logical_caching, we retrieve a local movie object through the Movie class, which now refers to our caching wrapper. We assert that the movie names are the same. After updating the physical model, because we deferred rebuild, initially the movie names do not match when comparing the physical and logical models. However, after calling the cache manager’s rebuild function, they once again match.

In the second test, test_dependent_obj_invalidation, we are testing that altering a physical rating object will propagate up through the caching layer of the logical objects as well. First we retrieve the cached logical model object for the movie we created in the setup method. After changing the movie rating description, the physical and already loaded logical models do not match. However, when we request the logical model object again, it does match.

Example 19-6. A unit test for the cache manager

require File.dirname(__FILE__) + '/../../test_helper'

class CacheManagerTestCase < Test::Unit::TestCase
  def setup
    @p = Physical::Movie.create!(
      :name => 'When Harry Met Sally',
      :length_minutes => 120,
      :rating => Physical::Rating::PG13)

    # memcache persists between applicationr restarts
    l = Logical::Movie.clear(@p.id)
  end

  def test_logical_caching
    # on the first get, the objects should match
    l = Logical::Movie.get(@p.id)
    assert l.name == @p.name

    # after an update to the physical model, the cached
    # value will not match
    @p.update_attribute(:name, 'new name')
    assert l.name != @p.name

    # after issuing a rebuild, the values will
    # again match
    Logical::CacheManager.instance.rebuild
    l = Logical::Movie.get(@p.id)
    assert l.name == @p.name
  end
  
  def test_dependent_obj_invalidation
    # initially the descriptions should match
    l = Logical::Movie.get(@p.id)
    assert l.rating_description == Physical::Rating::PG13.description

    # after updating the rating description,
    # the cached value will not match
    Physical::Rating::PG13.update_attribute(:description, 'new desc')
    assert l.rating_description != Physical::Rating::PG13.description
    
    # invalidation is not deferred, so logical model will pick
    # up changes immediately
    l = Logical::Movie.get(@p.id)
    assert l.rating_description == Physical::Rating::PG13.description
  end
end

Example 19-7 shows the results of running these two tests. They pass, as we expect.

Example 19-7. Results of running the cache manager unit test

2 tests, 4 assertions, 0 failures, 1 errors
ChakBookPro: chak$ ruby test/unit/logical/cache_manager_test.rb 
Loaded suite test/unit/logical/cache_manager_test
Started
..
Finished in 0.033217 seconds.

2 tests, 6 assertions, 0 failures, 0 errors

Considerations

In a high-traffic website, clearing an object rather than rebuilding it can be much more costly than might be initially expected. For example, if you receive 10 requests for an object per second, and it takes one second to rebuild that object, then you might pay up to 10 times for a rebuild if you clear, rather than rebuild. All of the requests between the clear action and the conclusion of the first rebuild action see a cache miss, so they kick off additional, unneeded rebuilds. This is shown in Figure 19-1.

er_1901

Figure 19-1. Clearing the cache after an invalidation can incur a large penalty in excess rebuilds

Previous to time –0.1, requests are processed quickly because the data requested comes out of the cache. At time –0.1, some request occurs that causes the cache to be cleared. The next request, at time 0, does not find the requested data in the cache, so the rebuild process, which takes 0.4 seconds, begins. The next three requests still do not find the data in the cache, so each one kicks off another (unnecessary) rebuild process. Finally, at time 0.4, the first rebuild process has completed, so requests starting at this time are served from the cache. Meanwhile, three rebuild processes are still working away, sapping precious computing resources for no real purpose. In our caching mechanism, this is what would happen if clear is called on data that still exists, i.e., data that has not been deleted.

There are two alternatives. One, pictured in Figure 19-2, is to continue to serve the stale object while the new object is being rebuilt. This method has the benefit of not wasting any resources in rebuilding the same data multiple times. It also returns data to the caller quickly, although the data is slightly out of date data. This is the behavior implemented in the examples above, where schedule_rebuild is called for data that has been detected to be invalid. While the data is being rebuilt in the background at the end of the request, subsequent requests continue to return results based on the original value stored in the cache.

er_1902

Figure 19-2. A background rebuild wastes no resources, but temporarily returns stale data

A third option is a compromise between the first two, and it’s depicted in Figure 19-3. If the rebuild acquires a read lock on the data, then subsequent requests wait until the rebuild is complete before returning data. This wait time is always shorter than the rebuild time itself, and the data returned from all requests will be fresh.

er_1903

Figure 19-3. Locking cached data during a rebuild

Unfortunately, this option is difficult to implement efficiently using Memcache, which does not provide native locking support. A lock can be simulated by adding a key to the cache that means “this data is locked,” and readers would check to see if that key exists before reading. If it does, they should wait, then poll again until the lock has disappeared. Code to implement this behavior is shown in Example 19-8. However, this sort of implementation requires frequent polling to return data quickly after a rebuild is complete; you can save on polling by increasing the granularity of sleeps between poll attempts. However, this process adds time overhead to get back a result.

Another drawback of this approach is that each get request also incurs an additional round-trip penalty to ensure that the data being requested is not currently locked.

Example 19-8 shows our CachedObject class, modified to incorporate a locking mechanism that prevents a get request from completing until a simultaneous rebuild request has finished. Note that only the get and rebuild methods, along with new helper methods, are shown here, although the rest of the methods from Example 19-2 would still be part of this class. Additions are shown in bold.

Example 19-8. A locking mechanism built upon Memcache, with polling

module Logical
  class CachedObject

    def self.lock_key(key)
      "lock:#{key}"
    end

    def self.locked?(key, timeout_seconds = 10)
      start = CACHE[lock_key(key)]
      start ? Time.new - start < timeout_seconds : false
    end

    def self.with_lock(key, timeout_seconds = 10, &block)
      start = Time.new
      acquired_lock = false
      while (Time.new - start < timeout_seconds) && !acquired_lock
        acquired_lock = !CACHE.add(lock_key(key), Time.new).index("NOT_STORED")
        sleep 0.1 if !acquired_lock
      end
      yield
      CACHE.delete(lock_key(key))
    end
        
    def self.get(*params)
      key = cache_key(params)
      sleep 0.1 while locked?(key)
      CACHE[key] ||= self.uncached_class.get(params)
    end

    def self.rebuild(*params)
      key = cache_key(params)
      with_lock(key) do
        CACHE[key] = self.uncached_class.get(params)      
      end
    end

  end
end

Since we are simulating native locking by putting a semaphore in the cache, we need a method to generate a semaphore key unique to each item we might rebuild. The lock_key method accepts the key created by the cache_key method from Example 19-2, which should be unique, and prepends lock:. This utility method will be used in our other methods.

The next method, locked?, checks the cache for the presence of the semaphore. Since it’s possible that there was an error in the method doing the locking, or a power outage, or some other event beyond our control, locked? also takes an optional parameter to specify a timeout, after which it will return false even if the semaphore exists, giving the caller permission to take over the lock.

The next method, with_lock, takes a block as a parameter, and only executes that block once the lock has been acquired, or a timeout has passed. The add method of the Memcache API will only add an item if it does not already exist. When an item is successfully added, “STORED” is returned; otherwise “NOT STORED” is returned. The variable acquired_lock only becomes true when “NOT_STORED” is not found in the return string.

The get method has been modified so that it blocks while a key is locked. Again, if the default timeout, set at 10 seconds, passes, locked? will return true and the get will proceed.

Finally, the rebuild method has been modified so that the get call on the uncached class is wrapped by the new with_lock method, which ensures that calls to get on the cached class are blocked until the rebuild process is complete. Of course, this blocks simultaneous calls to rebuild as well.

Avoiding Rebuilding with Stale Data

If your logical model objects are composed of other logical model objects, rebuilding invalid items can be tricky. Take for example the simple example shown in Figure 19-4. Here, a solid line indicates a composition, and a dotted line indicates an observer relationship. The logical Movie class observes the physical Movie class, and will correctly get rebuilt using the data from the physical class when the physical movie object changes.

However, the logical Showtime class gets its data from the logical Movie class. Yet it also must observe the physical Movie class to detect an invalidation, because there is no easy way to observe a non-ActiveRecord type class. But there’s a problem here. We have not provided a mechanism to order rebuilds. We have no way to ensure that the Movie object gets rebuilt before the Showtime object, and if the Showtime object does get rebuilt before the Movie object, it will be rebuilt with stale data (Figure 19-4).

Certainly, we can get around this by always relying on physical models for rebuilding logical ones, but this sacrifices a lot of flexibility, and is also very inelegant. The whole purpose of the logical models is that they more closely resemble the way we naturally think about our application’s problem domain, so we’d like to use them as much as possible.

er_1904

Figure 19-4. The logical Showtime object

A better solution, then, is to devise a mechanism to observe logical model rebuilds as well as physical model invalidations. That way, only the logical Movie class need observe the physical Movie class, and the Showtime class can observe the logical Movie class. In general, the rule would be that you rebuild only from objects you observe.

We can accomplish this by creating a pseudo-class called LogicalModelInvalidation that inherits from ActiveRecord::Base. We’ll never save records of this class, but we’ll piggyback off the built-in ActiveRecord observers to simulate logical model observers.

First, we create a dummy table in the database with the attributes we want our invalidation pseudoclass to have, as shown in Example 19-9.

Example 19-9. A dummy table with the attributes for the invalidation pseudoclass

create table logical_model_invalidations (
  id integer,
  type text,
  object_id text
);

We need an id column because Rails expects a primary key. However, Rails does not let you easily set the value of an id primary key column, so we’ll also add an object_id column to simulate the real primary key of the object being invalidated. Finally, we have a type column, because we will take advantage of Rails single table inheritance. We’ll subclass this class for every class that can be invalidated.

Then, within the CachedObject class, we define the inner class shown in Example 19-10.

Example 19-10. Inner class within the CachedObject class

module Logical
  class CachedObject < ActionWebService::Struct
    class LogicalModelInvalidation < ActiveRecord::Base
      def validate
        errors.add_to_base("invalid")
      end
    end
  end
end

This is essentially an empty ActiveRecord class, but with one special property: it cannot be saved. The validate method has been redefined to always fail by adding a dummy error message whenever it is run. We don’t actually want to save objects of this type; we just want to gain the functionality that comes for free in ActiveRecord objects in that they can be observed by sweeper classes. Although instances of this class won’t ever be saved, the before_validation hook can be observed.

Each caching wrapper class should subclass this class, with the name Invalidation. For example, in the wrapper Movie logical model class, inside of the class definition itself, we add the declaration shown in Example 19-11.

Example 19-11. Declaration inside the class definition

module Logical
  class Movie < CachedObject
    class Invalidation < LogicalModelInvalidation; end
  end
end

It’s fine for every class to name this class with the same name, because the scoping within the Logical model, and further within the CachedObject subclass gives us the context we need to take appropriate action. The Invalidation subclass we created in Example 19-11 is actually named Logical::Movie::Invalidation, and other subclasses would be similarly scoped for their logical model class type.

It’s actually preferable that the name of the subclass is always the same; this lets us trigger invalidations programmatically. In Example 19-12, we make use of this when we upgrade our rebuild and clear methods to trigger an observable logical model invalidation. The additions are in bold.

Example 19-12. Upgrading the rebuild and clear methods

module Logical
  class CachedObject < ActionWebService::Struct
    def self.rebuild(*params)
      key = cache_key(params)
      with_lock(key) do
        CACHE[key] = self.uncached_class.get(params)
      end
      eval("#{self.name}::Invalidation").create(:object_id => params)
    end

    def self.clear(*params)
      key = cache_key(params)
      CACHE[key] = nil
      eval("#{self.name}::Invalidation").create(:object_id => params)
    end
  end    
end

Whenever an object is cleared or rebuilt, an instance of the corresponding Invalidation class is created. The create method tries to save the object immediately, which calls validate. Although validate adds an error and the create operation fails, we nonetheless have an observable event to track.

Now, other models can observe logical model invalidations. To observe the logical Movie class for invalidations, a sweeper would observe Logical::Movie::Invalidation. An example of what the logical model for the caching wrapper of the Showtime class might look is shown in Example 19-13.

Example 19-13. A logical model observing another logical model

module Logical
  class Showtime < CachedObject
    
    class ShowtimeSweeper < Sweeper
      observe Logical::Movie::Invalidation

      def before_validation(obj)
        Physical::MovieShowtime.find_all_by_movie_id(
          obj.object_id, :select => 'id'
        ).each do |ms|
          Logical::CacheManager.instance.schedule_rebuild(Showtime, ms.id)
        end
      end
    end
    ShowtimeSweeper.instance
    
    class Invalidation < LogicalModelInvalidation; end
  end
end

Based on Example 19-13, we now have implemented a logical model class that is composed of, and observes, another logical model object. We are no longer constrained to the left side of Figure 19-4, but can be confident that our implementation affords the cache correctness provided by the scenario on the right side of the figure.

Cache Indexes

In all of our examples of invalidation, it was necessary to go back to the database to find dependent objects. For example, when a movie was invalidated, we had to query to get the IDs of the showtimes that were for that movie. While doing so is not the worst thing in the world—ideally, invalidations that require database access are infrequent compared to reads that do not—this does impose some additional burden on the database.

It would be ideal if the caching layer itself could maintain indexes that could be used to find associated objects without resorting to queries in the data layer. Unfortunately, Memcache does not provide any native indexing support. We could build our own and store the indexes as regular objects stored in Memcache, but it would be tedious, inefficient, and also error-prone. Modifying the index would involve acquiring a lock, which we have already seen is not Memcache’s forte; it would require transferring the entire index to the client, modifying it in place, and then setting it back in Memcache. The various costs of developer and processor time are likely to outweigh the time spent in the database querying database indexes.

However, it should be noted that Memcache is not the only choice for logical model caching. SimpleDB, an Amazon web service, is even better-suited than Memcache for this task. SimpleDB, although it is a database of sorts, is not a relational database, but rather it is hash-based storage on steroids. Unlike Memcache, which provides simple key-value pair storage, SimpleDB provides key to key-value pair storage. This allows it to automatically index your data on all attributes. With SimpleDB, you can easily request all of the keys for showtimes that correspond to a particular movie.

Although SimpleDB’s users will define what it can do (rare is the tool that is used only as intended), cache-complete storage atop a relational database is what SimpleDB is really for. Unlike Memcache, which is LRU only, SimpleDB’s data is persistent, meaning you can create cache-complete copies of your data, and know definitively that if an item is not in the cache, it is not in the database either.

Of course, it is impractical to use SimpleDB as a cache unless your application is running on Amazon’s EC2 electronic computing cloud. Accessed from other places, the latency is likely to outweigh any other benefits.

A Ruby gem that provides a client interface to SimpleDB is available at http://rightaws.rubyforge.org.

Other Caching

There are a number of places in an application, both within as well as at the edges, for caching. Each has its own considerations and difficulties. The rest of this chapter provides an overview of these locations, and attempts to point out the most worrisome drawbacks and pitfalls, and what steps may be taken from day one to help get around them. Unfortunately, at some layers there is little recourse when the infrastructure falls short of our hopes. We will start at the bottom and work our way up.

Query Plan Caching

Whenever you execute a database query, the database first must come up with a plan for how to retrieve the data you requested. There may be many ways to go about executing your query, based on the number of tables involved in the query and the number of indexes that may be used.

For large queries, the time spent planning a query generally pales in comparison to the time spent executing the plan. However, any time spent planning queries is time not spent doing something else. And as query complexity rises, so too does the planning time. The number of possible paths to take for N tables joined in a single query is N!, even before you factor in the different ways to treat those tables due to indexes. Finding the very best plan could take an eternity. Because of this, Postgres has a cut-off point for the number of tables after which it will avoid an exhaustive search and instead do a heuristic search for the best query plan so that planning time does not blow up out of control.

For simple queries, while the plans may be quick to generate, the time spent in the query planner begins to take up a larger and larger percentage of the overall time needed to execute the query.

Luckily, these plans do not need to be created each time a query is executed. While you may execute millions of queries per day, most of them are slightly different versions of one another; the structure of the query is the same, but the values in the query change. If you can make it clear to the database that you have a certain number of “template” queries, and only the parameters are changing, then the database need only create a plan once per template. So if you execute one million queries per day, but all of these queries fall into a pattern of 100 query templates, you can eliminate 999,900 runs through the query planner by registering these templates.

You don’t have to take special action to register your query templates. However, when you execute your queries, you do have to separate out the template from the parameters when passing the query to the database. These templated variables are known as bind variables.

Unfortunately, as of this writing, bind variables present a problem for Rails users. The Rails framework appears to support bind variables. Indeed, when you write a query, you can pass a template with question marks denote bind variables. Example 19-14 shows a snippet where we define and then pass bind variables to an ActiveRecord query, with the segments relating to bind variables in bold.

Example 19-14. Snippets of code from Example 17-17, highlighting usage of bind variables

...
if !zip_code.empty?
  conditions_sql << "miles_between_lat_long(
    (select latitude from zip_codes where zip = ?),
    (select longitude from zip_codes where zip = ?),
    latitude, longitude) < 15"
  conditions_vars.concat [zip_code]*2
end  
...
psts = Physical::MovieShowtimeWithCurrentAndSoldOut.find(:all,
  :select => [:id, :movie_id, :theatre_id, :latitude, :longtitude],
  :include => [:movie, :theatre],
  :conditions => [conditions_sql.join(" and "), *conditions_vars])

However, ActiveRecord does not support passing these templates and bind variables through to the database drivers that actually communicate with your database. This means that although you are going through the motions of using bind variables, currently, with Rails, there is no benefit. Even if you have 100 templates for your one million query executions, currently with Rails, your database actually computes one million query plans.

Clearly this is a bad thing from a performance perspective, and there are other causes for concern as well. As of this writing, work is in progress for some database drivers to make ActiveRecord respect bind variables all the way through to the database. I encourage any interested readers to check the status of this work, and to lend a hand if you can.

Database Query Caching

MySQL has a feature called the query cache, which memorizes the results of each select query you execute. If you execute some query, and then execute the same query again, MySQL simply looks up the result from the first run and returns it.

If your application is exclusively read-based, this can give you a large performance boost when you are evaluating the same queries over and over again. However, since cache correctness is hard to maintain, as we saw in Chapter 14 in our discussion of materialized views, as well as in this chapter with logical model caching, MySQL does not attempt to maintain correctness in place within an existing query cache. Instead, to keep things as simple as possible, MySQL simply flushes the entire query cache for a table whenever that table changes in any way, be it via an insert, update, delete, or any DDL operation that modifies the table.

Certainly, having a query cache like MySQL’s is not generally harmful. In the rare situation where your application only sees cache misses, the penalty of incurred overhead to maintain the cache is measured at 13% in the MySQL documentation.

So while the query cache can be extremely helpful in a read-only context, in practice it does not provide the boost you might expect on an active website. Because of the modest gains a query cache provides in real-world situations, this feature is not found in most databases, and even in MySQL, it should not be relied up on as your primary caching mechanism. An architecture that relies on a query cache begs the following question, which will come up again in the following discussion of the Rails query cache: Why does your application request the same data over and over again?

If you know you need the same data repeatedly, why isn’t it being cached closer to the user? Not only is the database the furthest point from the original request, but it is also the likeliest to become your bottleneck. Even if the queries you execute are fast, avoiding them altogether is still faster. In this chapter, we’ve already seen how we can avoid most queries with logical model caching.

Rails Query Caching

In Rails 2.0, an application layer query cache was introduced, which, for the period of a single request, caches the results of each select query you execute.

This feature is actually quite puzzling. If you don’t have a functional logical model cache, it makes sense that successive requests might need to access the same data anew from the database. But it is perplexing why, within a single request, you would need to request the exact same information twice, rather than store the result in a variable and reuse the data stored there throughout the request.

Rather than providing any performance gain you could not achieve through good programming practice, the Rails query cache actually encourages you to write bad code. Example 19-15 shows the same process, once without utilizing the cache, and a second time relying on it.

Example 19-15. Avoiding and using the Rails query cache

# avoiding cache, good style
m = Movie.find(5)
if m.rating == Rating::R
  # do something
end
if m.length_minutes > 90
  # do something else
end

# relying on cache, bad style
if Movie.find(5).rating == Rating::R
  # do something
end
if Movie.find(5).length_minutes > 90
  # do something else
end

Of course, this is a contrived example. The real benefit would be seen if multiple functions took a movie_id argument and proceeded to look the movie up in the database for each call to the function. But that, too, is bad practice; it’s just harder to see it.

Instead, you can define functions that accept objects as arguments and depend on the caller to do the database lookup. Example 19-16 shows two versions of the same method, needs_id_check?, which returns true if the movie rating is R.

Example 19-16. Two versions of the same method, one intended to avoid the Rails query cache

# relies on query cache, bad style
def needes_id_check?(movie_id)
  return Movie.find(movie_id).rating == Rating::R
end

# avoid cache, good style
def needs_id_check?(movie)
  return movie.rating == Rating::R
end

Of course, the second method is preferable, and outperforms any query cache that can be build because the object is immediately present—it need not be looked up in a database or query cache.

Fragment, Action, and Page Caching

Rails has a number of built-in mechanisms for caching the result of page rendering, either the entire page by URL, an action, or a fragment of a view. These mechanisms can speed up your page rendering times dramatically, dropping them to nearly zero, but maintaining cache correctness at the granularity of an entire page is a large challenge.

If your goal is scaling, and if you have maintained cache correctness at your logical model layer, you don’t need fragment, action, or page caching. While it is true that they do improve speed of page rendering, they do not do more for you with respect to scaling than a cache complete logical model layer would do.

Remember, the definition of scaling is the ability to serve a linear growth of users by adding hardware linearly. To scale, we need only to ensure that we have no bottlenecks that cannot be eliminated with additional dollars and hardware. In almost all cases, the challenge is with squeezing more and more queries through the database layer. However, if you are cache-complete, you won’t execute database queries to render your pages. An increase in requests can be handled by adding more application servers. Rather than request data from the database, it will be requested from the cache. The best case scenario for your users would be if you maintained one server per visitor, so each user feels as if she is the only one browsing your website. Of course, this is not practical, but it is possible with a complete and shared cache. With this in mind, it should be understood that adding additional fragment, action, or page caching on top of these other caches gives you an improvement in speed, not an improvement in your ability to scale.


Chapter 18 : A RESTful Web Service