Chapter 3: Organizing with Modules

Back to Table of Contents.

Organizing with Modules

Like most projects, the first Rails web site I worked on started with just a handful of tables and a handful of classes. In those early days, we couldn’t imagine the features or reports we would need in our third year, which is where we are at the time of this writing. We only knew what we knew about the business at the current moment.

When you just have a handful of tables to deal with, there doesn’t seem to be much reason to impose an organizational strategy. At that time, Rails was at version 0.13, and there weren’t many big Rails sites around whose teams could offer their expert advice on Day One organization either. Such advice can often only be dispensed in hindsight, and back then—and even frequently today—most new Rails users were new to Ruby as well.

So we plodded along, developing our site, and along with it our own 20/20 hindsight. Today, our hindsight is really good, but the organization of our original core application is not so good. As of present writing, the models directory of that core application contains 188 classes.

Such a pile of classes is overwhelming for new employees and even some veterans. It can be quite a challenge to remember where everything is, or what effects a change in one class might have on the other 187 classes. The advice from the original developer of a “big Rails site,” in hindsight, is to organize into modules from Day One.

Even in the initial development of our application, at around class number 40 we sensed something wasn’t right, and that some kind of namespacing and organization was necessary. But even with 40 classes, the instinct to move forward as opposed to laterally was strong, and it seemed that there wasn’t ever time to refactor. If 40 classes represents inertia, then 188 is simply far too late to start organizing. An early investment of time to set up some organization will pay big dividends when your site has grown an order of magnitude or two in complexity.

Files and Directories

When you first create your Rails application, a number of directories are created. Example 3-1 highlights the four we are concerned with in this chapter, namely the controllers, helpers, and models directories under the app directory.

All but the simplest projects will eventually need to be broken up into modules to minimize the number of classes a developer needs to be concerned with at any given time, but the skeleton created by the rails command doesn’t set us up to start working within modules from Day One.

Example 3-1. Abbreviated output from the rails command to create a new project

bash-3.2$ rails example_app
      create
      create  app/controllers
      create  app/helpers
      create  app/models
      create app/views/layouts
      ...

Luckily, even without knowing a single thing about our application, we can start organizing our model, controller, and helper files into three categories that will pave the way for a well-organized application down the road: physical, logical and service, and utility. Not every one of the top-level directories will need each of these subdirectories, and you may find some will need others, as well. These three are a good start. Below we’ll see what each directory is for and where it belongs.

The first category, physical, corresponds to the models, controllers, and helpers and views normally associated with a Rails application. These are the models that descend from ActiveRecord::Base, and correspond directly to physical database tables.

The next category, the pair of logical and service, comes into play when your application is large or complex enough that it is ready to evolve to an SOA. At this point you will define an API for clients that should remain relatively fixed. To prevent your service API from changing every time you tweak your database design, an abstraction layer that’s not tied directly to database tables is necessary. Under the models directory, add a directory logical, where the logical or domain model classes will go. Under controllers, create a directory called service, which we’ll use later in this book when we break our application into a service-oriented architecture. Although you won’t have anything to put in these directories right now, but create them anyway. The mere presence of this hierarchy will remind you that something does go here, and that it’s of a different sort than what we put in the physical directories.

The third category is for utility scripts intended to be run with script/runner. These are background processes that send out emails or do various other tasks. Usually they run on a schedule or are run by hand. These classes don’t have helpers, and the controller is cron or you, the operator.

You may find that your application has additional categories. You will also no doubt find the need, within each category, to further subdivide in order to maintain your own sanity. The main point to understand is that if you don’t lay out a framework for managing different types of classes from the start, you will end up with a mess that is hard to tame late in the game.

Therefore, it’s strongly recommended that right at the beginning you expand the initial set of directories created for you under app. Even if you don’t use all of the directories right away, having them serves as a reminder that files should be organized up front. In the early days of your application, you will only fill the directories related to your physical models, but having them pre-organized into their own directories makes it much easier to add other types of classes later.

A proposed generic directory hierarchy that can be used as a starting point is shown in Example 3-2. Additions from the basic set are in bold. You can pick whatever names you like for these. The rest of this chapter will be devoted to organization within a given top-level module. The focus will primarily be on the physical models, since that’s the first part of the application to be written. We’ll also see how interactions from one module to another are possible. To do so, we’ll present a standard way to write a utility model that interacts with physical models.

Example 3-2. Directory structure organized from Day One

app/controllers
app/controllers/physical
app/controllers/service
app/helpers
app/helpers/physical
app/models
app/models/physical
app/models/logical
app/views/layouts
app/views/physical

Module Boundaries for Namespacing

Namespaces are a feature of many computer programming languages, and have great benefits for large projects, especially those with multiple programmers working simultaneously. At the simplest level, a namespace is a way to group related classes together, and at the same time separate those classes from other, unrelated classes.

A project that uses namespaces has three big wins over a project that doesn’t:

  1. One developer can work on a feature within the confines of one namespace while another developer works on a different feature within the confines of its own namespace. They don’t risk stepping on each other’s toes.
  2. If it makes sense for the overall project, two classes can have the same name, as long as they are in different namespaces. For example, a Clothing::Boot can exist alongside an Automobile::Boot (as in the trunk of the car) without any problem. The namespace provides the context, and there are no naming collisions.
  3. Namespaces can provide an abstraction barrier between large, disparate sets of code. If documentation is published describing what’s “public” in the namespace, everything else within it can be changed safely as long as classes in other namespaces restrict themselves to using the published API.

Namespaces do for classes what classes do for data and methods.

In Ruby, namespaces are implemented with modules. Around every class in the module, or around a set of classes, you specify the start and end of the module:

module MyModule
  class Foo
  end
end

There can now be another class called Foo in another module:

module YourModule
  class Foo
  end
end

From within each module, to access to Foo class, you just say Foo. From another module, you prefix the class name with the module name: ::MyModule::Foo.

ActiveRecord Associations Between Modules

As shown earlier, to place a class within a module, you simply open and close the module around the class. Within a module, you define relationships between ActiveRecord classes no differently than you would when there are no modules at all.

However, because modules provide namespacing and scoping, when crossing a module boundary, you need to tell ActiveRecord where to look for the class being referenced. In this book, we will build a movie ticket application that contains information about movies and also ticket sales. Ticket sales depend on the movies that exist, but not vice versa. Thus, let’s take a very simple set of classes (they won’t be the final classes we arrive at later in this book) to illustrate how you define an order in one module to depend on a movie in another. Example 3-3, shows these two classes and the additional ActiveRecord code needed to make it work in bold.

Example 3-3. ActiveRecord relationships between modules

# models/physical/movies/movie.rb
module Physical
  module Movies
    class Movie < ActiveRecord::Base
    end
  end
end

# models/physical/orders/order.rb
module Physical
  module Orders
    class Order < ActiveRecord::Base
      belongs_to :movie,
                 :class_name => '::Physical::Movies::Movie'
    end
  end
end

Certainly, it would be more elegant if ActiveRecord’s association methods could take a separate :module parameter, so that the class name itself need not be repeated if it could be inferred from the name of the association itself, but at the present time ActiveRecord does not have this support. In a sense, only three years into the explosion of the Rails phenomenon, we are blazing the trail toward Enterprise with each new day. Expect more features needed for large enterprise systems to emerge as more and more of Rails sites grow to need them.

Reciprocal Relationships

In the previous example, our Orders class knows about Movies, but not vice versa.

Most Rails tutorials encourage you to create cross-dependencies where none exist. Indeed, if you examine Figure 3-1, you will see the table structure for the Movies and Orders classes.

er_0301Figure 3-1. Two tables with a foreign key reference; is the relationship reciprocal?

ActiveRecord tutorials often encourage you to create the following classes to represent these tables:

class Movie
  has_many :orders
end

class Order
  has_one :movie
end

Suddenly, between two classes in the application layer, we have a cross-dependency that did not exist between the tables in the data layer. Defining the interrelationship provides convenient methods you may wish to use later, such as movie.orders and order.movie, but if you don’t expect that you will ever need to access the relationship in both directions—in this case, movie.orders seems like a reporting rather than operational concept—it’s better to leave out the reciprocal relationship definition. When a developer tries to access the relationship in reverse, he’ll get an error, indicating the method does not exist, and that moment will provide an opportunity to examine the design. Either the access was inappropriate and the developer’s goals could be achieved in another way, or the reverse relationship should be created.

None of this is to say that the vast majority of relationships between classes are or should be one-way relationships. In fact, the majority in your application probably will be reciprocal. There will be many more classes than there are modules. The art of design is to recognize the clusters of one-way relationships that do exist, because doing so opens up great possibilities down the road, both in terms of decreasing developer coordination overhead, as well as making the application more flexible and open to being split into separate services where and when appropriate.

Modules Presage Services

The ideas above about reciprocal relationships may seem like making a bug fuss about something rather inconsequential. So why bother? In fact, being careful about class dependencies can help you identify borders for modules. And modules can provide guidance for an application split that comes along with a move to a service-oriented architecture.

For example, imagine if we added a third class, Popcorn, to represent our foray into selling snacks at the movie theatre. If we were defining reciprocal relationships de rigueur, without much thought to actual needs, we would have a reciprocal relationship between movies and orders, and another reciprocal relationship between popcorn and orders.

If we could recognize early on that movies and popcorn are completely unrelated, we could put them in separate modules from the start. This also necessitates that the Orders class be in its own separate module as well; certainly it doesn’t belong in one of the Movies or Popcorn modules.

Figure 3-2 shows our three classes with interdependencies in Frame 1. With the names of the classes present, it’s easy to gloss over the relationships because you know what the inherent relationships really are. Frame 2 shows the relationships again, but without class names. Would you imagine mapping Movies, Popcorn, and Orders onto this diagram? Frame 3 shows the classes, with only the relationship we want. An order can be for a movie or for popcorn.

er_0302Figure 3-2. Adding reciprocal relationships that do not really exist

Figure 3-2 then goes on to show the progression we might see in our software over time. The number of classes around each topic area increases. This is shown in Frame 4, where we’ve placed a module boundary around each set of classes. For example, the Movies module might also contain ratings, theatres, and showtimes. The Orders module might contain classes for credit cards and billing addresses. And the Snacks module could contain candy, soda, etc. From the start, you want to ensure that the defined relationships allow you to easily put boundaries around each set. Eventually, these boundaries can become physical ones—in a service-oriented architecture—rather than simply suggestions implied by module namespacing. Frame 5 shows the potential set of services and their interconnections. In effect, we’re back full-circle to Frame 3. Abstractions allow us to keep complex relationships and processes simple to deal with.

Remember that good fences make good neighbors, and so too does modularization make for good software.

Ensuring Proper Load Order

Rails does a pretty good job of locating and loading classes when they are first accessed. As a convention over configuration issue, this means that everything will work fine unless you are pushing the envelope of convention too far.

There are two scenarios when the autoloader can become confused. In the simplest case, it just doesn’t know what directory to find your files in. In that case, you may see an error like the following one:

NameError: uninitialized constant Physical::Movie

For this type of error, you simply need to add the new path to the autoloader’s list of directories it searches. In environment.rb, in the Rails::Initializer.run section, add or modify the following statement to add your new load paths:

config.load_paths += %W(
    #{RAILS_ROOT}/app/models/logical
    #{RAILS_ROOT}/app/controllers/service
)

In the above example, the directories for logical models and service controllers were added.

The next type of autoloader problem occurs when you define parts of your class in one file, and other parts of the class in another file. In Ruby, it’s perfectly acceptable to reopen classes. You may have some basic class definition that should always be loaded, and then additional functionality that is loaded only under certain circumstances. Or, as we’ll see in Chapter 16, you may need to split your code up for some other reason. When we break our app up into services, we’ll break our logical model classes up into two components: one shared and one not. Once one piece of code is loaded, the autoloader will refuse to load the other part. The errors you see when the autoloader has not loaded all parts of your class will be specific to your classes; the class itself will be defined, but some method or piece of data inside the class will not be.

To deal with this type of issue, you actually need to explicitly load your class files, disregarding the autoloader altogether. For us, this will come up for our models/logical directory. In general, to force loading of an entire directory, the code snippet in Example 3-4 can be placed in application.rb.

Example 3-4. Code to load an entire directory of files

Dir["#{RAILS_ROOT}/app/models/logical/*.rb"].each { |file|
  require_dependency "logical/#{file[file.rindex('/') + 1...-3]}"
}

Here, the directory logical was used, but it can be replaced with any directory for which you want to force loading. The code snippet works in the following way:

  1. The directory listing of files ending in .rb is placed in an array to be iterated over.
  2. For each item in the array, we construct the appropriate name to pass to the require_dependency method.
  3. We construct the name by taking all characters after the last “/” character and up to the .rb. rindex(‘/’) returns the index of the last slash, and …-3 means “up to the third to last character.” Put together, these become a range, which returns a substring of the filename. For example, app/models/logical/foo.rb becomes foo.
  4. The filename is appended back onto the part of the path that is below one of the load paths, as defined above.
  5. We then pass this to require_dependency, which loads a class from its expected location even if the class is already defined.

Exercises

  1. Examine the classes in your problem space. Draw a dependency graph and find clusters of classes that can be broken up into modules.
  2. For each module, determine the dependency type: cascading, independent, or container/contained.

Getting There: Refactor Steps

Refactoring to modules can be a big process. If you haven’t had modules in mind while developing your application, your list of classes may be messy. Two “big refactors” are detailed here. Step one is simply to get your physical classes (those derived from ActiveRecord::Base) into a single module. Step two is to detangle “utility” functions intended to be run with script/runner into their own classes, which belong in the Utility module.

High-Level Module Refactor

  1. Create the directories physical, logical, and utility under the models directory.
  2. Move all of your classes that descend from ActiveRecord::Base (likely all of them) into the models/physical directory.
  3. Wrap each class with the following:
    module Physical
      # original code
    end
    
  4. Create the directories physical and service under the controllers directory.
  5. Move all of your classes that descend from ActionController::Base (likely all of them) into the controllers/physical directory.
  6. For each class, repeat step 3.
  7. Adjust all routes in routes.rb to take the module name into account:
    map.connect 'foo/:action/:id', :controller => 'physical/foo'
    

8. Make analogous changes in your test directory for unit and functional tests. Remember to repeat step 3 for each set of files.

Detangling Utility Methods

Utility methods are those that you never intend to be run while executing a user request. They are run exclusively via script/runner, either from the command line or through a crontab process. Often these methods will be mixed into your ActiveRecord models as class methods, but they don’t belong there. It is often surprising to many beginning Rails programmers that you can create classes that don’t descend from ActiveRecord::Base or ActionController::Base. You certainly can. Follow these steps to deconvolve utility methods from other classes:

  1. Locate all of your script/runner processes. A good place to start looking is in your crontab settings.
  2. For each method found, create a new file under the models/utility directory named after the process, e.g., emailer.rb.
  3. Structure the file based on the following template:
module Physical
  module Utility
    class Emailer
      def self.run(params)
        # original code goes here
      end
    end
  end
end

4. Anywhere that the original code referenced the original class with self, replace it will the original class’s name. In this case, it would likely be Email.

5. Wherever you run these scripts, alter the way the process is invoked. So:

./script/runner "Email.send_emails"

becomes:

./script/runner "Physical::Utility::Emailer.run"
Chapter 2 : Organizing with Plugins
Chapter 4 : Database As a Fortress
Advertisements