Tracks: Part 4
This is the fourth and final article in the series reviewing the GTD application Tracks.
Rails is a wonderful productivity enhancer. It can sometimes seem almost like magic: you write a few lines of code here and there, and Rails just “does the right thing.” What’s not to love about that?
Unfortunately, that magic often tricks newcomers to Rails into thinking that the right thing will be done, even when there is no way for Rails to do so. I’m speaking, specifically, of database indexes (or rather, the lack thereof).
The Tracks application is certainly not the only one to do this, so this is in no way directed solely at them. I’ve encountered far too many Rails applications that seem to assume that defining the database tables is sufficient, and that Rails will magically take care of the rest. This, sadly, is not true, and has resulted in a whole generation of web applications with terrible database performance. The only indexes that Rails adds are on your tables’ primary keys. All other indexes must be defined explicitly, by you.
Even veterans can make this mistake. At 37signals, we were missing indexes on four (smaller) tables in Campfire until recently. It wasn’t until we started wondering why the load on that database was so high that I went looking, and discovered some full-table scans occuring in some frequent queries. Adding the necessary indexes caused the load to drop by over half!
I’ve addressed the question of when to add indexes in a post on my own blog. Rather than repost that article here, you can read it there: Indexing for DB performance.
In general, you’ll want indexes on your foreign keys, but if it were only that simple, Rails could figure that out for you. Sadly, the indexes you need depend largely upon how your application will query the database. Consider the following table definition:
1 2 3 4 5 6 7 |
create_table "notes", :force => true do |t| t.column "user_id", :integer, :default => 0, :null => false t.column "project_id", :integer, :default => 0, :null => false t.column "body", :text t.column "created_at", :datetime t.column "updated_at", :datetime end |
The foreign keys here are user_id and project_id. Should there be indexes on those? It depends.
Consider the SQL generated for the following call:
note = Note.find(1234, :include => :user) |
You should see SQL something like the following in your log:
1 2 3 4 |
SELECT * FROM notes, users WHERE users.id = notes.user_id AND notes.id = 1234 |
The database, upon receiving this query, has to determine how to satisfy it. Without getting into the nitty-gritty (mostly because the nitty-gritty differs between databases, but also because I’m only superficially familiar with the nitty-gritty, anyway), the DB decides that it will start with grabbing the note, since it was given a constant for that primary key.
So, it knows the note. That means it knows the user_id, and can then satisfy that part of the query via the primary key on users. After that, it’s done, since it has found all the necessary rows.
In this case, it was able to satisfy the query via the primary keys on the two tables. No other indexes were necessary, and since Rails defines our primary keys for us, that kind of query works efficiently “out of the box”.
Now, consider the SQL generated for something like this:
1 2 |
# user = User.find(1234)
notes = user.notes |
You’ll see something like this in your logs:
1 2 3 |
SELECT * FROM notes WHERE user_id = 1234 |
Ok, in this case, we know the user_id, since we know the user the notes belong to. We want to find all notes with that user_id. The database, however, cannot satisfy that query using any of the defined indexes, so it has to resort to the dreaded “full table scan”. It has to look at every row in the notes table. For small tables, even up to a few thousand rows, that won’t be noticable. But when you start getting tens or hundreds of thousands of rows, the difference becomes significant. If you’re on a shared host, people will start hating you.
So, let’s assume you recognize that your application is querying like this, looking records up by foreign key. Rails, fortunately, makes it simple to add new indexes:
1 2 3 4 5 6 7 8 9 10 |
# script/generate migration AddIndexToUser class AddIndexToUser < ActiveRecord::Migration def self.up add_index :notes, :user_id end def self.down remove_index :notes, :user_id end end |
In general, you’ll want an index on any combination of columns that you are using to look up records. For Tracks, “Find all notes for this user” needs an index on the foreign key, but “Find the user for this note” does not.
I really cannot stress enough the importance of getting to know how your DBMS of choice works. I’ve only scratched the surface in this article, and in the post on my own blog. Find a good book on SQL and indexes. Learn specifically how your database uses indexes to satisfy queries. You cannot hope to scale your application, in any framework, if your database queries are doing full table scans all the time.
Tracks: Part 2
Here we are, #2 in the series of articles exploring the GTD application Tracks. In part 1, Koz discussed the finer points of code reuse. Here, I’ll talk about action filters in your controllers.
I’m focusing primarily on the ContextController and ProjectController in this article—both have many similarities. Their actual purpose is not really relevant here, so I won’t describe what they actually do. Rather, let me just point out one of the first things I noticed when I started reading through the code. This pattern was startlingly prevalent:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
def list init ... end def show init init_todos ... end def create ... end def add_item self.init ... end |
Note that the same method gets called at the start of many of these actions. The intent here, is to do some kind of common setup, or initialization, prior to the meat of each action. Rails provides a cleaner way to do this, via filters. Filters are simply methods that get called before or after an action, and this code is a textbook example of when you ought to be using them.
In general, filters that you want executed before an action are defined via the before_filter macro:
before_filter :init, :init_todos |
Defining the filter like that would cause both init and init_todos to be called prior to every action in the current controller (and its subclasses, since filters in a superclass are inherited by subclasses).
What we want in this instance is something a bit more sophisticated. We don’t want the filters to trigger on every action, only on some of them. In fact, there are only one or two actions that need the init_todos filter. Fortunately, before_filter (and the filter macros in general) provide a very simple, but very powerful, mechanism for including or excluding actions:
1 2 |
before_filter :init, :except => :create before_filter :init_todos, :only => :show |
The above specifies that the init method should be called prior to every action, except create, and init_todos should be called prior only to the show action. Both except and only allow arrays of action names to be given, as well.
Using filters like this can help you to reduce the “noise” in your actions, leaving you free to implement only the tasty bits within the action itself. It really makes your code easier to read, and to maintain.
Another (related) thing I noticed, this time in the ApplicationController, was the following:
1 2 3 4 5 6 7 8 |
before_filter :set_session_expiration # ... def set_session_expiration return if @controller_name == 'feed' || ... # ... end |
The top-level ApplicationController specifies a before filter, set_session_expiration, that will trigger on every single action in all subclasses. It has some logic to determine what the session’s expiration should be. Note, though, that if the controller is determined to be the FeedController (the name of the controller is “feed”), that it just returns.
In general, if you need to refer to the current controller or action name explicitly in your code, you could probably refactor it to be cleaner. In this case, you can let Rails’ filtration code do the conditional work for you. In ApplicationController, simply omit that condition altogether. Then, in the FeedController, do:
1 2 3 4 5 |
class FeedController < ApplicationController skip_before_filter :set_session_expiration # ... end |
session :off. This will disable sessions for that controller, and stop your sessions table from filling up with empty, aggregator-generated sessions.
The skip_before_filter macro allows subclasses to override filter behavior configured in their parent class. Here, we tell Rails that we want to “skip”, or ignore, the set_session_expiration filter for all actions in this controller. (You can use the only and except parameters here, as well, if you need finer tuning.) By using the skip_before_filter macro, the code that applies to the FeedController is in the FeedController, and the ApplicationController remains blissfully unaware of its subclasses.
(While we’re on the topic of filters, I might just add that filters in edge rails have one minor drawback: they cause your stacktrace to become ridiculously long, because they are processed recursively. A way to process them without making recursive calls would be a wonderful patch. Hint, hint!)
Tracks: Part 1
This is part 1 in our multi-part series evaluating the funky GTD app Tracks.
Tracks has two files in their lib directory which got my attention.
/lib/acts_as_namepart_finder.rb which lets you find an object by its ‘name’, and falls back to a partial match; and/lib/acts_as_todo_container.rb which contains a lot of the logic for Todo List management.
These both illustrate a common anti-pattern I see with rails programmers: premature extraction. Just because rails has a bunch of meta programming magic with names like acts_as_list, doesn’t mean you need it.
We’ll start with acts_as_name_part_finder as it’s the simplest case. As the generated method is always the same, there’s no need for the indirection and complication which comes from using an ‘acts_as’ macro.
1 2 3 4 5 6 |
module NamePartFinder def find_by_namepart(namepart) find_by_name(namepart) || find(:first, :conditions => ["name LIKE ?", namepart + '%']) end end |
You just have to call extend in your class definition. Once you’ve done that, you’ll get the nice find_by_namepart methods you’re used to using.
1 2 3 |
class Project < ActiveRecord::Base extend NamePartFinder end |
If you find yourself wanting this same behaviour for fields other than name, this solution won’t help at all. An alternative is to use a simple macro to generate some static methods for you. You can just stick this at the bottom of environment.rb, or somewhere else that gets required.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
# config/environment.rb def partial_finders_for(*attribute_names) attribute_names.each do |attribute_name| class_eval <<-EOS def self.find_by_partial_#{attribute_name}(attribute_value) find_by_#{attribute_name}(attribute_value) || find(:first, :conditions=>["#{attribute_name} LIKE ?", attribute_value + '%']) end EOS end end # app/models/project.rb class Project < ActiveRecord::Base partial_finders_for :name, :nickname end # app/controllers/project_controller.rb def index @projects = Project.find_by_partial_name("jo") end |
But the more fundamental question here is whether to bother with a module at all. It seems the partial_name code is only ever called from one action, so my advice to you is, put it back inline and don’t worry about the duplication until you’re actually using it in multiple places. The only thing worse than repetitive code, is prematurely-extracted code.
Replacing a simple, working solution with a creaking abstraction is much worse than a few duplicated lines. DRY is about avoiding bugs, it’s not an absolute rule.
Now acts_as_todo_container is a slightly more interesting case, unlike name_part_finder. Because the current implementation requires a parameter, you can’t just use a simple module. Looking at the implementation, there are two things which jump out at me.
First, there’s a lot of boilerplate code to support ‘find_todos_include’, which is only needed to prevent the views from triggering multiple queries. Thankfully, has_many takes an :include option which can save us a lot of duplication.
1 2 3 4 5 6 7 8 9 |
class Context < ActiveRecord::Base has_many :todos, :dependent => :delete_all, :include=>:project #... end class Project < ActiveRecord::Base has_many :todos, :dependent => :delete_all, :include=>:context #... end |
Now whenever you access the ‘todos’ collection, AR will include the project or context as needed. With this refactoring in place we can simplify acts_as_todo_container to just one module.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
module TodoList def not_done_todos @not_done_todos ||= find_not_done_todos end def done_todos @done_todos ||= find_done_todos end def find_not_done_todos self.todos.find(:all, :conditions => ["todos.type = ? and todos.done = ?", "Immediate", false], :order => "todos.due IS NULL, todos.due ASC, todos.created_at ASC") end def find_done_todos self.todos.find(:all, :conditions => ["todos.type = ? AND todos.done = ?", "Immediate", true], :order => "completed DESC", :limit => @user.preference.show_number_completed) end end class Project < ActiveRecord::Base has_many :todos, :dependent => :delete_all, :include=>:context include TodoList end |
Another option would be to use composition and extract the Todo list behaviour into a specific class TodoList. Once you’ve done that you can probably simplify the rendering code a little as all the partials could focus on rendering a ‘todo list’.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
class Project < ActiveRecord::Base has_one :todo_list #... end class Context < ActiveRecord::Base has_one :todo_list # ... end class TodoList < ActiveRecord::Base has_many :todos, :dependent => :delete_all # ... end |
The main reason this would prove problematic is that Todo instances need to have references to both their context, and their project. By adding another class here, you’d need to manually ensure that you moved them to the relevant lists whenever you changed the belongs_to associations. That’s probably going to be a lot more code than just using the module method outlined above.

