Arguments In And About Ruby

When I first started writing Ruby code a few years ago, I had a hard time wrapping my head around Rails’ DSL methods. They looked nothing like the kinds of Java methods I’d been used to reading and writing; declarative and high-level, they accept a dizzying variety of different options and switches. And they’re idiomatic within the community; Ruby is full of, and famous for, its DSL-friendly features, and programmers aren’t shy about using them. These features run counter to common wisdom about appropriate method signatures, yet the use of option hashes remains popular both among library authors and users. The pattern isn’t restricted to Ruby these days either; it’s common in Javascript to pass objects of options into all sorts of places.

So what gives?

As an example of that wisdom, no less an authority than Uncle Bob argues in Clean Code (which I recommend) for a conservative approach to function arguments.

The ideal number of arguments for a function is zero (niladic). Next comes one (monadic), followed closely by two (dyadic). Three arguments (triadic) should be avoided wherever possible. More than three (polyadic) requires very special justification—and then shouldn’t be used anyway.

This results in libraries with lots of small methods that accept small numbers of arguments, although constructors are sometimes used to populate objects with their dependencies in various DI frameworks.

But an additional three signatures are common (and even acceptable) in various DSLs, and you can find all three in ActiveRecord:

# def initialize(attributes = {})
Pirate.new :name => "Greenbeard", :ship => "The Queen Anne's Renegotiation"

# def has_many(of_what, options = {})
has_many :crew_members, :class_name => "Pirate", :foreign_key => "captain_id"

# def validates_presence_of(*arrrrrgs)
validates_presence_of :parrot, :peg_leg, :blank => false

These are, in order: a hash; a single argument followed by a hash; and an open list of arguments used to model a homogenous list followed by a hash of options that apply to all objects in the preceding list. This seems almost fatally flexible compared with the far more narrow first three, but their use is generally more narrow than at first it appears.

Time for a Breakdown

Named arguments in Ruby (again, Javascript applies here too) fall into roughly two use cases:

  1. As options that modify the behavior of the method; this comes most often in DSL form, where the method in question has a relatively simpler signature but a wide variety of modifiers.
  2. As data passed into a constructor; these serve to “rehydrate” objects that consist essentially of empty slots and behavior.

Note that these can overlap; DSL methods often turn around and immediately construct objects from their arguments, which means that these options become the “data” of that object; likewise, data given to a Ruby object often modifies the behavior of that object, as with ActiveRecord classes.

What has always struck me most about #1 was its resemblance to a Unix command; they often accept flags that modify their behavior in dramatic ways while sticking to the idea of trying to do one main thing well. Internally, however, those flags are broken down into more manageable, modular components. The same idea applies here: a method which takes a large number of arguments often serves only as an entry point into the underlying library; from an organizational standpoint, these options get broken down into smaller and smaller method calls, at which point the same rules of method organization apply that Uncle Bob suggests above.

An interesting analogy is Git’s “plumbing” vs “porcelain” command set: small commands get built into big ones, much the same way that large DSL methods get decomposed into methods with far narrower scope.

The most common use case for #2 is found in constructors which mass-assign attributes based on the results of database loads or HTTP requests. Note that its use shouldn’t just be confined to ActiveRecord; indeed, this is sometimes useful precisely as to way to avoid having to push attributes indiscriminately into one’s model. Building these types of mapper objects yourself is a good way to avoid burdening models or controllers with too much complexity.

Benefits and Tips

Significantly, following these patterns and opening high-level methods up to a multiplicity of arguments makes code more easy to maintain and understand, not less. This is due at least in part to weak automated refactoring tools; I’ve seen the pain caused by the need to manually modify the arity of a method used in dozens of spots in an application. Likewise, it’s often easier to remember one big method, and its most important options, rather than having to hunt and peck through an API for the one little method that you need.

If you do choose to use option hashes, here are several things you can do to help keep them maintainable, and not ruin the fun for the rest of us:

  • Document your defaults. Having a hash of default arguments serves both as effective documentation, allowing you to see at a glance what options a method knows about, and a fallback for when the user doesn’t supply an option that you need. Javascript libraries generally do a good job here.
  • Validate option keys. ActiveSupport supplies a assert_valid_keys method to help you provide feedback to your users when they fat-finger a hash key.
  • Build an object. When it comes time to process those option hashes, many frameworks use the options to populate an object that understands the goal of the method but is easier to test in small pieces.
  • Don’t go nuts. Remember, smaller is still better most of the time.

Got any more tips to add? Leave ’em in the comments or follow up on Twitter.

Microsoft technology isn't what killed MySpace

Slides for my talk on construction techniques for internal DSLs