Gears Within Gears

Seek simplicity, and distrust it.

Rails and REST: ActiveResource doesn't play nice with domain-driven resource identifiers

Posted by Brian Guthrie Fri, 29 Aug 2008 00:39:00 GMT

An update from my post from a few days ago where I talk about REST and domain identifiers. This is the entirety of the code from ActiveResource::Base#save:

  def save
    new? ? create : update
  end

Here’s the punchline:

  def new?
    id.nil?
  end

ActiveResource doesn’t document it, but you can override the primary key that it uses to communicate with your backend system (self.primary_key = :username) and indeed this is the most effective way I know of ensuring that it performs deletes and updates with the appropriate key. The trouble is that if you attempt to create a resource remotely and pre-populate the username (which you’d want to be able to do; it is, after all, a domain identifier) ActiveResource chokes and tries to update the remote resource instead of saving it.

To put it another way: ActiveResource thinks that if your resource has an ID it’s not new. It cannot separate the concepts of auto-increment database-generated ID and resource identifier.

The correct way to solve this problem is to flag any resources retrieved from a remote system as “not new” and consider any other created resources as new. Until that’s fixed I recommend using a conditional approach in which your controller checks incoming IDs and finds on the appropriate column based on whether or not the ID looks like a resource identifier or a database integer. More on that next post.

This is part of a series of posts on Rails and REST. Read the rest.

Posted in | no comments |

Rails and REST: Don't use auto-incremented database IDs as your resource identifier

Posted by Brian Guthrie Mon, 25 Aug 2008 05:21:00 GMT

By default, Rails expects your resource to be identified by its model’s primary key ID. Don’t give in to temptation. There are a lot of good reasons to avoid going that route, and they’ll save you pain and trouble later. This isn’t news in the Rails community, as the framework has always had excellent routing support for custom URLs. But the default XML serialization code isn’t as forgiving, and it’s worth ticking off a few reasons to take the extra step.

They aren’t human-friendly

Database IDs mean nothing to the human beings consuming your resource. This matters even when all you’re providing is pure REST service with no UI component, as with the project I’m working on now, because although it may be other applications consuming the service, human beings are writing the code. One of the great advantages of a RESTful API is that they’re pretty easy for even casual programmers, maybe technically-minded domain experts, to consume with simple, hacky script. Anything you can do to encourage those people and make their lives easier will redound back in user happiness and cool, unexpected new uses.

This isn’t just useful for the warm fuzzies, though. Most importantly, it allows users to self-identify the resource, saving you an unnecessary search. More on that later.

They’re (usually) generated arbitrarily

My current project involves importing a large amount of data from a third-party system. For a long time, whenever we re-ran that import any system that relied on a link to our resource broke because the IDs got regenerated. Of course you can fix this with a smarter import (and we have) but a better solution is simply to use the data we’re given. A URL-friendly ID that’s inherently tied to the data is much less fragile.

They don’t allow you to model concepts

Some resources don’t map cleanly to database tables, or it doesn’t make any sense to map them that way. Consider a resource that exposes the weather on a given day, and retains a history of that weather. Your historical data is probably stored in a database somewhere, but the concept you’re modeling is the date.

Exposing the resource with database IDs forces the user to perform a search if they want to find out what the weather was five years ago from this Tuesday. Without a URL representation you end up with a query of the form /weather?date=2003-08-26, which is more appropriately a /show/ than it is an /index/ call. /weather/2003-08-26 is not only a cleaner URL but retains cleaner semantics; an index should give you a list of things back, so the former URL will deserialize into an array with a single Weather resource (courtesy of ActiveResource) whereas the latter is simply a single object.

In a future post I’ll talk about ways to minimize the amount of trouble involved in moving over to a human-friendly resource identifier.

This is part of a series of posts on Rails and REST. Read the rest.

Posted in | 2 comments |

Rails and REST: A reference to commonly-used HTTP status codes and their use in REST APIs

Posted by Brian Guthrie Fri, 22 Aug 2008 16:19:00 GMT

I see a fair amount of confusion with services over developers returning incorrect or incomplete HTTP failure codes from their RESTful controller actions, so here’s a quick reference to some of the most commonly-used codes in the 400 range and their specific use in a Rails context.

Code Rails symbol Use
401 :unauthorized The requester is not authorized to access this resource. Useful if you’re trying to roll some form of RESTful authentication (e.g. see Amazon’s S3 authentication).
404 :not_found The request is trying to access a resource that does not exist. Use this when your find method raises ActiveRecord::RecordNotFound, or your find_by returns nil.
405 :method_not_allowed The request is asking to perform a CRUD operation that your resource doesn’t support. It’s certainly much friendlier than bombing out because you haven’t defined a create method on your controller.
406 :not_acceptable The request is trying to access a resource in a format that your server doesn’t support. If a request doesn’t match any format you’ve provided in your respond_to block, Rails will automatically respond with this and an empty body (see ActionController::MimeResponds).
408 :request_timeout The timestamp on the request doesn’t match up to the server. Useful for authenticating requests that an attacker can’t play back at a later time.
409 :conflict The POST or PUT performed on your resource comes into conflict with an existing resource. Use this when your model fails a uniqueness_of validation. The easiest way to discover this is to compare the contents of the model error messages against ActiveRecord::Errors.default_error_messages[:taken].
422 :unprocessable_entity The request was correct but contains incorrect data, as with a regular model validation failure on a PUT or POST.

These are all the codes I’ve used in my RESTful services, but if you know of any others that I’ve missed post them in the comments or send me an email and I’ll incorporate them into the table above.

Also, it’s worth noting that REST, based as it is on the HTTP we know and love, is supposed to be reasonably human-friendly. In my opinion it’s not necessary to adhere strictly to the above conventions (although I try to) as long as the human beings trying to use your service understand why their requests failed and what they can do to fix them.

This is part of a series of posts on Rails and REST. Read the rest.

Posted in | no comments |

Rails and REST: Nested XML and the Law of Demeter

Posted by Brian Guthrie Fri, 22 Aug 2008 01:43:00 GMT

When you’re writing a RESTful web service designed to expose a resource to the outside world, the case for including first-order associations is pretty strong. ActiveRecord’s default serialization method, to_xml, makes it pretty easy: simply pass in the :include key and the list of associations you’d like to nest, and the XML is automatically generated for you. But it can kill performance pretty quickly if you’re not careful. Consider a User resource, which includes, as a first-order association, the Address of that user. The XML will look something like this:

<user> <first-name>Guybrush</first-name> <last-name>Threepwood</last-name> <job-title>Fearsome Pirate</job-title> <address> <street-address>1 Governors Mansion</street-address> <city>Melee Island</city> <state>The Caribbean</state> <zip-code type="integer">12345</zip-code> </address> </user>

This is fine for a single resource, and probably what I want to see when I ask for /users/gthreepwood.xml. But when I ask for /users.xml?job_title=Pirate, and I’m getting back a list of resources that can stretch into the dozens or hundreds (there are a lot of pirates on Melee Island) it can start to drag quickly. You’re asking for more data from your database server (be sure you’re including the associations not only on the to_xml call but in the initial find call as well), it takes longer to serialize each object, and it takes longer (often much, much longer) for the client to parse that XML.

Worst of all, your client probably doesn’t need the full address. There’s plenty of precedent in the web world for this; when you’re displaying a list of things you probably want an abbreviated view, but when you’re looking at a specific resource you probably want a much richer view. You’re also designing an API that forces the client to reach through multiple layers of an object in order to dig out the one piece of information they need.

Instead, follow the Law of Demeter. If all the client of your resource needs when displaying a list of users is a text description of their street address, consider adding a method to your model that provides it:

def full_address
  "#{address.street_address}, #{address.city}, #{address.state} #{address.zip_code}" 
end

And use it in your controller as follows:

respond_to do |format| 
  format.xml do 
    render :xml => @users.to_xml(:methods => [:full_address])
  end
end

By my count, the former representation comes to 282 bytes and the latter to 200—a savings of almost one third. Multiplied over a large XML document with multiple resources that translates into a big win for both network bandwidth and XML parsing speed.

There’s one hitch, and this applies anytime you include a method as part of your XML. If the client tries to modify that resource update it by POSTing it back to your server, and you handle it by passing params[:user] straight into the update_attributes method of the relevant User object, your model will complain that it doesn’t know anything about a full_address= method. You’ve provided a getter but no setter.

There are two ways around this: either remove the offending parameter from the attributes you pass into the User model in your controller, or create no-op setter on your model. I personally prefer the latter, simply because I like my controllers to do as little as possible (and for the record, I think that models should either know how to serialize themselves to XML or that responsibility for doing so should be offloaded to an ERB template, but that’s another post). But either works as long as you’re aware of it and handling it appropriately.

This is part of a series of posts on Rails and REST. Read the rest.

Posted in | no comments |