API Evolution - What does coupling mean to you?

When talking about REST, the conversation usually boils down to coupling. REST enables distributed components to evolve independently over long periods of time, and it does this by constraining where coupling can occur.

I’ve made this pitch a thousand times with the blind assumption that everyone had the same understanding of the term coupling. If I was to try and abstractly define coupling, I would say that, it is a measure of how changes in one thing might cause changes in the other. The confusion arises when you start to consider the best place to put the things that might change.

An example

Now, if we consider the following example HTTP requests,

a) GET /server/WidgetsThatAreFunky
b) GET /server/Widgets?foo=34&bar=98&sortBy=baz

In these examples, a client is making a request to the server to get a list of widgets that meet the criteria “Funky”.

My perspective

In my definition of coupling, the first example has very low coupling between the client and the server because the client is simply expressing WHAT it wants and not HOW to get widgets that meet that criteria. If the way we define “Funky” changes, then the server can make those changes and the client just works.

The alternative

However, I have now run into a number of people who perceive option (b) as having less coupling because the server does not require any knowledge of what the client might want to do with the data. It is the client that determines how the “Funky” criteria is defined. This perspective opens the door to building a generic API for the server and allowing the clients to come up with all different types of unexpected behaviour. I believe this is the essence of why people are building APIs to their web sites. Build a generic API and let the crowds do their magic.

In both cases, changes to the criteria “Funky”, require changing only one component. In the first example, the server needs to change. In the second example the client needs to change.

The problem

Here’s the problem. Although option (b) sounds like the most flexible option, it does have a significant number of disadvantages:

If anything changes to the properties foo, bar or baz, then the client’s “funky” definition may become invalid.
The server cannot optimize for delivering “funky” widgets, because it has no knowledge of the definition of “funky”.
Updating clients is normally more risky and difficult than updating servers.

Option (b) exposes a much larger surface area of API than option (a), which means there are many more dependencies between the client and the server. However, with option a the client is far more limited in what it can do with the data.

Two worlds

These two options represent to me two fundamentally different approaches on how to build distributed applications.

From my perspective, option (a) is what REST is all about, encapsulating the behaviour of a service behind resources, so that clients are simply expressing their intent. The idea that REST services should be generically re-usable by many different clients, I believe has been skewed from the original intent. REST services should be usable by clients on many different platforms, however, those clients should be trying to achieve primarily the same goals. Where clients are able to do completely different things with services is largely accidently due to the use of the uniform interface, and media types that apply the principle of least power. That’s why its called “serendipitous reuse” and not “planned reuse”.

Option (b) is about exposing data. If you carry it to the logical extreme, you get the Semantic Web. You need intelligent clients and you need servers that can withstand high volumes of unpredictable requests. I see this style as being appropriate for clients that crawl data in batch mode, rather than interactively responding to user requests.

I think the question you need to ask when building a distributed system, is what I am trying to achieve. Am I trying to expose semantically rich data to the world for people to explore and create things from, and do I have a enough computing horsepower/ financial resources to support that? Or am I trying to build the most efficient distributed application that allows a user to achieve a goal, no matter what client platform they are running on?

I’m not convinced that in this case, middle ground makes sense? I think both types of systems have their place, but serve very different purposes. What do you think?