We are currently seeing a significant amount of discussion about building hypermedia APIs. However, the server side only plays part of the role in a hypermedia driven system. To take full advantage of the benefits of hypermedia, the client must allow the server to take the lead and drive the state of the client. As I like to say, it takes two to Tango.
So you think you can dance?
Soon after I was married, my wife convinced me to take dance lessons with her. Over the couple of years we spent taking lessons, I learned there were three types of people who join a dance studio. There are people who want to get better at dancing, there are couples who are getting married who don't want to look like idiots during their 'first dance' and there are divorcees. I'll leave it to you to figure out why the divorcees are there as I'll just be focusing on the other two groups.
The couples who are preparing for their weddings usually are under time and budgetary constraints, so they usually opt to learn a choreographed sequence of steps to a particular song. They both learn the sequence of steps and, to an extent, dance their own independent steps whilst hanging on to each other. It is a process that serves a purpose. It meets the goals of the couple, but it is a vastly inferior result to the approach taken when people's goal is simply to learn how to dance.
In order to learn to dance there are a number of basic fundamentals that are required. It is essential to be able to follow the beat of whatever music you are trying to dance to. There are a set of basic dance primitives that can be combined to make up dance sequences. It is also important to understand the role of the man vs that of the woman when dancing (Note: these are traditional role names, and no longer necessarily correlate with gender). The man leads the dance, the woman follows.
As the dance progresses the man chooses the sequences of primitives to perform and uses hand signals, body position and weight change to communicate with the women what steps are coming next. There is no predefined, choreographed set of sequences. The man basically does whatever he wants within the constrains of dance style. The woman follows.
When done right, it looks like magic
Watching a talented couple do this freestyle dance is often indistinguishable from a choreographed dance. When people learn to dance this way, they can dance to any piece of music as long as the beat matches a style of dance they know and they can dance with any partner. Whereas, couples who learn their wedding dance, know one sequence to one piece of music and can only dance it with one partner.
The Client-Server dance
Building a client that can consume a HTTP API can be done in different ways. You can build your application to be like a choreographed dance, where both the client and sever know in advance what is going to happen. When the client makes an HTTP request to a particular resource it knows in advance how the server will respond. The challenge with this approach is that both parties need to have knowledge of the sequences, and more importantly, where they are up to in the sequence. If someone decides to make any changes the other party is likely to get confused by the unplanned change.
A choreographed client
The last twenty years of building clients for distributed applications has taught us how to build highly choreographed clients. We first learn the API that the server exposes and then teach our clients an intricate pattern of interactions in order to achieve our desired goals. Our application is the dance we perform.
Frequently when building clients like this we will create a facade over the remote service and a view model to manage the state of the sequence of interactions. Consider the following example of a distributed application that is designed to perform the dance of turning on and off a light switch.
The service facade:
public class SwitchService { private const string SwitchStateResource = "switch/state"; private const string SwitchOnResource = "switch/on"; private const string SwitchOffResource = "switch/off";<span class="kwrd">private</span> <span class="kwrd">readonly</span> HttpClient _client; <span class="kwrd">public</span> SwitchService(HttpClient client) { _client = client; } <span class="kwrd">public</span> async Task<<span class="kwrd">bool</span>> GetSwitchStateAsync() { var result = await _client.GetStringAsync(SwitchStateResource).ConfigureAwait(<span class="kwrd">false</span>); <span class="kwrd">return</span> <span class="kwrd">bool</span>.Parse(result); } <span class="kwrd">public</span> Task SetSwitchStateAsync(<span class="kwrd">bool</span> newstate) { <span class="kwrd">if</span> (newstate) { <span class="kwrd">return</span> _client.PostAsync(SwitchOnResource,<span class="kwrd">null</span>); } <span class="kwrd">else</span> { <span class="kwrd">return</span> _client.PostAsync(SwitchOffResource, <span class="kwrd">null</span>); } } }
The view model:
public class SwitchViewModel : INotifyPropertyChanged { public event PropertyChangedEventHandler PropertyChanged; private readonly SwitchService _service; private bool _switchState;<span class="kwrd">public</span> SwitchViewModel(SwitchService service) { _service = service; _switchState = service.GetSwitchStateAsync().Result; } <span class="kwrd">private</span> <span class="kwrd">bool</span> SwitchState { get { <span class="kwrd">return</span> _switchState; } set { _service.SetSwitchStateAsync(<span class="kwrd">value</span>).Wait(); _switchState = <span class="kwrd">value</span>; OnPropertyChanged(); OnPropertyChanged(<span class="str">"CanTurnOn"</span>); OnPropertyChanged(<span class="str">"CanTurnOff"</span>); } } <span class="kwrd">public</span> <span class="kwrd">bool</span> On { get { <span class="kwrd">return</span> SwitchState; } } <span class="kwrd">public</span> <span class="kwrd">void</span> TurnOff() { SwitchState = <span class="kwrd">false</span>; } <span class="kwrd">public</span> <span class="kwrd">void</span> TurnOn() { SwitchState = <span class="kwrd">true</span>; } <span class="kwrd">public</span> <span class="kwrd">bool</span> CanTurnOn { get { <span class="kwrd">return</span> SwitchState == <span class="kwrd">false</span>; } } <span class="kwrd">public</span> <span class="kwrd">bool</span> CanTurnOff { get { <span class="kwrd">return</span> SwitchState; } } <span class="kwrd">protected</span> <span class="kwrd">virtual</span> <span class="kwrd">void</span> OnPropertyChanged([CallerMemberName] <span class="kwrd">string</span> propertyName = <span class="kwrd">null</span>) { var handler = PropertyChanged; <span class="kwrd">if</span> (handler != <span class="kwrd">null</span>) handler(<span class="kwrd">this</span>, <span class="kwrd">new</span> PropertyChangedEventArgs(propertyName)); } }
In our client view model we maintain the current SwitchState. The client needs to know at any point in time, whether the switch is on or off. This information will be provided to the View to present a visual representation to the user and it is also used to drive the application logic that determines if we are allowed to turn the switch on or off again. Our application wishes to prevent someone from trying to turn on the switch if it is already on and turn off the switch if it is already off. This is an extremely simple example but will be sufficient to illustrate differences between the two approaches.
The important point to note, that just like our engaged couple doing their dance, both the client and server must keep track of the current application state in order to know what they can and must do next.
Sometimes you just have to let go
In this next example, we take away the responsibility from the client of keeping track of state that is already being tracked by the server. The client simply follows the lead of the server and trusts the server to provide it the necessary guidance.
We no longer need to provide a facade over the server API and instead we focus on understanding the messages communicated to by the server. For that we have created a class called SwitchDocument that allows the client to parse and interpret the message.
public class SwitchDocument { public static SwitchDocument Load(Stream stream) { var switchStateDocument = new SwitchDocument(); var jObject = JObject.Load(new JsonTextReader(new StreamReader(stream))); foreach (var jProp in jObject.Properties()) { switch (jProp.Name) { case "On": switchStateDocument.On = (bool)jProp.Value; break; case "TurnOnLink": switchStateDocument.TurnOnLink = new Uri((string)jProp.Value, UriKind.RelativeOrAbsolute); break; case "TurnOffLink": switchStateDocument.TurnOffLink = new Uri((string)jProp.Value, UriKind.RelativeOrAbsolute); break; } } return switchStateDocument; }<span class="kwrd">public</span> <span class="kwrd">bool</span> On { get; <span class="kwrd">private</span> set; } <span class="kwrd">public</span> Uri TurnOnLink { get; set; } <span class="kwrd">public</span> Uri TurnOffLink { get; set; } <span class="kwrd">public</span> <span class="kwrd">static</span> Uri SelfLink { get { <span class="kwrd">return</span> <span class="kwrd">new</span> Uri(<span class="str">"switch/state"</span>, UriKind.Relative); } } }</pre>
Our view model now has the reduced role of simply presenting the information contained in the SwitchDocument to the view and providing a way to interact with the affordances described in the SwitchDocument.
public class SwitchHyperViewModel : INotifyPropertyChanged { public event PropertyChangedEventHandler PropertyChanged; private readonly HttpClient _client; private SwitchDocument _switchStateDocument = new SwitchDocument();<span class="kwrd">public</span> SwitchHyperViewModel(HttpClient client) { _client = client; _client.DefaultRequestHeaders.Accept.Add(<span class="kwrd">new</span> MediaTypeWithQualityHeaderValue(<span class="str">"application/switchstate+json"</span>)); _client.GetAsync(SwitchDocument.SelfLink).ContinueWith(t => UpdateState(t.Result)).Wait(); } <span class="kwrd">public</span> <span class="kwrd">bool</span> On { get { <span class="kwrd">return</span> _switchStateDocument.On; } } <span class="kwrd">public</span> <span class="kwrd">bool</span> CanTurnOn { get { <span class="kwrd">return</span> _switchStateDocument.TurnOnLink != <span class="kwrd">null</span>; } } <span class="kwrd">public</span> <span class="kwrd">bool</span> CanTurnOff { get { <span class="kwrd">return</span> _switchStateDocument.TurnOffLink != <span class="kwrd">null</span>; } } <span class="kwrd">public</span> <span class="kwrd">void</span> TurnOff() { _client.PostAsync(_switchStateDocument.TurnOffLink, <span class="kwrd">null</span>).ContinueWith(t => UpdateState(t.Result)); } <span class="kwrd">public</span> <span class="kwrd">void</span> TurnOn() { _client.PostAsync(_switchStateDocument.TurnOnLink, <span class="kwrd">null</span>).ContinueWith(t => UpdateState(t.Result)); } <span class="kwrd">private</span> <span class="kwrd">void</span> UpdateState(HttpResponseMessage httpResponseMessage) { <span class="kwrd">if</span> (httpResponseMessage.StatusCode == HttpStatusCode.OK) { _switchStateDocument = SwitchDocument.Load(httpResponseMessage.Content.ReadAsStreamAsync().Result); OnPropertyChanged(); OnPropertyChanged(<span class="str">"CanTurnOn"</span>); OnPropertyChanged(<span class="str">"CanTurnOff"</span>); } } <span class="kwrd">protected</span> <span class="kwrd">virtual</span> <span class="kwrd">void</span> OnPropertyChanged([CallerMemberName] <span class="kwrd">string</span> propertyName = <span class="kwrd">null</span>) { var handler = PropertyChanged; <span class="kwrd">if</span> (handler != <span class="kwrd">null</span>) handler(<span class="kwrd">this</span>, <span class="kwrd">new</span> PropertyChangedEventArgs(propertyName)); } }
This new hypermedia driven View Model has the same interface as the choreographed one and can be easily connected to the same simple user interface to display the state of the light switch and provide controls that can change the light switch. The difference is in the way the application state is managed. In this case, the view model determines if the switch can be turned on or off based on the presence of links that will turn on and off the switch. Attempting to turn on or off the switch involves making a HTTP request to the server and using the response as a completely new state for the view model.
You can find a complete WPF example in the Github repository.
The similarity is only skin deep
On the surface it appears that the two different approaches produce pretty much the same results. It is almost the same amount of code, with a similar level of complexity. The question has to be asked, what are the benefits?
If you watch a couple dance who have learned a choreographed dance, you may think they are very capable dancers. You may not even be able to tell the difference between them and others doing the same dance who have a much more fundamental understanding of how to dance. The differences only begin to appear when you introduce change. Changing dance partners, changing music or adding new steps will quickly reveal the differences.
The impact of change
The same is true with our sample application. Consider the scenario where new requirements are introduced where a switch could only be turned on during a certain time period, or only users with certain permissions could turn on the switch.
In the choreographed application, we would need to add a number of other server resources that would allow a client to inquire if a user has permission, or if the time of day permits turning on or off the switch. The client must call those resources to retrieve the information, which in itself is not terribly complex. However, deciding when to make those requests can be tricky. Calling them frequently adds a significant performance hit, but caching the values locally can introduce problems with keeping the local state consistent with the server based resource state.
In the hypermedia driven client, neither of our new business requirements require additional resources to be created or server roundtrips. In fact the client code does not need to change. All the logic that is used to determine if a client can turn on or off a switch can be embedded into the server logic for determining whether to include a "TurnOn" link or a "TurnOff" link.
The links are always refreshed along with the state of the switch so the client state is always consistent. The state may be stale, but that is fine because HTTP has all kinds of mechanism for refreshing stale state. The key thing is the client does not need to deal with the complexity of the permissions being an hour old, the timing schedule being ten minutes old and state of the switch being ten seconds old.
Some constraints can lead to unimagined possibilities
The fact that our client application does not need to change to accommodate these new requirements is far more significant than our analogy might lead us to believe. When ballroom dancing there usually just one man and one woman. The implications of making changes to the dance are limited. In distributed systems, it is not uncommon for a single API to have thousands of client instances and perhaps multiple different types of clients. The client applications are often created by different teams, in different countries with completely different time constraints.
Being able to make logic changes on the server that would normally be embedded into the client can potentially have huge benefits. The example I have shown only scratches the surface of the techniques that be applied using hypermedia, but hopefully it hints at the possibilities.
Image Credit: Tango https://flic.kr/p/7BY638
Image Credit: First Dance https://flic.kr/p/d3vMxG
Image Credit: Rumba Steps https://flic.kr/p/dMrKk
Image Credit: Skipping Rope https://flic.kr/p/6pZSGX
Image Credit: Base jump https://flic.kr/p/df5Nwd
Image Credit: Fake Gucci purse: https://flic.kr/p/9w5Qji
Image Credit: Windmills https://flic.kr/p/96iTMv
Image Credit: Hang Glider https://flic.kr/p/6jfG9m