A Fresh Coat Of REST Paint On A SOAP Stack

Published on November 9, 2015

In my experience, once a SOAP API gets into production and is working, nobody wants to touch it.  They can be very finicky beasts. Sometimes the most innocuous update can stop a client application in its  tracks.  Exposing a SOAP API to external customers just raises the risk level.  One approach I frequently see is putting an HTTP or REST API in front of the SOAP stack.  This adds a layer of indirection that allows some wiggle room into the exposed API.  This blog posts talks about some of the issues to consider when trying to build a resource oriented HTTP API as a façade to a SOAP API.

image

Anatomy Of A SOAP Call

Here is an example of a SOAP request:

image

Most SOAP requests you will see in the wild are HTTP POST requests.  Although the SOAP specification did define a WebMethod feature to allow use of other HTTP methods it was rarely implemented.  There is value in changing the request method to more accurately reflect the characteristics of the request.

SOAP requests are usually targeted at a single URL that represents the entire service.  In our example case, the service is identified by http://www.webservicex.com/globalweather.asmx . Different operations in the service are identified by the contents of the request payload.  In a HTTP oriented design, each interesting concept, aka resource, is identified by a URL and the HTTP methods are used to perform a variety of operations on that resource.

The operation parameters are also part of the request payload, whereas with HTTP style APIs, the parameters are part of either the URL path or query string.

The corresponding response to the HTTP request is shown below:

image

The response body from a SOAP request always uses an XML envelope to surround the actual response body.  In this example, the actual XML response is escaped using entity names to avoid any potential clashes with the envelope.  The success or failure of the response is difficult to determine in any standardized way because result codes are often defined in ways that are specific to the particular API or operation.

Let The Resources Multiply

Possibly the first and easiest change that can be made is to lift the action out of the request body and into the URL.  The most mechanical thing we could do for our example request is,

http://www.webservicex.com/globalweather.asmx/GetWeather{?CityName,CountryName}

This would work and is a perfectly valid resource identifier.  However,  there are a few improvements that are possible. 

Legacy Cruft

The .asmx extension is a remnant of the days when web resources were commonly to stored as files on disk of a web server.  The convention continued into the services world as a way of routing requests to particular handling code.  However, most modern web frameworks don't need us to expose these implementation details anymore. 

Redundancy and Contradiction

The second change we can make is to remove the Get prefix.  Developers that are using HTTP to make requests to an API will learn there are a predefined set of methods that can be used with a resource.  The GET method is one of those methods and maps most closely to the read-only nature of this operation.  Once we are using the GET HTTP method, having Get in the resource identifier is redundant.  What is worse is if the API decides to implement one of the HTTP other methods on the resource then Get becomes contradictory.  Doing a PUT to a GetWeather resource becomes very confusing!

Our new resource identifier becomes:

http://www.webservicex.com/globalweather/Weather{?CityName,CountryName}

Service Boundaries

There seems to be some additional redundancy between GlobalWeather and Weather.  In the SOAP oriented world, it was common to group a set of operations into a logical service, that has a corresponding implementation with the same boundary.  This practice has followed into the HTTP world through the use of the term API, but it is not actually necessary.  As long as a resource identifier can be mapped to its corresponding implementation, there is no need to expose to the outside world what are the implementation service boundaries
So, we could just drop the globalweather path segment:

http://www.webservicex.com/Weather{?CityName,CountryName}

Sometimes it does make sense to logically group resources, and using path segments is the ideal way to do this.  However, don't assume that the most logical grouping to the consumer of the resources maps identically to the API implementation boundaries.

Parameterized Resource Space

It is common to translate parameters to service methods into query string parameters and in many cases that is the optimum solution.  There is however a scenario where it makes more sense to move the parameters into the path.  The one characteristic that distinguishes path segments from query string parameters is that path segments define a hierarchical resource space, whereas query string parameters are independent dimensions

In our example, country name and city name are a natural hierarchy.  Our resource identifier could look like,

http://www.webservicex.com/Weather/{CountryName}/{CityName}

Find A Better Method

For most operations that are read-only, the decision to use HTTP GET method instead of the SOAP POST should be an easy one.  One of the main reasons that SOAP did not allow GET, in early versions, is the fact that GET methods are not allowed to use request bodies.  SOAP relied on the body to communicate the action name and parameters, but we have seen how we can move all that information into the URL and remove the need to send a request body. 

Safe Operations

There are only two reasons that we should not map the GET method to an underlying SOAP call:

The first is if the size or format of a parameter makes it too difficult to put into a URL.  For example if the purpose of a call is for the client to upload an image which the server does some image processing on and then returns the updated image.  Encoding image data into a URL query string is like to be ugly, bloated and probably run into physical URL length limitations imposed by network component implementations.

The second reason is if the client is intentionally trying to change some data in the system, then GET is not the right method.  If when retrieving some data, there are internal system updates that cause timestamps to be updated and counts to be incremented then GET is still fine.

Unsafe Operations

If GET cannot be used, then the next methods to consider are DELETE and PUT.  If a resource is being removed conceptually from the system, then DELETE is most appropriate.    

If a resource is being replaced then PUT is the best option.  If just a portion of an existing resource is being changed then PATCH is the first choice.

If a new resource is being created and the client is able to assign the resource identifier itself, then PUT works well. 

And The Rest

If none of the above cases fit, then POST is the fallback method.

Changing the Response

The response is significantly easier to deal with than the request. 

Unwrap The Body

SOAP responses are always wrapped inside an XML envelope payload.  The soap:envelope and soap:body should be removed and if any kind of encoding has been applied to the body, as in our example, it should be removed. 

Correct The Response Description

The Content-Length header should be adjusted to reflect the new body size and the Content Type header should be updated to accurately describe the semantics of the message being returned.

But What About The REST?

Following the suggestions that I have laid out so far will enable converting a SOAP API into a resource oriented HTTP API that may or may not satisfy the "self-descriptive" constraint of REST depending on the media types chosen.  However, my experience is that most SOAP APIs are built to expose a different set of data than you would if you were building an API based on the principles of REST, described by Roy Fielding.  There is really no way to do a mechanical translation of a SOAP API into a REST API that is designed to maximize evolvability through the use of self-describing messages and hypermedia.  If extremely loose coupling and independently evolvable components are important objectives  then it is unlikely that there will be a 1-1 mapping between SOAP operations and HTTP resources and therefore the guidance in this article will be of limited use.

So yes, the title of the article is a misnomer, but I'd rather see more HTTP-based resource oriented APIs than argue about the imprecision of commonly used terms!