In my experience, once a SOAP API gets into production and is working, nobody wants to touch it. They can be very finicky beasts. Sometimes the most innocuous update can stop a client application in its tracks. Exposing a SOAP API to external customers just raises the risk level. One approach I frequently see is putting an HTTP or REST API in front of the SOAP stack. This adds a layer of indirection that allows some wiggle room into the exposed API. This blog posts talks about some of the issues to consider when trying to build a resource oriented HTTP API as a façade to a SOAP API.
Anatomy Of A SOAP Call
Here is an example of a SOAP request:
Most SOAP requests you will see in the wild are HTTP POST requests. Although the SOAP specification did define a WebMethod feature to allow use of other HTTP methods it was rarely implemented. There is value in changing the request method to more accurately reflect the characteristics of the request.
SOAP requests are usually
targeted at a single URL that represents the entire service. In our
example case, the service is identified by
Different operations in the service are identified by the contents of the
request payload. In a HTTP oriented design, each interesting
concept, aka resource, is identified by a URL and the HTTP methods are
used to perform a variety of operations on that resource.
The operation parameters are also part of the request payload, whereas with HTTP style APIs, the parameters are part of either the URL path or query string.
The corresponding response to the HTTP request is shown below:
The response body from a SOAP request always uses an XML envelope to surround the actual response body. In this example, the actual XML response is escaped using entity names to avoid any potential clashes with the envelope. The success or failure of the response is difficult to determine in any standardized way because result codes are often defined in ways that are specific to the particular API or operation.
Let The Resources Multiply
Possibly the first and easiest change that can be made is to lift the action out of the request body and into the URL. The most mechanical thing we could do for our example request is,
This would work and is a perfectly valid resource identifier. However, there are a few improvements that are possible.
.asmx extension is a remnant of the days when web resources
were commonly to stored as files on disk of a web server. The convention
continued into the services world as a way of routing requests to particular
handling code. However, most modern web frameworks don't need us to expose
these implementation details anymore.
Redundancy and Contradiction
The second change we can make is to remove the Get
prefix. Developers that are using HTTP to make requests to an API will
learn there are a predefined set of methods that can be used with a
resource. The GET method is one of those methods and maps most closely to
the read-only nature of this operation. Once we are using the GET HTTP
method, having Get in the resource identifier is redundant. What
is worse is if the API decides to implement one of the HTTP other methods on the
Get becomes contradictory. Doing a PUT to a
GetWeather resource becomes very confusing!
Our new resource identifier becomes:
There seems to be some additional redundancy between
GlobalWeather and Weather. In the SOAP oriented world, it was common to
group a set of operations into a logical service, that has a corresponding
implementation with the same boundary. This practice has followed into the
HTTP world through the use of the term API, but it is not actually
necessary. As long as a resource identifier can be mapped to its
corresponding implementation, there is no need to expose to the outside
world what are the implementation service boundaries.
So, we could just drop the globalweather path segment:
Sometimes it does make sense to logically group resources, and using path segments is the ideal way to do this. However, don't assume that the most logical grouping to the consumer of the resources maps identically to the API implementation boundaries.
Parameterized Resource Space
It is common to translate parameters to service methods into query string parameters and in many cases that is the optimum solution. There is however a scenario where it makes more sense to move the parameters into the path. The one characteristic that distinguishes path segments from query string parameters is that path segments define a hierarchical resource space, whereas query string parameters are independent dimensions.
In our example, country name and city name are a natural hierarchy. Our resource identifier could look like,
Find A Better Method
For most operations that are read-only, the decision to use HTTP GET method instead of the SOAP POST should be an easy one. One of the main reasons that SOAP did not allow GET, in early versions, is the fact that GET methods are not allowed to use request bodies. SOAP relied on the body to communicate the action name and parameters, but we have seen how we can move all that information into the URL and remove the need to send a request body.
There are only two reasons that we should not map the GET method to an underlying SOAP call:
The first is if the size or format of a parameter makes it too difficult to put into a URL. For example if the purpose of a call is for the client to upload an image which the server does some image processing on and then returns the updated image. Encoding image data into a URL query string is like to be ugly, bloated and probably run into physical URL length limitations imposed by network component implementations.
The second reason is if the client is intentionally trying to change some data in the system, then GET is not the right method. If when retrieving some data, there are internal system updates that cause timestamps to be updated and counts to be incremented then GET is still fine.
If GET cannot be used, then the next methods to consider are DELETE and PUT. If a resource is being removed conceptually from the system, then DELETE is most appropriate.
If a resource is being replaced then PUT is the best option. If just a portion of an existing resource is being changed then PATCH is the first choice.
If a new resource is being created and the client is able to assign the resource identifier itself, then PUT works well.
And The Rest
If none of the above cases fit, then POST is the fallback method.
Changing the Response
The response is significantly easier to deal with than the request.
Unwrap The Body
SOAP responses are always wrapped
inside an XML envelope payload. The
soap:body should be removed and if any kind of encoding has been
applied to the body, as in our example, it should be removed.
Correct The Response Description
header should be adjusted to reflect the new body size and the
Type header should be updated to accurately describe the semantics of the
message being returned.
But What About The REST?
Following the suggestions that I have laid out so far will enable converting a SOAP API into a resource oriented HTTP API that may or may not satisfy the "self-descriptive" constraint of REST depending on the media types chosen. However, my experience is that most SOAP APIs are built to expose a different set of data than you would if you were building an API based on the principles of REST, described by Roy Fielding. There is really no way to do a mechanical translation of a SOAP API into a REST API that is designed to maximize evolvability through the use of self-describing messages and hypermedia. If extremely loose coupling and independently evolvable components are important objectives then it is unlikely that there will be a 1-1 mapping between SOAP operations and HTTP resources and therefore the guidance in this article will be of limited use.
So yes, the title of the article is a misnomer, but I'd rather see more HTTP-based resource oriented APIs than argue about the imprecision of commonly used terms!