In my last post I discussed what I perceive to be a misuse of the term self-description. I made the claim that one of the negative side-effects of this misuse is the development of more API metadata description languages. I believe this is a path we have travelled before and it is a dead end.
Those who ignore history are destined to re-invent it
We have a tendency in this industry to look back on past efforts that have failed and believe that either the people who failed were misguided or that "things are so different now that those past experiences are not relevant". We mock SOAP and WSDL with the disdain a seventeen year old has for their "out of touch" parents. Those teenagers forget there parents were seventeen once too.
The efforts of the past few years to take the simple JSON format and build JSON schema, JSON pointer, JSON path are a hint that maybe the complexity of XML wasn't completely without value. The even more recent efforts to build API description formats like Swagger, Mashery's format, API blueprint show that we are heading back in the direction of WSDL. The most recent efforts to create an API directory and API search mechanism make the letters UDDI spring to my lips.
In no way am I saying that these services are not valiant attempts to solve real problems. They are the same real problems the distributed systems designers were trying to solve 10 years ago. I also believe that due to these services being more true to the design of HTTP that they will be incrementally better than what we tried to make work 10 years ago. However, the web didn't become the phenomenon that it has due to an incremental improvement. Web browsers and web servers took a fundamentally different approach to building distributed applications than had been done before. The web works differently, but we keep trying to turn it back into the model we are familiar with and in the process loosing its advantages.
The problem, in my humble opinion, comes down to what it really means to be self-descriptive. An API implementation should describe itself. The implementation itself should be the source of truth. Having a metadata document describe the API is making the design document the source of truth and the actual HTTP implementation a secondary artifact. That might work for a while, on a new product, in a tightly controlled team. However, five years down the road, when a bug needs to be fixed, or a new feature needs to be added by developers who were never involved in the initial design, that metadata document describing the API is going to be forgotten and the real world API will diverge. The WWW works because it was designed to survive the test of time and evolution is measured in years, not in days until the financing runs out.
The other problem with API design documents is that it is highly unlikely there will ever be a winner. There will be fragmentation of the industry and there will be incompatibility. HTTP is pretty much universal, we need to take advantage of that. I also assert that any sufficiently sophisticated API design language that is capable of fully describing a HTTP API will be more complex than the HTTP specification itself. As a point of reference it took the Httpbis team of HTTP experts and industry veterans more than 5 years to clarify the wording on the existing HTTP specification.
The cynic in me would also like to point out that a significant amount of effort going into creating these API design languages are from vendors who want to build tooling that can consume those design documents. We've been down this path before. Let's not make the same mistakes this time.
History provides the answers
There are two web clients that have made the web successful. The most obvious one is the web browser, without it, end users would not be able to consume the content of the web. However, just as important is the web crawler. Without the web crawler, the search engines would not know what content is on the web and finding web content would be so much more difficult. For those of you old enough to remember having to use dmoz.org to find web sites, I'm sure you will agree. Web crawlers work because they understand many of the common media types of the web and they know how to follow links. When a web crawler follows a link, it has no idea what the semantics of the response are going to be until it processes the content-type header of the response.
The vendors who are currently trying to build tooling that will generate API documentation, create client proxy libraries, catalog APIs and manage APIs need to recognize that the solution is sitting right in front of our face. HTTP is the universal interface that has the ability to do introspective requests. HTTP methods like HEAD can discover all kinds of valuable meta information about actual HTTP resources. Link relations like "described-by" can be used to point to either machine readable or human readable documentation about particular resources.
I don't have anywhere close to all the answers. I think as a community we need to do a significant amount of experimentation and research on how to annotate our API implementations with small chunks of metadata so that they can be crawled to generate the meta artifacts that we need to support the growth of the web api infrastructure.
We don't need a single all-encompassing Web API design description language, we need a few dozen little specifications that can be mixed an matched to describe the features of our API implementation. These annotations need to be attached to our APIs using standard HTTP mechanisms. This will allows us to retrofit existing APIs and automatically support all platforms.
Show me
In an upcoming post I plan on exploring existing HTTP mechanisms that can be used to describe an actual API.