With the recent surge of interest in hypermedia APIs I am beginning to see the term “self-descriptive” thrown around quite frequently. Unfortunately, the meaning of self-descriptive is not exactly self-descriptive, leading to misuse of the term.
Consider the following HTTP requests,
Example 1:
GET /address/99 => 200 OK Content-Type: application/json Content-Length: 508{ "_meta" : { "street" : { "type" : "string", "length" : 80, "description" : "Street address"}, "city" : { "type" : "string", "length" : 20, "description" : "City name"}, "postcode" : { "type" : "string", "length" : 11, "description" : "Postal Code / Zip Code"}, "country" : { "type" : "string", "length" : 15, "description" : "Country name"}, }, "address" : { "street" : "1 youville way", "city" : "Mysteryville", "postcode" : "H3P 2Z9", "country" : "Canada" } }
Example 2:
GET /address/99 => 200 OK Content-Type: application/vnd.gabba.berg Content-Length: 90<berg> <blurp filk="iggy">ababa</blurb> <bop> <bip>yerk</bip> </bop> <berg>
I suspect a fair number of people will be surprised when I make the claim that from the perspective of self-descriptive HTTP messages, the first message is not self-descriptive and the second one is.
The first may contain more descriptive content, but it doesn't use the standardized methods provided to us by HTTP to identify the semantics of the content. The client is forced to make assumptions. The second one is explicit about identifying the meaning of the payload.
Identify yourself
Self-descriptive in HTTP does not mean the message describes itself. It means that the message depends on semantic identifiers, using mechanisms defined by HTTP (e.g. media-types and link relations) to convey the complete meaning of the message. This allows client application to know whether it can understand the incoming message.
The first example contains all kinds of metadata which attempts to describe the actual data in the message. However, how can the client know if it is able to interpret the metadata? It reminds me of the first French language course I took in Quebec, where the teacher started providing instruction in French! Fortunately, humans are pretty intelligent creatures, software applications, not so much.
Declare your semantics
In example 1, the media type in the content type header is declared as "application/json". Unfortunately that tells me nothing about the meaning of the information in the message body. The client can process the content as JSON, but the message is telling it nothing else about the meaning of the message. Allowing a client to assume that because you have retrieved the representation from /address/99
that the response will contain information about an address is a violation of the self-descriptive constraint.
Why yes, I do speak Klingon
In example 2, which at first glance appears completely unintelligible to a developer, provides a media type, which, in theory, should be registered with IANA and therefore I should be able to find a written specification that explains what all those weird attributes and elements mean. Once I have read the specification, I can write code in my client application to be able to process content that is identified as "application/vnd.gabba.berg".
There is no magic
I get the impression that some developers perceive hypermedia and self-descriptiveness as some magical property that will allow clients to perform tasks that they previously had no idea how to do.
A client can only process media types it understands. A web browser knows how to render HTML, follow links, fill forms and run script. The browser is completely ignorant of the fact that one HTML page might be doing banking transactions and another submitting an order for a year's supply of Shamwow products.
The effect might be magical, but the reality is that the hypermedia driven clients can only do exactly what they have been coded to do.
Image credit: Name tag https://flic.kr/p/27Y1J9