HTTP API Design Part 4: API Standards

Multithreaded JavaScript has been published with O'Reilly!

This is the last of four articles on HTTP API Design. These articles are based on content from my recent book Advanced Microservices. This post is about standards for designing an API.

Simple Envelope

This first format is mostly a hypothetical one. When providing responses to a client it's pretty easy to reply with either just the requested resource, or an array of requested resources. However there is usually metadata that we want to provide as well. While it's true that response headers do provide us with some conduits for providing this metadata, it's usually not powerful enough to convey everything a consumer needs to know.

Let's first take errors into consideration. Sure, we can supply a 4XX or 5XX status code when a request fails, but how can we be more specific? Whatever we choose to do we also need to be consistent so that a client always knows how to tell if a request is an error and always knows where to get the error information.

If we only reply with a resource or an array of resources we'd be forced to do something silly like add an error property to our resources. Instead, we can alway reply with a parent object to the actual data. We refer to this standardized JSON parent as an “envelope”, as it envelops the important data.

Here's a very simple envelope for requests which have failed:

{
  "error": "database_connection_failed",
  "error_human": "Unable to establish database connection",
  "data": null
}

Note that there are two error properties. The first one, error, is a machine-parseable error code. A lot of services choose to use numeric values for this, but why use an unreadable numeric format when a string will do? We also want a separate human-readable string. This string could theoretically be translated to match the Accept-Language of the request and displayed to an end-user. It would be maddening to write code in consumers to compare human readable error strings, especially if they can change, which is why we need to have separate properties.

Then, if the response is a successful one, we can add additional properties like so:

{
  "error": null,
  "error_human": null,
  "data": [{"id": "11"}, {"id": "12"}],
  "offset": 10,
  "per_page": 10
}

In this case we still have the error property but since there isn't an error we set it to null. The data property contains the content being requested by the consumer, in this case the consumer is asking for a collection of data so we're providing an array of resources in that collection. Finally we have two additional metadata properties, offset and per_page, which tells the consumer about the response. In this case the client has requested the second page of results with 10 entries per page, so we essentially reply with that data for context.

JSON API

JSON API is a standard for returning data to a consumer while removing redundancies in this data. As an example, consider the situation where an API represents a bunch of blog posts on a website. Each blog post will have unique content, such as a title and an identifier and body text. However each blog post will have potentially redundant information, such as information about the author.

Usually in these situations we'd respond with the redundant author information in each post. If our blog is primarily authored by the same person, this is a lot of wasted content. JSON API allows us to instead define relationships between different types of resources, thereby removing the redundancies. Check out the following example:

{
  "data": [
    {
      "type": "articles",
      "id": "1",
      "attributes": {
        "title": "Article Title",
        "body": "Content"
      },
      "relationships": {
        "author": {
          "data": {
            "id": "42",
            "type": "people"
          }
        }
      }
    }
  ],
  "included": [
    {
      "type": "people",
      "id": "42",
      "attributes": {
        "name": "John",
        "age": 80
      }
    }
  ]
}

GraphQL

GraphQL is an API standard invented by Facebook. It includes a custom format for querying data. Typically responses are then formatted in JSON. The query format requires that the consumer specify all attributes which they want in the response, so attribute whitelisting is built in and is a first-class citizen. This was born of the need for mobile clients to only get important data, thereby wasting fewer bytes.

Another feature of GraphQL is that the attributes requested in the response can correlate to data from different collections. This makes GraphQL particularly attractive when building façades, services which consume data from other services. The GraphQL can perform the necessary aggregations in a single request, saving the client from having to make multiple requests to different collections.

Requests usually use a single HTTP endpoint, with the body received via POST. GraphQL is NOT a RESTful HTTP practice, and can actually be used completely separately from HTTP. Here's an example of a GraphQL query:

{
  user(id: "tlhunter") {
    id
    name
    photo {
      id
      url
    }
    friends {
      id
      name
    }
  }
}

And here is an example of the correlated response:

{
  "data": {
    "user": {
      "name": "Thomas Hunter II",
      "id": "tlhunter",
      "photo": { "id": "12", "url": "http://im.io/12.jpg" },
      "friends": [
        { "name": "Rupert Styx", "id": "rupertstyx" }
      ]
    }
  }
}

MessagePack

MessagePack can be thought of as a 1:1 binary representation of JSON. Any JSON document can be represented as MessagePack, which means it could be used with JSON API or GraphQL or anywhere as long as the consumer accepts it. Any superfluous whitespace is removed, and some other redundancies such as quote and colon characters are removed as well. The binary representation is typically going to be smaller and can be quicker to serialize and deserialize.

Consider the following document. This file is 31 bytes (not counting whitespace). It is an object with two properties, the first being a string and the second an array of three integers:

{
  "id": "tlhunter",
  "xyz": [1,2,3]
}

The following is the corresponding MessagePack message, which has been reduced to 21 bytes:

82 a2 69 64 a8 74 6c 68 75 6e 74
65 72 a3 78 79 7a 93 01 02 03

We can actually look at the output message and easily correlate it 1:1 with the input message. The first byte is 82, which means an object with 2 properties. Think of 8X as an object and the 2 as meaning two properties. Of course, this doesn't work if the object has many many properties, MessagePack probably then uses a more verbose encoding. The a2 which follows means a string of 2 bytes. The 69 64 is our string “id”. Next we have a8 which is an 8 byte string followed by 74 6c 68 75 6e 74 65 72 for “tlhunter”. Then we have a3 78 79 7a for our three byte string “xyz”. 93 tells us we have an array of 3 integers. Finally the last three bytes 01 02 03 look suspiciously similar to our input of three numbers 1, 2, 3. Had this array contained mixed types or even numbers with decimals, the encoding would have changed.

One “limitation” of this format is that it doesn't have a schema. Imagine that the above resource is well known. There's an entire collection of them. Each item in the collection always has an id and an xyz property, the id is always a string, and xyz is always an array of numbers. Each time we transmit these documents we're wasting bytes by describing the properties as well as providing the data. If we wanted to remove this redundancy we could instead use a tool like Apache Thrift. Thrift allows us to create version-able schemas and share them amongst providers and consumers. This requires we share the property descriptors only once while enforcing documents adhere to a schema.

JSON RPC (Remote Procedure Call)

JSON RPC is a very different paradigm from the RESTful HTTP one we've been looking at throughout these posts. Instead of abstracting data into resources and performing CRUD operations on them, you can simply expose functions and their parameters and allow clients to call these functions semi-directly. This pattern is called RPC. JSON RPC is then

Similar to GraphQL, if you're using these requests over HTTP you're probably going to use a single endpoint, accept the requests via a POST request, and respond. JSON RPC can also work completely outside of HTTP, e.g. with TCP or IPC (Inter-Process Communication).

Here's an example of a JSON RPC request. The document is very simple; the request requires a version number, an identifier (to property correlate requests with responses since we're not married to HTTP). We also name the RPC method we want to run and provide arguments as the params property. params can be either an array or an object, correlating to normal function parameters or named parameters, respectively.

{
  "jsonrpc":"2.0",
  "method":"subtract",
  "params":[42,23],
  "id":1
}

The response also contains a version and an identifier corresponding to the request. The important bit is the result property which contains the result of the operation.

{
  "jsonrpc": "2.0",
  "result": 19,
  "id": 1
}

This article is based on content from my book Advanced Microservices.There's also an accompanying HTTP API Design Presentation.

Tags: #apis
Thomas Hunter II Avatar

Thomas has contributed to dozens of enterprise Node.js services and has worked for a company dedicated to securing Node.js. He has spoken at several conferences on Node.js and JavaScript and is an O'Reilly published author.