David Hardy

Posted on Jan 14, 2022 • Updated on Mar 18, 2022

Live Subscriptions, the basic awesome feature GraphQL is missing

#graphql #livedata #subscriptions #apollo

Instantly receive new data whenever data is changed, is every developer's dream. Update UI, trigger events, really make an application come to live. GraphQL has support for making applications come to live with subscriptions, but it's lacking. It becomes complex rather quickly, see this article for the rationale.

Middleware

Instead of breaking down relations into multiple queries, and fixing it in the client, let's drill down into the Live Subscriptions alternative.

Traditionally subscriptions are used to receive information about updated data, but instead, we want to receive only the updated data. For this, we need to tweak GraphQL a bit, by injecting Live Subscription middleware. Both the client and server must add this middleware in their pipeline, which comes down for a oneliner for the client and the server, see the readme for details.

The middleware sits between the GraphQL logic of the client and the server. That means that nothing has been changed on resolver implementation in the server, and nothing has been changed on using GraphQL in application code. The middleware seamlessly does its work, however, a slight modification is required in the schema and the query.

The Query/Subscription setup has changed, and the live keywords are introduced (in bold). The client-side query has to be changed accordingly, and liveId field is mandatory. With this change, the middleware can start building state per client, per subscription.

subscription livePosts {
  livePosts {
    liveId,
    posts {
      my
      complex {
        data
      }
    }
  }
}

Live Subscription flow

Apart from the middleware, and the introduction of the live keyword, almost nothing changes. To ease the setup of subscriptions and triggering resolvers the LiveManager utility class should be used in the subscribe resolver of GraphQL.

So instead of the typical subscription setup;

Subscription: {
  postUpdated: (...) => 
    pubSub.asyncIterator(“POST_UPDATED”),
  ...
}

We now utilise LiveManager as follows;

const liveManager = new LiveManager();
liveManager.addTopic(“livePosts”);

pubSub.subscribe(
  “POST_UPDATED”, 
   liveManager.publish(“livePosts”)),
   {},
);

Subscription: {
  livePosts: (...) => 
    liveManager.addSubscription(“livePosts”),
}

LivePosts: {
  posts: (...) => /* your resolver code here> */,
}

That's it, these changes to your business code satisfy all necessary requirements for Live Subscriptions. From now on forward, every subscription that is prefixed with live and with a liveId field will be optimized following this flow:

The client starts a new livePosts subscription.
The server receives the subscription and informs the utility class LiveManager of a new subscription called livePosts.
LiveManager creates a fresh liveId, and triggers the resolvers to do their work.
The resolved data is received by the middleware, is stored in memory in the middleware. The server-middleware forwards the data.
The client-middleware receives the data, stores it in memory, and forwards it.
The client receives the resolved data.
Something in the landscape informs the server that there is an update.
The server figures out this change is relevant to livePosts subscriptions and informs the LiveManager.
LiveManager triggers the resolvers for all livePosts subscriptions.
The server-middleware can figure out the change in the resolved data by comparing it with the previous state. For this, it uses liveId. It forwards the changed data instead of the resolved data.
The client-middleware applies the changed data based on its previous state and the received changes, based on the liveId.
The client receives the resolved data.

Costs - Statefulness

Nothing in the world is truly free. There is a cost for using Live Subscriptions, which mostly all have to due with the GraphQL server becoming stateful. Since the server now keeps a copy of the state of all its clients.

Memory usage

Keeping state costs memory. This state is important to reduce costs in data transfer and reduce complexity client side. This isn't merely a cache, it's a state of several clients. Depending on the number of clients your application has, and the size and amount of the subscriptions they start, RAM usage is going to grow.

Scaling

Vertical scaling by simply adding more resources to a server will resolve issues RAM issues, but Horizontal scaling is one of the powers of stateless servers. Even with the statefulness of Live Subscriptions, horizontal scaling is unaffected.

Subscriptions use web-sockets, which always talk to the same instance. So the state of a client is always maintained by the same instance. When a client connects new state is created, when a client disconnects the state is cleared. When a new instance is added to the pool, new clients will be auto balanced to this new instance, and maintain a connection with that instance.

Web-sockets do not give 100% guarantee that they won't drop the connection, therefore the GraphQL client implementation already has a retry mechanism. So when a connection drops, the state is dropped from the middleware on both sides. Next, the GraphQL client will establish a new connection to possibly a different instance, and the new state is created by the middleware. There are several reasons why the connection can drop, one of the reasons is horizontal scaling killing off an instance.

Caveats

Then there are some more considerations; the major one being subscription support of GraphQL is scarce. There is some additional overhead making sure a subscriptions is correctly authenticated. This was already the case using 'classic' subscriptions, but when sending actual data using subscriptions this becomes even more important.

Also, when using the datasources pattern, manually recreating the datasources is need before every execution, using preTrigger;

livePosts: {
  subscribe: (_, __, context) => {
    return liveManager.addSubscription('livePosts', {
      userId: context.userId, 
      preTrigger: () => {
        context.dataSources = context.dataSourceBuilder();
      },
    });
  },

Lastly, to optimise lists of data, it is beneficial to inform Live Subscriptions of the unique id for a data type. Without unique ids, we cannot detect shifts in a list, and have to rely on index in the list instead.

subscribe: liveSubscribeBuilder(subscribe, {
  idFieldsByTypename: {
    Post: 'id',
    Author: 'id',
    Book: 'isbn',
  },
}),

FAQ

Sure there are some considerations, let's adres them:

Is this completely revolutionary?

No Live Updates have been a thing for a while, just not in GraphQL. Take for example Firebase, there it has been the default ever since.

Why isn't this in the core framework?

Well, GraphQL solves quite a few issues that traditional REST has, but it isn't perfect.

Is it production tested?

Yes! It is. In fact, it is operational in Mission-Critical Applications right now. It's robust and ready to safely use.

Why is this even needed?

Please read more in this blog for the rationale behind Live Subscriptions.

Where can I find more?

Check out the open-source git repo at https://gitlab.com/livesubscriptions/monorepo.

the/experts. Blog

the/experts. Blog is a community of amazing users

Live Subscriptions, the basic awesome feature GraphQL is missing

Middleware

Live Subscription flow

Costs - Statefulness

Memory usage

Scaling

Caveats

FAQ

Is this completely revolutionary?

Why isn't this in the core framework?

Is it production tested?

Why is this even needed?

Where can I find more?

Discussion (0)

Read next

Keycloak - Configuration as Code Pt. 3

With buildpacks to the moon!

Mastering Void Method Testing with Mockito & JUnit 5 in Spring Boot Applications

Onder de motorkap van het ontwikkelteam.