Join our community of software engineering leaders and aspirational developers. Always
stay in-the-know by getting the most important news and exclusive content delivered
fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter
in the past. Click the button below to open the re-subscribe form
in a new tab. When you're done, simply close that tab and continue
with this form to complete your subscription.
The New Stack does not sell your information or share it with
unaffiliated third parties. By continuing, you agree to our
Terms of Use and
Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!
We’re so glad you’re here. You can expect all the best TNS content to arrive
Monday through Friday to keep you on top of the news and at the top of your game.
What’s next?
Check your inbox for a confirmation email where you can adjust your preferences
and even join additional groups.
Follow TNS on your favorite social media networks.
Having been in the API space for the last decade (at Apigee, Google, and now StepZen), and having done databases for two decades prior (at IBM and during my dissertation at Berkeley), I can safely say that there are two tricks that databases have done well, that will revolutionize how APIs are built and managed.
The first is that databases operate declaratively. That means you tell the database what to do, not how to do it. You follow that principle whether you are creating data or querying for it. APIs, on the other hand, have mostly been created programmatically.
Second, equally importantly, many databases know how to federate, meaning that if your data is scattered across two database systems, then one can execute a query against both as if it was coming from one. The query that you submit is scattered to multiple backend databases, and the results are gathered to present as if they were coming from one centralized system.
In this paper, we will address the latter, we have previously written articles about the former.
Federation of APIs
I had first heard this concept in hearing how Netflix talked about their API tier, in particular from Dan Jacobsen, who is now at the New York Times. Netflix had “domain APIs” that reflected how backends viewed and surfaced their data, and they had “experience APIs” that reflected how applications wanted to access the data. Each experience API made the right call to a set of domain APIs (scattered), and then combined the results (gathered) into one.
But scattering/gathering is not easy. In traditional APIs, it is hardwired into the way each API is built. If the experience API needed some more data, someone would go in and program that API to now scatter to another backend. If the backend API changed its implementation, someone would go in and reprogram the experience API. Errors from backends? Program. Perform issues? Program (add cache), etc., etc.
StepZen enables developers to easily build and deploy a single GraphQL API that gets the data they need from multiple backends. The API delivers the right data reliably, irrespective of backend protocols, schemas and authentications. We manage the API so that developers manage zero infrastructure.
Learn More
The latest from StepZen
Furthermore, why do two levels, and does only two levels make sense? If you look at the World Wide Web, it is an interconnection of pages, grouped together into sites, grouped together into domains, etc. The structure of interconnectivity (href) is the same — and it allows for arbitrary complex relationships to be formed.
This needs to be the new world of APIs. However, for this world to form, there is one more fundamental shift that needs to happen. Scattering and gathering, when each backend produces arbitrary structures is almost impossible. There has to be a “standardization” of some form. And that standardization is GraphQL.
GraphQL, which stands for Graph Query Language, has two wonderful features.
returning customer and order data in one query, just like a federated database query.
It returns the data in exactly the shape of the request. No more, no less.
Now imagine a graph of GraphQL APIs:
A GraphQL query at any one level can be scattered to the next-level subgraphs. The responses from them are in the exact shape of the subgraph subqueries that were sent to them. Gathering (stitching) them is trivial, there are no shapes to wrestle with, no logic to write. And this can continue on down.
As you can see, this is an entirely new API architecture. It is a federated graph of APIs, and can be used to build a big supergraph — or a single graph of graphs — and many smaller graphs-of-graphs, which can be scoped at whatever is the right structure for an organization. It is a very clean, easy concept. And it is the future.
However, that is not all. This architecture has some enormous positive implications on performance, governance and multicloud.
Performance
By federating and sending GraphQL subqueries down, you are not shipping unnecessary data up and down the supergraph. In database terms, this is called pushdown — you are letting each subsystem do the most it can, and only send the results of the computation back for the gather stage. This is the difference between computing a member’s total order amount by fetching all the orders and sending it to the customer subgraph (and letting it compute the total), vs. computing the total in the member’s subgraph and only sending the total to the customer subgraph. Good GraphQL systems understand what each subgraph can do, and try to do maximal pushdowns. In traditional API architectures, such knowledge has to be hardwired into the APIs that are further up the chain.
Governance
With this federation model, data does not leave the subgraph unless it has to. Imagine that you have an EU subgraph and a US subgraph. The scatter phase ensures that the EU subgraph is asked for something in its domain, and it can decide what data it can send upstream. Queries like “what is the total amount for a customer” can return the total amount, without violating any privacy issues or leaking specific order data. Or the subgraph can decide to obfuscate some data to preserve privacy.
In addition to the privacy issues, a federated model is just better for governance. Each team decides what its subgraph looks like. It can have a more detailed subgraph for internal use and expose fewer capabilities upstream. Of course, since the data returned has to make sense, it does not preclude the need for some lightweight governance across the subgraphs, but that is much less than what it would be if the whole thing was one tangled mess of programming logic.
Multicloud
Imagine if some of your services were on Google Cloud and some on AWS and some on-premises. You would want to manage them for governance and for performance, separately. In that world, this federated API structure is the only way to go.
Summary
APIs are great. However, API architectures have not evolved. With GraphQL, a new way of forming a graph of graphs is emerging. This architecture leads to a simpler design, better performance, simpler governance and graceful migration to the cloud. It is the way forward.
StepZen enables developers to easily build and deploy a single GraphQL API that gets the data they need from multiple backends. The API delivers the right data reliably, irrespective of backend protocols, schemas and authentications. We manage the API so that developers manage zero infrastructure.