GraphQL is a query language that you can use to interact with an API. GraphQL allows you to write your own queries that will bring back specifically the info you want from the GraphQL server’s database (within the boundaries of what the API provider wants you to see, which is defined in the schema).
From an API provider’s point of view, GraphQL also optimises bandwidth and server resource usage and, therefore improves performance.
GraphQL was designed at Facebook in 2012 and used internally for a few years before the specification was open sourced from 2015 onward. The concept gradually gained some traction and companies like Shopify, Netflix, PayPal, Github or Airbnb started using GraphQL as part of their API technology stack.
However, GraphQL APIs are still nowhere as mainstream as the RESTful APIs you still see in a vast majority of applications. But a number of industry observers are expecting GraphQL to keep building momentum.
We’ll see.
Two flavors of APIs
To understand the purpose of GraphQL, you need a bit of context. There are two main types of APIs you are likely to come across: RESTful APIs and GraphQL APIs.
A RESTful API provides users with a list of queries to choose from, each one targeting a given object type and bringing back a JSON object with data from a fixed set of fields.
API providers will generally supply all these endpoints inside a collection that users can import into Postman and use to query the API, as you can see below.
This often also comes with API documentation that details how each endpoint works and what it brings back.
Bottom line: the collection of supplied queries is designed to cover everything a user may require and should be enough to keep everyone happy.
The benefit of this approach is that it’s simple and straight forward. In practice, however, there are limitations.
API providers generally want to keep their list of queries from growing into a bloated and unwieldy collection. To keep things lean, they are often tempted to limit the number of available queries by having each one return a larger set of data, leaving it up to the client application to parse what comes back and retain just what it needs.
This means a lot of unnecessary data is transiting over the network, potentially impacting performance. At the same time, this overfetching increases the risk of excessive data exposure, revealing elements that a hacker may end up using maliciously.
Also, with a fixed data structure for each endpoint, a client application may need to query several endpoints to collect the data it need from the API, here again increasing network and server load, at the same time bringing back a lot of unnecessary fields.
How GraphQL works
In contrast, GraphQL generally exposes a single endpoint and provides users with an interface that lets them write queries with just the fields they want, then send them through.
In the example above, the endpoint directs the user to a GraphiQL interface where queries can be written (on the left side) and responses received (on the right side). Here, we are asking for a list of the names of all the different queries we can make to the target API.
To talk to a GraphQL API, you can also use browser extensions like Altair or Burp Suite extensions like InQL. Postman also has a great GraphQL interface that really facilitates writing queries (more on these tools in later articles).
By giving users the ability to write their own requests, GraphQL optimizes the API’s query process, avoiding the overfetching and potential data disclosure that traditionally come with RESTful APIs and limiting the number of queries a client app has to send to get the data it wants. The result is optimized bandwidth, reduced server load and improved overall performance.
The down side is that the overall process is more complex than just picking from a collection of provided RESTful queries. Here, users need to learn the GraphQL syntax, understand concepts like schema, types and introspection, as well as learn a whole new set of tools.
But GraphQL is also more complex for API developers, with GraphQL implementations opening new possibilities for vulnerabilities. This creates a new space for us security researchers and bug bounty hunters to play.
This is why adding GraphQL hacking to you game is definitely worth the effort and time investment.
In the following articles, I will go into the concepts you need to understand and the techniques and tools that will help you get up to speed with GraphQL as an ethical hacker. So stick around. 🙂