Strategies for Landing an Oracle DBA Job as a Fresher
APIdays Helsinki 2019 - Beyond REST: GraphQL API Management with Amit Acharya, IBM
1. IBM Confidential
GraphQL API Management
Amit P. Acharya
Head of Product – API Management (API Connect) & Gateways
IBM
linkedin.com/in/amitpa
@amitacharya
2. The Next 35 25 minutes…
2
1. Why GraphQL? It’s for the end-user silly!
2. Wait..What about REST? It isn’t going anywhere
3. SoWhat’sAPI Management for GraphQL? It’s no rocket science
3. REST
Get ../books/
Get ../authors/
Get results. Iterate to display “My
favorite authors”
GraphQL – The Value
GraphQL
Query {
Books {
author
{ …
}
Benefits for API Consumer
• Get exactly what you ask for
• Query in. Result out
• Roundtrip reduction
4. • Do we really need GraphQL?
• Technology is always providing
developers new solutions to
existing problems
• Lets step back and understand
how we got here
The Need…
5. • API interface represents the
“contract” (ie WSDL) created by
the API Provider
• API Provider exposes contract for
API consumer
• API Consumer consumes service
based on “the contract”
The 1.0 of APIs == Service Oriented Architecture (SOA)
6. • API Provider controls the service
interface
• API Consumer does not have much
input into the design of the
“contract”
Advantage: API Provider
7. • API interface represented by open
standard (OpenAPI), JSON
payloads, YAML configurations
• API Provider exposes standard
RESTful interface
• API Consumer discovers APIs via
self-service onboarding and
developer portal
The 2.0 of APIs == REST
8. • API Provider engages API
Consumer in API design
• API Provider provides simpler
interface of service
implementation
• API Consumer consumes APIs via
self-discovery and use modern
standards
Advantage: API Provider and API Consumer
9. • Query language and
implementation paradigm for data-centric APIs
• API Consumer defines the data they need (and
nothing more)
• API Provider handles complexity of obtaining data
from backend systems
The 3.0 of APIs = GraphQL and Async Endpoints
10. • API Consumer maintains control over
the data definition
• API Consumer does not care about the
internal data structure within
systems
• API Provider endpoint completely
driven by API consumer needs
Advantage: API Consumer
11. In GraphQL, profiles, or resource access rules
depend on the query:
POST ../graphql
{ me { name, age }}
POST ../graphql
mutation {
createK8Cluster (name: "c1"){
clusterId
}
}
vs.
GET …/profiles/me
vs.
POST …/resources/k8cluster
In RESTAPIs, profiles, or resource access rules are
defined for endpoints:
Question: Is GraphQL replacing REST?
• No. RESTAPIs are well-defined interfaces with standard error codes
• Easily cached and optimized for the HTTP protocol
GraphQL provides an alternative query-based approach, optimized for data-intensive operations
REST v/s And GraphQL
12. • SingleGraphQL transaction may
invoke multiple backends
POST /sports/graphql? HTTP/1.1
query {
Players (name: "John T") {
name
league
team {
name
arena {
name
…
}
city
}
}}
Server
1. GET …/players/
2. GET …/team/player.name=?
3. GET …/arena/team.name=?
GraphQL Endpoints De-constructed
13. • Learnings from query
languages (i.e. SQL)
• Can a “poor Query”
overwhelm backend
systems?
• Bad queries can be malicious
or unintentional
Select * From Transactions
SELECT cust.name, address.name, …. {infinite
attributes}
FROM cust, address, … {infinite tables},
WHERE cust.name = address.name AND …. {infinite
joins}
Selecting all data from a database
Complex and nested queries with multiple table joins
Queries And
Implications
14. • Multiple nested backend calls triggered
by single GraphQL API call
• Throttling – Protect backends when
system usage spike
• Variable compute time to resolve query
depends on query complexity
• Rate limits provide ability to limit
number of transactions per consumer
Server
Protection Via
Throttling & Rate Limits
18. • Detect and reject requests with
complex nesting
• Pre-calculate load to determine if
query will overwhelm backends
• Use point/weight system to calculate
“cost” for different query parameters
(e.g. GitHub GraphQL APIs)
Threat Protection
21. • GraphQL enables API consumer to easily retrieve
exactly the data it requires (from data intensive
backends)
• GraphQL management requires insight into the
impact of a query on backend systems
• GraphQL API management enables differentiated API
plans & new threat protection policies
Summary
Hinweis der Redaktion
I wanted to tell you a story on how I got into GraphQL. When I first starting reading about GraphQL, I thought ‘here we go again’. As technologists, we love to solve problems to our technology challenges with new frameworks, programming languages, & tooling. We have already established REST / JSON as our standard for delivery of APIs, so why do we yet another standard.
BTW, I just spent a year learning Angular 2 and only to then realize that Angular is now on version 7. But then all my developer friends tell me that Angular is dead and that you need to learn React. I’m like seriously, your telling me that I need to learn another way to write form input, like, it all gets rendered the same way.
Let me bring this back full circle, so is GraphQL just another framework for writing Web forms … thankfully not, but to truly understand its value and why we need API management of GraphQL, similar to REST APIs, lets revisit our history.
My earliest days of APIs started with SOA or ESB based architecture. I know there are folks who would say that I should probably start with CORBA or ORB, but I did not want a full history assignment.
In the SOA days, we need a human-readable way (quite a complex human) of communicating between heterogenous systems. This team was using .NET and another team was using J2EE, well, so here is SOAP - a common messaging format; and its described using WSDL files. These interfaces or “contracts” were created by the API / service provider, quite often, coupled to the service implementation.
The API consumer calling this service, didn’t have much help. They were given a WSDL, perhaps some documentation via email, and it was really up to the consumer to invoke the API. I put up an image of a WhatsApp image here, but I probably should have used ICQ or something popular at that time to illustrate that conversation.
- If we were to decide who has the advantage in this relationship, its clear that the API Provider holds the cards, or the chess pieces. The API consumer is at the mercy of the contract and the help of the API provider to be able to consume its service.
Now we enter API 2.0, these versions are independent of other technology themes like Web 2.0, or <insert better architecture name> 2.0.
We move away from verbose WSDL files and into a simpler service definition represented as Swagger (or now we call it Open API). Also brings us indent hell with YAML.
We actually put in some thought into our service design, trying to simplify our consumer user experience. We used Roy Felding REST principals to leverage the HTTP protocol and leverage the concept of resources in requests. Furthermore, we didn’t simply throw the Swagger over to the consumer, we hosted them in developer portals, enabling self-service discovery, documentation & forums in a single spot.
- If we now decide who has the advantage, well I think it’s a ‘tie’. The service provider gains the benefits of REST architecture, optimized for HTTP and its much easier for the API consumer to discover and consume the APIs.
Is there really a 3.0 concept in technology? It seems like we stop at 2.0 and if we screw up there, than we try to use another buzz word, like ‘Reborn’.
GraphQL provides a query language for your APIs, leveraging the graph-like nature of backend data structures.
The fundamental difference is that the ‘API complexity’ shifts from the API consumer to the provider
The goal is to make the user experience as simple as possible, reducing the round-tripping and taking advantage of compute on backend infrastructures to optimize data flow.
The API consumer is the big winner here, they simply express the data they require, and voila, they get it back in a single request.
Now, the question is, have we given too much control to the consumer? …. hmm, sounds like we might need some API Management.
This is my foreshadowing moment, but we will come back this question, since to really understand what kind of API management is needed, we first need to understand GraphQL in a little bit more detail.
In a REST query, its clear from the request, what action is being requested, whether its a read-only query via HTTP GET or a request to modify the state of the backend via an HTTP POST.
In a GraphQL query, its a single endpoint, (HTTP POST), where the body determines the action. It can be a read query, where your getting profile information, or a create query, (mutation in GraphQL lingo) to modify data in your systems
The very common question of whether GraphQL is replacing REST - answer is No. Its an alternative to obtain data from data-intensive services. REST provides you advantages of well-defined interfaces with standard error codes, easily cached and optimized for the HTTP protocol.
Key element of a GraphQL call is to take away the complexity of multiple server-side calls.
A single GraphQL call can potentially make one or more backend calls (the runtime magic of graphql optimizes the connectivity). The API consumer is likely sympathetic but is not losing sleep over the complexity of the server-side calls, but its lack of accountability is where we need to manage the number of calls that the API consumer can potentially make ….hmmm API management
GraphQL is a query language, not much different in concept to our favorite query language, SQL.
We have all seen these examples, select * from table, well what if you have a million records, a single query could bring down your infrastructure,
The dreaded, ‘join hell’ where you have multiple joins. These queries can someone be intentional (think SQL Injection) or unintentional (newbie developer).
We have to think back to API 3.0 - we have the API consumer the control, but perhaps we gave them too much control because there could have been unintended consequences.
To protect ourselves from the API consumer, we need to look into the concept of throttling and rate limits.
Anyone who has worked in infrastructure knows that we need to protect our backend services, fail fast, control the number of requests allowed to backends based on compute available.
Especially with GraphQL and its graph-like data structures, we need to understand the impact of a single GraphQL query on the backend infrastructure.
In the traditional API 2.0 or REST use case, a single transaction had an single impact on the compute of backend services. Furthermore, it could be pre-calculated because the service contract was well-defined. In GraphQL, we don’t have that same 1:1 relationship, instead we need a way to map compute time to GraphQL calls.
Consider rate limits, is it really fair to rate limit a consumer on a per GraphQL request basis like we do with REST? I don’t think so, we need to re-think the way we perform consumer rate limiting.
The API Management of GraphQL queries needs to evolve from the traditional model around REST and consider the implications of GraphQL to Rate Limiting and Threat protection, which is my catch-all term to ensure that we minimize the system impact of queries from API consumers.
For GraphQL management to be successful, we need to take advantage of our existing infrastructure and adapt it to understand the complexity of GraphQL queries; specifically, we are not asking our gateways to execute GraphQL, that is the wrong design architecture, but rather assess the complexity of queries and make policy decisions around whether we want to send them to backends.
This is analogous to what many API gateways do when assessing the runtime complexity of an XML or JSON payload. You can have a heavily nested XML payload that can cause software parsers to use a large amount of CPU, so having the detection in the gateway allows you to protect your backend and keep the bad messages from ever reaching them.
- Introspecting a GraphQL backend, we can build static analysis into our gateway, also enabling similar introspection from GraphQL clients.
In the static analysis of our GraphQL schema, we can start to build analyze the structure of our query calls, specifically the maximum nesting, resolve complexity (ie the number of backend calls) and even the number of data elements returned.
- With the analysis on our gateway, we can then start to perform policy enforcement of GraphQL requests. We can reject requests that have very complicated nesting or pre-calculate the compute time of a potential query and determine if we want to even allow that request to proceed.
- A great example of these designs is on GitHub API, where they choose a point system and a token bucket, where it calculates a cost for each query and only allows you to make calls, it even offers a query to pre-calculate the cost before you actually submit the request.
- To the second part of API management is to consider how we offer rate plans.
If we really think about it, GraphQL is a premium service, if we look at the transition from API 2.0 to API 3.0, we gave the API consumer a really great service, so as API providers, we can potentially monetize it as a differentiated API plan.
For example, we may consider REST our basic plan, Kafka service, as a good plan, and GraphQL as the premium plan. For developers who want to the efficiency of GraphQL, API management systems can offer it as a premium rate plan.
In addition, you can have multiple GraphQL plans, where you are quota limited based on the compute time of the backend. It can be based on the resolve complexity and you can offer multiple plans to allow consumers to obtain more API data from a single GraphQL request.