"Getting Started with Your Initial GraphQL API"

Foreword

Some time ago, Facebook introduced GraphQL, a novel way to construct back-end APIs. Essentially, it’s a domain-specific language for querying and manipulating data. Initially, I didn’t give it much thought, but I eventually became involved in a Toptal project requiring me to use GraphQL for back-end APIs. This prompted me to learn how to adapt my existing REST knowledge to this new technology.

It proved to be a fascinating endeavor. The implementation process challenged me to rethink the usual strategies and techniques associated with REST APIs and find more GraphQL-suited solutions. This article aims to encapsulate some common points to keep in mind when venturing into GraphQL API development for the first time.

Required Libraries

Developed internally by Facebook, GraphQL was publicly launched in 2015. Fast forward to 2018, the project transitioned from Facebook to the newly established GraphQL Foundation under the stewardship of the nonprofit Linux Foundation. This foundation now oversees the maintenance and evolution of the GraphQL query language specification and a reference implementation for JavaScript.

GraphQL is a relatively recent technology, and with its initial implementation in JavaScript, its most developed libraries are primarily found within the Node.js ecosystem. Two companies, Apollo and Prisma, offer open-source tools and libraries for GraphQL. The example project in this article utilizes a reference implementation of GraphQL for JavaScript, along with libraries from these two providers:

Graphql-js – The reference implementation of GraphQL for JavaScript.
Apollo-server – A GraphQL server compatible with Express, Connect, Hapi, Koa, and more.
Apollo-graphql-tools – Facilitates building, mocking, and stitching a GraphQL schema using SDL.
Prisma-graphql-middleware – Enables the breakdown of GraphQL resolvers into middleware functions.

Describing your APIs in GraphQL involves using GraphQL schemas. A dedicated language, the GraphQL Schema Definition Language (SDL), is provided by the specification for this purpose. SDL strikes a balance between simplicity and intuitiveness while remaining incredibly potent and expressive.

Two primary approaches exist for creating these schemas: code-first and schema-first.

The code-first method entails describing GraphQL schemas as JavaScript objects based on the graphql-js library, with the SDL being auto-generated from the source code.
Conversely, the schema-first approach involves defining GraphQL schemas directly in SDL and then integrating business logic using Apollo’s graphql-tools library.

My preference lies with the schema-first methodology, which I will employ for the sample project in this article. We’ll use a traditional bookstore example to build a back end that provides CRUD APIs for managing authors and books, as well as APIs for user management and authentication.

Creating a Basic GraphQL Server

Setting up a rudimentary GraphQL server involves creating a new project, initializing it with npm, and configuring Babel. The latter step, Babel configuration, requires installing the necessary libraries using the following command:

1
npm install --save-dev @babel/core @babel/cli @babel/preset-env @babel/node 

Once Babel is installed, create a file named .babelrc within your project’s root directory. Add the following configuration to this file:

1
2
3
4
5
6
7
8
{
    "presets": [
        [
            "@babel/env",
            { "targets": { "node": "current" } }
        ]
    ]
}

Next, modify the package.json file and include the following command in the scripts section:

1
2
3
4
5
6
7
{
    ...
    "scripts": {
        "serve": "babel-node index.js"
    },
    ...
}

With Babel configured, you can proceed to install the essential GraphQL libraries. Use the following command:

1
npm install --save express apollo-server-express graphql graphql-tools graphql-tag

After the installation of the required libraries, you can create a minimal GraphQL server by adding the following code snippet to your index.js file:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
import gql from 'graphql-tag';
import express from 'express';
import { ApolloServer, makeExecutableSchema } from 'apollo-server-express';

const port = process.env.PORT || 8080;

// Define APIs using GraphQL SDL
const typeDefs = gql`
   type Query {
       sayHello(name: String!): String!
   }

   type Mutation {
       sayHello(name: String!): String!
   }
`;

// Define resolvers map for API definitions in SDL
const resolvers = {
   Query: {
       sayHello: (obj, args, context, info) => {
           return `Hello ${ args.name }!`;
       }
   },

   Mutation: {
       sayHello: (obj, args, context, info) => {
           return `Hello ${ args.name }!`;
       }
   }
};

// Configure express
const app = express();

// Build GraphQL schema based on SDL definitions and resolvers maps
const schema = makeExecutableSchema({ typeDefs, resolvers });

// Build Apollo server
const apolloServer = new ApolloServer({ schema });
apolloServer.applyMiddleware({ app });

// Run server
app.listen({ port }, () => {
   console.log(`🚀Server ready at http://localhost:${ port }${ apolloServer.graphqlPath }`);
});

Now, you can launch the server using the command npm run serve. Navigating to the URL http://localhost:8080/graphql in a web browser will open GraphQL’s interactive visual shell, known as Playground. Here, you can execute GraphQL queries and mutations while observing the resulting data.

In GraphQL, API functions are categorized as queries, mutations, and subscriptions:

Clients employ queries to retrieve necessary data from the server.
Mutations are used by clients to create, update, or delete data on the server.
Subscriptions allow clients to establish and maintain a real-time connection with the server, enabling them to receive events from the server and respond appropriately.

Our focus in this article will be on queries and mutations. Subscriptions, being a substantial topic in themselves, merit a dedicated article and aren’t mandatory for all API implementations.

Advanced Scalar Data Types

As you delve deeper into GraphQL, you’ll notice that SDL only offers basic data types. More advanced scalar types, such as Date, Time, and DateTime, ubiquitous in APIs, are absent. The graphql-iso-date library addresses this issue. After installing it, you need to define these new advanced scalar data types within your schema and link them to their implementations provided by the library:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
import { GraphQLDate, GraphQLDateTime, GraphQLTime } from 'graphql-iso-date';

// Define APIs using GraphQL SDL
const typeDefs = gql`
   scalar Date
   scalar Time
   scalar DateTime
  
   type Query {
       sayHello(name: String!): String!
   }

   type Mutation {
       sayHello(name: String!): String!
   }
`;

// Define resolvers map for API definitions in SDL
const resolvers = {
   Date: GraphQLDate,
   Time: GraphQLTime,
   DateTime: GraphQLDateTime,

   Query: {
       sayHello: (obj, args, context, info) => {
           return `Hello ${ args.name }!`;
       }
   },

   Mutation: {
       sayHello: (obj, args, context, info) => {
           return `Hello ${ args.name }!`;
       }
   }
};

Beyond date and time, other valuable scalar data type implementations can be helpful depending on your specific scenario. One example is graphql-type-json, which enables dynamic typing within your GraphQL schema. This allows the passing and returning of untyped JSON objects through your API. Another useful library, graphql-scalar, provides the capability to define custom GraphQL scalars with enhanced sanitization, validation, and transformation capabilities.

You also have the option to create and utilize your own custom scalar data type within your schema, as demonstrated earlier. While not particularly complex, a detailed explanation falls outside the scope of this article. Should you be interested, you can find more comprehensive information in the Apollo documentation.

Splitting Schema

As you continue to enhance your schema with added functionality, it will inevitably grow in size. Maintaining all the definitions in a single file becomes impractical. Splitting it into smaller, manageable components becomes essential for code organization and scalability. Fortunately, Apollo’s makeExecutableSchema function accommodates schema definitions and resolver maps in array format. This allows you to break down your schema and resolver map into more manageable parts. This is precisely what I’ve done in the provided sample project, where I’ve divided the API into the following sections:

auth.api.graphql – Dedicated to user authentication and registration.
author.api.graphql – Provides CRUD API endpoints for managing author entries.
book.api.graphql – Contains CRUD API endpoints for handling book entries.
root.api.graphql – Serves as the schema root and houses common definitions (e.g., advanced scalar types).
user.api.graphql – Provides CRUD API endpoints for user management.

When dividing the schema, there’s one crucial aspect to bear in mind. One part must act as the root schema, with all other parts extending it. While this might sound intricate, it’s quite straightforward in practice.

In the root schema, queries and mutations are defined like so:

1
2
3
4
5
6
7
8
type Query {

    ...
}

type Mutation {
    ...
}

In other schema parts, the definition changes slightly:

1
2
3
4
5
6
7
extend type Query {
    ...
}

extend type Mutation {
    ...
}

That’s all there is to it!

Authentication and Authorization

Many API implementations necessitate restricted global access and the implementation of role-based access control mechanisms. This is where Authentication (confirming user identity) and Authorization (enforcing rule-based access policies) come into play.

Like REST, GraphQL commonly uses JSON Web Token for authentication. Validating the provided JWT token requires intercepting all incoming requests and inspecting their authorization headers. To accomplish this in GraphQL, you can register a function as a context hook when setting up the Apollo server. This function is called with the current request, generating a context shared among all resolvers. Here’s how it’s done:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
// Configure express
const app = express();

// Build GraphQL schema based on SDL definitions and resolver maps
const schema = makeExecutableSchema({ typeDefs, resolvers });

// Build Apollo server
const apolloServer = new ApolloServer({
    schema,
    
    context: ({ req, res }) => {
            const context = {};

            // Verify jwt token
            const parts = req.headers.authorization ? req.headers.authorization.split(' ') : [''];
            const token = parts.length === 2 && parts[0].toLowerCase() === 'bearer' ? parts[1] : undefined;
            context.authUser = token ? verify(token) : undefined;

            return context;
    }
});
apolloServer.applyMiddleware({ app });

// Run server
app.listen({ port }, () => {
   console.log(`🚀Server ready at http://localhost:${ port }${ apolloServer.graphqlPath }`);
});

If the user provides a valid JWT token, it’s verified, and the corresponding user object is saved in the context. This object then becomes accessible to all resolvers throughout the request’s lifecycle.

Despite verifying user identities, our API remains globally accessible. There’s nothing preventing unauthorized access. One option is to directly check the user object within the context of each resolver. However, this approach is prone to errors and introduces boilerplate code. Additionally, it’s easy to overlook adding this check when introducing new resolvers. In REST API frameworks, this is often addressed using HTTP request interceptors. This doesn’t translate well to GraphQL, however. One HTTP request can contain multiple GraphQL queries, and implementing an interceptor would only provide access to the raw string representation of the query, necessitating manual parsing—not an ideal solution. This concept doesn’t map cleanly from REST to GraphQL.

Therefore, we require a different approach for intercepting GraphQL queries. Enter prisma-graphql-middleware. This library allows you to execute custom code either before or after a resolver is called, promoting code reusability and a clear separation of concerns.

Leveraging the Prisma middleware library, the GraphQL community has developed numerous impressive middleware solutions addressing various use cases. For user authorization, there’s graphql-shield, which simplifies the creation of a permission layer for your API.

Once graphql-shield is installed, you can introduce a permission layer into your API like so:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
import { allow } from 'graphql-shield';

const isAuthorized = rule()(
   (obj, args, { authUser }, info) => authUser && true
);

export const permissions = {
    Query: {
        '*': isAuthorized,
          sayHello: allow
    },

    Mutation: {
        '*': isAuthorized,
        sayHello: allow
    }
}

This layer is then applied as middleware to your schema:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
// Configure express
const app = express();

// Build GraphQL schema based on SDL definitions and resolver maps
const schema = makeExecutableSchema({ typeDefs, resolvers });
const schemaWithMiddleware = applyMiddleware(schema, shield(permissions, { allowExternalErrors: true }));

// Build Apollo server
const apolloServer = new ApolloServer({ schemaWithMiddleware });
apolloServer.applyMiddleware({ app });

// Run server
app.listen({ port }, () => {
    console.log(`🚀Server ready at http://localhost:${ port }${ apolloServer.graphqlPath }`);
})

When instantiating the shield object, we enable allowExternalErrors. By default, the shield catches and handles errors arising within resolvers, which wasn’t suitable for my sample application.

In this example, we merely restricted API access to authenticated users. However, the shield offers significant flexibility, allowing for the implementation of intricate authorization schemes. For instance, in our example application, we have two roles: USER and USER_MANAGER. Only users with the USER_MANAGER role can access user administration functionality. Here’s how it’s implemented:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
export const isUserManager = rule()(
    (obj, args, { authUser }, info) => authUser && authUser.role === 'USER_MANAGER'
);

export const permissions = {
    Query: {
        userById: isUserManager,
        users: isUserManager
    },

    Mutation: {
        editUser: isUserManager,
        deleteUser: isUserManager
    }
}

Another point worth mentioning is the organization of middleware functions within your project. Similar to schema definitions and resolver maps, it’s advantageous to separate them based on the schema and maintain them in distinct files. Unlike the Apollo server, which accepts and stitches together arrays of schema definitions and resolver maps, the Prisma middleware library requires a single middleware map object. Therefore, if you opt to split them, manual stitching is necessary. Refer to the ApiExplorer class in the sample project to see my solution to this.

Validation

GraphQL SDL offers limited user input validation. You can only specify which fields are required and optional. Any additional validation rules must be implemented manually. While you could embed validation logic directly within the resolver functions, this isn’t ideal and presents another excellent opportunity to utilize GraphQL middlewares.

Let’s consider a user signup request where we need to validate whether the username is a valid email address, if the entered passwords match, and if the chosen password meets specific strength criteria. Here’s how you can achieve this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
import { UserInputError } from 'apollo-server-express';
import passwordValidator from 'password-validator';
import { isEmail } from 'validator';

const passwordSchema = new passwordValidator()
    .is().min(8)
    .is().max(20)
    .has().letters()
    .has().digits()
    .has().symbols()
    .has().not().spaces();

export const validators = {
    Mutation: {
        signup: (resolve, parent, args, context) => {
            const { email, password, rePassword } = args.signupReq;

            if (!isEmail(email)) {
                throw new UserInputError('Invalid Email address!');
            }

            if (password !== rePassword) {
                throw new UserInputError('Passwords don\'t match!');
            }

            if (!passwordSchema.validate(password)) {
                throw new UserInputError('Password is not strong enough!');
            }

            return resolve(parent, args, context);
        }
    }
}

This validation layer, along with the permissions layer, is then applied as middleware to your schema:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
// Configure express
const app = express();

// Build GraphQL schema based on SDL definitions and resolver maps
const schema = makeExecutableSchema({ typeDefs, resolvers });
const schemaWithMiddleware = applyMiddleware(schema, validators, shield(permissions, { allowExternalErrors: true }));

// Build Apollo server
const apolloServer = new ApolloServer({ schemaWithMiddleware });
apolloServer.applyMiddleware({ app })

N + 1 Queries

Another potential pitfall in GraphQL APIs, often overlooked, is the N + 1 queries problem. This arises when you have a one-to-many relationship between types defined in your schema. To illustrate this, let’s examine the book API from our sample project:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
extend type Query {
    books: [Book!]!
    ...
}

extend type Mutation {
    ...
}

type Book {
    id: ID!
    creator: User!
    createdAt: DateTime!
    updatedAt: DateTime!
    authors: [Author!]!
    title: String!
    about: String
    language: String
    genre: String
    isbn13: String
    isbn10: String
    publisher: String
    publishDate: Date
    hardcover: Int
}

type User {
    id: ID!
    createdAt: DateTime!
    updatedAt: DateTime!
    fullName: String!
    email: String!
}

Here, the User type has a one-to-many relationship with the Book type, represented by the creator field in Book. The resolver map for this schema is defined as:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
export const resolvers = {
    Query: {
        books: (obj, args, context, info) => {
            return bookService.findAll();
          },
        ...
    },

    Mutation: {
        ...
    },

    Book: {
        creator: ({ creatorId }, args, context, info) => {
            return userService.findById(creatorId);
          },
        ...
    }
}

Executing a books query against this API and observing the SQL statement log might reveal something like this:

1
2
3
4
5
6
7
select `books`.* from `books`
select `users`.* from `users` where `users`.`id` = ?
select `users`.* from `users` where `users`.`id` = ?
select `users`.* from `users` where `users`.`id` = ?
select `users`.* from `users` where `users`.`id` = ?
select `users`.* from `users` where `users`.`id` = ?
...

This shows that the resolver was initially called for the books query, returning a list of books. Subsequently, for each book object, the creator field’s resolver was invoked, leading to N + 1 database queries. Such behavior can overload your database and is far from ideal.

To tackle this N + 1 queries issue, Facebook developers introduced an elegant solution known as DataLoader. Its README succinctly describes it as:

“DataLoader is a generic utility to be used as part of your application’s data fetching layer to provide a simplified and consistent API over various remote data sources such as databases or web services via batching and caching”

Understanding how DataLoader operates might not be immediately intuitive. Let’s first examine an example addressing the problem outlined above and then delve into the underlying logic.

In our sample project, DataLoader is defined as follows for the creator field:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
export class UserDataLoader extends DataLoader {
   constructor() {
       const batchLoader = userIds => {
           return userService
               .findByIds(userIds)
               .then(
                   users => userIds.map(
                       userId => users.filter(user => user.id === userId)[0]
                   )
               );
       };

       super(batchLoader);
   }

   static getInstance(context) {
       if (!context.userDataLoader) {
           context.userDataLoader = new UserDataLoader();
       }

       return context.userDataLoader;
   }
}

With the UserDataLoader in place, you can modify the resolver for the creator field:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
export const resolvers = {
   Query: {
      ...
   },

   Mutation: {
      ...
   },

   Book: {
      creator: ({ creatorId }, args, context, info) => {
         const userDataLoader = UserDataLoader.getInstance(context);

         return userDataLoader.load(creatorId);
      },
      ...
   }
}

After making these changes, if you rerun the books query and inspect the SQL statement log, you should observe something similar to this:

1
2
select `books`.* from `books`
select `users`.* from `users` where `id` in (?)

As you can see, the N + 1 database queries have been reduced to two. The first query retrieves the list of books, while the second fetches the list of users corresponding to the creators of those books. Let’s break down how DataLoader achieves this.

Batching is a core feature of DataLoader. During a single execution phase, DataLoader gathers all distinct IDs from individual load function calls. It then invokes the batch function, providing it with all the requested IDs. Importantly, DataLoader instances aren’t designed for reuse. Once the batch function is called, the returned values are permanently cached within the instance. Consequently, a new DataLoader instance must be created for each execution phase. This is handled by the static getInstance function in our example. It checks if a DataLoader instance already exists within the context object. If not, it creates a new one. Recall that a new context object is generated for each execution phase, shared among all resolvers.

DataLoader’s batch loading function receives an array of unique requested IDs. It then returns a promise that resolves to an array of corresponding objects. When implementing this function, two crucial points need attention:

The length of the results array must match the length of the requested IDs array. For instance, if the requested IDs are [1, 2, 3], the returned array must contain precisely three objects: [{ "id": 1, “fullName”: “user1” }, { “id”: 2, “fullName”: “user2” }, { “id”: 3, “fullName”: “user3” }]
The order of objects in the results array must align with the order of IDs in the requested IDs array. If the requested IDs are ordered as [3, 1, 2], the results array should mirror this order: [{ "id": 3, “fullName”: “user3” }, { “id”: 1, “fullName”: “user1” }, { “id”: 2, “fullName”: “user2” }]

Our example ensures this order consistency using the following code:

1
2
3
4
5
then(
   users => userIds.map(
       userId => users.filter(user => user.id === userId)[0]
   )
)

Security

Last but certainly not least, let’s address security. GraphQL empowers you to create highly flexible APIs, providing users with extensive control over data querying. This flexibility, however, can be exploited if security measures are inadequate. As the saying goes, “With great power comes great responsibility.” A malicious user could potentially craft an expensive query, leading to a Denial of Service (DoS) attack on your server.

One initial step towards securing your API is to disable GraphQL schema introspection. By default, GraphQL API servers allow introspection, enabling tools like GraphiQL and Apollo Playground to understand the schema. This feature, however, can also be leveraged by malicious actors to construct complex queries targeting your API. Disabling it involves setting the introspection parameter to false during Apollo Server creation:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
// Configure express
const app = express();

// Build GraphQL schema based on SDL definitions and resolver maps
const schema = makeExecutableSchema({ typeDefs, resolvers });

// Build Apollo server
const apolloServer = new ApolloServer({ schema, introspection: false });
apolloServer.applyMiddleware({ app });

// Run server
app.listen({ port }, () => {
   console.log(`🚀Server ready at http://localhost:${ port }${ apolloServer.graphqlPath }`);
})

Another important security measure is limiting query depth. This is particularly crucial if your data types have cyclic relationships. Consider our example project where the Author type has a books field, and the Book type has an authors field. This represents a cyclic relationship, and a malicious user could construct a query like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
query {
   authors {
      id, fullName
      books {
         id, title
         authors {
            id, fullName
              books {
             id, title,
             authors {
                  id, fullName
                  books {
                       id, title
                       authors {
                      ...
                      }
                  }
               }
            }
         }
      }
   }
}

With sufficient nesting, such a query could easily overwhelm your server. To mitigate this risk, you can use a library called graphql-depth-limit. Once installed, you can enforce a depth limit when creating the Apollo Server:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
// Configure express
const app = express();

// Build GraphQL schema based on SDL definitions and resolver maps
const schema = makeExecutableSchema({ typeDefs, resolvers });

// Build Apollo server
const apolloServer = new ApolloServer({ schema, introspection: false, validationRules: [ depthLimit(5) ] });
apolloServer.applyMiddleware({ app });

// Run server
app.listen({ port }, () => {
   console.log(`🚀Server ready at http://localhost:${ port }${ apolloServer.graphqlPath }`);
})

In this case, we’ve restricted the maximum query depth to five.

Post Scriptum: Moving from REST to GraphQL Is Interesting

This tutorial aimed to highlight common challenges encountered when starting with GraphQL API development. Given the breadth of some topics, certain code examples are simplified and only scratch the surface. For more comprehensive implementations, please refer to the Git repository of my sample GraphQL API project: graphql-example.

In conclusion, GraphQL is an intriguing technology. While its potential to replace REST remains uncertain, its innovative approach to API development makes it a valuable skillset to acquire. In the ever-evolving world of IT, who knows what the future holds?