|
| 1 | +--- |
| 2 | +title: Solving the N+1 Problem with `DataLoader` |
| 3 | +--- |
| 4 | + |
| 5 | +When building a server with GraphQL.js, it's common to encounter |
| 6 | +performance issues related to the N+1 problem: a pattern that |
| 7 | +results in many unnecessary database or service calls, |
| 8 | +especially in nested query structures. |
| 9 | + |
| 10 | +This guide explains what the N+1 problem is, why it's relevant in |
| 11 | +GraphQL field resolution, and how to address it using |
| 12 | +[`DataLoader`](https://github.com/graphql/dataloader). |
| 13 | + |
| 14 | +## What is the N+1 problem? |
| 15 | + |
| 16 | +The N+1 problem happens when your API fetches a list of items using one |
| 17 | +query, and then issues an additional query for each item in the list. |
| 18 | +In GraphQL, this usually occurs in nested field resolvers. |
| 19 | + |
| 20 | +For example, in the following query: |
| 21 | + |
| 22 | +```graphql |
| 23 | +{ |
| 24 | + posts { |
| 25 | + id |
| 26 | + title |
| 27 | + author { |
| 28 | + name |
| 29 | + } |
| 30 | + } |
| 31 | +} |
| 32 | +``` |
| 33 | + |
| 34 | +If the `posts` field returns 10 items, and each `author` field fetches |
| 35 | +the author by ID with a separate database call, the server performs |
| 36 | +11 total queries: one to fetch the posts, and one for each post's author |
| 37 | +(10 total authors). As the number of parent items increases, the number |
| 38 | +of database calls grows, which can degrade performance. |
| 39 | + |
| 40 | +Even if several posts share the same author, the server will still issue |
| 41 | +duplicate queries unless you implement deduplication or batching manually. |
| 42 | + |
| 43 | +## Why this happens in GraphQL.js |
| 44 | + |
| 45 | +In GraphQL.js, each field resolver runs independently. There's no built-in |
| 46 | +coordination between resolvers, and no automatic batching. This makes field |
| 47 | +resolvers composable and predictable, but it also creates the N+1 problem. |
| 48 | +Nested resolutions, such as fetching an author for each post in the previous |
| 49 | +example, will each call their own data-fetching logic, even if those calls |
| 50 | +could be grouped. |
| 51 | + |
| 52 | +## Solving the problem with `DataLoader` |
| 53 | + |
| 54 | +[`DataLoader`](https://github.com/graphql/dataloader) is a utility library designed |
| 55 | +to solve this problem. It batches multiple `.load(key)` calls into a single `batchLoadFn(keys)` |
| 56 | +call and caches results during the life of a request. This means you can reduce redundant data |
| 57 | +fetches and group related lookups into efficient operations. |
| 58 | + |
| 59 | +To use `DataLoader` in a `graphql-js` server: |
| 60 | + |
| 61 | +1. Create `DataLoader` instances for each request. |
| 62 | +2. Attach the instance to the `contextValue` passed to GraphQL execution. You can attach the |
| 63 | +loader when calling [`graphql()`](https://graphql.org/graphql-js/graphql/#graphql) directly, or |
| 64 | +when setting up a GraphQL HTTP server such as [express-graphql](https://github.com/graphql/express-graphql). |
| 65 | +3. Use `.load(id)` in resolvers to fetch data through the loader. |
| 66 | + |
| 67 | +### Example: Batching author lookups |
| 68 | + |
| 69 | +Suppose each `Post` has an `authorId`, and you have a `getUsersByIds(ids)` |
| 70 | +function that can fetch multiple users in a single call: |
| 71 | + |
| 72 | +```js |
| 73 | +import { |
| 74 | + graphql, |
| 75 | + GraphQLObjectType, |
| 76 | + GraphQLSchema, |
| 77 | + GraphQLString, |
| 78 | + GraphQLList, |
| 79 | + GraphQLID |
| 80 | +} from 'graphql'; |
| 81 | +import DataLoader from 'dataloader'; |
| 82 | +import { getPosts, getUsersByIds } from './db.js'; |
| 83 | + |
| 84 | +const UserType = new GraphQLObjectType({ |
| 85 | + name: 'User', |
| 86 | + fields: () => ({ |
| 87 | + id: { type: GraphQLID }, |
| 88 | + name: { type: GraphQLString }, |
| 89 | + }), |
| 90 | +}); |
| 91 | + |
| 92 | +const PostType = new GraphQLObjectType({ |
| 93 | + name: 'Post', |
| 94 | + fields: () => ({ |
| 95 | + id: { type: GraphQLID }, |
| 96 | + title: { type: GraphQLString }, |
| 97 | + author: { |
| 98 | + type: UserType, |
| 99 | + resolve(post, args, context) { |
| 100 | + return context.userLoader.load(post.authorId); |
| 101 | + }, |
| 102 | + }, |
| 103 | + }), |
| 104 | +}); |
| 105 | + |
| 106 | +const QueryType = new GraphQLObjectType({ |
| 107 | + name: 'Query', |
| 108 | + fields: () => ({ |
| 109 | + posts: { |
| 110 | + type: GraphQLList(PostType), |
| 111 | + resolve: () => getPosts(), |
| 112 | + }, |
| 113 | + }), |
| 114 | +}); |
| 115 | + |
| 116 | +const schema = new GraphQLSchema({ query: QueryType }); |
| 117 | + |
| 118 | +function createContext() { |
| 119 | + return { |
| 120 | + userLoader: new DataLoader(async (userIds) => { |
| 121 | + const users = await getUsersByIds(userIds); |
| 122 | + return userIds.map(id => users.find(user => user.id === id)); |
| 123 | + }), |
| 124 | + }; |
| 125 | +} |
| 126 | +``` |
| 127 | + |
| 128 | +With this setup, all `.load(authorId)` calls are automatically collected and batched |
| 129 | +into a single call to `getUsersByIds`. `DataLoader` also caches results for the duration |
| 130 | +of the request, so repeated `.load(id)` calls for the same ID don't trigger |
| 131 | +additional fetches. |
| 132 | + |
| 133 | +## Best practices |
| 134 | + |
| 135 | +- Create a new `DataLoader` instance per request. This ensures that caching is scoped |
| 136 | +correctly and avoids leaking data between users. |
| 137 | +- Always return results in the same order as the input keys. This is required by the |
| 138 | +`DataLoader` contract. If a key is not found, return `null` or throw depending on |
| 139 | +your policy. |
| 140 | +- Keep batch functions focused. Each loader should handle a specific data access pattern. |
| 141 | +- Use `.loadMany()` sparingly. While it's useful when you already have a list of IDs, it's |
| 142 | +typically not needed in field resolvers, since `.load()` already batches individual calls |
| 143 | +made within the same execution cycle. |
| 144 | + |
| 145 | +## Additional resources |
| 146 | + |
| 147 | +- [`DataLoader` GitHub repository](https://github.com/graphql/dataloader): Includes full API docs and usage examples |
| 148 | +- [GraphQL field resolvers](https://graphql.org/graphql-js/resolvers/): Background on how field resolution works. |
0 commit comments