Skip to content

Commit e2a4ba7

Browse files
sarahxsandersyaacovCR
authored andcommitted
docs: N+1 problem and DataLoader (#4383)
1 parent 00a25a0 commit e2a4ba7

File tree

3 files changed

+150
-0
lines changed

3 files changed

+150
-0
lines changed

cspell.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ overrides:
2727
- swcrc
2828
- noreferrer
2929
- xlink
30+
- deduplication
3031

3132
validateDirectives: true
3233
ignoreRegExpList:

website/pages/docs/_meta.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ const meta = {
1919
'constructing-types': '',
2020
'oneof-input-objects': '',
2121
'defer-stream': '',
22+
'n1-dataloader': '',
2223
'resolver-anatomy': '',
2324
'graphql-errors': '',
2425
'-- 3': {

website/pages/docs/n1-dataloader.mdx

Lines changed: 148 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,148 @@
1+
---
2+
title: Solving the N+1 Problem with `DataLoader`
3+
---
4+
5+
When building a server with GraphQL.js, it's common to encounter
6+
performance issues related to the N+1 problem: a pattern that
7+
results in many unnecessary database or service calls,
8+
especially in nested query structures.
9+
10+
This guide explains what the N+1 problem is, why it's relevant in
11+
GraphQL field resolution, and how to address it using
12+
[`DataLoader`](https://github.com/graphql/dataloader).
13+
14+
## What is the N+1 problem?
15+
16+
The N+1 problem happens when your API fetches a list of items using one
17+
query, and then issues an additional query for each item in the list.
18+
In GraphQL, this usually occurs in nested field resolvers.
19+
20+
For example, in the following query:
21+
22+
```graphql
23+
{
24+
posts {
25+
id
26+
title
27+
author {
28+
name
29+
}
30+
}
31+
}
32+
```
33+
34+
If the `posts` field returns 10 items, and each `author` field fetches
35+
the author by ID with a separate database call, the server performs
36+
11 total queries: one to fetch the posts, and one for each post's author
37+
(10 total authors). As the number of parent items increases, the number
38+
of database calls grows, which can degrade performance.
39+
40+
Even if several posts share the same author, the server will still issue
41+
duplicate queries unless you implement deduplication or batching manually.
42+
43+
## Why this happens in GraphQL.js
44+
45+
In GraphQL.js, each field resolver runs independently. There's no built-in
46+
coordination between resolvers, and no automatic batching. This makes field
47+
resolvers composable and predictable, but it also creates the N+1 problem.
48+
Nested resolutions, such as fetching an author for each post in the previous
49+
example, will each call their own data-fetching logic, even if those calls
50+
could be grouped.
51+
52+
## Solving the problem with `DataLoader`
53+
54+
[`DataLoader`](https://github.com/graphql/dataloader) is a utility library designed
55+
to solve this problem. It batches multiple `.load(key)` calls into a single `batchLoadFn(keys)`
56+
call and caches results during the life of a request. This means you can reduce redundant data
57+
fetches and group related lookups into efficient operations.
58+
59+
To use `DataLoader` in a `graphql-js` server:
60+
61+
1. Create `DataLoader` instances for each request.
62+
2. Attach the instance to the `contextValue` passed to GraphQL execution. You can attach the
63+
loader when calling [`graphql()`](https://graphql.org/graphql-js/graphql/#graphql) directly, or
64+
when setting up a GraphQL HTTP server such as [express-graphql](https://github.com/graphql/express-graphql).
65+
3. Use `.load(id)` in resolvers to fetch data through the loader.
66+
67+
### Example: Batching author lookups
68+
69+
Suppose each `Post` has an `authorId`, and you have a `getUsersByIds(ids)`
70+
function that can fetch multiple users in a single call:
71+
72+
```js
73+
import {
74+
graphql,
75+
GraphQLObjectType,
76+
GraphQLSchema,
77+
GraphQLString,
78+
GraphQLList,
79+
GraphQLID
80+
} from 'graphql';
81+
import DataLoader from 'dataloader';
82+
import { getPosts, getUsersByIds } from './db.js';
83+
84+
const UserType = new GraphQLObjectType({
85+
name: 'User',
86+
fields: () => ({
87+
id: { type: GraphQLID },
88+
name: { type: GraphQLString },
89+
}),
90+
});
91+
92+
const PostType = new GraphQLObjectType({
93+
name: 'Post',
94+
fields: () => ({
95+
id: { type: GraphQLID },
96+
title: { type: GraphQLString },
97+
author: {
98+
type: UserType,
99+
resolve(post, args, context) {
100+
return context.userLoader.load(post.authorId);
101+
},
102+
},
103+
}),
104+
});
105+
106+
const QueryType = new GraphQLObjectType({
107+
name: 'Query',
108+
fields: () => ({
109+
posts: {
110+
type: GraphQLList(PostType),
111+
resolve: () => getPosts(),
112+
},
113+
}),
114+
});
115+
116+
const schema = new GraphQLSchema({ query: QueryType });
117+
118+
function createContext() {
119+
return {
120+
userLoader: new DataLoader(async (userIds) => {
121+
const users = await getUsersByIds(userIds);
122+
return userIds.map(id => users.find(user => user.id === id));
123+
}),
124+
};
125+
}
126+
```
127+
128+
With this setup, all `.load(authorId)` calls are automatically collected and batched
129+
into a single call to `getUsersByIds`. `DataLoader` also caches results for the duration
130+
of the request, so repeated `.load(id)` calls for the same ID don't trigger
131+
additional fetches.
132+
133+
## Best practices
134+
135+
- Create a new `DataLoader` instance per request. This ensures that caching is scoped
136+
correctly and avoids leaking data between users.
137+
- Always return results in the same order as the input keys. This is required by the
138+
`DataLoader` contract. If a key is not found, return `null` or throw depending on
139+
your policy.
140+
- Keep batch functions focused. Each loader should handle a specific data access pattern.
141+
- Use `.loadMany()` sparingly. While it's useful when you already have a list of IDs, it's
142+
typically not needed in field resolvers, since `.load()` already batches individual calls
143+
made within the same execution cycle.
144+
145+
## Additional resources
146+
147+
- [`DataLoader` GitHub repository](https://github.com/graphql/dataloader): Includes full API docs and usage examples
148+
- [GraphQL field resolvers](https://graphql.org/graphql-js/resolvers/): Background on how field resolution works.

0 commit comments

Comments
 (0)