2020-12-21 06:21:10 +00:00
# Optimizing N+1 queries
Have you noticed some GraphQL queries end can make hundreds of database queries, often with mostly repeated data? Lets take a look why and how to fix it.
## Query Resolution
Imagine if you have a simple query like this:
```graphql
query { todos { users { name } } }
```
and `User` resolver is like this:
```rust
struct User {
id: u64,
}
#[Object]
impl User {
async fn name(& self, ctx: & Context< '_>) -> Result< String > {
let pool = ctx.data_unchecked::< Pool < Postgres > >();
let (name,): (String,) = sqlx::query_as("SELECT name FROM user WHERE id = $1")
.bind(self.id)
.fetch_one(pool)
.await?;
Ok(name)
}
}
```
The query executor will call the `Todos` resolver which does a `select * from todo and return N todos` . Then for each
of the todos, concurrently, call the `User` resolver, `SELECT from USER where id = todo.user_id` .
eg:
```sql
SELECT id, todo, user_id FROM todo
SELECT name FROM user WHERE id = $1
SELECT name FROM user WHERE id = $1
SELECT name FROM user WHERE id = $1
SELECT name FROM user WHERE id = $1
SELECT name FROM user WHERE id = $1
SELECT name FROM user WHERE id = $1
SELECT name FROM user WHERE id = $1
SELECT name FROM user WHERE id = $1
SELECT name FROM user WHERE id = $1
SELECT name FROM user WHERE id = $1
SELECT name FROM user WHERE id = $1
SELECT name FROM user WHERE id = $1
SELECT name FROM user WHERE id = $1
SELECT name FROM user WHERE id = $1
SELECT name FROM user WHERE id = $1
SELECT name FROM user WHERE id = $1
```
After executing `SELECT name FROM user WHERE id = $1` many times, and most `Todo` objects belong to the same user, we
need to optimize these codes!
## Dataloader
We need to group queries and exclude duplicate queries. `Dataloader` can do this.
[facebook ](https://github.com/facebook/dataloader ) gives a request-scope batch and caching solution.
The following is an example of using `DataLoader` to optimize queries::
```rust
use async_graphql::*;
use async_graphql::dataloader::*;
use itertools::Itertools;
2021-09-24 02:54:18 +00:00
use std::sync::Arc;
2020-12-21 06:21:10 +00:00
struct UserNameLoader {
pool: sqlx::Pool< Postgres > ,
}
#[async_trait::async_trait]
2020-12-22 02:04:21 +00:00
impl Loader< u64 > for UserNameLoader {
2020-12-21 06:21:10 +00:00
type Value = String;
2021-09-24 02:54:18 +00:00
type Error = Arc< sqlx::Error > ;
2020-12-22 02:04:21 +00:00
async fn load(& self, keys: & [u64]) -> Result< HashMap < u64 , Self::Value > , Self::Error> {
2020-12-21 06:21:10 +00:00
let pool = ctx.data_unchecked::< Pool < Postgres > >();
let query = format!("SELECT name FROM user WHERE id IN ({})", keys.iter().join(","));
Ok(sqlx::query_as(query)
.fetch(& self.pool)
.map_ok(|name: String| name)
2021-09-24 02:54:18 +00:00
.map_err(Arc::new)
2020-12-21 06:21:10 +00:00
.try_collect().await?)
}
}
struct User {
id: u64,
}
#[Object]
impl User {
async fn name(& self, ctx: & Context< '_>) -> Result< String > {
let loader = ctx.data_unchecked::< DataLoader < UserNameLoader > >();
let name: Option< String > = loader.load_one(self.id).await?;
name.ok_or_else(|| "Not found".into())
}
}
```
In the end, only two SQLs are needed to query the results we want!
```sql
SELECT id, todo, user_id FROM todo
SELECT name FROM user WHERE id IN (1, 2, 3, 4)
```
2020-12-22 02:04:21 +00:00
## Implement multiple data types
You can implement multiple data types for the same `Loader` , like this:
```rust
struct PostgresLoader {
pool: sqlx::Pool< Postgres > ,
}
#[async_trait::async_trait]
impl Loader< UserId > for PostgresLoader {
type Value = User;
2021-09-24 02:54:18 +00:00
type Error = Arc< sqlx::Error > ;
2020-12-22 02:04:21 +00:00
async fn load(& self, keys: & [UserId]) -> Result< HashMap < UserId , Self::Value > , Self::Error> {
// Load users from database
}
}
#[async_trait::async_trait]
impl Loader< TodoId > for PostgresLoader {
type Value = Todo;
type Error = sqlx::Error;
async fn load(& self, keys: & [TodoId]) -> Result< HashMap < TodoId , Self::Value > , Self::Error> {
// Load todos from database
}
}
```