Skip to content

Add better documentation for handling batch mappings with parameters #1168

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
tstocker-black-cape opened this issue Apr 2, 2025 · 6 comments
Assignees
Labels
status: feedback-provided Feedback has been provided status: waiting-for-triage An issue we've not yet triaged type: documentation A documentation task
Milestone

Comments

@tstocker-black-cape
Copy link

I submitted a stackoverflow question with the spring graphql tag that you can find here.

Given a GraphQL schema that contains data like the following

type Person {
  name: String!
  age: Int!
  friends(filter: FriendsFilter): [Person!]!
  hobbies(filter: HobbiesFilter): [Hobby!]!
}

I can create a schema mapping in my controller which looks like the following

@SchemaMapping
public List<Person> friends(
    @Arugment FriendsFilter filter, 
    Person person){
  // Fetch and return friends
}

However, this runs us into the N+1 problem. So to solve that, we need to batch. What I would expect to be able to do is modify my code to the following

@BatchMapping
public Map<Person, List<Person>> friends(
    @Arugment FriendsFilter filter, 
    List<Person> people){
  // Fetch and return friends in bulk
}

I have found that spring graphql does not support this kind of thing. While this kind of support would be ideal, I'm willing to work around it, but all the other answers I'm finding lose the type information and attempt to register a mapper for the pair Person.class, List.class which is insufficient as I have two fields that are both lists. What exactly is the simplest and most correct way forward here? I have to solve the N+1 problem and I have to preserve the filtering functionality of my API.

I've tried reading through the closed issues asking for this feature and I still haven't quite found the answer I'm looking for. I could really just use some help finding the right thing to do in this case where a filter is required and we can't sacrifice the typing of List.

@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged label Apr 2, 2025
@bclozel bclozel self-assigned this Apr 3, 2025
@bclozel
Copy link
Member

bclozel commented Apr 4, 2025

We've discussed this problem in a few issues,#232 is a good example but there are others. Let's use this issue to work on some documentation in Spring for GraphQL on this.

Sorry I initially missed your question on StackOverflow. I'll paste the answer here as well:


The batching feature in graphql-java and Spring for GraphQL is not "just" a way to work around the N+1 issue for a single data fetcher. This is a more general mechanism for loading elements in batches and caching their resolution for the lifetime of the GraphQL request.

More specifically, the DataLoader API is a contract for loading Objects given a key (usually their id). DataLoader#load(...) calls can be invoked for different parts of the query which may have different arguments and different selection sets. Futures are kept around until their resolution is triggered with DataLoader#dispatch().

Batch loading functions do have access to the BatchLoaderEnvironment which contains the main GraphQLContext and key contexts (but this is outside of the scope of this question). @BatchMapping methods are merely shortcuts to registering a batch loading function and using it in a data fetcher.

For your case, I would say that there are two possible approaches: fetching then filtering, or doing a tailored fetch.

Let's use the following schema for this example:

type Query {
    me: Person
    people: [Person]
}

input FriendsFilter {
    favoriteBeverage: String
}

type Person {
    id: ID!
    name: String
    favoriteBeverage: String
    friends(filter: FriendsFilter): [Person]
}

Fetching then filtering

One approach would be to fetch all friends for a given person, possibly caching their values for the entire lifetime of the GraphQL request.

@Controller
public class FriendsController {

	private final Map<Integer, Person> people = Map.of(
			1, new Person(1, "Rossen", "coffee", List.of(2, 3)),
			2, new Person(2, "Brian", "tea", List.of(1, 3)),
			3, new Person(3, "Donna", "tea", List.of(1, 2, 4)),
			4, new Person(4, "Brad", "coffee", List.of(1, 2, 3, 5)),
			5, new Person(5, "Andi", "coffee", List.of(1, 2, 3, 4))
	);

	public FriendsController(BatchLoaderRegistry registry) {
		registry.forTypePair(Integer.class, Person.class).registerMappedBatchLoader((personIds, env) -> {
			// fetch all friends and do not apply filter, caching Person by their id
			Map<Integer, Person> friends = new HashMap<>();
			personIds.forEach(personId -> friends.put(personId, people.get(personId)));
			return Mono.just(friends);
		});
	}

	@QueryMapping
	public Person me() {
		return this.people.get(2);
	}

	@QueryMapping
	public Collection<Person> people() {
		return this.people.values();
	}

	@SchemaMapping
	public CompletableFuture<List<Person>> friends(Person person, @Argument FriendsFilter filter, DataLoader<Integer, Person> dataLoader) {
		// load all friends THEN apply the given filter
		return dataLoader
				.loadMany(person.friendsId())
				.thenApply(filter::apply);
	}

	public record Person(Integer id, String name, String favoriteBeverage, List<Integer> friendsId) {
	}

	public record FriendsFilter(String favoriteBeverage) {

		List<Person> apply(List<Person> friends) {
			return friends.stream()
					.filter(person -> person.favoriteBeverage.equals(this.favoriteBeverage))
					.collect(Collectors.toList());
		}
	}
}

In practice, this request:

query {
  me {
    name
    friends(filter: {favoriteBeverage: "tea"}) { 
      name
      favoriteBeverage
    }
  }
  people {
    name
    friends(filter: {favoriteBeverage: "coffee"}) { 
      name
      favoriteBeverage
    }
  }
}

Will yield:

{
  "data": {
    "me": {
      "name": "Brian",
      "friends": [
        {
          "name": "Donna",
          "favoriteBeverage": "tea"
        }
      ]
    },
    "people": [
      {
        "name": "Andi",
        "friends": [
          {
            "name": "Rossen",
            "favoriteBeverage": "coffee"
          },
          {
            "name": "Brad",
            "favoriteBeverage": "coffee"
          }
        ]
      },
      {
        "name": "Brad",
        "friends": [
          {
            "name": "Rossen",
            "favoriteBeverage": "coffee"
          },
          {
            "name": "Andi",
            "favoriteBeverage": "coffee"
          }
        ]
      },
      {
        "name": "Donna",
        "friends": [
          {
            "name": "Rossen",
            "favoriteBeverage": "coffee"
          },
          {
            "name": "Brad",
            "favoriteBeverage": "coffee"
          }
        ]
      },
      {
        "name": "Brian",
        "friends": [
          {
            "name": "Rossen",
            "favoriteBeverage": "coffee"
          }
        ]
      },
      {
        "name": "Rossen",
        "friends": []
      }
    ]
  }
}

Note: we have here two different operations fetching friends with different filters, but they're both using the batch loading function.

  • Pros: Person value are well shared in the DataLoader cache, meaning you will fetch more values but perform less I/O operations.
  • Cons: If people have lots of friends and filtering operations are costly, the server will consume more memory/CPU instead of delegating that to the data store

Tailored fetch

Let's try and just fetch the values we need.

@Controller
public class FriendsController {

	private final Map<Integer, Person> people = Map.of(
			1, new Person(1, "Rossen", "coffee", List.of(2, 3)),
			2, new Person(2, "Brian", "tea", List.of(1, 3)),
			3, new Person(3, "Donna", "tea", List.of(1, 2, 4)),
			4, new Person(4, "Brad", "coffee", List.of(1, 2, 3, 5)),
			5, new Person(5, "Andi", "coffee", List.of(1, 2, 3, 4))
	);

	public FriendsController(BatchLoaderRegistry registry) {
		// we're now using a composed key
		registry.forTypePair(FriendFilterKey.class, Person[].class).registerMappedBatchLoader((keys, env) -> {
			// perform efficient fetching by delegating the filter operation to the data store
			Map<FriendFilterKey, Person[]> result = new HashMap<>();
			keys.forEach(key -> {
				Person[] friends = key.person().friendsId().stream()
						.map(people::get)
						.filter(friend -> key.friendsFilter().matches(friend))
						.toArray(Person[]::new);
				result.put(key, friends);
			});
			return Mono.just(result);
		});
	}

	@QueryMapping
	public Person me() {
		return this.people.get(2);
	}

	@QueryMapping
	public Collection<Person> people() {
		return this.people.values();
	}

	@SchemaMapping
	public CompletableFuture<Person[]> friends(Person person, @Argument FriendsFilter filter, DataLoader<FriendFilterKey, Person[]> dataLoader) {
		return dataLoader.load(new FriendFilterKey(person, filter));
	}

	public record Person(Integer id, String name, String favoriteBeverage, List<Integer> friendsId) {
	}

	public record FriendsFilter(String favoriteBeverage) {

		boolean matches(Person friend) {
			return friend.favoriteBeverage.equals(this.favoriteBeverage);
		}
	}

	// because this key contains both the person and the filter, we will need to fetch the same friend multiple times
	public record FriendFilterKey(Person person, FriendsFilter friendsFilter) {
	}

}
  • Pros: We only fetch the friends we need for a given person, delegating the memory/filter operations to the data store.
  • Cons: We're performing one call per list of friends and the DataLoader cache usage is not optimal

Note: we can't consider a record FriendFilterKey(Integer personId, FriendFilter) {} here. If we did, the batch loading function would return null for filtered out friends, leading to null entries in the response:

    "me": {
      "name": "Brian",
      "friends": [
        {
          "name": "Donna",
          "favoriteBeverage": "tea"
        },
        null // Rossen is filtered out
      ]
    },

There is more to this, as we could also explain how context keys can be used to pass a per-key context. The goal is not to replace the DataFetchingEnvironment (think, @QueryMapping arguments), but provide useful context data that doesn't need to be part of the loader key.

@tstocker-black-cape Please let us know if this comment helps you and if it needs some clarification. We can then consider adding a new section in the documentation. Thanks!

@bclozel bclozel added type: documentation A documentation task status: waiting-for-feedback We need additional information before we can continue labels Apr 4, 2025
@tstocker-black-cape
Copy link
Author

@bclozel Thanks so much for taking the time to help me out with this! I appreciate your example. It definitely helped me wrap my head around how everything goes together better. I'm privileged to be able to "push the gas and make it go" for the most part thanks to spring boot.

Your second example "Tailored Fetch" is what I'm going for, but you side-stepped a key aspect of my issue by using a native array instead of a typed list. Type erasure is the root of my issue at the moment. To demonstrate this, I've created an example project that builds off the "People" API we've been using. Some key changes are

  • The response is not a list, but an edge type. Imagine that an edge is a pagination wrapper. It might contain aggregations or cursors, etc.
  • A person can have friends and enemies, both of which are Edge<Person> return types

If you run the only test in PersonControllerIT, you'll see an error that looks something like this

org.springframework.web.reactive.function.client.WebClientRequestException: Request processing failed: java.lang.IllegalStateException: More than one DataLoader named 'ts.bc.io.controller.PersonController$Edge'

	at org.springframework.web.reactive.function.client.ExchangeFunctions$DefaultExchangeFunction.lambda$wrapException$9(ExchangeFunctions.java:137)
	at reactor.core.publisher.MonoErrorSupplied.subscribe(MonoErrorSupplied.java:55)
	at reactor.core.publisher.Mono.subscribe(Mono.java:4576)
	at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onError(FluxOnErrorResume.java:103)
	at reactor.core.publisher.FluxPeek$PeekSubscriber.onError(FluxPeek.java:222)
	at reactor.core.publisher.FluxPeek$PeekSubscriber.onError(FluxPeek.java:222)
	at reactor.core.publisher.FluxMap$MapSubscriber.onError(FluxMap.java:134)
	at reactor.core.publisher.Operators.error(Operators.java:198)
	at reactor.core.publisher.MonoError.subscribe(MonoError.java:53)
	at reactor.core.publisher.Mono.subscribe(Mono.java:4576)
	at reactor.core.publisher.Mono.block(Mono.java:1806)
	at org.springframework.test.web.reactive.server.DefaultWebTestClient$DefaultRequestBodyUriSpec.exchange(DefaultWebTestClient.java:369)
	at org.springframework.graphql.test.tester.WebTestClientTransport.execute(WebTestClientTransport.java:60)
	at org.springframework.graphql.test.tester.DefaultGraphQlTester$DefaultRequest.execute(DefaultGraphQlTester.java:179)
	at ts.bc.io.controller.PersonControllerIT.test_query(PersonControllerIT.java:55)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
	Suppressed: java.lang.Exception: #block terminated with an error
		at reactor.core.publisher.BlockingSingleSubscriber.blockingGet(BlockingSingleSubscriber.java:146)
		at reactor.core.publisher.Mono.block(Mono.java:1807)
		... 7 more
Caused by: jakarta.servlet.ServletException: Request processing failed: java.lang.IllegalStateException: More than one DataLoader named 'ts.bc.io.controller.PersonController$Edge'
	at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1022)
	at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:914)
	at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:590)
	at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:885)
	at org.springframework.test.web.servlet.TestDispatcherServlet.service(TestDispatcherServlet.java:72)
	at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:658)
	at org.springframework.mock.web.MockFilterChain$ServletFilterProxy.doFilter(MockFilterChain.java:165)
	at org.springframework.mock.web.MockFilterChain.doFilter(MockFilterChain.java:132)
	at org.springframework.test.web.servlet.MockMvc.perform(MockMvc.java:201)
	at org.springframework.test.web.servlet.client.MockMvcHttpConnector.connect(MockMvcHttpConnector.java:110)
	at org.springframework.test.web.reactive.server.WiretapConnector.connect(WiretapConnector.java:72)
	at org.springframework.web.reactive.function.client.ExchangeFunctions$DefaultExchangeFunction.exchange(ExchangeFunctions.java:103)
	... 7 more
Caused by: java.lang.IllegalStateException: More than one DataLoader named 'ts.bc.io.controller.PersonController$Edge'
	at org.springframework.graphql.execution.DefaultBatchLoaderRegistry.registerDataLoader(DefaultBatchLoaderRegistry.java:117)
	at org.springframework.graphql.execution.DefaultBatchLoaderRegistry.registerDataLoaders(DefaultBatchLoaderRegistry.java:111)
	at org.springframework.graphql.execution.DefaultExecutionGraphQlService.lambda$applyDataLoaderRegistrars$6(DefaultExecutionGraphQlService.java:142)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
	at org.springframework.graphql.execution.DefaultExecutionGraphQlService.applyDataLoaderRegistrars(DefaultExecutionGraphQlService.java:142)
	at org.springframework.graphql.execution.DefaultExecutionGraphQlService.registerDataLoaders(DefaultExecutionGraphQlService.java:122)
	at org.springframework.graphql.execution.DefaultExecutionGraphQlService.lambda$execute$4(DefaultExecutionGraphQlService.java:104)
	at reactor.core.publisher.MonoDeferContextual.subscribe(MonoDeferContextual.java:47)
	at reactor.core.publisher.Mono.subscribe(Mono.java:4576)
	at reactor.core.publisher.Mono.subscribeWith(Mono.java:4641)
	at reactor.core.publisher.Mono.toFuture(Mono.java:5153)
	at org.springframework.graphql.server.webmvc.GraphQlHttpHandler.prepareResponse(GraphQlHttpHandler.java:82)
	at org.springframework.graphql.server.webmvc.AbstractGraphQlHttpHandler.handleRequest(AbstractGraphQlHttpHandler.java:133)
	at org.springframework.web.servlet.function.support.HandlerFunctionAdapter.handle(HandlerFunctionAdapter.java:108)
	at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1089)
	at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:979)
	at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1014)
	at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:914)
	at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:590)
	at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:885)
	at org.springframework.test.web.servlet.TestDispatcherServlet.service(TestDispatcherServlet.java:72)
	at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:658)
	at org.springframework.mock.web.MockFilterChain$ServletFilterProxy.doFilter(MockFilterChain.java:165)
	at org.springframework.mock.web.MockFilterChain.doFilter(MockFilterChain.java:132)
	at org.springframework.test.web.servlet.MockMvc.perform(MockMvc.java:201)
	at org.springframework.test.web.servlet.client.MockMvcHttpConnector.connect(MockMvcHttpConnector.java:107)
	... 9 more

Here is the example

@spring-projects-issues spring-projects-issues added status: feedback-provided Feedback has been provided and removed status: waiting-for-feedback We need additional information before we can continue labels Apr 7, 2025
@bclozel
Copy link
Member

bclozel commented Apr 16, 2025

Thanks for the detailed sample @tstocker-black-cape

The main issue here is that your batch loader are registered against the Edge<T> type, which explains the error you are seeing. There are a couple of things to be considered here:

  1. I think the Edge should not be implemented in Spring for GraphQL applications, and our pagination support should be used instead. Using this support should first get rid of the Edge type entirely. Instead, your application could import org.springframework.data:spring-data-commons (even if it doesn't use any repository) and use the Slice and ScrollSubrange types. This would as well move the "limit" and "skip" types out of the filter types in your schema.
  2. Once this first issue is out of the picture, we're basically back to the problem being discussed earlier. In your case, clients can specify filters at each stage of the schema (root query and schema mappings), so caching by filtering key might be inefficient because the cache hit rate would be very low. We can also load people by their ids with the batch loader and then filter - better cache hit rate, but more memory usage if people have lots of friends.

In general I don't this changes much the documentation to be added, this is more of a combination with pagination.

@tstocker-black-cape
Copy link
Author

Thanks for taking the time to look at this!

  1. While I understand that "Change your library to fit my application" is not a fair ask, "Change your API to fit our library" is necessarily the best advice either. I'm giving minimal examples for the sake of simplicity, but my app has aggregations in the edge to fulfill business requirements, so we can't just drop them. I think many apps that leverage Spring GraphQL will have constraints and engineers may not be able to change the API and I would just ask that you keep these folks in mind.
  2. I apologize, but I think you might be going over my head here. Which cache are you referring to here? In my real application, bulk filtering in memory is not an option due to the size of the data set.

If you feel that there is no action to take here, feel free to close the ticket. FWIW, I have found a way to achieve my goals using a name in the batch loader registry, so thank you for your example code! I still maintain that filtered sub-selections (schema mappings) are a common occurrence in GraphQL API's and it would be amazing if Spring GraphQL supported them without running into the N+1 problem. If you ever find a way to enhance the @BatchMapping annotation to work in that way, I'll be the first to adopt it.

Thanks again for taking the time with me!

@bclozel
Copy link
Member

bclozel commented Apr 16, 2025

While I understand that "Change your library to fit my application" is not a fair ask, "Change your API to fit our library" is necessarily the best advice either. I'm giving minimal examples for the sake of simplicity, but my app has aggregations in the edge to fulfill business requirements, so we can't just drop them.

Fair enough. I was merely pointing to the fact that Spring for GraphQL supports the popular relay connection spec and that it would greatly simplify your codebase and schema design. But I understand that you might need to stick to an existing schema that uses a different pagination mechanism.

I apologize, but I think you might be going over my head here. Which cache are you referring to here? In my real application, bulk filtering in memory is not an option due to the size of the data set.

The main goal behind using batch loading is to not only load many entities at once, but also not loading the same entity twice. By default, the data loader will cache in memory loaded entities for the entire duration of the request. Given the size of your dataset, I understand that loading all connected types in the graph is not an option. By using specific filtered keys, you will most likely load the same entity (Person instance here) multiple times, as it will probably appear under several different keys. I guess this is the tradeoff that we've been discussing in this issue from the beginning: memory vs. number of I/O calls.

If you feel that there is no action to take here, feel free to close the ticket.

I think it would be beneficial for us to improve our reference documentation to explain a bit why batch loading is different from queries and mutations when it comes to the environment. We certainly can't get into specific details about design and tradeoffs; this would probably belong in a tutorial or a blog post, not our reference documentation.

I still maintain that filtered sub-selections (schema mappings) are a common occurrence in GraphQL API's and it would be amazing if Spring GraphQL supported them without running into the N+1 problem. If you ever find a way to enhance the @BatchMapping annotation to work in that way, I'll be the first to adopt it.

I'm glad you managed to work things out. We do recognize that this is a common pattern. But as we can see, @BatchMapping is merely a shortcut and there are many ways and tradeoffs when it comes to designing a batch loading key and a strategy. Right now I don't see a natural way to make the samples I pasted above more expressive or more condensed. Maybe we can refine the model later.

Thanks for the discussion and bringing this up in the first place!

@bclozel bclozel added this to the 1.4.0 milestone Apr 16, 2025
@rstoyanchev
Copy link
Contributor

Great discussion here, and good documentation improvements to come. I'll just mention a couple of things.

While I understand that "Change your library to fit my application" is not a fair ask, "Change your API to fit our library" is necessarily the best advice either. I'm giving minimal examples for the sake of simplicity, but my app has aggregations in the edge to fulfill business requirements, so we can't just drop them.

It is also worth noting that our support for pagination is flexible, and pluggable. For example, instead of Edge you could use your own pagination type designed to fit your needs, and then adapt it through a ConnectionAdapter to the graphql.relay.Connection for the response. We have adapters for the Spring Data Page and Slice types because those are known types that we can support, but any other can be adapted. In that sense the Relay spec is just the standard format required on the wire, but you can have some flexibility in your application with the types you use.

FWIW, I have found a way to achieve my goals using a name in the batch loader registry, so thank you for your example code!

The BatchLoaderRegistry provides convenience for the 80% of cases where the key and value types are sufficiently unique, but if you need to deviate you can use a custom name indeed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: feedback-provided Feedback has been provided status: waiting-for-triage An issue we've not yet triaged type: documentation A documentation task
Projects
None yet
Development

No branches or pull requests

4 participants