Skip to content

[3.10.8] Caching leads to worse performance than not caching #12560

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
JeffreyMJordan opened this issue Apr 16, 2025 · 6 comments
Open

[3.10.8] Caching leads to worse performance than not caching #12560

JeffreyMJordan opened this issue Apr 16, 2025 · 6 comments

Comments

@JeffreyMJordan
Copy link

Issue Description

Hi Apollo, I've noticed some instances where caching leads to worse application performance than when not caching. When our application sends a large volume of batched queries, disabling caching frequently leads to better performance.

I've been able to reproduce the issues locally, and it seems that this is related to the watches field and related updates that take place when parts of the cache are invalidated. It looks like there is a lot of cpu spend iterating through the watches field and updating relevant parts of the cache. I'm curious whether this process could be optimized; I've noticed that when sending many instances of the same query with the same arguments, each instance is added to the watches field.

I understand that there's inherently more compute needed when caching vs. not, but it's unintuitive that caching would decrease performance.

Intended Outcome
Caching improves performance when an applications sends a large amount of batched queries.

Actual Outcome
Caching degrades performance when an applications sends a large amount of batched queries.

Link to Reproduction

https://codesandbox.io/p/devbox/quirky-johnson-lt2hfy?workspaceId=ws_9rF45qqZxvP5ESx5Mhtmg2

Reproduction Steps

I would have uploaded a performance profile but file upload seems to be broken.

  1. Open the performance tab in chrome devtools and observe how long scripting takes to complete
  2. Uncomment the fetchPolicy then reload

runMicrotasks takes about 345ms without caching and takes 22.8 seconds with caching.

@apollo/client version

3.10.8

@jerelmiller
Copy link
Member

Hey @JeffreyMJordan 👋

Would you mind checking out the values reported by the memory management section here? https://www.apollographql.com/docs/react/caching/memory-management

That inMemoryCache.executeSelectionSet and inMemoryCache.executeSubSelectedArray limits tend to really slow things down if those limits are hit. Run console.log(client.getMemoryInternals()) to get the values and see if you're hitting those limits.

@JeffreyMJordan
Copy link
Author

JeffreyMJordan commented Apr 16, 2025

Hi @jerelmiller , thanks for the reply

I'm taking these numbers from our internal replication (not the codesandbox). The internal replica is very similar to what I provided you in the sandbox, as I'm just repeating the same query with the same args a certain number of times.

Values in sizes.inMemoryCache are:

  • executeSelectionSet: 7
  • executeSubSelectedArray: 0
  • maybeBroadcastWatch: 0

limits.inMemoryCache are:

  • executeSelectionSet: 50000
  • executeSubSelectedArray: 10000
  • maybeBroadcastWatch: 5000

That said, it does seem likely we're hitting a limit somewhere. I noticed there's a large dropoff in performance between 4900 queries and 5000 queries that doesn't occur when not caching. Below is the full contents of client.getMemoryInternals() for 5000 operations when caching:

{
    "limits": {
        "parser": 1000,
        "canonicalStringify": 1000,
        "print": 2000,
        "documentTransform.cache": 2000,
        "queryManager.getDocumentInfo": 2000,
        "PersistedQueryLink.persistedQueryHashes": 2000,
        "fragmentRegistry.transform": 2000,
        "fragmentRegistry.lookup": 1000,
        "fragmentRegistry.findFragmentSpreads": 4000,
        "cache.fragmentQueryDocuments": 1000,
        "removeTypenameFromVariables.getVariableDefinitions": 2000,
        "inMemoryCache.maybeBroadcastWatch": 5000,
        "inMemoryCache.executeSelectionSet": 50000,
        "inMemoryCache.executeSubSelectedArray": 10000
    },
    "sizes": {
        "print": 5,
        "parser": 9,
        "canonicalStringify": 4,
        "links": [],
        "queryManager": {
            "getDocumentInfo": 8,
            "documentTransforms": []
        },
        "cache": {
            "fragmentQueryDocuments": 0
        },
        "addTypenameDocumentTransform": [
            {
                "cache": 8
            }
        ],
        "inMemoryCache": {
            "executeSelectionSet": 7,
            "executeSubSelectedArray": 0,
            "maybeBroadcastWatch": 0
        },
        "fragmentRegistry": {}
    }
}

I'm wondering if console.log isn't the best tool to diagnose this. Because console.log prints a live view of the object, if the value is mutated between logging and viewing the log, the observed value will differ from the actual value at that point in the program. I actually confirmed that the watches field was not deduping queries with the same value and args by using debugger statements, as the actual value differed from the observed value in this case.

I also can't help but notice that the limit of maybeBroadcastWatch is 5000, and the performance dropoff occurs very close to this limit.

@JeffreyMJordan
Copy link
Author

JeffreyMJordan commented Apr 16, 2025

My best guess is that the production examples are hitting the maybeBroadcastWatch limit. Can you explain more about how watches works and what this limit does internally? What are the implications of increasing this limit?

My current understanding is that watches is used to track queries that need to be re-fired when there's an update to the cache

@phryneas
Copy link
Member

It's safe to increase if you are hitting it with a single displayed page.

All of these limits are memoization cache limits, so they are a tradeoff between memory pressure and not repeating work where the result could be cached.

If the limits are too high, you keep too many computed values in memory in case you could need them again at some point in the future - but if the limits are too low and not everything on screen can be memoized at once, the oldest memoized values will be removed from the cache, while they are still needed.
That means they need to be recalculated, which will then put them into the cache as "new memoized values" - pushing other values out of the cache, potentially ending up in feud of recalculations.

@JeffreyMJordan
Copy link
Author

Hi, just following up on this. I increased the cache limits by quite a bit and I'm not seeing any improvement. This issue seems very similar to this one from 2022.

I did patch in this merged PR and I'm still not seeing much improvement. The times we've run into issues in production are all times the client has sent a large volume of queries, all of which are sped up by bypassing InMemoryCache.

@phryneas
Copy link
Member

phryneas commented May 8, 2025

So, this is independent from a specific version number?
The fact that you put the version number into the issue so prominently made us believe that this is related to a regression during upgrading (and around 3.9/3.10 those memory limits were introduced), so we focused on that route.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants