-
-
Notifications
You must be signed in to change notification settings - Fork 99
Knowing when library has been reloaded #729
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
No library - no problem. |
|
HI @rgaudin @evrial @kelson42 , Added my thoughts in this comment. Requesting your response. |
Proposal: Automating Cache Purge with Kafka in KiwixOverviewThis proposal outlines a Kafka-based architecture to streamline the process of purging the cache when the Kiwix server reloads its library. Instead of invoking the purge API manually from library-maintain.py, this approach leverages Kafka for event-driven communication, enhancing scalability and reliability. Architecture DesignComponents
Workflow
Implementation DetailsKafka Setup
Kiwix Server Integration
Cache Purge Daemon
Benefits
ConclusionThis Kafka-based approach improves the efficiency of cache purging by making it event-driven, reducing manual intervention, and enhancing scalability. By implementing this architecture, Kiwix can achieve a more resilient and automated workflow for managing library updates. |
@Optimus-NP We can not rely on another architecture element (like Kafka) to do that. It should work only with Varnish and Kiwix Server. |
This is related to #728 but independent.
Use case we have is that kiwix-serve is behind a varnish cache for library.kiwix.org because it's not able to handle the load.
Our cache sure has a time-based expiration for all its entries but it's not relevant here.
Because we frequently publish new ZIM files, we frequently (at most once per hour ATM) regenerate the library XML file.
When we do, we want to invalidate our cache entries related to the Catalog.
Our problem is that we don't know when kiwix-serve has actually reloaded the library and is ready to serve new data.
If we invalidate cache too soon, then chances are an incoming request happens before the refresh and we'd be storing old data into the cache instead of the new one.
To workaround this, we are now waiting 10s after writing the XML file on disk and purging the cache. Of course, that's arbitrary, ugly and fragile.
How could we be informed that the library has been reloaded?
The text was updated successfully, but these errors were encountered: