Skip to content

Event resending: Performance Issues with large number of publications (Batching, Indexing, OoM) #1148

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
michalkopacz opened this issue Apr 11, 2025 · 0 comments

Comments

@michalkopacz
Copy link

Hi Spring Modulith Team,

I've tried to use Spring Modulith's event publication mechanism (JPA outbox pattern) and have encountered performance challenges when dealing with a large backlog of unpublished events.

Environment:

Spring Boot Version: 3.4.4
Spring Modulith Version: 1.3.1
Database: PostgreSQL 15
Java Version: 21

Problem Description:

When an external system consuming events is temporarily unavailable, a significant number of events can accumulate in the event_publication table. When the system recovers and Spring Modulith attempts to publish this backlog, I've observed the following issues:

  1. Inefficient Processing: The mechanism seems to fetch and process all incomplete publications individually. This is inefficient for clearing a large backlog quickly.
  2. Potential OutOfMemory Errors: Fetching a large number of incomplete publications might lead to OoM errors if the implementation tries to load too many event details into memory simultaneously.
  3. Missing Index: The default PostgreSQL schema appears to lack an optimal index for the JpaEventPublicationRepository.findIncompletePublicationsOlderThan(...) query. This can cause performance degradation (e.g., slow queries, high DB CPU) when the event_publication table grows significantly. An index covering (completion_date, publication_date) might be necessary.

Requested Enhancements:

  1. Implement configurable batch processing for fetching and publishing incomplete events (e.g., using pagination or streaming) to improve throughput and prevent OoM errors during backlog processing.
  2. Review and add an optimized index to the default PostgreSQL schema to support efficient querying of incomplete publications by findIncompletePublicationsOlderThan.

I believe these changes would significantly improve the robustness and performance of the event publication feature in recovery scenarios.

I am also willing to contribute to implementing these changes if the team is open to external contributions in these areas.

Thanks for considering this issue!

@michalkopacz michalkopacz changed the title Event Republication: Performance Issues with large number of publications (Batching, Indexing, OoM) Event resending: Performance Issues with large number of publications (Batching, Indexing, OoM) Apr 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant