Skip to content

Adding blog for date histogram optimizations #3782

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jainankitk
Copy link

Description

The journey of optimizing the date histogram aggregation

Issues Resolved

Closes #3758

Check List

  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the BSD-3-Clause License.

@pajuric
Copy link

pajuric commented May 22, 2025

@kolchfa-aws - Please go ahead and provide a technical review for this blog

@kolchfa-aws kolchfa-aws self-assigned this May 23, 2025
Comment on lines +21 to +27
## Understanding the Aggregation Types

Before diving into optimizations, let's quickly cover the fundamentals. OpenSearch supports three main types of aggregations:
1. **Metric Aggregations**: Used for computing statistics like min, max, sum, or average
2. **Bucket Aggregations**: Groups documents together (by date, terms, or ranges)
3. **Pipeline Aggregations**: Combines multiple aggregations, using output from one as input to another
As part of this blog, we will mainly focus on Date Histogram aggregations which is subtype of Bucket Aggregations
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems redundant, the function of date histogram is already clear in the first paragraph and below section


When we began analyzing performance, we started single date histogram aggregation query execution in loop and collected flame graphs during the execution [9310](https://github.com/opensearch-project/OpenSearch/issues/9310). The flame graph primarily indicated towards two key limitations:
1. **Data Volume Dependency**: Query latency was directly proportional to data volume. For example, a one-month aggregation taking 1 second would take 12 seconds for a year's worth of data.
2. **Bucket Count Impact**: Large numbers of buckets (like in minute-level aggregations) led to hash collision issues, further degrading performance.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not aware of hash collision issues before, did we know this is the issue?


By default, OpenSearch processes aggregations by first evaluating each query condition on every shard using Lucene, which builds iterators that identify matching document IDs. These iterators are then combined (e.g., using logical AND) to find documents that satisfy all query filters. The resulting set of matching document IDs is streamed to the aggregation framework, where each document flows through one or more aggregators. These aggregators use Lucene’s doc values (explained above) to efficiently retrieve the field values needed for computation (e.g., for calculating averages or counts). This streaming model is hierarchical—documents pass through a pipeline of aggregators, allowing them to be grouped into top-level and nested buckets simultaneously. For example, a document can first be bucketed by month, then further aggregated by HTTP status code within that month. This design enables OpenSearch to process complex, multi-level aggregations efficiently in a single pass over the matching documents.

## Understanding the setup
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel it maybe better to organize
How numeric data is stored in OpenSearch
How Aggregation works in OpenSearch
under this section

}
```

This date histogram aggregation query generates filter corresponding to each bucket and leverages Lucene’s Points Index, which uses BKD trees, to significantly optimize the aggregation. This tree-based structure organizes data into nodes representing value ranges with associated document counts, enabling efficient traversal. By skipping irrelevant subtrees and leveraging early termination, the system reduces unnecessary disk reads and avoids visiting individual documents. which are executed in similar manner as range query using the index tree to determine counts for each bucket faster than document value iteration. Similar approach was also applied to the auto date histogram, composite aggregation on date histogram source, and, later on, numeric range aggregation with consecutive ranges (TODO: Why do we need consecutive ranges??).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need consecutive ranges
It's the outcome of a date histogram.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does numeric range aggregation query also need that? IMO, if consecutive ranges is not the requirement of this approach, we need not make that classification

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Range agg doesn't have such requirement - consecutive ranges.
The multi-range traversal algorithm should still be able to support in-consecutive ranges case with some modification, like maintain a list of ranges that are active regarding the current travelling point.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you confusing consecutive with overlapping? Consecutive example is (0,10) (10,20) (20,30)...., Non-consecutive example is (0,10) (20,30) (40,50).....


![Performance_Improvements_Initial.png](../assets/media/blog-images/2025-05-05-date-histogram-optimizations/Performance_Improvements_Initial.png)

The following diagrams illustrate how documents are counted per histogram bucket using the index tree. To efficiently count documents matching a range (e.g., 351 to 771), the traversal begins at the root, checking whether the target range intersects with the node’s range. If it does, the algorithm recursively explores the left and right subtrees. An important optimization is skipping entire subtrees: if a node’s range falls completely outside the query range (e.g., 1–200), it is ignored. Conversely, if a node’s range is fully contained within the query range (e.g., 401–600), the algorithm can return the document count from that node directly without traversing its children. This allows the engine to avoid visiting all leaf nodes, focusing only on nodes with partial overlaps. As a result, the operation becomes significantly faster—reducing complexity from linear (O(N)) to logarithmic (O(log N)) in many cases—since it leverages the hierarchical structure to skip large irrelevant portions of the tree and aggregate counts efficiently.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel (O(N)) to (O(log N)) is a good way to tell the difference. The real time complexity is much complicated than that.
I think use the tree to quickly find the target range start, and skip sub trees are good and simple explanation already.

Comment on lines +182 to +184
![MRT_Initial.png](../assets/media/blog-images/2025-05-05-date-histogram-optimizations/MRT_Initial.png)
![MRT_Middle.png](../assets/media/blog-images/2025-05-05-date-histogram-optimizations/MRT_Middle.png)
![MRT_Final.png](../assets/media/blog-images/2025-05-05-date-histogram-optimizations/MRT_Final.png)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems there are no go-through explanation of these diagrams

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, @kolchfa-aws also added that feedback. Let me work on addressing that

@kolchfa-aws
Copy link
Collaborator

@jainankitk Please let me know when you address tech review comments and this PR is ready for my review. Thanks!

@kolchfa-aws kolchfa-aws added the Tech review The blog is under tech review label Jun 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Tech review The blog is under tech review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BLOG] Optimizing Date Histogram Aggregation Using the Index
4 participants