Description
Is your feature request related to a problem? Please describe
Background
Currently, a data node performs both indexing and searching, leading to workload interference between these tasks. An expensive query can monopolize memory and CPU resources, causing indexing requests to fail or vice versa. Additionally, scaling read traffic typically involves adding more replicas, which can slow down indexing and reduce throughput. Therefore, supporting the separation of indexing and search will enhance read and write performance. Also Separation of indexing and search allows each to scale independently. For example, additional resources can be allocated to indexing processes during data ingestion, while search operations can be scaled separately to handle query loads.
Describe the solution you'd like
High level, there would be two approaches to achieve indexing and search separation. Node/Role level and instance/cluster level.
Node/Role level separation
In order to achieve Indexing and Search separation, we would build on the new node role “search” which separates out with existing data role which focus on indexing only. The “search” node role would act as dedicated search nodes.
With remote storage, we would keep committed data as segments and uncommitted data being added to the translog. To maintain consistency, the same semantics are applied when storing data in the remote store. Data from the local translog is backed up to the remote translog store with each indexing operation. Additionally, whenever new segments are created during refresh, flush, or merge processes, these new segments are uploaded to the remote segment store.
The control plane is running as active-standby for redundancy and would have built-in auto failover mechanism.
The search node directly downloads indexed data from remote storage and executes search operations, including aggregations. It operates in active-active mode to ensure availability during failures. The refresh interval should be configurable according to system limits.
Requirement:
- traffic separation: coordinator role separation or consider have proxy layer atop for handling routing.
- Cluster status - Green/Yellow/Red:
a. today, any copy of shard in unassigned would trigger cluster status change to red(primary) or yellow. With
separation, we would have higher granularity for indexing and search status. e.g when primary fails on serving write
traffic, it would have certain indicator on Indexing Unhealthy/Unavailable. Similar apply to search, if any / all replica
fails, should indicate search failure etc. - ShardAllocation strategy and zone/rack awareness:
a. today, opensearch follows set of resiliency policy and allocation preference based on primary/replica architecture.
With separation, the shard allocation would be based on role (search/data). For the awareness, primary shard would
need to apply to primary active / standby (avoid allocate primary active and standby in one zone/rack). For search,
try to distribute across different zone/rack. - Auto failover mechanism for primary active and standby. When primary active fails to serve traffic, it should
automatically failover to primary standby for durability of indexing. - consistency guarantee:
a. ensure consistency guarantees comparable to today's standards by having a search replica shard monitor its
synchronization with the data node. If the replica is out of date, it can redirect or fallback the request to the data
node. Options could include requiring strict consistency, allowing a maximum lag of X, and so on. - lightweight snapshot
Related component
Other
Describe alternatives you've considered
Cluster / domain separation
Alternatively, the similar indexing and search separation can be achieved through Cross-Cluster Replication with segment replication. With ccr-(segrep), the leader cluster will mostly handle the indexing/writes while the follower cluster will keep in-sync with segment replication. All indexing requests would route to the primary cluster and the search request route to the follower cluster.
Additional context
No response
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Status
Status
Status