Skip to content

[RFC] Indexing and Search Separation #14596

Open
@amberzsy

Description

@amberzsy

Is your feature request related to a problem? Please describe

Background
Currently, a data node performs both indexing and searching, leading to workload interference between these tasks. An expensive query can monopolize memory and CPU resources, causing indexing requests to fail or vice versa. Additionally, scaling read traffic typically involves adding more replicas, which can slow down indexing and reduce throughput. Therefore, supporting the separation of indexing and search will enhance read and write performance. Also Separation of indexing and search allows each to scale independently. For example, additional resources can be allocated to indexing processes during data ingestion, while search operations can be scaled separately to handle query loads.

Describe the solution you'd like

High level, there would be two approaches to achieve indexing and search separation. Node/Role level and instance/cluster level.

Node/Role level separation
In order to achieve Indexing and Search separation, we would build on the new node role “search” which separates out with existing data role which focus on indexing only. The “search” node role would act as dedicated search nodes.
image

With remote storage, we would keep committed data as segments and uncommitted data being added to the translog. To maintain consistency, the same semantics are applied when storing data in the remote store. Data from the local translog is backed up to the remote translog store with each indexing operation. Additionally, whenever new segments are created during refresh, flush, or merge processes, these new segments are uploaded to the remote segment store.

The control plane is running as active-standby for redundancy and would have built-in auto failover mechanism.
The search node directly downloads indexed data from remote storage and executes search operations, including aggregations. It operates in active-active mode to ensure availability during failures. The refresh interval should be configurable according to system limits.

Requirement:

  1. traffic separation: coordinator role separation or consider have proxy layer atop for handling routing.
  2. Cluster status - Green/Yellow/Red:
    a. today, any copy of shard in unassigned would trigger cluster status change to red(primary) or yellow. With
    separation, we would have higher granularity for indexing and search status. e.g when primary fails on serving write
    traffic, it would have certain indicator on Indexing Unhealthy/Unavailable. Similar apply to search, if any / all replica
    fails, should indicate search failure etc.
  3. ShardAllocation strategy and zone/rack awareness:
    a. today, opensearch follows set of resiliency policy and allocation preference based on primary/replica architecture.
    With separation, the shard allocation would be based on role (search/data). For the awareness, primary shard would
    need to apply to primary active / standby (avoid allocate primary active and standby in one zone/rack). For search,
    try to distribute across different zone/rack.
  4. Auto failover mechanism for primary active and standby. When primary active fails to serve traffic, it should
    automatically failover to primary standby for durability of indexing.
  5. consistency guarantee:
    a. ensure consistency guarantees comparable to today's standards by having a search replica shard monitor its
    synchronization with the data node. If the replica is out of date, it can redirect or fallback the request to the data
    node. Options could include requiring strict consistency, allowing a maximum lag of X, and so on.
  6. lightweight snapshot

Related component

Other

Describe alternatives you've considered

Cluster / domain separation
Alternatively, the similar indexing and search separation can be achieved through Cross-Cluster Replication with segment replication. With ccr-(segrep), the leader cluster will mostly handle the indexing/writes while the follower cluster will keep in-sync with segment replication. All indexing requests would route to the primary cluster and the search request route to the follower cluster.

Snip20240627_71

Comparison:
image

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    IndexingIndexing, Bulk Indexing and anything related to indexingIndexing & SearchRFCIssues requesting major changesRoadmap:Modular ArchitectureProject-wide roadmap labelSearchSearch query, autocomplete ...etcStorage:RemoteenhancementEnhancement or improvement to existing feature or requestv3.0.0Issues and PRs related to version 3.0.0

    Type

    No type

    Projects

    Status

    🏗 In progress

    Status

    In Progress

    Status

    🆕 New

    Status

    New

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions