Skip to content

[Joins] Join Query DSL #15450

Open
Open
@harshavamsi

Description

@harshavamsi

Is your feature request related to a problem? Please describe

Coming from #15185 , we want to introduce the join DSL format that will be used to construct the join query. It will make use of the existing QueryBuilders within OpenSearch to parse the left and right queries. We will add new logic to SearchSourceBuilder to support the new join field in the query DSL.

Describe the solution you'd like

The join field will be parsed by a new JoinBuilder in OpenSearch that will take in the following:

  • left query(just query from the builder perspective) from StreamInput and parses into Lucene query,
  • fields - left query fields to broadcast back
  • join - XContent Object containing
    • right query
      • index - right index to perform join
      • query - right lucene query
      • fields - right query fields to join on
      • type - type of join to perform(inner, outer, cross, left_join, right_join)
      • algorithm - type of join algorithm to use(hash_join, nested_join), we might support only one type to start with
      • condition - the join condition to evaluate while joining
        • left_field - left index field to use while evaluating
        • right_field - right index field to use while evaluating
        • comparator - the operator to use for the condition(<, <=, >, >=, =) // should this be just text?
      • fields - right query fields to broadcast back
      • aggs - aggregations to perform while joining / should this be outside the join clause?

Full query DSL

{  
  "query": {  
    "bool": {  
      "filter": [  
        {  
          "range": {  
            "@timestamp": {  
              "gte": "now-1h"  
            }  
          }  
        },  
        {  
          "match": {  
            "message": "error"  
          }  
        }  
      ]  
    }  
  },  
  "fields": ["instance_id", "status_code"],  
  "join": {  
    "right_query": {  
        "index": "instance_details",   
        "query": {  
          "range": {  
            "created_at": {  
              "gte": "now-1y"  
            }  
          }  
        },  
        "fields": ["instance_id", "region"]  
    },  
    "type": "inner",   
    "algorithm": "hash_join", // optional  
    "condition": {  
        "left_field": "instance_id",  
        "right_field": "instance_id",  
        "comparator": "="  
    },  
    "fields": ["region", "status_code"],  
    "aggs": {  
      "by_region": {  
        "terms": {  
          "field": "region"  
        },  
        "aggs": {  
          "by_status_code": {  
            "terms": {  
              "field": "status_code"  
            },  
            "aggs": {  
              "status_code_count": {  
                "value_count": {  
                  "field": "status_code"  
                }  
              }  
            }  
          }  
        }  
      }  
    }  
  }  
}

Related component

Search:Query Capabilities

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Labels

Type

No type

Projects

Status

🆕 New

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions