Skip to content

[BUG] PPL Join limitations #3779

Open
Open
@noCharger

Description

@noCharger

What is the bug?

PPL join fails with circuit breaker exception even after adjusting circuit breaker settings. The join operation appears to be consuming excessive memory when trying to join large datasets on timestamp fields.

How can one reproduce the bug?
Steps to reproduce the behavior:

Enable calcite plugin:

curl -XPUT "http://localhost:9200/_cluster/settings" \
-H "Content-Type: application/json" \
-d'{
  "persistent": {
    "plugins.calcite.enabled": true
  }
}'

Increase circuit breaker limits to maximum:

curl -XPUT "http://localhost:9200/_cluster/settings" \
-H "Content-Type: application/json" \
-d'{
  "persistent": {
    "indices.breaker.fielddata.limit": "95%",
    "indices.breaker.total.limit": "95%",
    "indices.breaker.request.limit": "90%",
    "indices.breaker.fielddata.overhead": "1.03",
    "indices.breaker.request.overhead": "1.0"
  }
}'

Execute a simple JOIN query:

curl -XPOST "http://localhost:9200/_plugins/_ppl/" \
-H "Content-Type: application/json" \
-d'{
  "query": "source = big5 | left join on @timestamp = @timestamp [source = big5 | where `event.id` = '\''ERROR'\'' | stats count() by span(@timestamp, 1h)]"
}'

Observe the circuit breaker error:

{
  "error": {
    "reason": "Error occurred in OpenSearch engine: all shards failed",
    "details": "Shard[0]: OpenSearchException[java.util.concurrent.ExecutionException: CircuitBreakingException[[fielddata] Data too large, data for [_id] would be [859053756/819.2mb], which is larger than the limit of [858993459/819.1mb]]]",
    "type": "SearchPhaseExecutionException"
  },
  "status": 500
}

What is the expected behavior?

  • Successfully execute join without hitting circuit breaker limits
  • Handle large datasets efficiently
  • Provide memory-efficient execution of joins

What is your host/environment?

  • OS: Linux
  • Version: 3.1
  • Plugins

Do you have any additional context?
Using standard big5 index from OSB.

curl localhost:9200/_cat/indices
green open big5 Ta16685cTqeehNeEZ86wmw 1 0 116000000 0 25.8gb 25.8gb

Metadata

Metadata

Assignees

No one assigned

    Labels

    PPLPiped processing languagecalcitecalcite migration releatedenhancementNew feature or request

    Type

    No type

    Projects

    Status

    In progress

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions