Running terms query with filter context causes maxClauseCount

I'm trying to tun a terms query and filter with the results on some aggrs I run. It looks like this:

{
  "aggregations": {
    "aggs_name": {
      "aggregations": {
        ...
      },
      "filter": {
        "bool": {
          "filter": [
            {
              "terms": {
                "node_type": [
                  "type_a",
                  "type_b"
                ]
              }
            },
            {
              "terms": {
                "some_field": {
                  "id": "HUGE_DOC",
                  "index": "lables",
                  "path": "field_name",
                  "type": "_doc"
                }
              }
            }
          ]
        }
      }
    }
  },
  "query": {
    ...
  },
  "size": 0
}

The HUGE_DOC contains a filed with something like 15k ip address which I need to filter by.
Trying to run this query raises:

{
  "error" : {
    "root_cause" : [
      {
        "type" : "too_many_clauses",
        "reason" : "maxClauseCount is set to 1024"
      }
    ],
    "type" : "search_phase_execution_exception",
    "reason" : "all shards failed",
    "phase" : "query",
    "grouped" : true,
    "failed_shards" : [
      {
        "shard" : 0,
        "index" : "index_name ",
        "node" : "eX4_hQltQZWZ013rCm8MSA",
        "reason" : {
          "type" : "too_many_clauses",
          "reason" : "maxClauseCount is set to 1024"
        }
      }
    ],
    "caused_by" : {
      "type" : "too_many_clauses",
      "reason" : "maxClauseCount is set to 1024",
      "caused_by" : {
        "type" : "too_many_clauses",
        "reason" : "maxClauseCount is set to 1024"
      }
    }
  },
  "status" : 500
}

What can I do in order to filter on a terms query with big amount of values?

Hi,

The limit you are hitting is a built in limit to Lucenes BooleanQuery. Using more than 1024 clauses in such a query is considered an anti pattern an should be avoided because it can harm performance of the whole cluster. You can e.g. try running several subsequent queries filtering on a smaller subset of the ids you are now trying to filter on in one big query. However, there is a setting that you can use to increase this limit that should be available at least since version 5.6, but its not widely documented and should be used with caution. I think its a node-level setting called "indices.query.bool.max_clause_count" so you would need to set this to a higher value on all your nodes. See https://github.com/elastic/elasticsearch/pull/18341 for further reference.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.