Get partial results in case of too_many_clauses

Renat_Nasyrov · December 29, 2018, 1:57pm

Hello!
We use Elasticsearch to index small documents with intensive use of ngram edge tokenizer and synonyms at index time. On certain queries it leads to humongous amount of clauses and we get the following error:

 {
  "error": {
    "root_cause": [
      {
        "type": "too_many_clauses",
        "reason": "too_many_clauses: maxClauseCount is set to 100000"
      }
    ],
    "type": "search_phase_execution_exception",
    "reason": "all shards failed",
    "phase": "query",
    "grouped": true,
    "failed_shards": [
      {
        "shard": 0,
        "index": "my_index_name",
        "node": "KE0mOUoSQIaDtXE4KpeHeQ",
        "reason": {
          "type": "query_shard_exception",
          "reason": "failed to create query: {\n  \"multi_match\" : {\n    \"query\" : \"weird query generating lots of clauses\",\n    \"fields\" : [\n      \"attr_1^4.0\",\n      \"attr_2^6.0\",\n      \"exact_attr_1^20.0\",\n      \"exact_attr_2^30.0\",\n      \"exact_name^20.0\",\n      \"name^3.0\"\n    ],\n    \"type\" : \"most_fields\",\n    \"operator\" : \"OR\",\n    \"slop\" : 0,\n    \"prefix_length\" : 0,\n    \"max_expansions\" : 1,\n    \"zero_terms_query\" : \"NONE\",\n    \"auto_generate_synonyms_phrase_query\" : true,\n    \"fuzzy_transpositions\" : true,\n    \"boost\" : 1.0\n  }\n}",
          "index_uuid": "WE3OBFBISoqVGzbkiP5qxQ",
          "index": "my_index_name",
          "caused_by": {
            "type": "too_many_clauses",
            "reason": "too_many_clauses: maxClauseCount is set to 100000"
          }
        }
      }
    ],
    "caused_by": {
      "type": "too_many_clauses",
      "reason": "too_many_clauses: maxClauseCount is set to 100000"
    }
  },
  "status": 400
}

Our server-side setting for maxClauseCount is set to 100000, which is already big. Query-time parameter max_expansions does not constrain clause count at all. Anyway, we want to get results even for a subset of clauses. Is this possible?

cbuescher · January 2, 2019, 10:09am

No, this is not possible. The clause check is done before executing the actual query to protect the server from being loaded with such a huge query. Running the query on only a subset of clauses usually defeats the purpose of full test search (or any kind of retrieval) since you don't know and cannot really controll which actual clauses make it into the query.

I think the best course of action would be to determine what kind of client-side search causes the clause count to exceed 100.000 and try to rephrase that query so it doesn't blow up that easily.

system · January 30, 2019, 10:09am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Trying to understand `max_clause_count` Elasticsearch	1	408	October 5, 2020
Too many clauses exceptions Elasticsearch	4	3266	December 12, 2018
Elasticsearch 6.x too_many_clauses error Elasticsearch	2	3123	February 8, 2018
Too_many_clauses: maxClauseCount is set to 1337 Elasticsearch	1	233	August 30, 2023
Tweaking query to avoid "too_many_clauses" Elasticsearch	1	1074	June 30, 2017

Get partial results in case of too_many_clauses

Related topics