Tweaking query to avoid "too_many_clauses"


I'm having some issues as to where a query is failing due to too many clauses and was wondering if there were optimizations I could do to keep my data as-is but make the query successful. My data consists of 2 indices: contracts and products where:

  1. A contract is a customer code and a list of product codes available to that customer
  2. Products are a product code along with all the metadata (description, price, etc.)

A contract, once indexed might look like this:

        "_index": "contracts",
        "_type": "customer",
        "_id": "1-000001",
        "_score": 1,
        "_source": {
          "product_id": [
          "customer_code": "00001"

So I query the products index with the following:

GET products/_search
  "query": {
    "terms": {
      "id": {
        "index": "contract",
        "type": "customer",
        "path": "product_id",
        "id": "1-000001"

which works fine when the product_id array on the contract is small but some contracts have a lot of products (2000+) at which point the query fails. Even when the number of products is small enough for the query to execute properly it is still quite slow.

In short what I'm trying to do is query the products index and say "give me all the products in this contract". Anybody with an idea on how to improve this? Thanks!

Note: We recently upgraded from an old (pre 1.0) version of ES to 5.1.1 and we had a similar query working as a terms filter which seemed to ignore the max clause while terms query do not.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.