Query + composite + top_hits = exception

I have a query that throws an exception, and I can't seem to work around it. My actual configuration is a bit more complex than what I have below, but I've managed to cook up an example that still produces the error.

Running Elasticsearch 7.6.2.

My index and mappings:

PUT /test
PUT /test/_mappings
{
  "properties": {
    "rootProperty": {
      "type": "keyword"
    },
    "children": {
      "type": "nested",
      "properties": {
        "childPropertyA": {
          "type": "keyword"
        },
        "childPropertyB": {
          "type": "text"
        }
      }
    }
  }
}

I then insert a document:

PUT /test/_doc/1
{
  "rootProperty": "root",
  "children": [
    {
      "childPropertyA": "1",
      "childPropertyB": "abc"
    },
    {
      "childPropertyA": "2",
      "childPropertyB": "def"
    },
    {
      "childPropertyA": "3",
      "childPropertyB": "ghi"
    },
    {
      "childPropertyA": "4",
      "childPropertyB": "jkl"
    }
  ]
}

I then perform the following query:

GET /test/_search
{
  "size": 0,
  "query": {
    "bool": {
      "filter": [
        { "term": { "rootProperty": "root" }}
      ]
    }
  },
  "aggs": {
    "aggsNested": {
      "nested": {
        "path": "children"
      },
      "aggs": {
        "aggsChildPropertyAValues": {
          "composite": {
            "size": 100,
            "sources": [
              { "childPropertyAValues": { "terms": { "field": "children.childPropertyA" }} }
            ]
          },
          "aggs": {
            "aggsAdditionalData": {
              "top_hits": {
                "size": 1,
                "_source": { 
                  "includes": [
                    "children.childPropertyB"
                  ]
                }
              }
            }
          }
        }   
      }
    }
  }
}

I receive this exception:

{
  "error" : {
    "root_cause" : [
      {
        "type" : "array_index_out_of_bounds_exception",
        "reason" : "Index 129 out of bounds for length 129"
      }
    ],
    "type" : "search_phase_execution_exception",
    "reason" : "all shards failed",
    "phase" : "query",
    "grouped" : true,
    "failed_shards" : [
      {
        "shard" : 0,
        "index" : "test",
        "node" : "8vCzoJ9HS7KsMD168FxXqg",
        "reason" : {
          "type" : "array_index_out_of_bounds_exception",
          "reason" : "Index 129 out of bounds for length 129"
        }
      }
    ],
    "caused_by" : {
      "type" : "array_index_out_of_bounds_exception",
      "reason" : "Index 129 out of bounds for length 129",
      "caused_by" : {
        "type" : "array_index_out_of_bounds_exception",
        "reason" : "Index 129 out of bounds for length 129"
      }
    }
  },
  "status" : 500
}

Some notes on this setup:

  • The query in the search is meant to filter down the data before it is to be aggregated.
  • I'm using nested child documents because the length of that array can get large, and I need to maintain the structure of the child data.
  • The composite over childPropertyA is meant to extract all the unique childPropertyA values across the data.
  • I'm using a composite so I can 'page' through the data to get all unique values
  • The size of 100 on the composite was arbitrarily set in my initial tests, but there would be an optimal value for the live system.
  • I have a child top_hits child aggregate because I need additional information for each of the unique values returned. Some additional properties from the child objects.

Some observations while 'playing' with the query:

  • If I set the size on the composite to 1, or 2, the query runs, but if I set it to 3 or above, I get the exception.
  • If I remove the root level query that filters down the data, the query runs
  • If I remove the top_hits child aggregation, the query runs.

Any help would be much appreciated.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.