I have a query that throws an exception, and I can't seem to work around it. My actual configuration is a bit more complex than what I have below, but I've managed to cook up an example that still produces the error.
Running Elasticsearch 7.6.2.
My index and mappings:
PUT /test
PUT /test/_mappings
{
"properties": {
"rootProperty": {
"type": "keyword"
},
"children": {
"type": "nested",
"properties": {
"childPropertyA": {
"type": "keyword"
},
"childPropertyB": {
"type": "text"
}
}
}
}
}
I then insert a document:
PUT /test/_doc/1
{
"rootProperty": "root",
"children": [
{
"childPropertyA": "1",
"childPropertyB": "abc"
},
{
"childPropertyA": "2",
"childPropertyB": "def"
},
{
"childPropertyA": "3",
"childPropertyB": "ghi"
},
{
"childPropertyA": "4",
"childPropertyB": "jkl"
}
]
}
I then perform the following query:
GET /test/_search
{
"size": 0,
"query": {
"bool": {
"filter": [
{ "term": { "rootProperty": "root" }}
]
}
},
"aggs": {
"aggsNested": {
"nested": {
"path": "children"
},
"aggs": {
"aggsChildPropertyAValues": {
"composite": {
"size": 100,
"sources": [
{ "childPropertyAValues": { "terms": { "field": "children.childPropertyA" }} }
]
},
"aggs": {
"aggsAdditionalData": {
"top_hits": {
"size": 1,
"_source": {
"includes": [
"children.childPropertyB"
]
}
}
}
}
}
}
}
}
}
I receive this exception:
{
"error" : {
"root_cause" : [
{
"type" : "array_index_out_of_bounds_exception",
"reason" : "Index 129 out of bounds for length 129"
}
],
"type" : "search_phase_execution_exception",
"reason" : "all shards failed",
"phase" : "query",
"grouped" : true,
"failed_shards" : [
{
"shard" : 0,
"index" : "test",
"node" : "8vCzoJ9HS7KsMD168FxXqg",
"reason" : {
"type" : "array_index_out_of_bounds_exception",
"reason" : "Index 129 out of bounds for length 129"
}
}
],
"caused_by" : {
"type" : "array_index_out_of_bounds_exception",
"reason" : "Index 129 out of bounds for length 129",
"caused_by" : {
"type" : "array_index_out_of_bounds_exception",
"reason" : "Index 129 out of bounds for length 129"
}
}
},
"status" : 500
}
Some notes on this setup:
- The query in the search is meant to filter down the data before it is to be aggregated.
- I'm using nested child documents because the length of that array can get large, and I need to maintain the structure of the child data.
- The composite over childPropertyA is meant to extract all the unique childPropertyA values across the data.
- I'm using a composite so I can 'page' through the data to get all unique values
- The size of 100 on the composite was arbitrarily set in my initial tests, but there would be an optimal value for the live system.
- I have a child top_hits child aggregate because I need additional information for each of the unique values returned. Some additional properties from the child objects.
Some observations while 'playing' with the query:
- If I set the size on the composite to 1, or 2, the query runs, but if I set it to 3 or above, I get the exception.
- If I remove the root level query that filters down the data, the query runs
- If I remove the top_hits child aggregation, the query runs.
Any help would be much appreciated.