Ignore indexes with no hits during aggregation

Our ES cluster has a daily index with a field that previously was not saved with type = nested, but now is being saved with type = nested.

Our search queries are against a wildcard index, not a specific index. So it's expected that certain queries may cross daily indexes.

If we run a nested aggregation specifically against the daily index on a day after the type=nested index was created, everything works fine.

However, if we run an aggregation with our wildcard index, even if all of the data is filtered to the new index, we get proper results, but we also get a shard error about how the nested field is not nested.

Minimal example:

PUT mytestindex-old
{
  "mappings": {
      "properties": {
        "field": {
          "type": "object"
        }
      }
  }
}

PUT /mytestindex-old/_doc/1
{
  "field": {}
}

PUT mytestindex-new
{
  "mappings": {
      "properties": {
        "field": {
          "type": "nested"
        }
      }
  }
}

GET mytestindex-*/_search 
{"size":10,"query":{"bool":{"must":[{"term":{"_index":"mytestindex-new"}}]}},"aggs":{"theAgg":{"nested":{"path":"field"}}}}

The result comes back with valid results, but also a shard failure:

 "reason" : {
 "index" : "mytestindex-old",
          "type" : "aggregation_execution_exception",
          "reason" : "[nested] nested path [field] is not nested"
        }

This is obviously an extremely convoluted/oversimplified example, but my question is why does the shard failure occur, when there are no documents with that index included in the result? i.e. although the GET is against mytestindex-*, the query that runs before the aggregation should ensure that the aggregation is only run against the new index.

Welcome!

I believe this is because elasticsearch tries to validate the query before executing it. Here the mapping for the old index and for field field is not nested type so it won't be able to be executed.

Appreciate your response!

So in other words, the aggregation is validated against any index that matches, in my case, 'mytestindex-*', even though the query filter specifically would limit the aggregation to only specific indexes that have the proper mapping?

Is this desirable behavior? I.e. if I am defining the query in such a way that I know the aggregation would succeed, should the validation occur against indexes that have no hits?

I believe it is.

I don't understand the use case where you'd mix indices in a query where you don't have a consistent mapping.

I'd try to use the same mapping for the same fields in all indices that should be queried at the same time and move under specific fields what is not shared by all the indices.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.