No aggregation results unless more than half the shards are successful?


I just would like to confirm and find documentation/description about something that we noticed (and had to invest a bit of time to actually understand).

We had a case where some indices had an incorrect mapping which caused some shards to fail, while others could execute the query successfully.

What was strange was that sometimes we got no results at all and sometimes at least results from the successful shards were returned. After some looking at this, we discovered that it seems to be related to the number of failed and successful shards. It seems only if more shards are successful than failed, aggregation results are returned, otherwise not.

Is this expected? Is this documented somewhere? Couldn't find any description of this, but would like to verify so we can act accordingly in this case.

This is with Elasticsearch 5.3.3, indices typically have 6 shards, between 3 and 20 indices are included in the queries.

This query returns values, 6 successful, 3 failed:
2017-12-04 07:00:31 UTC WARNING [<0x4>] [SearchResponseFutureImpl] 3 shard(s) failed in Elasticsearch SearchResponse while executing query on indexes [abc1 abc2, abc3] (96ms/99ms), hits: 72598, successful/failed shards: 6/3. Failure(s):

This query does not return any value at all because index abc4 adds more failing shards so that we have 6 successful and 6 failed:
2017-12-03 22:34:32 UTC WARNING [<0x4>] [SearchResponseFutureImpl] 6 shard(s) failed in Elasticsearch SearchResponse while executing query on indexes [abc1, abc2, abc3, abc4] (73ms/77ms), hits: 71703, successful/failed shards: 6/6. Failure(s):

I'm not aware of this logic. Returning results even if they are partial is the objective, however we are moving towards a policy of returning no results and an error in future versions if there's any reason for missing responses - see New query flag: allow_partial_results with default set to true. · Issue #27435 · elastic/elasticsearch · GitHub

It looked fairly definite to us, more successfull shards than failed ones made it return data, otherwise not. Unless we did overlook something in our own logic on top of Elasticsearch, naturally.

Thanks, for the response and issue-link, it looks like we are currently "halfway there", kind of :slight_smile:

Maybe some shards are "skipped" because they don't contain data relevant to the query and of the remaining non-skipped shards these all failed triggering this failure condition:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.