ES7.4. I have queries that perform ID searches on documents I know exist in the index (which has 1 replica). As a sanity check, I raise an error if I get back less than 100%. This is rare but it happens. The cluster is usually pretty idle but occasionally gets busy enough to have search queues start to fill, CPU spikes, etc. The cluster and query are using the default behavior that partial search results are OK. I'm wondering if during busy times I am getting back partial results because of temporary shard failures. Reading up on shard failures, I get the impression that this is different (and worse) than just the coordinating node getting a timeout.
Can someone shed some light on whether a busy cluster could lead to partial search results when nodes hosting a primary shard and a replica shard time out?