Looking for a little guidance/suggestions on how to debug an issue I'm having. I'm performing a non-trivial search and each time I execute the same search, I get back one of two results. One result involves 3 hits and the other result is about a dozen hits (which includes the 3 hits from the first result). So, for example, I will get back results with document IDs A, D, and G on one execution and the next will get A, B, C, D, E, F, G. If i keep executing the search in succession, it will toggle back and forth between the two results.
My search contains a number of nested boolean queries, function score queries, etc. I am also using a post-filter with 6 different terms lookup queries that are being cached and are set to expire every hour.
from my elasticsearch.yml...
indices.cache.filter.terms.expire_after_access: 3600s
indices.cache.filter.terms.expire_after_write: 3600s
I have a 3 node cluster running v1.7.1. There are 5 shards and 2 replicas for the index, so each node has all the shards on it (either a primary or replica).
This issue happens sporadically and isn't reproducible (at least I haven't discovered how to reproduce on demand yet). When it does happen, I can query each of the nodes and the cluster and the problem will only occur on a single node. Haven't been able to figure out yet if it is the same node that has problems. There is activity (indexing/searching) occurring on my cluster while I'm executing these searches but not anything that would effect the documents that should match this query.
Any help anyone can offer would be greatly appreciated. At this point just trying to isolate the issue so that I can make a targeted fix rather than just blindly making changes to my query structure (like simplifying my post filters).
Thanks