Inconsistency in search result counts in single node that I do not see with two nodes

I set up the following two environments:

  1. Two Node Cluster
    • 10 shards per index
    • 2 replicas
    • 24GB per node heap size
    • both nodes are on the same physical box, just running with
      different directories and ports
  2. One Node Cluster
    1. 5 shards per index
    2. 0 replicas
    3. 48GB heap size

I loaded 500M records into both and performed hundreds of searches to
collect search timing statistics. I used the same test script on both
environments (script iterates through and builds curl commands to send a
request to the local node). I performed the last set of tests after all
data has been fully loaded and no more data was coming into ES.

When I did all of the tests on the first environment, even if I performed
the same query, I always got the same number of records in the output JSON,
although the timing was slightly different (which was expected):

Search for term 25344701 records match, took 9134ms
Search for term 25344701 records match, took 9639ms
Search for term 25344701 records match, took 274ms
Search for term 25344701 records match, took 288ms
Search for term 25344701 records match, took 304ms

The count was the same each time.

When I did the same test in the second environment, each time I searched
for the same term, I got a different count of records:

Search for term 25342102 records match, took 7911ms
Search for term 25339283 records match, took 942ms
Search for term 25344701 records match, took 1138ms
Search for term 25344701 records match, took 544ms
Search for term 25344628 records match, took 375ms

The counts are not off by that much, but I don't understand why they are
complete for the two-node case and not for the single node case. Is it only
because the queries are not being in run in parallel on two nodes (or more
replicas)?

It does not appear to be a time-out issue because I also have searches that
return in low hundreds of ms that have the same problem as searches that
take hundreds of seconds (massively wildcarded searches).

I am using query_string search API. Everything else besides the above
settings are default. I also have various tests with/without facets and
with/without sorting, but the same pattern happens every time. I get the
full count in environment 1, but not in environment 2 for every repeat
search.

Thanks,
Jerry

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.