I am running tests against a node with 2TB of data and 20GB of heap space. In elasticsearch.log I am seeing entries like these:
[2015-10-27 09:01:59,437][WARN ][monitor.jvm ] [00] [gc][old][6016][716] duration [10.4s], collections [1]/[11.1s], total [10.4s]/[5.8m], memory [19gb]->[18gb]/[19.9gb], all_pools {[young] [337.7mb]->[14.8mb]/[532.5mb]}{[survivor] [66.5mb]->[0b]/[66.5mb]}{[old] [18.6gb]->[18gb]/[19.3gb]}
When I perform simultaneous testing queries, sometimes the numfound value being returned changes as after the initial query. The following shows values I am storing in a log file created during queries:
2015-10-27 08:44:51 id [4250, 0]: current row 0; doc count 10000; numfound 109706
2015-10-27 08:44:51 id [4250, 0]: found 109706 with id 2331502282
2015-10-27 08:45:06 id [4250, 0]: current row 10000; doc count 10000; numfound 236890
2015-10-27 08:45:06 id [4250, 0]: found 236890 with id 2331482575
2015-10-27 08:45:22 id [4250, 0]: current row 20000; doc count 10000; numfound 236890
In the logs above you can see the current row (es queries 'start'), the number of docs returned in the query (es queries 'size'), and the numfound value in the result set of the query. Can somebody explain to me what situation could cause the numfound value to change after the initial query?
This machine is running with very high old heap usage values > 90%, but I would have expected an error rather than inaccurate results if memory is really becoming an issue.