We are attempting to test performance of our existing operational cluster in AWS. We attempted to configure the cluster as close to our operational cluster as we could. We are running 1.6.2.
Short story is that performance is slightly better with the new test cluster when running the same queries against our small indexes. However, for our large indexes the performance is much worse. The median query against our large indexes went from 150ms to 767ms.
When looking at all of the metrics in Marvel everything looks similar except for the Lucene memory. In our operational cluster it is 8 GB on each node, but in the test cluster it is 780 MB on each node. I would like to understand what could cause such a difference. Its seems like I must have misconfigured some setting, but I am not sure where to look. As far as I can tell everything is configured the same for both clusters. Anyone have advice on where to start looking? Or can you provide me with a link that better explains indices.segments.memory_in_bytes?
I've setup the cluster two different ways:
31GB heap size out of a total of 61GB on each node.
24GB heap size out of a total of 61GB on each node.
The Lucene memory is 780MB on each node (vs the expected 8 GB) with both of those configurations.