index docs store
mp_v 106319763 8gb
A Sprinboot Application generates 10000 search queries with 10 parallel threads against a classical local and remote ELK System.
This takes 40 minutes.
Running the same set of queries against an ECE Cluster it runs into the exception
"None of the configured nodes were available". Our developers raise the exception only after getting no answer for the last 20 search queries against the ECE Cluster.
We get the exception in a time range from two minutes to eleven minutes after starting the search queries.
The ECE Cluster has 16 GB RAM and the JVM pressure reaches a maximum of 52 percent
Where can we start to analyze the breakdowns? It seems not to be a network problem.
AFAIK with some spring related projects there's a chance you are using the TransportClient instead of the RestClient.
It has been deprecated and is now removed in master branch. So definitely switch to the RestClient instead.
I think you'll get much more stability as well.
My 2 cents.
We do see issues with HA ES 6.x+ and the transport client due to an incompatibility between the ECE proxy and transport client. The newest ECE proxy fixes it but won't be available for another release or so.
As per above, we recommend using the REST client where possible.
If the issue is performance related (and not the above transport client problem), the most common reason is that ECE throttles instances by default, to minimize "noisy neighbor" effects (basically proportional to the RAM, with a 20% overprovision).
As a result, when comparing performance to "classic" ES performance it sometimes seems like ECE is worse when in practice it's just that ECE is "starving" the cluster. The easiest way to compare apples-to-apples is to set
hard_limit: false in the "Data" section of the advanced editor.
It's also useful to set up monitoring and use the ECE performance dashboards to try to get a better idea of whether the cluster is being throttled and where
This seems to be the solution. Thanks!
This solved another problem for us - we had very high latency values until we've changed hard_limit to false.
We still get "None of the configured nodes were available" messages.
Our developers have to do their homework now and change to RestClient. We hope that solves the issue.
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.